Visual Abstract
Abstract
The PRIMARY score is a 5-category scale developed to identify clinically significant intraprostate malignancy (csPCa) on 68Ga-prostate-specific membrane antigen (PSMA)–11 PET/CT (68Ga-PSMA PET) using a combination of anatomic site, pattern, and intensity. Developed within the PRIMARY trial, the score requires evaluation in external datasets. This study aimed to assess the reproducibility and diagnostic accuracy of the PRIMARY score in a cohort of patients who underwent multiparametric MRI (mpMRI) and 68Ga-PSMA PET before prostate biopsy for the diagnosis of prostate cancer. Methods: In total, data from 242 men who had undergone 68Ga-PSMA PET and mpMRI before transperineal prostate biopsy were available for this ethics-approved retrospective study. 68Ga-PSMA PET and mpMRI data were centrally collated in a cloud-based deidentified image database. Six experienced prostate-focused nuclear medicine specialists were trained (1 h) in applying the PRIMARY score with 30 sample images. Six radiologists experienced in prostate mpMRI read images as per the Prostate Imaging–Reporting and Data System (PI-RADS), version 2.1. All images were read (with masking of clinical information) at least twice, with discordant findings sent to a masked third (or fourth) reader as necessary. Cohen κ was determined for both imaging scales as 5 categories and then collapsed to binary (negative and positive) categories (score 1 or 2 vs. 3, 4, or 5). Diagnostic performance parameters were calculated, with an International Society of Urological Pathology grade group of at least 2 (csPCa) on biopsy defined as the gold standard. Combined-imaging–positive results were defined as any PI-RADS score of 4 or 5 or as a PI-RADS score of 1–3 with a PRIMARY score of 3–5. Results: In total, 227 patients with histopathology, 68Ga-PSMA PET, and mpMRI imaging before prostate biopsy were included; 33% had no csPCa, and 67% had csPCa. Overall interrater reliability was higher for the PRIMARY scale (κ = 0.70) than for PI-RADS (κ = 0.58) when assessed as a binary category (benign vs. malignant). This was similar for all 5 categories (κ = 0.65 vs. 0.48). Diagnostic performance to detect csPCa was comparable between PSMA PET and mpMRI (sensitivity, 86% vs. 89%; specificity, 76% vs. 74%; positive predictive value, 88% vs. 88%; negative predictive value, 72% vs. 76%). Using combined imaging, sensitivity was 94%, specificity was 68%, positive predictive value was 86%, and negative predictive value was 85%. Conclusion: The PRIMARY score applied by first-user nuclear medicine specialists showed substantial interrater reproducibility, exceeding that of PI-RADS applied by mpMRI-experienced radiologists. Diagnostic performance was similar between the 2 modalities. The PRIMARY score should be considered when interpreting intraprostatic PSMA PET images.
The diagnosis of clinically significant prostate cancer (csPCa) has improved with the introduction of imaging-targeted biopsy with multiparametric MRI (mpMRI), allowing a proportion of men with normal MRI results to avoid biopsy and allowing MRI-targeted biopsy, improving the diagnosis of high-grade malignancy (1,2). The addition of 68Ga-prostate-specific membrane antigen (PSMA)–11 PET/CT (68Ga-PSMA PET) to mpMRI further improved the negative predictive value for prostate cancer diagnosis in the PRIMARY trial (3,4). The primary objective of this study was to validate the PRIMARY score developed in the prospective PRIMARY trial using a dataset of retrospectively collected real-world patients undergoing 68Ga-PSMA PET and mpMRI before prostate biopsy.
MATERIALS AND METHODS
This retrospective project was approved by the Human Research Ethics Board at St. Vincent’s Hospital Sydney (approval 2022/ETH00051). Patients from investigating urologists in Australia who had undergone both 68Ga-PSMA PET and mpMRI before initial prostate biopsy were identified. PSMA PET undertaken after biopsy for staging was an exclusion criterion. All patients had clinical data collected, including age, PSA level at the time of imaging, and the dates of imaging and biopsy procedures. The urologists’ reasons for biopsy or imaging were not available. 68Ga-PSMA PET and mpMRI were undertaken as per institutional protocols.
Imaging Data Collection
Both 68Ga-PSMA PET and mpMRI data were centrally collated and deidentified on a secure web-based server (MIMcloud; MIM Software) for independent review, with imaging modalities collated into separated imaging datasets to ensure masked reads by imagers. All imaging and clinical data were collated on an institutional REDCap database (St. Vincent’s Hospital Sydney) specifically designed for the trial. Case report forms were developed to record both PRIMARY and Prostate Imaging–Reporting and Data System (PI-RADS) scores. The PRIMARY score was documented as previously defined, with 5 categories: score 1, no significant pattern within the prostate; score 2, a diffuse transition or central zone pattern; score 3, focal transition zone activity above twice the background transition zone counts; score 4, focal peripheral zone activity of any intensity; and score 5, an SUV of more than 12 (Fig. 1) (4).
Five-point PRIMARY score (4).
Image Reads and Analysis
Deidentified 68Ga-PSMA PET images were independently read by 6 68Ga-PSMA PET–experienced nuclear medicine physicians according to the PRIMARY score. Each scan was read by 2 readers, with a maximum of 40 scans read by the same 2 readers. To get a single 68Ga-PSMA PET imaging decision per patient, any reader disagreement requiring a masked read by a third reader occurred when the first score was 1 or 2 and the second score was 3, 4, or 5. Before commencing, all nuclear medicine imaging investigators participated in a 1-h training session involving an explanation of the PRIMARY score and a consensus read of 30 68Ga-PSMA PET scans external to the study dataset. All readers were instructed to use the fused PET/CT images to allow differentiation between transition-zone and peripheral-zone activity. Images were read using the readers’ preferred PET image viewer platform.
Six experienced prostate MRI radiologists read mpMRI images as per PI-RADS version 2.1 independently of the PSMA PET or clinical results (5). To get a single PI-RADS score per patient, in the event of disagreement on PI-RADS scores between 2 readers, a third masked tie-breaking read was performed by another expert reader. If this was discordant with the other reads, a fourth read was undertaken.
Histopathology
Histopathology from biopsy was derived from the clinical histopathology report. Transperineal rather than transrectal ultrasound biopsy was undertaken in all cases. The median number of cores was 22. Systematic biopsy was standard, with MRI-guided additional cores also obtained in some cases. The International Society of Urological Pathology (ISUP) grade group reported for the trial was for the index lesion.
Statistical Analysis
The primary outcome of interest was an estimate of the interrater reliability of the PRIMARY score, in its original 5 categories and as a binary evaluation (score 1 or 2 vs. 3, 4, or 5), for nuclear medicine specialists new to using the score. This was measured using the Cohen κ-coefficient, with the interpretations of the value by Landis and Koch adopted in the figures and text (0–0.2 indicating slight agreement; 0.21–0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, substantial agreement; and 0.81–1.0, almost perfect or perfect agreement). Additionally, the analogous analysis was performed for the mpMRI read with PI-RADS with the further evaluation for 3 categories of PI-RADS (1 or 2 vs. 3 vs. 4 or 5). The sample size per reader pair aimed to exceed the heuristic of 30 previously proposed in the literature (6). Further, we expected heterogeneity between PET reader pairs given the novelty of the scale. Combined with a convenience sample of about 240 patients, it was thus decided to invite 6 readers for each imaging modality. Secondary aims were, first, to calculate diagnostic accuracy for 68Ga-PSMA PET using the PRIMARY score and for mpMRI using PI-RADS, with an ISUP grade group of 2 or more on biopsy being the gold standard; second, to calculate the diagnostic accuracy of a rational combination of imaging modalities based on the original PRIMARY paper (defining combined imaging positivity as PI-RADS 4/5 or PI-RADS 1–3 with a PRIMARY score of 3–5); third, to assess the distribution of ISUP grade group across categories of PI-RADS and binary PRIMARY score; and fourth, to evaluate the association between ISUP grade group and SUVmax, defined as the mean SUVmax reported by the 2 readers, with the Kruskal–Wallis test and significance set at a P value of less than 0.05. Stata version 17.0MP (StataCorp LLC) was used to generate the figures and analysis.
RESULTS
In total, 242 patients were available for analysis, having undergone 68Ga-PSMA PET and mpMRI, no prior prostate biopsy or treatment for prostate cancer, and subsequent transperineal prostate biopsy. Five had no pathology data available, a further 8 had no MR images available, and 2 more had no PET images available, leaving 227 patients for analysis (Table 1; Supplemental Fig. 1 [supplemental materials are available at http://jnm.snmjournals.org]). Seventy-four men (33%) did not have csPCa, 49 (22%) had ISUP grade group 2 disease, and 104 (46%) had ISUP grade group 3–5 disease.
Patient Characteristics (n = 227)
Reproducibility
The overall κ for binary PRIMARY score categories (1 or 2 vs. 3, 4, or 5) was 0.70 (95% CI, 0.61–0.80), with a percentage agreement of 86% (Fig. 2). Pairwise κ-values ranged from 0.55 to 0.87. Interrater reliability for the full 5-category scale was substantial at 0.65 (95% CI, 0.58–0.73), percentage agreement was 74%, and pairwise κ-values ranged from 0.52 to 0.73. The overall κ for binary PI-RADS (1 or 2 vs. 3, 4, or 5) was moderate at 0.58 (95% CI, 0.46–0.70), with a percentage agreement of 82%. For a 3-group categorization of PI-RADS (1 or 2 vs. 3 vs. 4 or 5), the κ was 0.55 (95% CI, 0.46–0.65), with percentage agreement of 75%, and for the full 5-category scale, the κ was 0.48 (95% CI, 0.40–0.56), with percentage agreement of 61%. Twelve patients had discordant PI-RADS scores among 3 readers and required a fourth read. There were 7 pairs of MRI readers, instead of 6, as the allotment of 1 reader could not be completed (paired reader κ-values are shown in Supplemental Tables 1–5).
Cohen κ-coefficient for PRIMARY and PI-RADS divided into 2 categories or left as 5 categories. Overall coefficient and 95% CI are presented as thicker symbols and lines, with pairwise interrater values presented next to these as thinner, fainter symbols and lines.
Diagnostic Accuracy
In total, 149 (66%) of patients had a positive 68Ga-PSMA PET result (PRIMARY score of 3, 4, or 5), and 155 (68%) had a positive MRI result (PI-RADS score of 3, 4, or 5). The sensitivity of 68Ga-PSMA PET was 86% (95% CI, 79%–91%), and the specificity was 76% (95% CI, 64%–85%), whereas the sensitivity of MRI was 89% (95% CI, 83%–93%), with a specificity of 74% (95% CI, 63%–84%). Individual PET and mpMRI reader sensitivities are represented in Figure 3. The positive predictive value for 68Ga-PSMA PET was 88% (95% CI, 82%–93%), with a negative predictive value of 72% (95% CI, 61%–81%), and the respective values for MRI were 88% (95% CI, 82%–93%) and 76% (95% CI, 65%–86%). Using the combined 68Ga-PSMA PET/MRI definition of positive findings, the sensitivity was 94% (95% CI, 89%–97%), with a specificity of 68% (95% CI, 56%–78%). The positive predictive value was 86% (95% CI, 80%–91%), and the negative predictive value was 85% (95% CI, 73%–93%) (Table 2; Supplemental Table 6).
Sensitivity vs. (1 − specificity) overall for 68Ga-PSMA PET and MRI derived from PRIMARY and PI-RADS scores, respectively. Faint markers and lines denote individual readers.
Diagnostic Accuracy for PI-RADS, PRIMARY, and Combination
Histopathology and Imaging
The distribution of histology grade according to positive or negative 68Ga-PSMA PET finding and category of PI-RADS is demonstrated in Figure 4. In each PI-RADS category, positive 68Ga-PSMA PET results, versus negative, resulted in a higher percentage of csPCa and ISUP grade group 3–5 cancer. Five of 45 patients (11%) with a PI-RADS score of 1 or 2 and 68Ga-PSMA PET–negative results had csPCa, all ISUP grade group 2. One patient of 59 (1.7%) had negative results on combined imaging (PI-RADS score of 3, 68Ga-PSMA PET–negative) and ISUP grade group 5 disease (Fig. 5). This patient’s MR images were read 3 times (PI-RADS scores of 2, 3, and 3), and the PET images were read twice (PRIMARY scores of 1 and 2).
Cumulative percentage distribution of ISUP grade group by positive or negative 68Ga-PSMA PET result and 3 categories of PI-RADS. GG = grade group.
PSMA PET imaging for patient with ISUP grade group 5 on histopathology, negative result on 68Ga-PSMA PET central read, and PI-RADS 3 on MRI. Central readers had 1-h training session on PRIMARY score with little prior exposure, and this lesion was missed by both readers. This image is technically PRIMARY score 4 (arrows) with focal lesion apically (SUVmax, 3.5).
SUVmax and Grade Group
There was a statistically significant association between PSMA PET SUVmax, taken as the mean of 2 readers, and ISUP grade group (P < 0.001) (Fig. 6). All patients with an SUVmax of more than 12 (PRIMARY score, 5) had csPCa, with 51 of 55 (93%) of those patients being ISUP grade group 3 or higher.
Horizontal box plot of SUVmax by ISUP grade group, taken as mean of 2 readers’ reports. Not shown are 1 ISUP grade group 3 lesion (SUVmax, 62) and 2 ISUP grade group 5 lesions (SUVmax, 44 and 46).
DISCUSSION
The PRIMARY score was developed to optimize the diagnostic accuracy of 68Ga-PSMA PET for csPCa intraprostatically and particularly to improve specificity over an SUVmax-based reporting method (3,4). The PRIMARY score has been incorporated into PROMISE version 2 for reporting of 68Ga-PSMA PET (7). This validation study undertaken on a real-world dataset confirmed the high diagnostic accuracy of the PRIMARY score, with significant reproducibility among readers. Accuracy was equivalent to the MRI PI-RADS score, with better reproducibility, despite the limited experience of the readers with the PRIMARY score.
Benign intraprostatic patterns of PSMA activity with increased uptake in the transition and central zones are common as a result of benign prostatic hypertrophy and physiologic activity surrounding the ejaculatory ducts in the central zones (8,9). However, most malignancy (70%) arises within the peripheral zone of the prostate, with the incidence of transition and central zone malignancy significantly lower (10). The PRIMARY score uses this information to weight focal activity in the peripheral and transition zones while classifying diffuse transition zone activity as a benign finding. The initial PRIMARY score publication found that separating patterns of intraprostatic PSMA activity into focal or diffuse improved identification of significant malignancy (4). This validation cohort confirmed improved specificity using a pattern-based reporting system rather than an intensity (SUVmax)-based analysis as was used in the initial PRIMARY paper (3).
mpMRI is now the standard of care for the diagnosis of csPCa, with key randomized trials demonstrating improved accuracy and a reduced requirement for biopsy compared with a non–imaging-based transrectal ultrasound biopsy diagnostic paradigm (2,11). However, despite better targeting with MRI, a significant number of malignancies are missed using an MRI-targeted approach, with a high proportion of negative biopsies that could potentially have been avoided (1). 68Ga-PSMA PET using the PRIMARY score in conjunction with mpMRI may further optimize the diagnosis of prostate cancer, reducing the biopsy requirement and detection of insignificant malignancy while improving sensitivity for csPCa. This is being evaluated further in the prospective randomized PRIMARY2 trial (NCT05154162).
There is a strong association between ISUP grade group on histopathology and 68Ga-PSMA PET intensity, an association that is likely due to the pro-proliferative role of the PSMA receptor in prostate cancer (12–16). As with the PRIMARY study, the analysis found that an SUVmax of more than 12 was associated with csPCa in 100%, with an ISUP grade group of at least 3 in 93% of those cases. This finding validates the use of PSMA intensity (SUVmax > 12) as the maximal PRIMARY score, although further work will be required to identify an optimum intensity for a PRIMARY score of 5 if PSMA-targeting peptides other than 68Ga-PSMA are to be utilized.
The Cohen κ demonstrated substantial agreement between the 6 PRIMARY score readers in the study, despite the fact that training was limited to a single 1-h session and despite the lack of harmonization between PET cameras, acquisition protocols, and doses due to the retrospective nature of the study. There was a single ISUP grade group 5 classified as PSMA-negative by both central readers and with equivocal mpMRI results (PI-RADS score, 3). This scan had focal PSMA avidity apically, fulfilling the criteria for a PRIMARY score of 4. It is expected that the reported diagnostic accuracy of the PRIMARY score will improve with further training of readers. There was a lower κ-score for PI-RADS despite the readers’ being high-volume prostate MRI specialists, pointing to reproducibility and simplicity as strengths of the PRIMARY score.
There are important limitations to the study. The study was designed as a real-world dataset to externally evaluate the findings of the PRIMARY trial. As such, it was retrospectively collected, and the reasons for which the urologists requested 68Ga-PSMA PET before biopsy were not documented and will have introduced a selection bias, particularly for the PI-RADS 1 and 2 patients included. Also because of the retrospective design, no camera or dose harmonization was possible. Nevertheless, the results show diagnostic accuracy similar to that of the prospective PRIMARY trial, demonstrating the reproducibility of the score, both for accuracy and among readers. A higher proportion of patients had more aggressive disease on histopathology than in the PRIMARY trial, with a lower proportion of benign or low-grade findings. This factor may have impacted the negative predictive value of the PRIMARY, PI-RADS, and combination scores. In the evaluation of reader agreement for both MRI and PSMA, the study utilized 2 readers—with a third reader for discordance—rather than the 3 readers considered optimal in a registration study (17).
Both the PRIMARY trial and this validation study utilized 68Ga-PSMA PET. Although it is likely that the PRIMARY score is translatable to other PSMA PET agents given the focus on pattern and anatomic site, this possibility requires further evaluation.
CONCLUSION
The PRIMARY score showed substantial interrater reproducibility by first-user nuclear medicine specialists, exceeding that of PI-RADS. Diagnostic performance was similar between the 2 modalities. The PRIMARY score should be considered when interpreting intraprostatic PSMA PET images.
DISCLOSURE
Wolfgang Fendler reports fees from SOFIE Biosciences (research funding), Janssen (consultant, speaker), Calyx (consultant, image review), Bayer (consultant, speaker, research funding), Novartis (speaker, consultant), Telix (speaker), GE Healthcare (speaker), and Eczacıbaşı Monrol (speaker) outside the submitted work. Louise Emmett reports funds from Movember (research), Clarity (research, consultant), Novartis (speaker, consultant), Astellas (speaker), Astrazeneca (speaker), and Telix (speaker). Matthias Eiber reports fees from Blue Earth Diagnostics Ltd. (consultant, research funding), Novartis/AAA (consultant, speaker), Telix (consultant), Bayer (consultant, research funding), RayzeBio (consultant), Point Biopharma (consultant), Eckert-Ziegler (speaker), Janssen Pharmaceuticals (consultant, speakers’ bureau), Parexel (image review), and Bioclinica (image review) outside the submitted work and a patent application for rhPSMA. No other potential conflict of interest relevant to this article was reported.
KEY POINTS
QUESTION: Is the PRIMARY score an accurate reproducible method for reporting intraprostatic PSMA PET findings in men prior to prostate biopsy, and how does it compare to mpMRI?
PERTINENT FINDINGS: The PRIMARY score is equivalent in diagnostic accuracy to mpMRI when undertaken prior to prostate biopsy in a high-risk population. Further, it is more reproducible than mpMRI despite the readers having limited clinical experience with the score.
IMPLICATIONS FOR PATIENT CARE: A 5-level PRIMARY score incorporating intraprostatic patterns and intensity on 68Ga-PSMA PET/CT shows potential as an accurate method for diagnosing csPCa and should be considered when PSMA PET is undertaken for this purpose.
Footnotes
Published online Nov. 30, 2023.
- © 2024 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication June 14, 2023.
- Revision received October 4, 2023.