Visual Abstract
Abstract
Prostate-specific membrane antigen (PSMA)–targeted radioligand therapy can improve the outcome of patients with advanced metastatic castration-resistant prostate cancer, but patients do not respond uniformly. We hypothesized that using the salivary glands as a reference organ can enable selective patient stratification. We aimed to establish a PSMA PET tumor–to–salivary gland ratio (PSG score) to predict outcomes after [177Lu]PSMA. Methods: In total, 237 men with metastatic castration-resistant prostate cancer treated with [177Lu]PSMA were included. A quantitative PSG (qPSG) score (SUVmean ratio of whole-body tumor to parotid glands) was semiautomatically calculated on baseline [68Ga]PSMA-11 PET images. Patients were divided into 3 groups: high (qPSG > 1.5), intermediate (qPSG = 0.5–1.5), and low (qPSG < 0.5) scores. Ten readers interpreted the 3-dimensional maximum-intensity-projection baseline [68Ga]PSMA-11 PET images and classified patients into 3 groups based on visual PSG (vPSG) score: high (most of the lesions showed higher uptake than the parotid glands) intermediate (neither low nor high), and low (most of the lesions showed lower uptake than the parotid glands). Outcome data included a more than 50% prostate-specific antigen decline, prostate-specific antigen (PSA) progression-free survival, and overall survival (OS). Results: Of the 237 patients, the numbers in the high, intermediate, and low groups were 56 (23.6%), 163 (68.8%), and 18 (7.6%), respectively, for qPSG score and 106 (44.7%), 96 (40.5%), and 35 (14.8%), respectively, for vPSG score. The interreader reproducibility of the vPSG score was substantial (Fleiss weighted κ, 0.68). The more than 50% prostate-specific antigen decline was better in patients with a higher PSG score (high vs. intermediate vs. low, 69.6% vs. 38.7% vs. 16.7%, respectively, for qPSG [P < 0.001] and 63.2% vs 33.3% vs 16.1%, respectively, for vPSG [P < 0.001]). The median PSA progression-free survival of the high, intermediate, and low groups by qPSG score was 7.2, 4.0, and 1.9 mo (P < 0.001), respectively, by qPSG score and 6.7, 3.8, and 1.9 mo (P < 0.001), respectively, by vPSG score. The median OS of the high, intermediate, and low groups was 15.0, 11.2, and 13.9 mo (P = 0.017), respectively, by qPSG score and 14.3, 9.6, and 12.9 mo (P = 0.018), respectively, by vPSG score. Conclusion: The PSG score was prognostic for PSA response and OS after [177Lu]PSMA. The visual PSG score assessed on 3-dimensional maximum-intensity-projection PET images yielded substantial reproducibility and comparable prognostic value to the quantitative score.
Patients with advanced metastatic castration-resistant prostate cancer (mCRPC) do not respond uniformly to [177Lu]prostate-specific membrane antigen (PSMA) (1,2). Thus, identification of patients who will likely benefit from PSMA-targeted radioligand therapy remains an unmet clinical need.
High PSMA expression as assessed by PET and whole-body (WB) tumor SUVmean is associated with better outcomes (3–7). PSMA PET should be used to select patients on the basis of tumor PSMA expression (8). However, the inclusion criteria based on baseline PSMA PET vary among major clinical trials and therapy centers across the world (Supplemental Table 1; supplemental materials are available at http://jnm.snmjournals.org) (1,2,6,9–17). The VISION trial applied qualitative (i.e., tumor uptake > liver uptake, assessed visually) thresholds (18). These criteria are relevant in identifying patients with absence of or low [68Ga]PSMA-11 expression, and 13% (126/1003) of patients were screening failures (1). Men with screening failure according to the VISION PET criteria had worse short-term outcomes than those who were eligible (19). However, even after selection of patients by VISION PET criteria, many patients do not respond favorably to [177Lu]PSMA, suggesting the need for further refinements of PSMA PET and other screening parameters.
When measured quantitatively, [68Ga]PSMA-11 uptake in the parotid glands exceeds liver uptake 2- to 3-fold (median SUVmax for liver vs. parotids, 9.7 vs. 21.3 (20)), which is close to the criteria used in the TheraP trial (lesion SUVmax, 20) (2). We hypothesized that use of the parotid glands rather than the liver as a reference organ would improve patient stratification for [177Lu]PSMA. The aim of this study was to test a quantitative and visual PSMA PET tumor–to–salivary gland ratio (PSG score) to predict outcomes after [177Lu]PSMA in a cohort of patients with mCRPC established retrospectively from multiple institutions.
MATERIALS AND METHODS
Study Design
This was a retrospective multicenter study using a published dataset (n = 270) (4,19). Images were visually analyzed by 10 masked central, independent readers. The informed consent requirement was waived by the UCLA institutional review board (waiver 19-000896).
Patients
Patients received [177Lu]PSMA-617 or [177Lu]PSMA-I&T between December 10, 2014, and July 19, 2019, in phase 2 clinical trials (NCT03042312 and ACTRN12615000912583) or via compassionate-use access programs (Supplemental Table 2). The [68Ga]PSMA-11 PET/CT protocol is provided in the supplemental Materials and Methods (20) and in Supplemental Table 3. Treatment details are provided in the supplemental Materials and Methods (21–23). Patients were excluded from the current analysis if more than 50% of the parotid glands was outside the PET field of view (as described in the eligibility criteria in Supplemental Table 4).
Image Analysis
Quantitative PSG (qPSG) Score
We first assessed the WB tumor burden quantitatively using the [68Ga]PSMA-11 PET qPSG score. Parotid glands and WB tumors were segmented semiautomatically on baseline [68Ga]PSMA-11 PET images using qPSMA software (24). Output parameters included WB SUVmean, the SUVmax of the lesion with the highest uptake (H-lesion), WB PSMA tumor volume, and bilateral parotid gland SUVmean. The ratio of WB tumor to parotid gland SUVmean (qPSG = mean tumor WB SUV/mean parotid gland SUVmean) was calculated. Patients were divided into 3 groups according to qPSG score: high (>1.5), intermediate (0.5–1.5), and low (<0.5). In addition, patients were grouped as high SUV versus low SUV to compare SUV-based criteria (2,6) with PSG scores (supplemental Materials and Methods).
Visual PSG (vPSG) Score
In a second step, we assessed the reproducibility and prognostic value of visual criteria using the parotid glands as an organ of reference (vPSG score). All readers were board-certified nuclear medicine physicians with more than 2 y of experience in PSMA PET interpretation. To assess whether the reader experience in treating patients with [177Lu]PSMA therapy influences image scoring, both readers with extensive experience (>50 treatments; 5 readers) and readers with limited experience (≤50 treatments; 5 readers) were selected (Supplemental Table 5).
Three-dimensional maximum-intensity-projection (MIP) baseline [68Ga]PSMA-11 PET images adjusted to 3 different SUV window ranges (0–10, 0–20, and 0–30) were generated by a single lead investigator not involved in image analysis. Each reader was provided with the images (portable document format). Readers were asked to classify the patients into 3 groups (i.e., high, intermediate, and low) according to the vPSG score as described in Table 1. Representative images of each group are shown in Figure 1.
Visual PSMA PET vPSG Score
Representative MIP images of 6 patients classified as having high, intermediate, and low vPSG scores (MIP SUV range, 0–20).
At more than 2 wk after the first reads, 50 cases were randomly selected for rereading to determine intrareader agreement. One lead investigator conducted the final analysis. A central majority rule (6 vs. 4) was applied in cases of disagreement to obtain the final reads. If disagreement persisted on intermediate versus high or on low versus intermediate (e.g., 5 vs. 5), the cases were classified as high or low, respectively, avoiding the intermediate category.
Clinical Outcomes
The clinical outcomes included a more than 50% prostate-specific antigen (PSA) decline (PSA50), PSA progression-free survival (PFS), and overall survival (OS). PSA50 was defined by a PSA decline of more than 50% compared with baseline at any time during the treatment (best response). PSA PFS was defined as the time from treatment initiation to PSA progression or death from any cause, as per the criteria of Prostate Cancer Clinical Trials Working Group 3 (25). OS was defined as time from treatment initiation to death of any cause.
Statistical Analysis
The R software package was used for statistical analysis. Two-tailed P values of less than 0.05 were considered significant. Clinical characteristics were compared among PSMA expression groups using the Mann–Whitney U and Fisher exact tests for continuous and categoric variables. The proportion of patients who had a PSA50 was assessed by the Fisher exact test, and the odds ratio from logistic regression was calculated. Kaplan–Meier analysis with the log-rank test and Cox hazard ratio regression was performed to evaluate survival outcomes. Multivariate Cox and logistic regression analyses were performed to test the PSG scores and previously reported prognostic factors for [177Lu]PSMA (4). Intra- and interreader agreement was evaluated by weighted Fleiss κ-coefficients. Agreement between vPSG score (majority rule) and qPSG score was assessed by weighted Cohen κ-coefficients.
RESULTS
Patients
Between April 23, 2019, and January 13, 2020, 414 patients were retrospectively screened, and 177 men were excluded as specified in Supplemental Figure 1. Thus, 237 men were included in the final analysis. Seventy-five and 162 men were treated with [177Lu]PSMA-617 and [177Lu]PSMA-I&T, respectively. Table 2 depicts the clinical characteristics of the cohort. The median follow-up time was 21.2 mo (interquartile range, 14.1–30.6 mo).
Patient Characteristics and Clinical Outcome
PSG Score
Of the 237 patients, the numbers in the high-, intermediate-, and low-PSG groups were 56 (23.6%), 163 (68.8%), and 18 (7.6%), respectively, by qPSG score and 106 (44.7%), 96 (40.5%), and 35 (14.8%), respectively, by vPSG score (majority rule) (Supplemental Tables 6 and 7 show the clinical and PSMA PET characteristics of each qPSG and vPSG score group). There was no difference between the baseline clinical characteristics of any groups, except for the lower proportion of patients with prior docetaxel treatment in the low-qPSG group. The number of patients with PSMA PET nodal metastasis (N1) was lowest in the low groups both by qPSG score (33.3%) and by vPSG score (20.0%) (P < 0.001). The number of distant metastases (≥20) was lower in the low group (45.7%) than the intermediate (75.0%) and high (76.4%) groups by vPSG score (P = 0.001). WB tumor SUVmean and PSMA tumor volume were highest in the high group, followed by the intermediate and low group, both by qPSG score and by vPSG score.
PSG Score and Clinical Outcome
Clinical outcomes for each of the 3 groups by vPSG and qPSG scores are summarized in Table 3. Comparisons between PSA PFS and OS in patients with a nonhigh PSG score (intermediate + low) versus a high PSG score (2 groups) are provided in Supplemental Figures 2 and 3, respectively. The PSA50, PSA PFS, and OS obtained by PSG scores and SUV-based criteria (high SUV vs. low SUV) are compared in the supplemental Materials and Methods.
Outcomes of qPSG Score and vPSG Score for High, Intermediate, and Low Patients
PSA Response
A higher PSA50 was observed in the groups with a high PSG score than in those with an intermediate or low PSG score (P < 0.001) (PSA50 odds ratios for qPSG and vPSG scores are shown in Supplemental Tables 8 and 9). Both qPSG score and vPSG score were independent predictors of PSA50. Moreover, PSA50 in patients with a high PSG score was significantly better than in those with an H-lesion SUVmax of at least 20 (supplemental Materials and Methods).
PSA PFS
PSA PFS was longest in the groups with a high qPSG or vPSG score (Fig. 2). The corresponding hazard ratios are shown in Supplemental Tables 10 and 11, respectively.
Kaplan–Meier curves for PSA PFS comparing groups with high, intermediate, and low PSMA expression classified by qPSG score (A) and vPSG score (B).
OS
The longest OS was in the groups with a high qPSG or vPSG score (Fig. 3). The hazard ratios of the high groups were lower than those of the intermediate groups but were not significantly different from the low groups in univariate and multivariate analyses (Supplemental Tables 12 and 13). There was no difference in OS between patients with an H-lesion SUVmax of at least 20 and those with an H-lesion SUVmax of less than 20. In contrast, OS was longer in patients with a WB SUVmean of at least 10 than in those with a WB SUVmean of less than 10. OS did not significantly differ between patients with a high PSG score and patients with a WB SUVmean of at least 10 (Supplemental Fig. 3).
Kaplan–Meier curves for OS comparing groups with high, intermediate, and low PSMA expression classified by qPSG score (A) and vPSG score (B).
Agreement
Agreement between qPSG and vPSG scores was moderate (weighted Cohen κ, 0.60; 95% CI, 0.52–0.68). Complete agreement between qPSG and vPSG scores was seen in 160 (67.5%) of the 237 patients.
The inter- and intrareader reproducibility of the vPSG score for all readers (n = 10) showed substantial agreement (Fleiss weighted κ, 0.68; 95% CI, 0.63–0.73) or almost perfect agreement (Cohen weighted κ [mean], 0.83 ± 0.06), respectively (supplemental Materials and Methods). Agreement among readers with and without prior 177Lu-PSMA experience is shown in Supplemental Table 14 and Supplemental Figure 4.
DISCUSSION
Quantitative (qPSG) and visual (vPSG) PET-derived scores for tumor [68Ga]PSMA-11 expression relative to parotid gland uptake predicted the PSA response and PSA PFS to [177Lu]PSMA of patients with mCRPC. The 3-dimensional MIP image–based vPSG score was substantially reproducible and did not require extensive experience with [177Lu]PSMA.
In the VISION study, the liver was used as the reference organ, and 87.4% of patients were eligible after [68Ga]PSMA-11 PET screening (1). PSMA tumor uptake equal to or greater than liver uptake appears to be the minimum target expression requirement for response to [177Lu]PSMA. The [68Ga]PSMA-11 uptake of the parotid gland is 2–3 times higher than that of the liver (20). Therefore, use of the parotid gland as a reference organ would make the criteria more stringent and specific.
Only MIP images were used for visual analysis. MIP images display WB tumor PSMA expression and disease extent in a single image. However, vPSG score should be used in combination with cross-sectional image analysis to determine the presence of PSMA-negative lesions (1,18,19). The greatest value of the PSG score may be in its use to exclude patients less likely to benefit from [177Lu]PSMA—those with a low PSG score. Also, when available, [18F]FDG PET/CT may complement the PSG score and potentially improve prognostication. The presence of [18F]FDG-positive/PSMA-negative lesions was associated with poor response to [177Lu]PSMA (9,26–28). We propose that patients with a low PSG score be deprioritized from [177Lu]PSMA. PSMA PET–based exclusion criteria for [177Lu]PSMA may encompass patients with PSMA-negative lesions by CT or by FDG, patients with lesion uptake below liver uptake, and patients with a low vPSG score.
Three different SUV-scale windows were used for interpreting MIP images. A MIP image with a narrow window (SUV, 0–10) is useful to observe the distribution of lesions with low PSMA expression, and MIP images with a wider window (SUVs, 0–20 and 0–30) are helpful to compare lesion uptake with parotid gland uptake. Using MIP images enables rapid and reproducible evaluations, which can facilitate clinical implementation.
Agreement between qPSG score and vPSG score (majority rule) was moderate, because vPSG score is based on the extent (>80%) of lesions with uptake greater than that of the parotid gland, whereas qPSG score is independent of disease extent (based on SUV ratio only). Despite the methodologic difference, the outcomes of each group by qPSG and vPSG score were similar, suggesting that both criteria are valuable. qPSG score enables higher reproducibility as it is obtained semiautomatically; however, segmentation software is necessary.
Recently developed nomograms to predict outcome after [177Lu]PSMA require WB SUVmean as a parameter (4). A classification using a WB SUVmean of at least 10 identified treatment responders in the VISION and TheraP cohorts (6). In our cohort, both qPSG score and vPSG score had similar prognostic value to a WB SUVmean of at least 10. The qPSG score is based on an SUV ratio (WB tumor to parotid glands) rather than a fixed SUV threshold to reduce some inherent variability in SUV measurements across patients, scanners, and reconstruction algorithms (29). The need for tumor segmentation software precludes current clinical use of quantitative parameters such as WB tumor volume/SUVmean. There are multiple WB segmentation tools under clinical development (24,30–33), but none are yet validated and widely available.
We propose a simple visual score to derive prognostic information from the screening 3-dimensional MIP [68Ga]PSMA-11 PET images. In contrast, a binary SUVmax classification (H-lesion SUVmax ≥ 20 vs. < 20) was not prognostic of patient OS, because H-lesion SUVmax does not account for disease heterogeneity, a key determinant of treatment response to [177Lu]PSMA (3,12). The 3-dimensional MIP-based vPSG score can be implemented quickly and at no cost in the clinic after further validation. Integration of the vPSG score in the [177Lu]PSMA nomogram approach (4) may improve its accuracy and further support clinical adoption.
We divided patients into 3 rather than 2 groups. The rationale was to capture, in the intermediate group, patients with heterogeneous PSMA expression. This grouping predicted PSA responses well. However, the group with an intermediate PSG score tended to show worse OS than the group with a low PSG score. Possible explanations include the small population, partial-volume effects, less advanced disease stage, and lower tumor burden in the low group. As such, the 3-group PSG score is more suitable as a biomarker for PSA response than for OS.
Limitations of this study include the lack of independent PSG score validation and the retrospective design. Moreover, the cohort did not include patients who were excluded from [177Lu]PSMA by the local treating sites. Thus, patients with low PSMA expression may be underrepresented. Also, the PSG score was tested only with [68Ga]PSMA-11 PET, and its efficacy with other PSMA-targeted PET tracers (e.g., [18F]DCFPyL) is unknown. Considering similar normal-organ and tumor biodistribution patterns between [68Ga]PSMA-11 and [18F]DCFPyL (34), we anticipate that the PSG score may be applicable to [18F]DCFPyL PET as well. Nevertheless, confirmatory studies have yet to be conducted. Finally, our criteria focus on only PSMA expression. Although high PSMA expression increases the likelihood of sufficient delivery of radiopharmaceutical to tumor, various factors (e.g., administered and absorbed dose, genomic DNA repair mechanism, radiosensitivity, and other biologic tumor characteristics) are associated with radioresistance (35). More comprehensive inclusion criteria may be necessary to refine patient selection.
CONCLUSION
This study proposes a PSG score derived from pretherapeutic [68Ga]PSMA-11 PET as a novel predictive and prognostic biomarker for response to [177Lu]PSMA. After further clinical validation, this score, together with other cross-sectional or metabolic imaging, may improve patient selection.
DISCLOSURE
Andrei Gafita is supported by the Prostate Cancer Foundation (21YOUN18), a UCLA Jonsson Comprehensive Cancer Center fellowship award, and a Dr. Christiaan Schiepers postdoctoral fellowship award. Jeremie Calais is supported by the Prostate Cancer Foundation (20YOUN05) and reports previous consulting activities for Astellas, Blue Earth Diagnostics, Curium Pharma, DS Pharma, GE Healthcare, IBA Cyclopharma, Isoray, Janssen Pharmaceuticals, Lantheus, Lightpoint Medical, Novartis, Point Biopharma, and Telix outside the submitted work. Michael Hofman is supported by a PCF Special Challenge Award through the Prostate Cancer Foundation with funding from CANICA, Oslo, Norway, and a Prostate Cancer Research Alliance Grant funded by Movember and the Australian Government Medical Research Future Fund; receives personal fees from Janssen (lecture honorarium), Mundipharma (lecture honorarium), Astellas (lecture honorarium), AstraZeneca (lecture honorarium), and MSD (advisory forum); receives research support from Endocyte, AAA, and Novartis, outside the submitted work; receives grant support from AAA/Novartis; and receives consulting fees for lectures or advisory boards from Astellas, AstraZeneca, Janssen, Merck/MSD, Mundipharma, and Point Biopharma. Wolfgang Fendler is supported by the German Research Foundation (FE1573/3-1/659216) and receives fees from SOFIE Bioscience (research funding), Janssen (consultant, speakers bureau), Calyx (consultant), Bayer (consultant, speakers bureau, research funding), and Parexel (image review) outside the submitted work. Johannes Czernin is a founder, board member, and holds equity in Sofie Biosciences and Trethera Therapeutics (intellectual property is patented by the University of California and licensed to Sofie Biosciences and Trethera Therapeutics) and was a consultant for Endocyte (VISION trial steering committee), Actinium Pharmaceuticals, and Point Biopharma, outside the submitted work. Thomas Hope reports consulting activities with Curium, RayzeBio, and ITM; receives research support from Clovis Oncology and Philips; and is supported by the NCI (R01CA212148 and R01CA235741) and the Prostate Cancer Foundation. Matthias Eiber reports previous consulting activities for BED, Novartis, Telix, Progenics, Bayer, PointBiopharma and Janssen and has submitted a patent application for rhPSMA, outside the submitted work. No other potential conflict of interest relevant to this article was reported.
KEY POINTS
QUESTION: Can the PSMA PET criteria using salivary glands as a reference organ (i.e., PSG score) optimize stratification of patients with mCRPC based on the response to [177Lu]PSMA?
PERTINENT FINDINGS: WB tumor uptake was compared with salivary gland uptake visually and quantitatively on baseline [68Ga]PSMA-11 PET images, and patients were classified into groups with high, intermediate, and low PSMA expression. Patients with high expression classified visually and by qPSG score showed a significantly better PSA response and OS after [177Lu]PSMA.
IMPLICATIONS FOR PATIENT CARE: The PSG score can be a valuable biomarker for response to [177Lu]PSMA and may assist in individual clinical decision making and future clinical trial design.
Footnotes
Guest Editor: Todd Peterson, Vanderbilt University.
Published online Mar. 30, 2023.
- © 2023 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication November 22, 2022.
- Revision received February 10, 2023.