Abstract
Primary central nervous system (CNS) lymphoma is an aggressive non-Hodgkin lymphoma with poor prognosis. We evaluated pretreatment 18F-FDG PET as a prognostic marker in primary CNS lymphoma. Methods: Forty-two immunocompetent patients with newly diagnosed primary CNS lymphoma who underwent pretreatment 18F-FDG PET were retrospectively analyzed. Baseline status and response to treatment were evaluated by MR imaging. Tumor maximum standardized uptake values were assessed by volume-of-interest analyses using an automatic isocontour definition. A 10-step semiquantitative visual rating system (metabolic imaging lymphoma aggressiveness scale, or MILAS) was used to assess primary CNS lymphoma metabolism as a marker of clinical aggressiveness. Logistic regression, log-rank testing, and multivariable Cox regression were used to investigate the association between 18F-FDG uptake and tumor response and survival. Results: Mean maximum standardized uptake value correlated linearly with MILAS. The distribution of patients according to MILAS (0–9) was 0%, 28.6%, 23.8%, 21.4%, 11.9%, 4.8%, 7.1%, 0%, 0%, and 2.4%. There was no correlation between MILAS and response to treatment. Respective 2- and 5-y survival rates were 52% and 32% for progression-free survival (PFS) and 64% and 50% for overall survival (OS). A cutoff at MILAS 3 was a good separator for PFS (median: 54.7 mo [≤3], 3.8 mo [>3], P = 0.0272) and OS (median: not reached [≤3], 13.8 mo [>3], P = 0.131). In multivariable analyses, increasing MILAS was significantly associated with shorter PFS (hazard ratio, 1.49, P = 0.006) and OS (hazard ratio, 1.43, P = 0.018). Conclusion: Increased pretreatment 18F-FDG uptake may offer new opportunities for baseline risk evaluation in untreated primary CNS lymphoma.
Primary central nervous system (CNS) lymphoma is an aggressive extranodal non-Hodgkin lymphoma (NHL) confined to the CNS compartment at diagnosis. Primary CNS lymphoma is a rare disease that accounts for 3%–4% of all primary brain tumors and 4%–6% of extranodal lymphomas. Compared with systemic NHL, the prognosis is poor (1–3). High-dose methotrexate combined with high-dose cytarabine followed by whole-brain radiotherapy is currently considered standard treatment (4). Regarding baseline risk stratification, 2 scoring systems have been proposed: that of the International Extranodal Lymphoma Study Group (0–1, 2–3, and 4–5 points), which is based on serum lactate dehydrogenase, age, Karnofsky performance score (KPS), involvement of deep brain structures, and cerebrospinal fluid protein concentration (5), and that of the Memorial Sloan-Kettering Cancer Center, which distinguishes 3 prognostic groups based on age and KPS (6). During the last few years, several other factors such as serologic markers, tumor characteristics, and pharmacokinetic parameters of methotrexate have been proposed to potentially identify risk groups (7–11), but most of these findings still lack external validation from larger cohorts.
The use of 18F-FDG PET has been extensively validated for baseline staging, interim response assessment, and posttherapy evaluation in systemic Hodgkin lymphoma and extracranial NHL (12–14). Increased 18F-FDG uptake by malignancies reflects increased carrier-mediated transport into the cell by glucose transporter 1 and phosphorylation of 18F-FDG to FDG-6-phosphate by hexokinase inside the cell. Furthermore, it was demonstrated that high 18F-FDG uptake is associated with high proliferative activity (assessed by Ki-67 immunostaining) (15) and poor patient outcome (16). In line with this, 18F-FDG uptake is on average higher in aggressive NHL than in indolent NHL (17). However, 18F-FDG uptake in aggressive NHL has also been found to be quite variable and partly overlapping with indolent NHL, raising the question of to what extent 18F-FDG PET might be useful in identifying patients with better or worse clinical course and prognosis. The use of 18F-FDG PET for the management of primary CNS lymphoma has not been systematically investigated so far. However, it has been suggested that 18F-FDG PET scans may be helpful for exclusion of systemic lymphoma involvement (18–21). Recently, Kawai et al. showed that a high maximum standardized uptake value (SUVmax) for 18F-FDG was associated with decreased progression-free survival (PFS) and overall survival (OS) in univariate analyses (22). However, the number of primary CNS lymphoma patients evaluated in that study was small (n = 17), and the use of 18F-FDG PET has not yet been recommended in the evaluation of primary CNS lymphoma at diagnosis, during treatment, or during follow-up (23). The present study investigated the potential predictive value of pretreatment 18F-FDG PET regarding tumor response, PFS, and OS in primary CNS lymphoma. In addition to measurements of 18F-FDG SUVmax, we propose a simple metabolic imaging lymphoma aggressiveness scale (MILAS) to visually assess primary CNS lymphoma metabolism as a marker of clinical aggressiveness.
MATERIALS AND METHODS
Patient Selection Criteria
Eligibility criteria for inclusion into this monocentric retrospective analysis were biopsy-proven primary CNS lymphoma, exclusion of systemic lymphoma manifestation by CT body scans and bone marrow examination, exclusion of HIV and Epstein–Barr virus infection, a pretreatment baseline 18F-FDG PET scan (acquired on the same scanner for all patients), and an MR scan of the brain before the start of any chemotherapy. During 2002–2009, 107 patients underwent PET before or during treatment; of those, 67 underwent scanning before the initiation of chemotherapy. Further patients were excluded either because they underwent PET on a different scanner (PET/CT) or because they underwent 11C-methionine PET. A final total of 42 patients remained in our dataset for analysis, all of whom underwent 18F-FDG PET on the same scanner before initiation of any chemotherapy. All patients provided written informed consent for institution-initiated research studies and specifically for analyses of clinical outcome studies, in conformance with the guidelines of our institutional review board.
Lymphoma Response Assessment
Baseline examination before treatment and response assessments during treatment and during follow-up were performed using contrast-enhanced brain MR imaging. The scans were analyzed by experienced board-certified neuroradiologists. For the present analysis, we used response assessments as documented in clinical routine (i.e., no additional retrospective MR imaging readings were performed).
PET Examinations
All PET examinations were performed using the same ECAT EXACT 922/47 scanner (Siemens/CTI). During the study period, the ECAT EXACT scanner underwent quality control testing according to the manufacturer specification. This testing included a daily check for detector sensitivity based on the transmission blank scan. Twice a year, a check for detector homogeneity and quantification was done and, if necessary, the system was normalized and cross-calibrated. Throughout the study period, no major repairs to the detector system were needed.
A 15-min emission scan consisting of three 5-min frames was acquired at 79.5 ± 26.9 min after intravenous injection of 366 ± 55 MBq of 18F-FDG in patients who were resting with their eyes closed in a room that had reduced ambient noise. Datasets were reconstructed by filtered backprojection (Shepp filter, 5 mm in full width at half maximum) with subsequent calculated attenuation. Further PET data analyses were done using a commercial software package (PMOD, version 3.2; PMOD Technologies Ltd.). Individual PET emission frames were automatically corrected for possible minor head movements. The summed, realigned PET dataset was then coregistered to the individual MR scan (contrast-enhanced T1-weighted scan in most cases; time gap from PET to MR imaging, maximum of 18 d). Tumor mean SUV and SUVmax—that is, regional 18F-FDG radioactivity concentration normalized by injected 18F-FDG dose per body weight—were assessed by volume-of-interest analyses using an automatic isocontour definition (80% of tumor maximum). Only results concerning SUVmax will be presented in the present work since mean SUV provided no superior information (data not shown).
In addition, MILAS was used to visually assess primary CNS lymphoma metabolism by means of a simple, custom-made 10-step color scale. The upper threshold of this MILAS color scale was individually adjusted to display physiologic 18F-FDG uptake of the cerebellum (reference region) as white (i.e., 10%–20% of maximum uptake; grade 1). 18F-FDG uptake below cerebellar uptake was color-coded as black (i.e., <10% of uptake maximum; grade 0), whereas 18F-FDG uptake above cerebellar uptake was color-coded in 8 discrete steps (i.e., 20%–30% of maximum, 30%–40% of maximum, and so forth, corresponding to MILAS scores of 2–9; Fig. 1). Maximum tumor 18F-FDG uptake in terms of MILAS score was then assessed by visual inspection (i.e., maximum tumor 18F-FDG uptake was scored according to its level on the MILAS color scale). Thus, MILAS scores linearly reflected maximum 18F-FDG uptake of the tumor relative to cerebellar 18F-FDG uptake, whereby 1 step of MILAS increase corresponded to about two thirds of cerebellar 18F-FDG uptake (e.g., a MILAS score of 4 implied that maximum 18F-FDG uptake of the tumor was about 200% higher than cerebellar 18F-FDG uptake). If the tumor could not be identified properly by inspection of the PET scan alone (i.e., tumor uptake close to physiologic brain uptake), the coregistered individual MR scan was used for precise tumor localization. In cases in which primary CNS lymphoma involved the cerebellum, we used either the contralateral cerebellar hemisphere (lateralized involvement) or a tumor-free cerebellar region (bilateral involvement) as the reference region.
Representative 18F-FDG PET scans of 4 patients with cerebral lymphoma coregistered to individual MR scans. First row, individual T1-weighted MR scans, shows strong contrast enhancement in lymphoma (marked by cross hairs; MR imaging gray level is adjusted for optimal display). In second row, individual 18F-FDG PET scans displayed using hot-metal color scale, individual maxima were adjusted for optimal display (i.e., 40, 25, 35, and 35 kBq/mL in first through fourth patients, respectively [from left to right]). In third row, 18F-FDG PET scans displayed using proposed MILAS color scale, individual maxima were adjusted to color-code physiologic 18F-FDG uptake of cerebellum (reference region) in white (i.e., 145, 90, 105, and 110 kBq/mL in first through fourth patients, respectively). Semiquantitative MILAS scores can be assessed simply by rating maximum 18F-FDG uptake according to its color level (i.e., 1, 2, 4, and 6 in first through fourth patients, respectively).
To assess interrater reproducibility and to strengthen diagnostic reliability, 2 investigators who were unaware of the patients’ outcomes independently evaluated tumor 18F-FDG uptake. The consensus of both investigators (achieved in a third joined reading in cases with discrepant MILAS scores) was used for further analysis.
Statistical Analysis
We used logistic regression to investigate the association between the MILAS and tumor response, and the best documented response evaluated by brain MR imaging under first-line therapy was used for analysis. Patients were categorized according to response: complete remission, partial remission, responder (complete remission + partial remission), and nonresponders (stable disease + progressive disease). We used linear regression for exploratory correlation of SUVmax and MILAS, and the association of linearity was investigated by graphical inspection of the model residuals. OS (time from diagnosis to death) and PFS (time from diagnosis to progress [under therapy], relapse, or death, whichever came first) were estimated using the Kaplan–Meier method. Unadjusted survival probabilities were compared using the log-rank test. Follow-up was estimated using the inverse Kaplan–Meier method (24). To investigate the prognostic values of MILAS and SUVmax on PFS and OS, we used multivariable Cox regression modeling. As potential confounders, we included age and KPS at diagnosis (both as continuous variables), which have been reported to be the strongest predictive factors found so far (6,25). The assumption of proportional hazards was formally tested.
Because the functional relationships between the continuous measurements of SUVmax/MILAS and PFS/OS were a priori unclear, we used multivariable fractional polynomials within the Cox procedure. This approach allowed modeling of possible nonlinearity in the relationship between the outcome and continuous predictors by estimating smooth functions in a multivariable context (26). Accordingly, the exploratory SUVmax/MILAS cutoffs chosen to illustrate unadjusted PFS/OS rates in the Kaplan–Meier curves were based on inspection of the fractional polynomial regression plots (supplemental Figs. 1–4; supplemental materials are available online only at http://jnm.snmjournals.org). For sensitivity analysis (only for MILAS), we stratified the Cox models by therapy modality (high-dose chemotherapy followed by autologous stem cell transplantation [HCTASCT] and no-HCTASCT). All statistical tests were 2-sided, and a P value of less than 0.05 was considered significant. Weighted κ-statistics were calculated to assess interrater agreement in terms of MILAS. Statistical analyses were performed using the program R, version 2.14.0 (The R Project for Statistical Computing, www.r-project.org), and STATA, version 12.1 (STATA Corp.).
RESULTS
Patients’ Characteristics and Treatment
Patients’ characteristics and first-line treatment regimens are summarized in Table 1. We identified 42 eligible patients who were diagnosed with primary CNS lymphoma and treated at our center between November 2002 and October 2009. Treatment was according to 3 different high-dose methotrexate–based protocols, all of which have been described previously (27–30). Of those, the majority additionally included HCTASCT for first-line therapy, but this treatment approach was primarily for younger (<65 y) and physically less compromised patients. Only 1 of 21 patients younger than 65 y did not receive HCTASCT. Most elderly patients were treated with methotrexate, lomustine, and procarbazine alone or in combination with rituximab. Before the start of treatment, all patients were on oral steroids, but these were tapered as soon as chemotherapy was initiated.
Patients’ Basic Characteristics
18F-FDG Uptake and Tumor Response
Representative 18F-FDG PET scans displayed by a widely used hot-metal color scale (maximum threshold set for optimal illustration) and the proposed MILAS color scale (maximum threshold set to code cerebellar 18F-FDG uptake in white) are shown in Figure 1 (coregistered with the individual MR scans). The MILAS scale allowed for a simple semiquantitative assessment of maximum 18F-FDG uptake in each case (examples of MILAS scores 1, 2, 4, and 6 are shown in Fig. 1).
The distribution of patients, SUVmax, response status, median PFS, and OS according to the MILAS scores is summarized in Table 2. The calculated weighted κ-value between the 2 independent MILAS readings was 0.78, which can be regarded as substantial agreement for the MILAS scoring by the 2 investigators (31). By consensus, most patients were categorized in MILAS groups 1–4; overall response rate (complete remission + partial remission) was 88%. Five patients were categorized as nonresponders (stable disease, n = 3), and 2 had a missing response status. The results of the logistic regression did not show a significant relation between MILAS and tumor response (for complete remission: odds ratio of 0.75, 95% confidence interval [CI] of 0.51–1.11; for complete remission + partial remission: odds ratio of 0.78, 95% CI of 0.49–1.23). As shown in Figure 2, SUVmax correlated with MILAS (P < 0.0001, R2 = 0.4734), and the residuals of the linear fit showed a normal distribution, thus confirming a somewhat linear relationship between MILAS and SUVmax. We also calculated the ratio of tumor SUVmax to the mean SUV of the cerebellum. As expected, this uptake ratio correlated excellently with MILAS (R2 = 0.93) and yielded quite similar results. However, for the sake of simplicity and conciseness, we focused on MILAS.
Distribution of 42 Patients, Best Remission Status After First-Line Therapy, PFS, and OS According to MILAS
Correlation between MILAS and SUVmax; R2 = 0.4734.
Survival Analysis
After a median follow-up of 53 mo (range, 17–97 mo), 25 patients experienced an event regarding PFS (progression in 4, relapse in 2, and death in 19). Respective 2- and 5-y PFS rates were 52% (95% CI, 39–70) and 32% (95% CI, 18–56); corresponding OS rates were 64% (95% CI, 51–81) and 50% (95% CI, 35–71), respectively. The results of the fractional polynomial regression analyses suggested a linear relationship between the predictors (MILAS and SUVmax) and the 2 endpoints PFS and OS. Based on the inspection of the fractional polynomial regression plots (supplemental Figs. 1–4), we compared unadjusted PFS and OS probabilities based on MILAS and SUVmax (Figs. 3A–3D). A cutoff at MILAS 3 was a good separator with respect to PFS (median PFS: 54.7 mo [score ≤ 3], 3.8 mo [score > 3], P = 0.0272) and also showed a trend regarding OS (median OS: not reached [score ≤ 3], 13.8 mo [score > 3], P = 0.131). Grouping by SUVmax revealed similar trends for decreased OS and PFS in patients with higher SUVmax (median PFS: 35.3 mo [SUVmax ≤ 16], 3.2 mo [SUVmax > 16], P = 0.178; median OS: not reached [SUVmax ≤ 16], 13.8 mo [SUVmax > 16.2], P = 0.106).
(A) PFS grouped by MILAS (≤3 vs. >3). (B) OS grouped by MILAS (≤3 vs. >3). (C) PFS grouped by SUVmax (≤16 vs. >16). (D) OS grouped by SUVmax (≤16 vs. >16).
The results of the final multivariable Cox regression analyses are summarized in Table 3. Raising MILAS was significantly associated with worse PFS and OS after adjustment for age and KPS. These results were also consistent in the sensitivity analysis in which we stratified the Cox procedure according to the therapy applied (HCTASCT and no-HCTASCT) (Table 4). Age had no impact in this stratified analysis, because the decision on whether to treat patients with the HCTASCT approach was based mainly on age (cutoff at 65 y); therefore, the distribution of age in the stratified Cox model was roughly either above or below 65. The SUVmax was not of predictive value in the multivariable analysis (Table 3).
Multivariate Cox Regression Analyses of Prognostic Impact of MILAS and SUVmax on PFS and OS Adjusted for Age and KPS
Multivariate Cox Regression Analyses, Stratified by Therapy Modality (HCTASCT and no-HCTASCT), of Prognostic Impact of MILAS on PFS and OS Adjusted for Age and KPS
DISCUSSION
The present study identified 18F-FDG uptake in primary CNS lymphoma at baseline evaluation as an independent predictor for PFS and OS. To our best knowledge, this is the largest series of primary CNS lymphoma patients in which the prognostic role of pretreatment 18F-FDG PET has been investigated in a multivariable fashion.
Because of factors such as errors in region-of-interest definition, body-weight assessment, and injected-dose calculation (requiring careful decay correction and cross-calibration between well-counter and PET camera), SUVmax calculations are more cumbersome and possibly error-prone and we therefore decided to propose a simple visual rating system to assess primary CNS lymphoma metabolism (i.e., MILAS) in addition to SUVmax. The use of MILAS relies on the common assumption that cerebellar 18F-FDG uptake can be applied as an internal reference to semiquantitatively assess relative tumor–to–reference region metabolism. This method circumvents the need for the aforementioned error-prone steps of SUV calculation since changes in these variables affect regional 18F-FDG uptake in lymphoma and the cerebellum to a comparable extent (i.e., they cancel each other out). This also explains why MILAS readings and SUVmax are not perfectly—but are still reasonably—correlated (R2 = 0.4734) and did not provide identical results although both outcome measures yield an estimate of maximal 18F-FDG uptake. We expected that by virtue of the internal standardization (use of a reference region), MILAS would be a more robust approach and better suited than SUVmax for interindividual comparisons. In line with this expectation, a noisier SUVmax provided similar results, albeit failing to reach statistical significance (likely because of the limited number of patients). This suggests that MILAS is a better tool for individual risk assessment by 18F-FDG uptake than is SUVmax alone. Of note, the present study was also in line with earlier works showing that carefully standardized, semiquantitative visual readings may yield diagnostic accuracy comparable to that of region-of-interest analyses in brain imaging studies (32,33).
Besides the inferior OS probability for patients with a high 18F-FDG SUVmax, Kawai et al.—like us—found no clear relationship between tumor response and 18F-FDG SUVmax (22). Regarding our multivariable survival analysis, the estimated unfavorable hazard ratios for MILAS on PFS/OS seemed to be quite robust, because the sensitivity analysis yielded significant inferior survival in patients with higher MILAS. However, although the relationship between increased 18F-FDG uptake as expressed by MILAS and decreased survival probability was in line with previous findings (22), our results still warrant external validation. Also, although the model output suggested a linear functional relationship between MILAS and PFS/OS, it may be that a possible nonlinear relationship was not detected because of limited statistical power. In addition, the MILAS cutoff at 3 needs to be considered as explorative, because it was not prespecified and was chosen only by simple inspection of the fractional polynomial regression plots. We propose that future research on the association between the continuous-measure 18F-FDG uptake and outcome should therefore focus primarily on the functional relationship before testing and claiming certain cutoffs.
One might assume that the correlation between MILAS and PFS/OS would be reflected in the tumor response (higher MILAS correlated with bad response), but such was not the case. One explanation could be that the main difficulty in the treatment of primary CNS lymphoma is not only the achievement of a response per se but the achievement of a durable response over time, because up to 60% of the patients who achieve a complete remission will relapse in the first 2 y (34). Thus, our findings suggest a positive relationship between 18F-FDG uptake and the overall aggressiveness of the course of disease. It would be interesting to investigate the association between 18F-FDG uptake and the histologic grade of primary CNS lymphoma; however, such a grading classification as that used in glioma does not exist yet (35). Because of the lack of 18F-FDG PET data during the course of treatment, we cannot make a statement about the possible value of interim 18F-FDG PET scans in primary CNS lymphoma. However, for baseline risk assessment, 18F-FDG PET could be a useful tool, particularly because of some evidence that classic risk factors such as age or performance status are no longer of prognostic value. This especially accounts for patients who receive more aggressive treatment approaches such as HCTASCT (36).
Regarding systemic NHL, Watanabe et al. recently reported a strong correlation between the Ki-67 proliferating index and the SUVmax of the biopsy site and proposed SUVmax as a useful predictor for the proliferation potential of NHL (37). Transferring these findings to primary CNS lymphoma would be difficult since these cells usually have an extremely high proliferation rate (35).
The scoring systems of the International Extranodal Lymphoma Study Group and the Memorial Sloan-Kettering Cancer Center are those usually applied for baseline risk stratification in primary CNS lymphoma. Because our population was small, we had to balance between overfitting of the Cox model and precision of the effect estimates. Therefore, we kept only age and KPS as baseline risk-stratifying factors in our analysis. In a recent analysis of 174 elderly patients with primary CNS lymphoma, no impact on OS was shown for serum lactate dehydrogenase, deep brain structure involvement, and cerebrospinal fluid protein concentration (factors needed to calculate the International Extranodal Lymphoma Study Group score); nevertheless, a KPS of 70% or more was still the strongest predictor of OS (25). Interestingly, KPS had no prognostic impact in our analysis. Although several other factors such as tumor-specific characteristics, serum markers, or methotrexate pharmacokinetics have been proposed as potential risk factors, the basic weakness of these analyses lies in their mostly retrospective nature, lack of external validation, and small patient numbers due to the rarity of the disease. A large multicenter study would be needed to establish a prospective dataset to refine the present scores and probably to establish new risk factors (including 18F-FDG PET).
Our study had 2 major limitations: its retrospective, single-center design and its relatively few patients. The fact that we included only patients who actually underwent pretreatment 18F-FDG PET implies a risk of patient selection bias with concomitant effects on risk factor analyses. However, by considering age and KPS in our multivariable analysis and conducting a sensitivity analysis based on treatment modality, we tried to minimize this potential bias. Regarding the second limitation, this series of 42 primary CNS lymphoma patients was small in absolute patient numbers and event numbers. Therefore, the power of our analysis was limited and the estimated hazard ratio for MILAS might be slightly overestimated.
CONCLUSION
Our data show the potential outcome-predicting ability of pretreatment 18F-FDG PET in primary CNS lymphoma patients. The inverse correlation between high 18F-FDG uptake and survival is in line with the notion that high lymphoma aggressiveness is usually associated with high 18F-FDG uptake. With PET and PET/CT now being broadly available, we propose that pretreatment 18F-FDG PET be included in the baseline evaluation of primary CNS lymphoma not only for exclusion of systemic disease but also for further investigation of its prognostic role.
DISCLOSURE
The costs of publication of this article were defrayed in part by the payment of page charges. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734. No potential conflict of interest relevant to this article was reported.
Footnotes
Published online Dec. 18, 2012
- © 2013 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
REFERENCES
- Received for publication May 14, 2012.
- Accepted for publication August 29, 2012.