Abstract
Interim 18F-FDG PET (after 1–4 cycles of chemotherapy) may be useful for tailoring a risk-adapted therapeutic strategy in lymphoma. The purpose of this study was to investigate whether semiquantification of standardized uptake values (SUVs) may help to improve the prognostic value of 18F-FDG PET, compared with visual analysis, after 4 cycles of chemotherapy. Methods: In a previous report, we showed that a 65.7% reduction in maximal SUV (SUVmax) between baseline (PET0) and 2 cycles of chemotherapy (PET2) better predicted event-free survival in 92 prospective patients with diffuse large B-cell lymphoma, by reducing false-positive interpretation of visual analysis. Eighty patients also underwent 18F-FDG PET after induction had been completed, at 4 cycles of chemotherapy (PET4). Images were interpreted visually (as negative or positive) and by computing the optimal percentage of SUVmax reduction between PET0 and PET4. Survival curves were estimated using Kaplan–Meier analysis and compared using the log-rank test. Median follow-up was 41 mo. Results: With visual analysis, the 2-y estimate for event-free survival was 82% in the PET4-negative group, compared with 25% in the PET4-positive group (P < 0.0001, accuracy of predicting event-free survival, 81.3%). An optimal cutoff of 72.9% SUVmax reduction from PET0 to PET4 yielded a 2-y estimate for event-free survival of 79% in patients with reduction of more than 72.9%, versus 32% in those with reduction of 72.9% or less (P < 0.0001; accuracy of predicting event-free survival, 77.5%). Conclusion: Although SUV semiquantification helps reduce false-positive interim 18F-FDG PET interpretations at 2 cycles, its performance is equivalent to visual analysis at 4 cycles, when most of the therapeutic effect has occurred upstream. This approach may be useful for objectively tailoring consolidation strategies.
During the past decade, PET with 18F-FDG has been revealed to be a powerful tool for monitoring response to therapy in most lymphomas (1–5). Recent studies have also stressed that the use of PET to assess response early, during the very first treatment cycles, can indicate chemosensitivity and may help to tailor therapeutic strategies for diffuse large B-cell lymphoma (6–8) and classic Hodgkin disease (6,9,10), depending on an individual patient's risk.
Assessment of response by PET relies mostly on visual analysis, which is subject to the dichotomous interpretation of an observer or a panel of observers. Besides, recently revised interpretation criteria (11) are adapted to assess response at the end of therapy but may not be adapted to assess response early during the course of treatment, because minimal residual uptake often persists, leading to false-positive interpretations (12).
In a recent study, we showed that semiquantification of 18F-FDG uptake, using standardized uptake value (SUV), was helpful in reducing false-positive interpretations after 2 cycles of first-line chemotherapy (13,14). By computing the percentage of maximal SUV (SUVmax) reduction between baseline and 2 cycles, we found that an optimal cutoff of 65.7% SUVmax reduction better separated patients with favorable outcomes (reduction > 65.7%; event-free survival, 79%) from those with poor outcomes (reduction ≤ 65.7%; event-free survival, 21%; P < 0.0001), compared with visual analysis (13).
The present study investigated whether SUV-based assessment of response may also help improve the prognostic value of interim PET after 4 cycles of chemotherapy (at the end of induction treatment), compared with visual analysis, in a subset of 80 patients with diffuse large B-cell lymphoma and with a median follow-up of 41 mo.
MATERIALS AND METHODS
Patients
The study population initially consisted of 92 prospective patients with a newly diagnosed diffuse large B-cell lymphoma (13). These patients were enrolled in a multicenter trial involving 4 departments of hematology of the Assistance Publique-Hôpitaux de Paris between January 2000 and December 2005. The primary objective was to assess the prognostic value of early PET after 2 cycles of induction chemotherapy (8). The study was approved by our institutional review board, and all patients gave informed written consent. Among the 92 patients, complete attenuation-corrected raw data were also available in 80 patients after 4 cycles of chemotherapy (7 scans not performed, 5 scans not readable on the optical disks). Patient characteristics and treatment regimens are summarized in Table 1. Importantly, treatment strategy was planned at inclusion, according to age, International Prognostic Index, and anthracycline-based protocols currently active at that time and was not influenced by PET results.
18F-FDG PET
Patients underwent serial PET: before chemotheraphy onset (PET0), after 2 cycles (PET2), and, in 80 patients, after 4 cycles (PET4), with a median interval of 18 d after the first day of the fourth cycle (range, 6–50 d). The delay before PET4 was due to limited access to the Assistance Publique-Hôpitaux de Paris PET center, which was shared by 5 hospitals, and to the priority that was given to PET2 assessment. Images were acquired on a dedicated C-PET camera (ADAC) for the first 69 patients who underwent PET4 (81 ± 7 min after injection of 2 MBq of 18F-FDG per kilogram) and then on a Gemini PET/CT system (Philips) in the last 11 patients (65 ± 8 min after injection of 5 MBq of 18F-FDG per kilogram). Image acquisition parameters and reconstruction methods have been described in detail in a previous publication (13). All patients also underwent concurrent diagnostic CT of the chest, abdomen, and pelvis within a week of each PET examination and then every 6 mo for follow-up, based on the International Workshop Criteria (15). Outcome was analyzed without consideration of the PET results.
Visual Analysis of 18F-FDG Uptake
PET images were analyzed by a consensus of 2 experienced observers who were unaware of the clinical, radiologic, and follow-up data. All foci were scored for their extent and intensity using a 3-point scale (1 = low, 2 = moderate, 3 = high). Extent was scored within each lymphatic area, organ, or skeletal region depending on the number of nodes or volume involved; intensity was scored by comparison with surrounding tissues after upper thresholding of the data in order to have the liver activity at around 30% of the gray scale. Then, PET4 images were scored as negative or positive by comparison with baseline PET, according to custom interpretation criteria derived from Mikhaeel et al. (16) and successfully applied in previous analyses (8,13). Negative was defined either as no residual abnormal uptake or as a residual site with an extent score of 1 and an intensity score of 1 when all other previously hypermetabolic sites were extinguished. Positive was defined either as at least 1 residual site with an extent score of 1 and an intensity score of 2 or as 2 or more residual sites with any extent and intensity scores.
In a second interpretation, PET4 images were scored as negative or positive according to the recently revised interpretation criteria (Juweid criteria) (11). Briefly, these criteria slightly differ from ours, as uptake in a residual mass 2 cm or larger must be compared with the mediastinal blood pool, and uptake in a residual mass smaller than 2 cm must be compared with the surrounding background. All PET4 scans were reviewed by the 2-observer consensus, and interpretation was modified depending on residual mass sizes seen on the concurrent CT scans. Specific criteria for defining PET positivity in the liver, spleen, lung, and bone marrow were applied when needed (11).
SUV-Based Assessment of 18F-FDG Uptake
For each PET dataset, the tumor with the most intense 18F-FDG uptake among all foci was carefully identified relying on a graded color-scale that used red to indicate the maximal count. A volumetric region of interest encompassing the entire tumor was drawn to ensure correct identification of the maximal count. SUVmax was calculated and normalized to body weight using the following formula:(Eq. 1)where activity was decay-corrected from the delay between injection and image acquisition.
To assess metabolic changes during induction chemotherapy, we used the most intense tumor in any region or organ on PET4 for comparison and as the indicator for disease status, even if its location differed from the initial tumor on PET0. In cases in which all lesions had disappeared, regions of interests were drawn in the same area on PET4 as on PET0, comparing carefully slice-to-slice and ensuring that region-of-interest size was restricted to the baseline tumor. SUV reduction was calculated as follows:(Eq. 2)
Statistical Analysis
To evaluate the prognostic value of early PET4, event-free survival and overall survival were chosen as endpoints. Follow-up was performed every 6 mo. Event-free survival was defined as the interval from the date of enrollment to the first evidence of progression or relapse or to the date of death from any cause. Data were censored if the patients were alive and free of progression or relapse at the last follow-up. Overall survival was defined as the interval from the date of enrollment to the date of death from any cause. Data were censored if the patients were alive at the last follow-up. Receiver-operating-characteristic (ROC) analysis was performed to determine an optimal cutoff for uptake on PET4 or an optimal cutoff for uptake reduction from PET0 to PET4 in predicting event-free survival (event vs. no event) and overall survival (dead vs. alive). Differences in SUVs between groups were analyzed with the unpaired Student t test, and significance was obtained when the 2-sided P value was less than 0.05. Survival curves according to visual analysis and SUV-based assessment of PET scans were obtained using Kaplan–Meier plots and were compared using the log-rank test.
RESULTS
Patient Outcome
During a median follow-up period of 41 mo after inclusion, 55 patients were free from events (event-free survival, 68.8%) and the remaining 25 underwent an event with a median delay of 4.7 mo; in addition, 63 patients survived (overall survival, 78.8%), whereas the remaining 17 died with a median delay of 6.7 mo.
Visual Analysis and Survival Prediction
All patients demonstrated intense foci of uptake on PET0, as expected in diffuse large B-cell lymphoma. At the end of induction therapy, PET4 was interpreted as negative in 62 patients and positive in 18 using custom visual analysis. The 2-y estimate for event-free survival was 82% (95% confidence interval [CI], 72%−92%) in the former, compared with 25% (95% CI, 4%−45%) in the latter (P < 0.0001, Fig. 1A). Positive and negative predictive values, as well as accuracies, in predicting event-free survival and overall survival are reported in Table 2. Of the 18 PET4-positive patients, 14 had an event with a median delay of 4.3 mo (including 7 who showed disease progression from PET2 to PET4) and only 4 remained free of events at the last follow-up.
When Juweid criteria were used, 6 patients who were PET4-negative with our custom criteria became positive, of whom 5 were false-positive (no event), leading to a subsequent reduction in positive predictive value (Table 2). The 2-y estimate for event-free survival remained unchanged in the 56 PET4-negative patients, at 82% (95% CI, 71%−93%), but increased in the 24 PET4-positive patients, at 38% (95% CI, 17%−58%, Fig. 1B).
SUV-Based Assessment and Survival Prediction
There was no statistical difference between SUVmax computed from the C-PET system and SUVmax obtained from the Gemini PET/CT system, on either PET0 or PET4 (P = 0.5 and 0.6, respectively). At baseline, SUVmax averaged 13.2 ± 4.8 in the 92 included patients (13.1 ± 4.9 in patients who had an event and 13.3 ± 4.8 in those who did not, P = 0.8). At 4 cycles, SUVmax decreased to 2.9 ± 2.7 in the 80 patients who underwent PET4, corresponding to a mean reduction of 76.7%. SUVmax reduction averaged 82.6% in the 62 PET4-negative patients, versus 56.5% in the 18 PET4-positive patients (P < 0.0001). All SUVs are displayed in Table 3.
With ROC analysis, the optimal cutoff values of SUVmax at PET4 were 2.8 for event-free survival prediction and 3.0 for overall survival prediction, with accuracies of 77.5% (area under the ROC curve, 0.720) and 81.3% (area, 0.751), respectively (Table 2). Semiquantitative analysis led to 3 additional false-positives for event-free survival prediction, compared with custom visual analysis, with subsequent alteration of survival curves (Fig. 1C).
The percentage of SUVmax reduction from PET0 to PET4 averaged 82.2% ± 8.0% in the 55 patients who remained free of disease, versus 64.7% ± 26.3% in the 25 patients who relapsed, progressed, or died (P < 0.0001). ROC analysis yielded an optimal cutoff of 72.9% SUVmax reduction at the end of induction therapy for predicting event-free survival and overall survival. The 2-y estimate for event-free survival was 79% (95% CI, 68%−89%) in the 63 patients with an SUVmax reduction greater than 72.9%, compared with 32% (95% CI, 9%−54%) in the 17 patients with an SUVmax reduction of 72.9% or less (P < 0.0001, Fig. 1D). The overall accuracy was 77.5% (area under the ROC curve, 0.719) for event-free survival prediction and 80.0% (area, 0.687) for overall survival prediction (Table 2), with slightly lower positive predictive values and negative predictive values, compared with custom visual analysis, because of 1 additional false-positive and 2 false-negative patients.
Influence of International Prognostic Index and Treatment Regimens
The prognostic value of interim PET, especially SUVmax reduction between PET0 and PET4 for event-free survival prediction, was independent of whether patients were of lower risk (P = 0.0004) or higher risk according to the International Prognostic Index (P = 0.0004, Fig. 2), of whether their chemotherapy regimen was based on CHOP (P = 0.02) or ACVBP/ACE (P < 0.0001) (see abbreviations in Table 1), of whether they received rituximab (P = 0.02) or not (P < 0.0001), and of whether they had consolidation by autologous stem cell transplantation (P = 0.0008) or not (P = 0.0002).
DISCUSSION
In the present study, we assessed the prognostic value of interim PET after 4 cycles of induction chemotherapy and compared different methods of interpretation, including custom visual analysis, Juweid criteria, and semiquantitative assessment, in a histologically homogeneous series of patients with diffuse large B-cell lymphoma followed up for a median of 41 mo. We emphasize that custom interpretation criteria allowing minimal residual uptake are more accurate than Juweid criteria, which were defined for end-of-therapy assessment. In addition, a glucose metabolic change of 72.9% SUVmax reduction from baseline to end of induction provides comparable results to custom visual analysis and may therefore serve as an objective measure to further guide the next step of the treatment—that is, consolidation therapy.
Visual interpretation of early PET response during induction chemotherapy (2–4 cycles) has been proven an independent prognostic indicator, compared with pretherapeutic indices such as the International Prognostic Index in diffuse large B-cell lymphoma (8) or the International Prognostic Score in Hodgkin disease (17). Indeed, metabolic imaging is an indicator of individual tumor sensitivity or resistance to the planned treatment, as opposed to pretherapeutic indices, which are population-based. However, the definition of metabolic response may be challenging because 18F-FDG uptake is a continuous variable and must be converted into a dichotomous variable—that is, positive or negative. Visual interpretation is subjective, and difficult decisions must be made by nuclear medicine physicians when minimal residual uptake persists in a previously involved lymphomatous area. In our original series (8), we chose to consider that patients with minimal residual uptake had negative PET2 findings on the basis of previous reports (16). Nevertheless, a relatively large number of patients with false-positive findings persisted. By using an SUVmax cutoff of 5.0 or an SUVmax reduction of approximately two thirds (65.7%), we were able to strongly reduce the number of false-positive interpretations (13,14). Interestingly, in the study by Gallamini et al. (17) including 260 patients with Hodgkin disease, semiquantification was helpful in defining the minimal residual uptake category—that is, an SUVmax of between 2.0 and 3.5. The authors obtained high positive and negative predictive values for progression-free survival when minimal residual uptake was considered PET2-negative. Another option would be to wait for additional cycles of chemotherapy to refine image interpretation. As a matter of fact, in the present study, the accuracies at 4 cycles seemed better than those at 2 cycles when visual interpretation was used, with slight overlap of CIs: 81.3% (95% CI, 72.8–89.8) versus 65.2% (95% CI, 55.5–74.9) for event-free survival and 81.3% (95% CI, 72.8–89.8) versus 68.5% (95% CI, 59.0–78.0) for overall survival (13). Indeed, 42% of patients who were PET2-positive became PET4-negative, of whom only 7.7% had an event (1 patient died after 9.1 mo), whereas 58% of patients who were PET2-positive remained PET4-positive, of whom 77.8% rapidly had an event (after an average of 4.3 mo). By contrast, all PET2-negative patients who underwent PET4 remained negative, of whom 20.0% had an event (after an average of 8.9 mo).
The use of semiquantification gave results comparable to visual analysis. The use of an absolute SUVmax threshold or SUVmax reduction yielded performance comparable to the Juweid criteria but slightly lower performance than that for visual analysis using our custom criteria for event-free survival prediction (Fig. 1; Table 3). Therefore, the advantages of semiquantitative analysis are not that apparent at 4 cycles, compared with what we have reported at 2 cycles (i.e., elimination of 14 false-positive patients) (13). This is not surprising, because cytotoxic chemotherapy is thought to kill cancer cells by first-order kinetics (18). At 2 cycles, in an idealized setting (assuming no interval tumor regrowth), one would expect 99.9% reduction in the number of viable cancer cells (18). Therefore, compared with a static parameter, an index expressing metabolic reduction is expected to be more discriminating for assessment of chemosensitivity at 2 cycles than response at 4 cycles, most of the therapeutic effect having occurred upstream.
Another explanation can account for the equivalent value of visual and semiquantitative analyses observed at 4 cycles. Residual 18F-FDG uptake after a few cycles of chemotherapy may stem both from persisting viable tumor and from local inflammation (4). In our series, the median delay between the fourth cycle and PET4 was somewhat longer (18 d) than the delay between the second cycle and PET2 (14 d, P < 0.001), because little more delay was allowed before starting consolidation therapy. Therefore, local inflammation was probably present less often at 4 cycles. Nevertheless, we found it preferable to use the same visual criteria as at 2 cycles—that is, allowing minimal residual uptake in a negative scan interpretation (extent score of 1 associated with intensity score of 1). This approach was efficient because visual analysis using our custom criteria gave the highest accuracy for prediction of event-free survival and overall survival. When the Juweid criteria were used, PET4-negative interpretations were converted into PET4-positive in 6 patients, of whom 5 had a good outcome (Fig. 3). In these cases, SUVmax reduction from PET0 to PET4 was always above 72.9% (mean, 79.2% ± 4.1%). Therefore, the Juweid criteria probably need refinement for the assessment of interim PET, by considering that minimal residual uptake could account for a negative PET interpretation, which is confirmed by a sufficient SUVmax reduction when using semiquantitative analysis.
Semiquantitative assessment is probably a more objective way to interpret PET response than is visual analysis. However, identification of the most intense tumor involves qualitative visual assessment. We believe that our approach, with the help of a color scale and by drawing contiguous regions of interest encompassing the tumor, should avoid interobserver variability, given that the observer consensus takes all precautions to avoid such physiologic pitfalls as brown fat, intestinal activity, or urinary elimination. Another limitation of our study is the use of a post hoc response criterion for SUV-based analysis (obtained from the same patient population), instead of a predefined response criterion, as we did for visual analysis. Finally, it seems difficult to rely on a single SUV at a given time point to appreciate the therapeutic response and to predict outcome. Indeed, although a cutoff value for an absolute SUV can vary greatly between different institutions (19,20)—here, the optimal cutoff of SUVmax was 2.8 for event-free survival prediction on PET4—the measurement of an interscan SUVmax reduction within the same institution is probably a better and more reproducible approach—here, a 72.9% SUVmax reduction.
CONCLUSION
In this prospective series of diffuse large B-cell lymphoma patients who underwent serial PET during induction therapy, we have demonstrated that SUV semiquantification helps reduce false-positive interpretations at 2 cycles but performs equivalently to visual analysis at 4 cycles, when false-positives are less frequent because most of the therapeutic effect has occurred upstream. This information may be useful for objectively tailoring risk-adapted consolidation strategies.
Acknowledgments
We particularly thank the entire team of the Assistance Publique-Hôpitaux de Paris PET center at Tenon Hospital, Paris (Prof. Jean-Noël Talbot) for their help with C-PET imaging. This study was supported by the Délégation à la Recherche Clinique de l'Assistance Publique-Hôpitaux de Paris (PHRC-AOM00152), the Société Française de Radiologie, and the Association pour la Recherche sur le Cancer. This study was presented in part at the 55th annual meeting of the Society of Nuclear Medicine, New Orleans, Louisiana, June 2008 (oral communication 537).
Footnotes
-
COPYRIGHT © 2009 by the Society of Nuclear Medicine, Inc.
References
- Received for publication September 2, 2008.
- Accepted for publication December 19, 2008.