Stacking Ensemble Learning–Based [18F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma

Shuilin Zhao; Jing Wang; Chentao Jin; Xiang Zhang; Chenxi Xue; Rui Zhou; Yan Zhong; Yuwei Liu; Xuexin He; Youyou Zhou; Caiyun Xu; Lixia Zhang; Wenbin Qian; Hong Zhang; Xiaohui Zhang; Mei Tian

doi:10.2967/jnumed.122.265244

Visual Abstract

Abstract

This study aimed to develop an analytic approach based on [¹⁸F]FDG PET radiomics using stacking ensemble learning to improve the outcome prediction in diffuse large B-cell lymphoma (DLBCL). Methods: In total, 240 DLBCL patients from 2 medical centers were divided into the training set (n = 141), internal testing set (n = 61), and external testing set (n = 38). Radiomics features were extracted from pretreatment [¹⁸F]FDG PET scans at the patient level using 4 semiautomatic segmentation methods (SUV threshold of 2.5, SUV threshold of 4.0 [SUV4.0], 41% of SUV_max, and SUV threshold of mean liver uptake [PERCIST]). All extracted features were harmonized with the ComBat method. The intraclass correlation coefficient was used to evaluate the reliability of radiomics features extracted by different segmentation methods. Features from the most reliable segmentation method were selected by Pearson correlation coefficient analysis and the LASSO (least absolute shrinkage and selection operator) algorithm. A stacking ensemble learning approach was applied to build radiomics-only and combined clinical–radiomics models for prediction of 2-y progression-free survival and overall survival based on 4 machine learning classifiers (support vector machine, random forests, gradient boosting decision tree, and adaptive boosting). Confusion matrix, receiver-operating-characteristic curve analysis, and survival analysis were used to evaluate the model performance. Results: Among 4 semiautomatic segmentation methods, SUV4.0 segmentation yielded the highest interobserver reliability, with 830 (66.7%) selected radiomics features. The combined model constructed by the stacking method achieved the best discrimination performance. For progression-free survival prediction in the external testing set, the areas under the receiver-operating-characteristic curve and accuracy of the stacking-based combined model were 0.771 and 0.789, respectively. For overall survival prediction, the stacking-based combined model achieved an area under the curve of 0.725 and an accuracy of 0.763 in the external testing set. The combined model also demonstrated a more distinct risk stratification than the International Prognostic Index in all sets (log-rank test, all P < 0.05). Conclusion: The combined model that incorporates [¹⁸F]FDG PET radiomics and clinical characteristics based on stacking ensemble learning could enable improved risk stratification in DLBCL.

Diffuse large B-cell lymphoma (DLBCL) is the most common subtype of aggressive non-Hodgkin lymphoma. Rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone represents the current first-line treatment, which is effective in approximately 60%–70% of patients (1). Patients with refractory disease or relapse after initial treatment have a low probability of cure and dismal outcomes due to the modest response rates for salvage regimens (2). Therefore, early identification of those high-risk patients is essential for designing individualized therapeutic intervention. Current prognostic scoring systems, such as the International Prognostic Index (IPI) and the National Comprehensive Cancer Network–IPI, have been the basis for determining prognosis in DLBCL (3,4). However, those models are inaccurate in predicting refractory disease, possibly because of their lack of intratumoral metabolic and functional information.

[¹⁸F]FDG PET/CT, a type of molecular imaging and a means to “transpathology” (5), has been recommended for staging and response assessment in DLBCL (6,7). Quantitative parameters on PET/CT, particularly total metabolic tumor volume (TMTV) and total lesion glycolysis, are considered to have prognostic significance in DLBCL (8,9). These parameters may allow for the assessment of whole-body tumor burden but remain limited in their ability to characterize phenotypical profiles such as shape, morphology, spatial distribution, and heterogeneity across individual lesions. For PET/CT image analysis, radiomics has recently been proposed as a novel high-throughput, noninvasive approach that could quantify tumor phenotype at a microscale level via extracting thousands of imaging-derived features (10). With the assistance of artificial intelligence, such as machine learning, radiomics offers a promising tool for diagnosis, therapeutic response assessment, and outcome prediction in various tumor types (11), including DLBCL (12–16). Preliminary studies have suggested that the application of machine learning algorithms, such as LASSO (least absolute shrinkage and selection operator) regression (16), ridge regression (13), and random forest (17), may contribute to the improved radiomics feature selection and prognostic modeling in DLBCL. However, most of those studies focused on evaluating a single machine learning approach, whereas only a minority used cross combination of different machine learning algorithms (14) or adopted ensemble machine learning (15). Stacking, an ensemble approach that combines different base classifiers into 1 metaclassifier, has been suggested to provide optimized performance and simplicity (18). In the present study, we aimed to develop an analytic approach based on [¹⁸F]FDG PET radiomics using stacking ensemble learning to improve the outcome prediction in DLBCL.

MATERIALS AND METHODS

Study Population

We retrospectively enrolled 240 consecutive patients with newly diagnosed DLBCL at 2 medical centers, including 202 patients at center 1 (the Second Affiliated Hospital of Zhejiang University School of Medicine) and 38 patients at center 2 (the First Affiliated Hospital of Zhejiang Chinese Medical University). Detailed information about the study population is shown in the supplemental materials (available at http://jnm.snmjournals.org) (19,20). The flowchart of patient enrollment is shown in Supplemental Figure 1. This study was approved by the Institutional Review Board at each institution, and the requirement to obtain written informed consent was waived.

PET/CT Imaging Protocol

Image acquisition and reconstruction were in accordance with the guidelines of European Association of Nuclear Medicine, version 2.0 (21). Patients fasted for at least 6 h and had a blood glucose level below 200 mg/dL before PET/CT examination. They were scanned at about 60 min after intravenous injection of [¹⁸F]FDG (3.70 MBq/kg). All PET images were corrected for attenuation using acquired low-dose CT data. Acquisitions differed between the 2 institutions in terms of PET/CT scanners, acquisition protocols, and reconstruction settings (Supplemental Table 1).

PET Image Segmentation and Feature Extraction

PET/CT images were reviewed by 2 independent nuclear medicine physicians, who were masked to patients’ clinical outcome. The volumes of interest were semiautomatically delineated using LIFEx software (version 6.30, https://www.lifexsoft.org/index.php) (22). Four different segmentation methods were applied to delineate lesions, including an SUV threshold of 2.5, an SUV threshold of 4.0 (SUV4.0), 41% of SUV_max, and SUV_PERCIST (1.5 × liver SUV_mean + 2 SDs) (21,23). SUV was calculated as (tissue radioactivity concentration [Bq/mL]) × (body weight [g])/(injected radioactivity [Bq]). According to the European Association of Nuclear Medicine guidelines, the liver SUV_mean should be between 1.3 and 3.0 (21). Conventional PET parameters including SUV_max, SUV_peak, TMTV, and total lesion glycolysis of each patient were recorded. The distance between the largest lesion and the lesion farthest from that bulk was also recorded (16).

Before feature extraction, all PET images were resampled to a voxel size of 3 × 3 × 3 mm using bilinear interpolation (24) and were discretized with a fixed bin size of 0.25 SUV (25). In total, 1,245 radiomics features were extracted from the entire segmented disease (patient level) via the open-source toolbox PyRadiomics (version 3.0.1) (16,26), consistent with the Image Biomarker Standardization Initiative (27). Detailed descriptions of the extracted features are presented in Supplemental Table 2. The radiomics workflow is shown in Figure 1.

FIGURE 1.

Radiomics workflow.

Feature Selection

The interobserver repeatability of radiomics features was evaluated using the intraclass correlation coefficient (ICC) in 100 randomly selected patients from center 1. Features with an ICC above 0.80 were considered robust and retained for subsequent analysis. The segmentation method with the maximum number of selected features was considered to be the most reliable method.

The ComBat harmonization method was applied to pool all conventional PET parameters and radiomics features derived from images acquired on the 2 different PET/CT scanners (28). Pearson correlation coefficient analysis followed by the LASSO algorithm were applied to select features. Details on feature selection are presented in the supplemental materials.

Stacking Ensemble Learning–Based Model Construction

Stacking ensemble learning is a complex machine learning algorithm that combines the result of several base learners to generate predictions into the metalearner to improve predictive accuracy (18). In this study, random forest, support vector machine, gradient boosting decision tree, and adaptive boosting were set as the base learners (first level), whereas random forest served as the metalearner (second level). The methodologic details are presented in the supplemental materials. Logistic regression was also applied to generate predictions. Confusion matrix analytics (including accuracy, F1 score, recall, and precision) were used to compare the performance of different machine learning algorithms. The detailed parameters of these algorithms are presented in Supplemental Table 3.

We evaluated the predictive value of 5 different models, including the radiomics model, the combined clinical–radiomics model, IPI, the model based on TMTV, the distance between the largest lesion and the lesion farthest from that bulk, and SUV_peak (17), as well as the International Metabolic Prognostic Index (29). Receiver-operating-characteristic (ROC) curve analysis was used to compare the predictive performance of different models.

Statistical Analysis

All statistical analysis was performed using SPSS (version 26.0), R (version 4.0.5, http://www.R-project.org), and Python (version 3.10). Progression-free survival (PFS) was defined as the time from diagnosis until lymphoma progression or death from any cause. Overall survival (OS) was defined as the time from diagnosis to death from any cause or to the last follow-up. Patients still alive were censored at the date of last contact. The differences in clinical characteristics were assessed using the χ² test and 1-way ANOVA, when appropriate. Patients were stratified into high- and low-risk groups using ROC curve analysis and maximizing the Youden index (30). Survival curves were estimated by the Kaplan–Meier analysis, and survival distributions were compared using the log-rank test. A P value of less than 0.05 was considered statistically significant.

RESULTS

Patient Characteristics and Outcome

Patients’ clinical characteristics are summarized in Table 1. No clinical characteristic had statistically significant differences among different datasets (all P > 0.05). The median follow-up intervals for the training, internal testing, and external testing sets were 41 mo (range, 4–105 mo), 44 mo (range, 6–104 mo), and 39 mo (range, 4–69 mo), respectively. By the end of follow-up, relapse and progression occurred in 56, 21, and 14 patients in the training, internal testing and external testing sets, respectively, whereas 45, 16, and 10 patients, respectively, had died.

View this table:

TABLE 1.

Patient Characteristics

Feature Selection

Among 4 segmentations, SUV4.0 segmentation showed the highest reliability, with 830 features (66.7%) retained in the context of an ICC of more than 0.8 (Supplemental Table 4). After the Pearson correlation coefficient test, 88 radiomics features were selected for SUV4.0 segmentation. The optimal features were obtained by the LASSO algorithm for construction of different stacking models (Supplemental Table 5).

Model Performance Evaluation

The model performance for 2-y PFS prediction based on different machine learning algorithms is shown in Supplemental Table 6. For the radiomics model, the stacking classifier showed better performance than the other 4 base classifiers and logistic regression, except for recall in the training set. For the combined model, the stacking classifier also demonstrated better performance than the other classifiers in the training set, internal testing set, and external testing set. Furthermore, the stacking-based combined model had higher predictive power than the radiomics model and IPI across nearly all evaluation metrics.

The model performance for 2-y OS prediction is shown in Supplemental Table 7. For the radiomics model, the stacking classifier demonstrated superior performance to the other base classifiers and logistic regression, except for precision in the internal testing set and accuracy and recall in the external testing set. For the combined model, the stacking classifier had relatively balanced performance in the training set but outperformed the other base classifiers in the internal testing set and the external testing set. Moreover, the stacking-based combined model performed better than the radiomics model and IPI.

We compared the performance of the stacking-based combined models by various combinations of base classifiers. As shown in Supplemental Tables 8 and 9, the combination of 4 base classifiers had a more balanced performance for PFS and OS prediction than did the other combinations. We also evaluated the performance of the radiomics and combined models trained on PFS prediction for predicting OS and vice versa; the results are shown in Supplemental Tables 10 and 11.

The results of ROC analysis are shown in Table 2. The combined model outperformed the other models for PFS prediction, with the area under the ROC curve (AUC) being 0.791, 0.762, and 0.771 in the training set, internal testing set, and external testing set, respectively. A similar trend was observed for OS prediction (the AUCs of the combined model were 0.843, 0.741, and 0.725 for the training set, internal testing set, and external testing set, respectively).

View this table:

TABLE 2.

AUCs of Different Models

Survival Prediction

Kaplan–Meier survival estimates of the combined model and IPI in the training set, internal testing set, and external testing set are shown in Figures 2, 3, and 4, respectively. The Kaplan–Meier survival estimates of the radiomics model are shown in Supplemental Figure 2. The differences in survival rates between low- and high-risk groups were significant except for OS in the radiomics model in the external testing set (P = 0.053). Moreover, the combined model demonstrated a more distinct risk stratification than the radiomics model and IPI, with larger differences between subgroups for both PFS and OS prediction (all P < 0.05).

FIGURE 2.

Kaplan–Meier curves for PFS of combined model (A), PFS of IPI (B), OS of combined model (C), and OS of IPI (D) in training set. Hazard ratio with 95% CI and log-rank P value are reported. HR = hazard ratio.

FIGURE 3.

Kaplan–Meier curves for PFS of combined model (A), PFS of IPI (B), OS of combined model (C), and OS of IPI (D) in internal testing set. Hazard ratio with 95% CI and log-rank P value are reported. HR = hazard ratio.

FIGURE 4.

Kaplan–Meier curves for PFS of combined model (A), PFS of IPI (B), OS of combined model (C), and OS of IPI (D) in external testing set. Hazard ratio with 95% CI and log-rank P value are reported. HR = hazard ratio.

DISCUSSION

In this study, we developed an analytic approach based on [¹⁸F]FDG PET radiomics using stacking ensemble learning for outcome prediction in DLBCL. Radiomics and combined clinical–radiomics models constructed by the stacking method outperformed those built on other single machine learning classifiers. Furthermore, the combined models integrating radiomics features and clinical information exhibited predictive performance superior to that of radiomics-only models and IPI.

To the best of our knowledge, this was the first study to evaluate the prognostic effect of [¹⁸F]FDG PET radiomics through a stacking ensemble learning approach in patients with DLBCL. Several previous studies have found that machine learning–based PET radiomics could be of prognostic importance in DLBCL (12–14). A multicenter study with 317 DLBCL patients suggested that the radiomics model based on LASSO logistic regression was predictive of 2-y time to progression, with an AUC of 0.76 (16). Another study using a LASSO-Cox algorithm reported an AUC of 0.748 for the radiomics model in the test set for PFS prediction (12). In a recent study, Jiang et al. used cross combination of 7 different machine learning algorithms for feature selection and found that the radiomics signature obtained by the support vector machine–support vector machine was highly predictive of PFS (AUC, 0.757) (14). Despite these encouraging findings, a recently developed ensemble learning approach has revealed diagnostic and prognostic advantages over a single machine learning method by aggregating multiple algorithms to achieve higher prediction accuracy (31,32). In our current study, the radiomics model built on a stacking ensemble learning approach outperformed those developed by the other 4 base classifiers and logistic regression, with AUCs of 0.715 and 0.707 for PFS prediction in the internal and external testing sets, respectively. This finding is consistent with the results from a recent radiomics study on DLBCL, in which a soft voting ensemble–based model showed higher accuracy than those based on single machine learning classifiers for 2-y event-free survival prediction (15). Notably, voting considers only linear relationships among classifiers whereas stacking is able to learn complex associations when individual base classifiers are heterogeneous (33). In our study, the combined model developed by 4 classifiers showed a more balanced performance than the other combinations, supporting the potential of stacking ensemble learning for radiomics analysis in DLBCL.

Our study also demonstrated that the combined models incorporating patient-level PET radiomics and clinical characteristics yielded higher AUCs and more distinct risk stratifications than IPI for outcome prediction in DLBCL, which is in line with previous observations (12,14,16). Recent studies suggested that the predictive ability of IPI has been weakened in the rituximab era (4). In this context, PET radiomics might add a new perspective on the phenotypic characteristics of DLBCL through profiling the intratumoral metabolic heterogeneity. Therefore, it is likely that considering both clinical and imaging features in analysis may offer a deeper understanding of the complex biologic properties of malignancy and thereby provide a better prognosis estimation.

Radiomics analysis in lymphoma remains challenging because of the lack of a primary site and the complexity of lesion delineation, particularly for disseminated disease. To date, no consensus has been reached on which segmentation method for lesion delineation in DLBCL is preferable. Although the 41%-of-SUV_max method has been recommended by the European Association of Nuclear Medicine for TMTV evaluation (21), this method is more likely to be influenced by interobserver variability (34). Other studies indicated that the SUV4.0 method could give a good approximation of TMTV for prediction of disease progression (35). On top of these, the impact of different segmentations on radiomics features for prognosis prediction in DLBCL remains to be explored. In our study, we compared the reliability of radiomics features based on 4 different segmentation methods. The SUV4.0 method yielded the highest interobserver reliability, with 830 features (66.7%) retained in ICC analysis, which is in line with the results from a recent study suggesting that SUV4.0 is the most stable approach (with excellent reliability for 84.8% of all features) among 6 semiautomatic segmentation methods (36). By contrast, the interobserver reliability of radiomics features based on 41%-of-SUV_max segmentation was the lowest in the current study, with only 46 features (3.7%) having excellent reliability. This discrepancy may correlate with differences in TMTV delineation. Previous studies demonstrated that variations in segmentation methods could have a marked effect on the outer contour of the segmentation, thereby influencing radiomics features, especially morphologic metrics (36,37). In our study, the SUV4.0 method exhibited a higher TMTV estimation and more stable radiomics features than the 41%-of-SUV_max method, indicating that a higher TMTV may cause the segmentation method to have less of an impact on radiomics features.

Several limitations of our study deserve mention. First, since this was a retrospective study with a relatively small sample size, our results need to be further validated in prospective multicenter studies involving a larger cohort of patients. Second, we applied only patient-level radiomics analysis; further studies are required to compare the impact of different lesion selection methods on radiomics analysis. Third, we applied ICC, Pearson correlation analysis, and LASSO for feature selection; further studies will be required to assess the performance of other strategies, for example, minimum redundancy maximum relevance and ReliefF. Fourth, to facilitate comparison with previous results, we used only PET images for radiomics analysis. A combination of PET and CT images may lead to the discovery of radiomics features that are more predictive. Fifth, Ki-67 expression and MYC/BCL-2 double-hit status are established prognostic factors but were not assessed in this study because of the incompleteness of the available data.

CONCLUSION

In the present study, we proposed an analytic approach using stacking ensemble learning for outcome prediction in DLBCL based on [¹⁸F]FDG PET radiomics. The stacking-based combined model that incorporates radiomics features and clinical characteristics could enable improved risk stratification in DLBCL patients.

DISCLOSURE

This study was partially supported by the National Natural Science Foundation of China (32027802), the National Key R&D Program of China (2021YFE0108300 and 2022YFE0118000), and the Key R&D Program of Zhejiang (2022C03071). No other potential conflict of interest relevant to this article was reported.

KEY POINTS

QUESTION: Can stacking ensemble learning–based [¹⁸F]FDG PET radiomics improve outcome prediction in patients with DLBCL?

PATIENT FINDINGS: In a retrospective study of 240 DLBCL patients, a stacking ensemble learning–based model that incorporates radiomics features and clinical characteristics enabled improved risk stratification.

IMPLICATIONS FOR PATIENT CARE: The stacking ensemble learning–based model incorporating PET radiomics and clinical information can be useful for better survival prediction and therapeutic decision making.

Footnotes

Published online Jul. 27, 2023.

REFERENCES

1.↵
1. Tilly H,
2. Gomes da Silva M,
3. Vitolo U,
4. et al
. Diffuse large B-cell lymphoma (DLBCL): ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2015;26(suppl 5):v116–v125.
OpenUrl CrossRef PubMed
2.↵
1. Crump M,
2. Neelapu SS,
3. Farooq U,
4. et al
. Outcomes in refractory diffuse large B-cell lymphoma: results from the international SCHOLAR-1 study. Blood. 2017;130:1800–1808.
OpenUrl Abstract/FREE Full Text
3.↵
International Non-Hodgkin’s Lymphoma Prognostic Factors Project. A predictive model for aggressive non-Hodgkin’s lymphoma. N Engl J Med. 1993;329:987–994.
OpenUrl CrossRef PubMed
4.↵
1. Zhou Z,
2. Sehn LH,
3. Rademaker AW,
4. et al
. An enhanced International Prognostic Index (NCCN-IPI) for patients with diffuse large B-cell lymphoma treated in the rituximab era. Blood. 2014;123:837–842.
OpenUrl Abstract/FREE Full Text
5.↵
1. Tian M,
2. He X,
3. Jin C,
4. et al
. Transpathology: molecular imaging-based pathology. Eur J Nucl Med Mol Imaging. 2021;48:2338–2350.
OpenUrl
6.↵
1. Barrington SF,
2. Kluge R
. FDG PET for therapy monitoring in Hodgkin and non-Hodgkin lymphomas. Eur J Nucl Med Mol Imaging. 2017;44(suppl 1):97–110.
OpenUrl CrossRef PubMed
7.↵
1. Zhang X,
2. Jiang H,
3. Wu S,
4. et al
. Positron emission tomography molecular imaging for phenotyping and management of lymphoma. Phenomics. 2022;2:102–118.
OpenUrl
8.↵
1. Cottereau AS,
2. Lanic H,
3. Mareschal S,
4. et al
. Molecular profile and FDG-PET/CT total metabolic tumor volume improve risk classification at diagnosis for patients with diffuse large B-cell lymphoma. Clin Cancer Res. 2016;22:3801–3809.
OpenUrl Abstract/FREE Full Text
9.↵
1. Toledano MN,
2. Desbordes P,
3. Banjar A,
4. et al
. Combination of baseline FDG PET/CT total metabolic tumour volume and gene expression profile have a robust predictive value in patients with diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2018;45:680–688.
OpenUrl
10.↵
1. Lambin P,
2. Leijenaar RTH,
3. Deist TM,
4. et al
. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14:749–762.
OpenUrl CrossRef PubMed
11.↵
1. Bi WL,
2. Hosny A,
3. Schabath MB,
4. et al
. Artificial intelligence in cancer imaging: clinical challenges and applications. CA Cancer J Clin. 2019;69:127–157.
OpenUrl PubMed
12.↵
1. Zhang X,
2. Chen L,
3. Jiang H,
4. et al
. A novel analytic approach for outcome prediction in diffuse large B-cell lymphoma by [¹⁸F]FDG PET/CT. Eur J Nucl Med Mol Imaging. 2022;49:1298–1310.
OpenUrl
13.↵
1. Frood R,
2. Clark M,
3. Burton C,
4. et al
. Discovery of pre-treatment FDG PET/CT-derived radiomics-based models for predicting outcome in diffuse large B-cell lymphoma. Cancers (Basel). 2022;14:1711.
OpenUrl
14.↵
1. Jiang C,
2. Li A,
3. Teng Y,
4. et al
. Optimal PET-based radiomic signature construction based on the cross-combination method for predicting the survival of patients with diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2022;49:2902–2916.
OpenUrl
15.↵
1. Ritter Z,
2. Papp L,
3. Zámbó K,
4. et al
. Two-year event-free survival prediction in DLBCL patients based on in vivo radiomics and clinical parameters. Front Oncol. 2022;12:820136.
OpenUrl
16.↵
1. Eertink JJ,
2. van de Brug T,
3. Wiegers SE,
4. et al
. ¹⁸F-FDG PET baseline radiomics features improve the prediction of treatment outcome in diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2022;49:932–942.
OpenUrl
17.↵
1. Eertink JJ,
2. Zwezerijnen GJC,
3. Cysouw MCF,
4. et al
. Comparing lesion and feature selections to predict progression in newly diagnosed DLBCL patients with FDG PET/CT radiomics features. Eur J Nucl Med Mol Imaging. 2022;49:4642–4651.
OpenUrl
18.↵
1. Naimi AI,
2. Balzer LB
. Stacked generalization: an introduction to super learning. Eur J Epidemiol. 2018;33:459–464.
OpenUrl PubMed
19.↵
1. Chawla NV,
2. Bowyer KW,
3. Hall LO,
4. Kegelmeyer WP
. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–357.
OpenUrl CrossRef PubMed
20.↵
1. Bergstra J,
2. Bengio Y
. Random search for hyper-parameter optimization. J Mach Learn Res. 2012;13:281–305.
OpenUrl
21.↵
1. Boellaard R,
2. Delgado-Bolton R,
3. Oyen WJ,
4. et al
. FDG PET/CT: EANM procedure guidelines for tumour imaging—version 2.0. Eur J Nucl Med Mol Imaging. 2015;42:328–354.
OpenUrl CrossRef PubMed
22.↵
1. Nioche C,
2. Orlhac F,
3. Boughdad S,
4. et al
. LIFEx: a freeware for radiomic feature calculation in multimodality imaging to accelerate advances in the characterization of tumor heterogeneity. Cancer Res. 2018;78:4786–4789.
OpenUrl Abstract/FREE Full Text
23.↵
1. Wahl RL,
2. Jacene H,
3. Kasamon Y,
4. Lodge MA
. From RECIST to PERCIST: evolving considerations for PET response criteria in solid tumors. J Nucl Med. 2009;50(suppl 1):122S–150S.
OpenUrl Abstract/FREE Full Text
24.↵
1. Shiri I,
2. Vafaei Sadr A,
3. Amini M,
4. et al
. Decentralized distributed multi-institutional PET image segmentation using a federated deep learning framework. Clin Nucl Med. 2022;47:606–617.
OpenUrl
25.↵
1. Pfaehler E,
2. van Sluis J,
3. Merema BBJ,
4. et al
. Experimental multicenter and multivendor evaluation of the performance of PET radiomic features using 3-dimensionally printed phantom inserts. J Nucl Med. 2020;61:469–476.
OpenUrl Abstract/FREE Full Text
26.↵
1. van Griethuysen JJM,
2. Fedorov A,
3. Parmar C,
4. et al
. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77:e104–e107.
OpenUrl Abstract/FREE Full Text
27.↵
1. Zwanenburg A,
2. Vallières M,
3. Abdalah MA,
4. et al
. The Image Biomarker Standardization Initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. 2020;295:328–338.
OpenUrl CrossRef PubMed
28.↵
1. Orlhac F,
2. Boughdad S,
3. Philippe C,
4. et al
. A postreconstruction harmonization method for multicenter radiomic studies in PET. J Nucl Med. 2018;59:1321–1328.
OpenUrl Abstract/FREE Full Text
29.↵
1. Mikhaeel NG,
2. Heymans MW,
3. Eertink JJ,
4. et al
. Proposed new dynamic prognostic index for diffuse large B-cell lymphoma: International Metabolic Prognostic Index. J Clin Oncol. 2022;40:2352–2360.
OpenUrl
30.↵
1. Ruopp MD,
2. Perkins NJ,
3. Whitcomb BW,
4. Schisterman EF
. Youden index and optimal cut-point estimated from observations affected by a lower limit of detection. Biom J. 2008;50:419–430.
OpenUrl CrossRef PubMed
31.↵
1. Chassagnon G,
2. Vakalopoulou M,
3. Battistella E,
4. et al
. AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Med Image Anal. 2021;67:101860.
OpenUrl CrossRef PubMed
32.↵
1. Papp L,
2. Spielvogel CP,
3. Grubmuller B,
4. et al
. Supervised machine learning enables non-invasive lesion characterization in primary prostate cancer with [⁶⁸Ga]Ga-PSMA-11 PET/MRI. Eur J Nucl Med Mol Imaging. 2021;48:1795–1805.
OpenUrl
33.↵
1. Heisler M,
2. Karst S,
3. Lo J,
4. et al
. Ensemble deep learning for diabetic retinopathy detection using optical coherence tomography angiography. Transl Vis Sci Technol. 2020;9:20.
OpenUrl PubMed
34.↵
1. Ilyas H,
2. Mikhaeel NG,
3. Dunn JT,
4. et al
. Defining the optimal method for measuring baseline metabolic tumour volume in diffuse large B cell lymphoma. Eur J Nucl Med Mol Imaging. 2018;45:1142–1154.
OpenUrl
35.↵
1. Barrington SF,
2. Zwezerijnen B,
3. de Vet HCW,
4. et al
. Automated segmentation of baseline metabolic total tumor burden in diffuse large B-cell lymphoma: which method is most successful? A study on behalf of the PETRA consortium. J Nucl Med. 2021;62:332–337.
OpenUrl Abstract/FREE Full Text
36.↵
1. Eertink JJ,
2. Pfaehler EAG,
3. Wiegers SE,
4. et al
. Quantitative radiomics features in diffuse large B-cell lymphoma: does segmentation method matter? J Nucl Med. 2022;63:389–395.
OpenUrl Abstract/FREE Full Text
37.↵
1. Belli ML,
2. Mori M,
3. Broggi S,
4. et al
. Quantifying the robustness of [¹⁸F]FDG-PET/CT radiomic features with respect to tumor delineation in head and neck and pancreatic cancer patients. Phys Med. 2018;49:105–111.
OpenUrl

Received for publication November 23, 2022.
Revision received May 31, 2023.

In this issue

Download PDF

Article Alerts

Email Article

Citation Tools

Bookmark this article

Cited By...

No citing articles found.

Google Scholar

More in this TOC Section

Show more Clinical Investigation

Keywords

[1] 1.↵
Tilly H,
Gomes da Silva M,
Vitolo U,
et al
. Diffuse large B-cell lymphoma (DLBCL): ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2015;26(suppl 5):v116–v125.
OpenUrl CrossRef PubMed

[2] Tilly H,

[3] Gomes da Silva M,

[4] Vitolo U,

[5] et al

[6] 2.↵
Crump M,
Neelapu SS,
Farooq U,
et al
. Outcomes in refractory diffuse large B-cell lymphoma: results from the international SCHOLAR-1 study. Blood. 2017;130:1800–1808.
OpenUrl Abstract/FREE Full Text

[7] Crump M,

[8] Neelapu SS,

[9] Farooq U,

[10] et al

[11] 3.↵
International Non-Hodgkin’s Lymphoma Prognostic Factors Project. A predictive model for aggressive non-Hodgkin’s lymphoma. N Engl J Med. 1993;329:987–994.
OpenUrl CrossRef PubMed

[12] 4.↵
Zhou Z,
Sehn LH,
Rademaker AW,
et al
. An enhanced International Prognostic Index (NCCN-IPI) for patients with diffuse large B-cell lymphoma treated in the rituximab era. Blood. 2014;123:837–842.
OpenUrl Abstract/FREE Full Text

[13] Zhou Z,

[14] Sehn LH,

[15] Rademaker AW,

[16] et al

[17] 5.↵
Tian M,
He X,
Jin C,
et al
. Transpathology: molecular imaging-based pathology. Eur J Nucl Med Mol Imaging. 2021;48:2338–2350.
OpenUrl

[18] Tian M,

[19] He X,

[20] Jin C,

[21] et al

[22] 6.↵
Barrington SF,
Kluge R
. FDG PET for therapy monitoring in Hodgkin and non-Hodgkin lymphomas. Eur J Nucl Med Mol Imaging. 2017;44(suppl 1):97–110.
OpenUrl CrossRef PubMed

[23] Barrington SF,

[24] Kluge R

[25] 7.↵
Zhang X,
Jiang H,
Wu S,
et al
. Positron emission tomography molecular imaging for phenotyping and management of lymphoma. Phenomics. 2022;2:102–118.
OpenUrl

[26] Zhang X,

[27] Jiang H,

[28] Wu S,

[29] et al

[30] 8.↵
Cottereau AS,
Lanic H,
Mareschal S,
et al
. Molecular profile and FDG-PET/CT total metabolic tumor volume improve risk classification at diagnosis for patients with diffuse large B-cell lymphoma. Clin Cancer Res. 2016;22:3801–3809.
OpenUrl Abstract/FREE Full Text

[31] Cottereau AS,

[32] Lanic H,

[33] Mareschal S,

[34] et al

[35] 9.↵
Toledano MN,
Desbordes P,
Banjar A,
et al
. Combination of baseline FDG PET/CT total metabolic tumour volume and gene expression profile have a robust predictive value in patients with diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2018;45:680–688.
OpenUrl

[36] Toledano MN,

[37] Desbordes P,

[38] Banjar A,

[39] et al

[40] 10.↵
Lambin P,
Leijenaar RTH,
Deist TM,
et al
. Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14:749–762.
OpenUrl CrossRef PubMed

[41] Lambin P,

[42] Leijenaar RTH,

[43] Deist TM,

[44] et al

[45] 11.↵
Bi WL,
Hosny A,
Schabath MB,
et al
. Artificial intelligence in cancer imaging: clinical challenges and applications. CA Cancer J Clin. 2019;69:127–157.
OpenUrl PubMed

[46] Bi WL,

[47] Hosny A,

[48] Schabath MB,

[49] et al

[50] 12.↵
Zhang X,
Chen L,
Jiang H,
et al
. A novel analytic approach for outcome prediction in diffuse large B-cell lymphoma by [¹⁸F]FDG PET/CT. Eur J Nucl Med Mol Imaging. 2022;49:1298–1310.
OpenUrl

[51] Zhang X,

[52] Chen L,

[53] Jiang H,

[54] et al

[55] 13.↵
Frood R,
Clark M,
Burton C,
et al
. Discovery of pre-treatment FDG PET/CT-derived radiomics-based models for predicting outcome in diffuse large B-cell lymphoma. Cancers (Basel). 2022;14:1711.
OpenUrl

[56] Frood R,

[57] Clark M,

[58] Burton C,

[59] et al

[60] 14.↵
Jiang C,
Li A,
Teng Y,
et al
. Optimal PET-based radiomic signature construction based on the cross-combination method for predicting the survival of patients with diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2022;49:2902–2916.
OpenUrl

[61] Jiang C,

[62] Li A,

[63] Teng Y,

[64] et al

[65] 15.↵
Ritter Z,
Papp L,
Zámbó K,
et al
. Two-year event-free survival prediction in DLBCL patients based on in vivo radiomics and clinical parameters. Front Oncol. 2022;12:820136.
OpenUrl

[66] Ritter Z,

[67] Papp L,

[68] Zámbó K,

[69] et al

[70] 16.↵
Eertink JJ,
van de Brug T,
Wiegers SE,
et al
. ¹⁸F-FDG PET baseline radiomics features improve the prediction of treatment outcome in diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2022;49:932–942.
OpenUrl

[71] Eertink JJ,

[72] van de Brug T,

[73] Wiegers SE,

[74] et al

[75] 17.↵
Eertink JJ,
Zwezerijnen GJC,
Cysouw MCF,
et al
. Comparing lesion and feature selections to predict progression in newly diagnosed DLBCL patients with FDG PET/CT radiomics features. Eur J Nucl Med Mol Imaging. 2022;49:4642–4651.
OpenUrl

[76] Eertink JJ,

[77] Zwezerijnen GJC,

[78] Cysouw MCF,

[79] et al

[80] 18.↵
Naimi AI,
Balzer LB
. Stacked generalization: an introduction to super learning. Eur J Epidemiol. 2018;33:459–464.
OpenUrl PubMed

[81] Naimi AI,

[82] Balzer LB

[83] 19.↵
Chawla NV,
Bowyer KW,
Hall LO,
Kegelmeyer WP
. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–357.
OpenUrl CrossRef PubMed

[84] Chawla NV,

[85] Bowyer KW,

[86] Hall LO,

[87] Kegelmeyer WP

[88] 20.↵
Bergstra J,
Bengio Y
. Random search for hyper-parameter optimization. J Mach Learn Res. 2012;13:281–305.
OpenUrl

[89] Bergstra J,

[90] Bengio Y

[91] 21.↵
Boellaard R,
Delgado-Bolton R,
Oyen WJ,
et al
. FDG PET/CT: EANM procedure guidelines for tumour imaging—version 2.0. Eur J Nucl Med Mol Imaging. 2015;42:328–354.
OpenUrl CrossRef PubMed

[92] Boellaard R,

[93] Delgado-Bolton R,

[94] Oyen WJ,

[95] et al

[96] 22.↵
Nioche C,
Orlhac F,
Boughdad S,
et al
. LIFEx: a freeware for radiomic feature calculation in multimodality imaging to accelerate advances in the characterization of tumor heterogeneity. Cancer Res. 2018;78:4786–4789.
OpenUrl Abstract/FREE Full Text

[97] Nioche C,

[98] Orlhac F,

[99] Boughdad S,

[100] et al

[101] 23.↵
Wahl RL,
Jacene H,
Kasamon Y,
Lodge MA
. From RECIST to PERCIST: evolving considerations for PET response criteria in solid tumors. J Nucl Med. 2009;50(suppl 1):122S–150S.
OpenUrl Abstract/FREE Full Text

[102] Wahl RL,

[103] Jacene H,

[104] Kasamon Y,

[105] Lodge MA

[106] 24.↵
Shiri I,
Vafaei Sadr A,
Amini M,
et al
. Decentralized distributed multi-institutional PET image segmentation using a federated deep learning framework. Clin Nucl Med. 2022;47:606–617.
OpenUrl

[107] Shiri I,

[108] Vafaei Sadr A,

[109] Amini M,

[110] et al

[111] 25.↵
Pfaehler E,
van Sluis J,
Merema BBJ,
et al
. Experimental multicenter and multivendor evaluation of the performance of PET radiomic features using 3-dimensionally printed phantom inserts. J Nucl Med. 2020;61:469–476.
OpenUrl Abstract/FREE Full Text

[112] Pfaehler E,

[113] van Sluis J,

[114] Merema BBJ,

[115] et al

[116] 26.↵
van Griethuysen JJM,
Fedorov A,
Parmar C,
et al
. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77:e104–e107.
OpenUrl Abstract/FREE Full Text

[117] van Griethuysen JJM,

[118] Fedorov A,

[119] Parmar C,

[120] et al

[121] 27.↵
Zwanenburg A,
Vallières M,
Abdalah MA,
et al
. The Image Biomarker Standardization Initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. 2020;295:328–338.
OpenUrl CrossRef PubMed

[122] Zwanenburg A,

[123] Vallières M,

[124] Abdalah MA,

[125] et al

[126] 28.↵
Orlhac F,
Boughdad S,
Philippe C,
et al
. A postreconstruction harmonization method for multicenter radiomic studies in PET. J Nucl Med. 2018;59:1321–1328.
OpenUrl Abstract/FREE Full Text

[127] Orlhac F,

[128] Boughdad S,

[129] Philippe C,

[130] et al

[131] 29.↵
Mikhaeel NG,
Heymans MW,
Eertink JJ,
et al
. Proposed new dynamic prognostic index for diffuse large B-cell lymphoma: International Metabolic Prognostic Index. J Clin Oncol. 2022;40:2352–2360.
OpenUrl

[132] Mikhaeel NG,

[133] Heymans MW,

[134] Eertink JJ,

[135] et al

[136] 30.↵
Ruopp MD,
Perkins NJ,
Whitcomb BW,
Schisterman EF
. Youden index and optimal cut-point estimated from observations affected by a lower limit of detection. Biom J. 2008;50:419–430.
OpenUrl CrossRef PubMed

[137] Ruopp MD,

[138] Perkins NJ,

[139] Whitcomb BW,

[140] Schisterman EF

[141] 31.↵
Chassagnon G,
Vakalopoulou M,
Battistella E,
et al
. AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Med Image Anal. 2021;67:101860.
OpenUrl CrossRef PubMed

[142] Chassagnon G,

[143] Vakalopoulou M,

[144] Battistella E,

[145] et al

[146] 32.↵
Papp L,
Spielvogel CP,
Grubmuller B,
et al
. Supervised machine learning enables non-invasive lesion characterization in primary prostate cancer with [⁶⁸Ga]Ga-PSMA-11 PET/MRI. Eur J Nucl Med Mol Imaging. 2021;48:1795–1805.
OpenUrl

[147] Papp L,

[148] Spielvogel CP,

[149] Grubmuller B,

[150] et al

[151] 33.↵
Heisler M,
Karst S,
Lo J,
et al
. Ensemble deep learning for diabetic retinopathy detection using optical coherence tomography angiography. Transl Vis Sci Technol. 2020;9:20.
OpenUrl PubMed

[152] Heisler M,

[153] Karst S,

[154] Lo J,

[155] et al

[156] 34.↵
Ilyas H,
Mikhaeel NG,
Dunn JT,
et al
. Defining the optimal method for measuring baseline metabolic tumour volume in diffuse large B cell lymphoma. Eur J Nucl Med Mol Imaging. 2018;45:1142–1154.
OpenUrl

[157] Ilyas H,

[158] Mikhaeel NG,

[159] Dunn JT,

[160] et al

[161] 35.↵
Barrington SF,
Zwezerijnen B,
de Vet HCW,
et al
. Automated segmentation of baseline metabolic total tumor burden in diffuse large B-cell lymphoma: which method is most successful? A study on behalf of the PETRA consortium. J Nucl Med. 2021;62:332–337.
OpenUrl Abstract/FREE Full Text

[162] Barrington SF,

[163] Zwezerijnen B,

[164] de Vet HCW,

[165] et al

[166] 36.↵
Eertink JJ,
Pfaehler EAG,
Wiegers SE,
et al
. Quantitative radiomics features in diffuse large B-cell lymphoma: does segmentation method matter? J Nucl Med. 2022;63:389–395.
OpenUrl Abstract/FREE Full Text

[167] Eertink JJ,

[168] Pfaehler EAG,

[169] Wiegers SE,

[170] et al

[171] 37.↵
Belli ML,
Mori M,
Broggi S,
et al
. Quantifying the robustness of [¹⁸F]FDG-PET/CT radiomic features with respect to tumor delineation in head and neck and pancreatic cancer patients. Phys Med. 2018;49:105–111.
OpenUrl

[172] Belli ML,

[173] Mori M,

[174] Broggi S,

[175] et al

Main menu

User menu

Search

Stacking Ensemble Learning–Based [¹⁸F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma

Visual Abstract

Abstract

MATERIALS AND METHODS

Study Population

PET/CT Imaging Protocol

PET Image Segmentation and Feature Extraction

Feature Selection

Stacking Ensemble Learning–Based Model Construction

Statistical Analysis

RESULTS

Patient Characteristics and Outcome

Feature Selection

Model Performance Evaluation

Survival Prediction

DISCUSSION

CONCLUSION

DISCLOSURE

KEY POINTS

Footnotes

REFERENCES

In this issue

Citation Manager Formats

Related Articles

Cited By...

More in this TOC Section

Similar Articles

Keywords

Main menu

User menu

Search

Stacking Ensemble Learning–Based [18F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma

Visual Abstract

Abstract

MATERIALS AND METHODS

Study Population

PET/CT Imaging Protocol

PET Image Segmentation and Feature Extraction

Feature Selection

Stacking Ensemble Learning–Based Model Construction

Statistical Analysis

RESULTS

Patient Characteristics and Outcome

Feature Selection

Model Performance Evaluation

Survival Prediction

DISCUSSION

CONCLUSION

DISCLOSURE

KEY POINTS

Footnotes

REFERENCES

In this issue

Citation Manager Formats

Jump to section

Related Articles

Cited By...

More in this TOC Section

Similar Articles

Keywords

Stacking Ensemble Learning–Based [¹⁸F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma