Skip to main content

Main menu

  • Home
  • Content
    • Current
    • Ahead of print
    • Past Issues
    • JNM Supplement
    • SNMMI Annual Meeting Abstracts
    • Continuing Education
    • JNM Podcasts
  • Subscriptions
    • Subscribers
    • Institutional and Non-member
    • Rates
    • Journal Claims
    • Corporate & Special Sales
  • Authors
    • Submit to JNM
    • Information for Authors
    • Assignment of Copyright
    • AQARA requirements
  • Info
    • Reviewers
    • Permissions
    • Advertisers
  • About
    • About Us
    • Editorial Board
    • Contact Information
  • More
    • Alerts
    • Feedback
    • Help
    • SNMMI Journals
  • SNMMI
    • JNM
    • JNMT
    • SNMMI Journals
    • SNMMI

User menu

  • Subscribe
  • My alerts
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Nuclear Medicine
  • SNMMI
    • JNM
    • JNMT
    • SNMMI Journals
    • SNMMI
  • Subscribe
  • My alerts
  • Log in
  • My Cart
Journal of Nuclear Medicine

Advanced Search

  • Home
  • Content
    • Current
    • Ahead of print
    • Past Issues
    • JNM Supplement
    • SNMMI Annual Meeting Abstracts
    • Continuing Education
    • JNM Podcasts
  • Subscriptions
    • Subscribers
    • Institutional and Non-member
    • Rates
    • Journal Claims
    • Corporate & Special Sales
  • Authors
    • Submit to JNM
    • Information for Authors
    • Assignment of Copyright
    • AQARA requirements
  • Info
    • Reviewers
    • Permissions
    • Advertisers
  • About
    • About Us
    • Editorial Board
    • Contact Information
  • More
    • Alerts
    • Feedback
    • Help
    • SNMMI Journals
  • View or Listen to JNM Podcast
  • Visit JNM on Facebook
  • Join JNM on LinkedIn
  • Follow JNM on Twitter
  • Subscribe to our RSS feeds
Research ArticleClinical Investigation

Stacking Ensemble Learning–Based [18F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma

Shuilin Zhao, Jing Wang, Chentao Jin, Xiang Zhang, Chenxi Xue, Rui Zhou, Yan Zhong, Yuwei Liu, Xuexin He, Youyou Zhou, Caiyun Xu, Lixia Zhang, Wenbin Qian, Hong Zhang, Xiaohui Zhang and Mei Tian
Journal of Nuclear Medicine October 2023, 64 (10) 1603-1609; DOI: https://doi.org/10.2967/jnumed.122.265244
Shuilin Zhao
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
4Cancer Center, Department of Radiology, Zhejiang Provincial People’s Hospital, Affiliated People’s Hospital, Hangzhou Medical College, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jing Wang
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chentao Jin
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xiang Zhang
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chenxi Xue
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rui Zhou
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yan Zhong
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuwei Liu
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xuexin He
5Department of Medical Oncology, Huashan Hospital of Fudan University, Shanghai, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Youyou Zhou
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Caiyun Xu
6Department of Nuclear Medicine, First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Traditional Chinese Medicine), Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lixia Zhang
6Department of Nuclear Medicine, First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Traditional Chinese Medicine), Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wenbin Qian
7Department of Hematology, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hong Zhang
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
8College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China;
9Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou, China; and
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xiaohui Zhang
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mei Tian
1Department of Nuclear Medicine and PET Center, Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China;
2Institute of Nuclear Medicine and Molecular Imaging of Zhejiang University, Hangzhou, China;
3Key Laboratory of Medical Molecular Imaging of Zhejiang Province, Hangzhou, China;
10Human Phenome Institute, Fudan University, Shanghai, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Supplemental
  • Info & Metrics
  • PDF
Loading

Visual Abstract

Figure
  • Download figure
  • Open in new tab
  • Download powerpoint

Abstract

This study aimed to develop an analytic approach based on [18F]FDG PET radiomics using stacking ensemble learning to improve the outcome prediction in diffuse large B-cell lymphoma (DLBCL). Methods: In total, 240 DLBCL patients from 2 medical centers were divided into the training set (n = 141), internal testing set (n = 61), and external testing set (n = 38). Radiomics features were extracted from pretreatment [18F]FDG PET scans at the patient level using 4 semiautomatic segmentation methods (SUV threshold of 2.5, SUV threshold of 4.0 [SUV4.0], 41% of SUVmax, and SUV threshold of mean liver uptake [PERCIST]). All extracted features were harmonized with the ComBat method. The intraclass correlation coefficient was used to evaluate the reliability of radiomics features extracted by different segmentation methods. Features from the most reliable segmentation method were selected by Pearson correlation coefficient analysis and the LASSO (least absolute shrinkage and selection operator) algorithm. A stacking ensemble learning approach was applied to build radiomics-only and combined clinical–radiomics models for prediction of 2-y progression-free survival and overall survival based on 4 machine learning classifiers (support vector machine, random forests, gradient boosting decision tree, and adaptive boosting). Confusion matrix, receiver-operating-characteristic curve analysis, and survival analysis were used to evaluate the model performance. Results: Among 4 semiautomatic segmentation methods, SUV4.0 segmentation yielded the highest interobserver reliability, with 830 (66.7%) selected radiomics features. The combined model constructed by the stacking method achieved the best discrimination performance. For progression-free survival prediction in the external testing set, the areas under the receiver-operating-characteristic curve and accuracy of the stacking-based combined model were 0.771 and 0.789, respectively. For overall survival prediction, the stacking-based combined model achieved an area under the curve of 0.725 and an accuracy of 0.763 in the external testing set. The combined model also demonstrated a more distinct risk stratification than the International Prognostic Index in all sets (log-rank test, all P < 0.05). Conclusion: The combined model that incorporates [18F]FDG PET radiomics and clinical characteristics based on stacking ensemble learning could enable improved risk stratification in DLBCL.

  • PET
  • diffuse large B-cell lymphoma
  • prognosis
  • machine learning
  • radiomics

Diffuse large B-cell lymphoma (DLBCL) is the most common subtype of aggressive non-Hodgkin lymphoma. Rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone represents the current first-line treatment, which is effective in approximately 60%–70% of patients (1). Patients with refractory disease or relapse after initial treatment have a low probability of cure and dismal outcomes due to the modest response rates for salvage regimens (2). Therefore, early identification of those high-risk patients is essential for designing individualized therapeutic intervention. Current prognostic scoring systems, such as the International Prognostic Index (IPI) and the National Comprehensive Cancer Network–IPI, have been the basis for determining prognosis in DLBCL (3,4). However, those models are inaccurate in predicting refractory disease, possibly because of their lack of intratumoral metabolic and functional information.

[18F]FDG PET/CT, a type of molecular imaging and a means to “transpathology” (5), has been recommended for staging and response assessment in DLBCL (6,7). Quantitative parameters on PET/CT, particularly total metabolic tumor volume (TMTV) and total lesion glycolysis, are considered to have prognostic significance in DLBCL (8,9). These parameters may allow for the assessment of whole-body tumor burden but remain limited in their ability to characterize phenotypical profiles such as shape, morphology, spatial distribution, and heterogeneity across individual lesions. For PET/CT image analysis, radiomics has recently been proposed as a novel high-throughput, noninvasive approach that could quantify tumor phenotype at a microscale level via extracting thousands of imaging-derived features (10). With the assistance of artificial intelligence, such as machine learning, radiomics offers a promising tool for diagnosis, therapeutic response assessment, and outcome prediction in various tumor types (11), including DLBCL (12–16). Preliminary studies have suggested that the application of machine learning algorithms, such as LASSO (least absolute shrinkage and selection operator) regression (16), ridge regression (13), and random forest (17), may contribute to the improved radiomics feature selection and prognostic modeling in DLBCL. However, most of those studies focused on evaluating a single machine learning approach, whereas only a minority used cross combination of different machine learning algorithms (14) or adopted ensemble machine learning (15). Stacking, an ensemble approach that combines different base classifiers into 1 metaclassifier, has been suggested to provide optimized performance and simplicity (18). In the present study, we aimed to develop an analytic approach based on [18F]FDG PET radiomics using stacking ensemble learning to improve the outcome prediction in DLBCL.

MATERIALS AND METHODS

Study Population

We retrospectively enrolled 240 consecutive patients with newly diagnosed DLBCL at 2 medical centers, including 202 patients at center 1 (the Second Affiliated Hospital of Zhejiang University School of Medicine) and 38 patients at center 2 (the First Affiliated Hospital of Zhejiang Chinese Medical University). Detailed information about the study population is shown in the supplemental materials (available at http://jnm.snmjournals.org) (19,20). The flowchart of patient enrollment is shown in Supplemental Figure 1. This study was approved by the Institutional Review Board at each institution, and the requirement to obtain written informed consent was waived.

PET/CT Imaging Protocol

Image acquisition and reconstruction were in accordance with the guidelines of European Association of Nuclear Medicine, version 2.0 (21). Patients fasted for at least 6 h and had a blood glucose level below 200 mg/dL before PET/CT examination. They were scanned at about 60 min after intravenous injection of [18F]FDG (3.70 MBq/kg). All PET images were corrected for attenuation using acquired low-dose CT data. Acquisitions differed between the 2 institutions in terms of PET/CT scanners, acquisition protocols, and reconstruction settings (Supplemental Table 1).

PET Image Segmentation and Feature Extraction

PET/CT images were reviewed by 2 independent nuclear medicine physicians, who were masked to patients’ clinical outcome. The volumes of interest were semiautomatically delineated using LIFEx software (version 6.30, https://www.lifexsoft.org/index.php) (22). Four different segmentation methods were applied to delineate lesions, including an SUV threshold of 2.5, an SUV threshold of 4.0 (SUV4.0), 41% of SUVmax, and SUVPERCIST (1.5 × liver SUVmean + 2 SDs) (21,23). SUV was calculated as (tissue radioactivity concentration [Bq/mL]) × (body weight [g])/(injected radioactivity [Bq]). According to the European Association of Nuclear Medicine guidelines, the liver SUVmean should be between 1.3 and 3.0 (21). Conventional PET parameters including SUVmax, SUVpeak, TMTV, and total lesion glycolysis of each patient were recorded. The distance between the largest lesion and the lesion farthest from that bulk was also recorded (16).

Before feature extraction, all PET images were resampled to a voxel size of 3 × 3 × 3 mm using bilinear interpolation (24) and were discretized with a fixed bin size of 0.25 SUV (25). In total, 1,245 radiomics features were extracted from the entire segmented disease (patient level) via the open-source toolbox PyRadiomics (version 3.0.1) (16,26), consistent with the Image Biomarker Standardization Initiative (27). Detailed descriptions of the extracted features are presented in Supplemental Table 2. The radiomics workflow is shown in Figure 1.

FIGURE 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 1.

Radiomics workflow.

Feature Selection

The interobserver repeatability of radiomics features was evaluated using the intraclass correlation coefficient (ICC) in 100 randomly selected patients from center 1. Features with an ICC above 0.80 were considered robust and retained for subsequent analysis. The segmentation method with the maximum number of selected features was considered to be the most reliable method.

The ComBat harmonization method was applied to pool all conventional PET parameters and radiomics features derived from images acquired on the 2 different PET/CT scanners (28). Pearson correlation coefficient analysis followed by the LASSO algorithm were applied to select features. Details on feature selection are presented in the supplemental materials.

Stacking Ensemble Learning–Based Model Construction

Stacking ensemble learning is a complex machine learning algorithm that combines the result of several base learners to generate predictions into the metalearner to improve predictive accuracy (18). In this study, random forest, support vector machine, gradient boosting decision tree, and adaptive boosting were set as the base learners (first level), whereas random forest served as the metalearner (second level). The methodologic details are presented in the supplemental materials. Logistic regression was also applied to generate predictions. Confusion matrix analytics (including accuracy, F1 score, recall, and precision) were used to compare the performance of different machine learning algorithms. The detailed parameters of these algorithms are presented in Supplemental Table 3.

We evaluated the predictive value of 5 different models, including the radiomics model, the combined clinical–radiomics model, IPI, the model based on TMTV, the distance between the largest lesion and the lesion farthest from that bulk, and SUVpeak (17), as well as the International Metabolic Prognostic Index (29). Receiver-operating-characteristic (ROC) curve analysis was used to compare the predictive performance of different models.

Statistical Analysis

All statistical analysis was performed using SPSS (version 26.0), R (version 4.0.5, http://www.R-project.org), and Python (version 3.10). Progression-free survival (PFS) was defined as the time from diagnosis until lymphoma progression or death from any cause. Overall survival (OS) was defined as the time from diagnosis to death from any cause or to the last follow-up. Patients still alive were censored at the date of last contact. The differences in clinical characteristics were assessed using the χ2 test and 1-way ANOVA, when appropriate. Patients were stratified into high- and low-risk groups using ROC curve analysis and maximizing the Youden index (30). Survival curves were estimated by the Kaplan–Meier analysis, and survival distributions were compared using the log-rank test. A P value of less than 0.05 was considered statistically significant.

RESULTS

Patient Characteristics and Outcome

Patients’ clinical characteristics are summarized in Table 1. No clinical characteristic had statistically significant differences among different datasets (all P > 0.05). The median follow-up intervals for the training, internal testing, and external testing sets were 41 mo (range, 4–105 mo), 44 mo (range, 6–104 mo), and 39 mo (range, 4–69 mo), respectively. By the end of follow-up, relapse and progression occurred in 56, 21, and 14 patients in the training, internal testing and external testing sets, respectively, whereas 45, 16, and 10 patients, respectively, had died.

View this table:
  • View inline
  • View popup
TABLE 1.

Patient Characteristics

Feature Selection

Among 4 segmentations, SUV4.0 segmentation showed the highest reliability, with 830 features (66.7%) retained in the context of an ICC of more than 0.8 (Supplemental Table 4). After the Pearson correlation coefficient test, 88 radiomics features were selected for SUV4.0 segmentation. The optimal features were obtained by the LASSO algorithm for construction of different stacking models (Supplemental Table 5).

Model Performance Evaluation

The model performance for 2-y PFS prediction based on different machine learning algorithms is shown in Supplemental Table 6. For the radiomics model, the stacking classifier showed better performance than the other 4 base classifiers and logistic regression, except for recall in the training set. For the combined model, the stacking classifier also demonstrated better performance than the other classifiers in the training set, internal testing set, and external testing set. Furthermore, the stacking-based combined model had higher predictive power than the radiomics model and IPI across nearly all evaluation metrics.

The model performance for 2-y OS prediction is shown in Supplemental Table 7. For the radiomics model, the stacking classifier demonstrated superior performance to the other base classifiers and logistic regression, except for precision in the internal testing set and accuracy and recall in the external testing set. For the combined model, the stacking classifier had relatively balanced performance in the training set but outperformed the other base classifiers in the internal testing set and the external testing set. Moreover, the stacking-based combined model performed better than the radiomics model and IPI.

We compared the performance of the stacking-based combined models by various combinations of base classifiers. As shown in Supplemental Tables 8 and 9, the combination of 4 base classifiers had a more balanced performance for PFS and OS prediction than did the other combinations. We also evaluated the performance of the radiomics and combined models trained on PFS prediction for predicting OS and vice versa; the results are shown in Supplemental Tables 10 and 11.

The results of ROC analysis are shown in Table 2. The combined model outperformed the other models for PFS prediction, with the area under the ROC curve (AUC) being 0.791, 0.762, and 0.771 in the training set, internal testing set, and external testing set, respectively. A similar trend was observed for OS prediction (the AUCs of the combined model were 0.843, 0.741, and 0.725 for the training set, internal testing set, and external testing set, respectively).

View this table:
  • View inline
  • View popup
TABLE 2.

AUCs of Different Models

Survival Prediction

Kaplan–Meier survival estimates of the combined model and IPI in the training set, internal testing set, and external testing set are shown in Figures 2, 3, and 4, respectively. The Kaplan–Meier survival estimates of the radiomics model are shown in Supplemental Figure 2. The differences in survival rates between low- and high-risk groups were significant except for OS in the radiomics model in the external testing set (P = 0.053). Moreover, the combined model demonstrated a more distinct risk stratification than the radiomics model and IPI, with larger differences between subgroups for both PFS and OS prediction (all P < 0.05).

FIGURE 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 2.

Kaplan–Meier curves for PFS of combined model (A), PFS of IPI (B), OS of combined model (C), and OS of IPI (D) in training set. Hazard ratio with 95% CI and log-rank P value are reported. HR = hazard ratio.

FIGURE 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 3.

Kaplan–Meier curves for PFS of combined model (A), PFS of IPI (B), OS of combined model (C), and OS of IPI (D) in internal testing set. Hazard ratio with 95% CI and log-rank P value are reported. HR = hazard ratio.

FIGURE 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
FIGURE 4.

Kaplan–Meier curves for PFS of combined model (A), PFS of IPI (B), OS of combined model (C), and OS of IPI (D) in external testing set. Hazard ratio with 95% CI and log-rank P value are reported. HR = hazard ratio.

DISCUSSION

In this study, we developed an analytic approach based on [18F]FDG PET radiomics using stacking ensemble learning for outcome prediction in DLBCL. Radiomics and combined clinical–radiomics models constructed by the stacking method outperformed those built on other single machine learning classifiers. Furthermore, the combined models integrating radiomics features and clinical information exhibited predictive performance superior to that of radiomics-only models and IPI.

To the best of our knowledge, this was the first study to evaluate the prognostic effect of [18F]FDG PET radiomics through a stacking ensemble learning approach in patients with DLBCL. Several previous studies have found that machine learning–based PET radiomics could be of prognostic importance in DLBCL (12–14). A multicenter study with 317 DLBCL patients suggested that the radiomics model based on LASSO logistic regression was predictive of 2-y time to progression, with an AUC of 0.76 (16). Another study using a LASSO-Cox algorithm reported an AUC of 0.748 for the radiomics model in the test set for PFS prediction (12). In a recent study, Jiang et al. used cross combination of 7 different machine learning algorithms for feature selection and found that the radiomics signature obtained by the support vector machine–support vector machine was highly predictive of PFS (AUC, 0.757) (14). Despite these encouraging findings, a recently developed ensemble learning approach has revealed diagnostic and prognostic advantages over a single machine learning method by aggregating multiple algorithms to achieve higher prediction accuracy (31,32). In our current study, the radiomics model built on a stacking ensemble learning approach outperformed those developed by the other 4 base classifiers and logistic regression, with AUCs of 0.715 and 0.707 for PFS prediction in the internal and external testing sets, respectively. This finding is consistent with the results from a recent radiomics study on DLBCL, in which a soft voting ensemble–based model showed higher accuracy than those based on single machine learning classifiers for 2-y event-free survival prediction (15). Notably, voting considers only linear relationships among classifiers whereas stacking is able to learn complex associations when individual base classifiers are heterogeneous (33). In our study, the combined model developed by 4 classifiers showed a more balanced performance than the other combinations, supporting the potential of stacking ensemble learning for radiomics analysis in DLBCL.

Our study also demonstrated that the combined models incorporating patient-level PET radiomics and clinical characteristics yielded higher AUCs and more distinct risk stratifications than IPI for outcome prediction in DLBCL, which is in line with previous observations (12,14,16). Recent studies suggested that the predictive ability of IPI has been weakened in the rituximab era (4). In this context, PET radiomics might add a new perspective on the phenotypic characteristics of DLBCL through profiling the intratumoral metabolic heterogeneity. Therefore, it is likely that considering both clinical and imaging features in analysis may offer a deeper understanding of the complex biologic properties of malignancy and thereby provide a better prognosis estimation.

Radiomics analysis in lymphoma remains challenging because of the lack of a primary site and the complexity of lesion delineation, particularly for disseminated disease. To date, no consensus has been reached on which segmentation method for lesion delineation in DLBCL is preferable. Although the 41%-of-SUVmax method has been recommended by the European Association of Nuclear Medicine for TMTV evaluation (21), this method is more likely to be influenced by interobserver variability (34). Other studies indicated that the SUV4.0 method could give a good approximation of TMTV for prediction of disease progression (35). On top of these, the impact of different segmentations on radiomics features for prognosis prediction in DLBCL remains to be explored. In our study, we compared the reliability of radiomics features based on 4 different segmentation methods. The SUV4.0 method yielded the highest interobserver reliability, with 830 features (66.7%) retained in ICC analysis, which is in line with the results from a recent study suggesting that SUV4.0 is the most stable approach (with excellent reliability for 84.8% of all features) among 6 semiautomatic segmentation methods (36). By contrast, the interobserver reliability of radiomics features based on 41%-of-SUVmax segmentation was the lowest in the current study, with only 46 features (3.7%) having excellent reliability. This discrepancy may correlate with differences in TMTV delineation. Previous studies demonstrated that variations in segmentation methods could have a marked effect on the outer contour of the segmentation, thereby influencing radiomics features, especially morphologic metrics (36,37). In our study, the SUV4.0 method exhibited a higher TMTV estimation and more stable radiomics features than the 41%-of-SUVmax method, indicating that a higher TMTV may cause the segmentation method to have less of an impact on radiomics features.

Several limitations of our study deserve mention. First, since this was a retrospective study with a relatively small sample size, our results need to be further validated in prospective multicenter studies involving a larger cohort of patients. Second, we applied only patient-level radiomics analysis; further studies are required to compare the impact of different lesion selection methods on radiomics analysis. Third, we applied ICC, Pearson correlation analysis, and LASSO for feature selection; further studies will be required to assess the performance of other strategies, for example, minimum redundancy maximum relevance and ReliefF. Fourth, to facilitate comparison with previous results, we used only PET images for radiomics analysis. A combination of PET and CT images may lead to the discovery of radiomics features that are more predictive. Fifth, Ki-67 expression and MYC/BCL-2 double-hit status are established prognostic factors but were not assessed in this study because of the incompleteness of the available data.

CONCLUSION

In the present study, we proposed an analytic approach using stacking ensemble learning for outcome prediction in DLBCL based on [18F]FDG PET radiomics. The stacking-based combined model that incorporates radiomics features and clinical characteristics could enable improved risk stratification in DLBCL patients.

DISCLOSURE

This study was partially supported by the National Natural Science Foundation of China (32027802), the National Key R&D Program of China (2021YFE0108300 and 2022YFE0118000), and the Key R&D Program of Zhejiang (2022C03071). No other potential conflict of interest relevant to this article was reported.

KEY POINTS

QUESTION: Can stacking ensemble learning–based [18F]FDG PET radiomics improve outcome prediction in patients with DLBCL?

PATIENT FINDINGS: In a retrospective study of 240 DLBCL patients, a stacking ensemble learning–based model that incorporates radiomics features and clinical characteristics enabled improved risk stratification.

IMPLICATIONS FOR PATIENT CARE: The stacking ensemble learning–based model incorporating PET radiomics and clinical information can be useful for better survival prediction and therapeutic decision making.

Footnotes

  • Published online Jul. 27, 2023.

  • © 2023 by the Society of Nuclear Medicine and Molecular Imaging.

REFERENCES

  1. 1.↵
    1. Tilly H,
    2. Gomes da Silva M,
    3. Vitolo U,
    4. et al
    . Diffuse large B-cell lymphoma (DLBCL): ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol. 2015;26(suppl 5):v116–v125.
    OpenUrlCrossRefPubMed
  2. 2.↵
    1. Crump M,
    2. Neelapu SS,
    3. Farooq U,
    4. et al
    . Outcomes in refractory diffuse large B-cell lymphoma: results from the international SCHOLAR-1 study. Blood. 2017;130:1800–1808.
    OpenUrlAbstract/FREE Full Text
  3. 3.↵
    International Non-Hodgkin’s Lymphoma Prognostic Factors Project. A predictive model for aggressive non-Hodgkin’s lymphoma. N Engl J Med. 1993;329:987–994.
    OpenUrlCrossRefPubMed
  4. 4.↵
    1. Zhou Z,
    2. Sehn LH,
    3. Rademaker AW,
    4. et al
    . An enhanced International Prognostic Index (NCCN-IPI) for patients with diffuse large B-cell lymphoma treated in the rituximab era. Blood. 2014;123:837–842.
    OpenUrlAbstract/FREE Full Text
  5. 5.↵
    1. Tian M,
    2. He X,
    3. Jin C,
    4. et al
    . Transpathology: molecular imaging-based pathology. Eur J Nucl Med Mol Imaging. 2021;48:2338–2350.
    OpenUrl
  6. 6.↵
    1. Barrington SF,
    2. Kluge R
    . FDG PET for therapy monitoring in Hodgkin and non-Hodgkin lymphomas. Eur J Nucl Med Mol Imaging. 2017;44(suppl 1):97–110.
    OpenUrlCrossRefPubMed
  7. 7.↵
    1. Zhang X,
    2. Jiang H,
    3. Wu S,
    4. et al
    . Positron emission tomography molecular imaging for phenotyping and management of lymphoma. Phenomics. 2022;2:102–118.
    OpenUrl
  8. 8.↵
    1. Cottereau AS,
    2. Lanic H,
    3. Mareschal S,
    4. et al
    . Molecular profile and FDG-PET/CT total metabolic tumor volume improve risk classification at diagnosis for patients with diffuse large B-cell lymphoma. Clin Cancer Res. 2016;22:3801–3809.
    OpenUrlAbstract/FREE Full Text
  9. 9.↵
    1. Toledano MN,
    2. Desbordes P,
    3. Banjar A,
    4. et al
    . Combination of baseline FDG PET/CT total metabolic tumour volume and gene expression profile have a robust predictive value in patients with diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2018;45:680–688.
    OpenUrl
  10. 10.↵
    1. Lambin P,
    2. Leijenaar RTH,
    3. Deist TM,
    4. et al
    . Radiomics: the bridge between medical imaging and personalized medicine. Nat Rev Clin Oncol. 2017;14:749–762.
    OpenUrlCrossRefPubMed
  11. 11.↵
    1. Bi WL,
    2. Hosny A,
    3. Schabath MB,
    4. et al
    . Artificial intelligence in cancer imaging: clinical challenges and applications. CA Cancer J Clin. 2019;69:127–157.
    OpenUrlPubMed
  12. 12.↵
    1. Zhang X,
    2. Chen L,
    3. Jiang H,
    4. et al
    . A novel analytic approach for outcome prediction in diffuse large B-cell lymphoma by [18F]FDG PET/CT. Eur J Nucl Med Mol Imaging. 2022;49:1298–1310.
    OpenUrl
  13. 13.↵
    1. Frood R,
    2. Clark M,
    3. Burton C,
    4. et al
    . Discovery of pre-treatment FDG PET/CT-derived radiomics-based models for predicting outcome in diffuse large B-cell lymphoma. Cancers (Basel). 2022;14:1711.
    OpenUrl
  14. 14.↵
    1. Jiang C,
    2. Li A,
    3. Teng Y,
    4. et al
    . Optimal PET-based radiomic signature construction based on the cross-combination method for predicting the survival of patients with diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2022;49:2902–2916.
    OpenUrl
  15. 15.↵
    1. Ritter Z,
    2. Papp L,
    3. Zámbó K,
    4. et al
    . Two-year event-free survival prediction in DLBCL patients based on in vivo radiomics and clinical parameters. Front Oncol. 2022;12:820136.
    OpenUrl
  16. 16.↵
    1. Eertink JJ,
    2. van de Brug T,
    3. Wiegers SE,
    4. et al
    . 18F-FDG PET baseline radiomics features improve the prediction of treatment outcome in diffuse large B-cell lymphoma. Eur J Nucl Med Mol Imaging. 2022;49:932–942.
    OpenUrl
  17. 17.↵
    1. Eertink JJ,
    2. Zwezerijnen GJC,
    3. Cysouw MCF,
    4. et al
    . Comparing lesion and feature selections to predict progression in newly diagnosed DLBCL patients with FDG PET/CT radiomics features. Eur J Nucl Med Mol Imaging. 2022;49:4642–4651.
    OpenUrl
  18. 18.↵
    1. Naimi AI,
    2. Balzer LB
    . Stacked generalization: an introduction to super learning. Eur J Epidemiol. 2018;33:459–464.
    OpenUrlPubMed
  19. 19.↵
    1. Chawla NV,
    2. Bowyer KW,
    3. Hall LO,
    4. Kegelmeyer WP
    . SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321–357.
    OpenUrlCrossRefPubMed
  20. 20.↵
    1. Bergstra J,
    2. Bengio Y
    . Random search for hyper-parameter optimization. J Mach Learn Res. 2012;13:281–305.
    OpenUrl
  21. 21.↵
    1. Boellaard R,
    2. Delgado-Bolton R,
    3. Oyen WJ,
    4. et al
    . FDG PET/CT: EANM procedure guidelines for tumour imaging—version 2.0. Eur J Nucl Med Mol Imaging. 2015;42:328–354.
    OpenUrlCrossRefPubMed
  22. 22.↵
    1. Nioche C,
    2. Orlhac F,
    3. Boughdad S,
    4. et al
    . LIFEx: a freeware for radiomic feature calculation in multimodality imaging to accelerate advances in the characterization of tumor heterogeneity. Cancer Res. 2018;78:4786–4789.
    OpenUrlAbstract/FREE Full Text
  23. 23.↵
    1. Wahl RL,
    2. Jacene H,
    3. Kasamon Y,
    4. Lodge MA
    . From RECIST to PERCIST: evolving considerations for PET response criteria in solid tumors. J Nucl Med. 2009;50(suppl 1):122S–150S.
    OpenUrlAbstract/FREE Full Text
  24. 24.↵
    1. Shiri I,
    2. Vafaei Sadr A,
    3. Amini M,
    4. et al
    . Decentralized distributed multi-institutional PET image segmentation using a federated deep learning framework. Clin Nucl Med. 2022;47:606–617.
    OpenUrl
  25. 25.↵
    1. Pfaehler E,
    2. van Sluis J,
    3. Merema BBJ,
    4. et al
    . Experimental multicenter and multivendor evaluation of the performance of PET radiomic features using 3-dimensionally printed phantom inserts. J Nucl Med. 2020;61:469–476.
    OpenUrlAbstract/FREE Full Text
  26. 26.↵
    1. van Griethuysen JJM,
    2. Fedorov A,
    3. Parmar C,
    4. et al
    . Computational radiomics system to decode the radiographic phenotype. Cancer Res. 2017;77:e104–e107.
    OpenUrlAbstract/FREE Full Text
  27. 27.↵
    1. Zwanenburg A,
    2. Vallières M,
    3. Abdalah MA,
    4. et al
    . The Image Biomarker Standardization Initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. 2020;295:328–338.
    OpenUrlCrossRefPubMed
  28. 28.↵
    1. Orlhac F,
    2. Boughdad S,
    3. Philippe C,
    4. et al
    . A postreconstruction harmonization method for multicenter radiomic studies in PET. J Nucl Med. 2018;59:1321–1328.
    OpenUrlAbstract/FREE Full Text
  29. 29.↵
    1. Mikhaeel NG,
    2. Heymans MW,
    3. Eertink JJ,
    4. et al
    . Proposed new dynamic prognostic index for diffuse large B-cell lymphoma: International Metabolic Prognostic Index. J Clin Oncol. 2022;40:2352–2360.
    OpenUrl
  30. 30.↵
    1. Ruopp MD,
    2. Perkins NJ,
    3. Whitcomb BW,
    4. Schisterman EF
    . Youden index and optimal cut-point estimated from observations affected by a lower limit of detection. Biom J. 2008;50:419–430.
    OpenUrlCrossRefPubMed
  31. 31.↵
    1. Chassagnon G,
    2. Vakalopoulou M,
    3. Battistella E,
    4. et al
    . AI-driven quantification, staging and outcome prediction of COVID-19 pneumonia. Med Image Anal. 2021;67:101860.
    OpenUrlCrossRefPubMed
  32. 32.↵
    1. Papp L,
    2. Spielvogel CP,
    3. Grubmuller B,
    4. et al
    . Supervised machine learning enables non-invasive lesion characterization in primary prostate cancer with [68Ga]Ga-PSMA-11 PET/MRI. Eur J Nucl Med Mol Imaging. 2021;48:1795–1805.
    OpenUrl
  33. 33.↵
    1. Heisler M,
    2. Karst S,
    3. Lo J,
    4. et al
    . Ensemble deep learning for diabetic retinopathy detection using optical coherence tomography angiography. Transl Vis Sci Technol. 2020;9:20.
    OpenUrlPubMed
  34. 34.↵
    1. Ilyas H,
    2. Mikhaeel NG,
    3. Dunn JT,
    4. et al
    . Defining the optimal method for measuring baseline metabolic tumour volume in diffuse large B cell lymphoma. Eur J Nucl Med Mol Imaging. 2018;45:1142–1154.
    OpenUrl
  35. 35.↵
    1. Barrington SF,
    2. Zwezerijnen B,
    3. de Vet HCW,
    4. et al
    . Automated segmentation of baseline metabolic total tumor burden in diffuse large B-cell lymphoma: which method is most successful? A study on behalf of the PETRA consortium. J Nucl Med. 2021;62:332–337.
    OpenUrlAbstract/FREE Full Text
  36. 36.↵
    1. Eertink JJ,
    2. Pfaehler EAG,
    3. Wiegers SE,
    4. et al
    . Quantitative radiomics features in diffuse large B-cell lymphoma: does segmentation method matter? J Nucl Med. 2022;63:389–395.
    OpenUrlAbstract/FREE Full Text
  37. 37.↵
    1. Belli ML,
    2. Mori M,
    3. Broggi S,
    4. et al
    . Quantifying the robustness of [18F]FDG-PET/CT radiomic features with respect to tumor delineation in head and neck and pancreatic cancer patients. Phys Med. 2018;49:105–111.
    OpenUrl
  • Received for publication November 23, 2022.
  • Revision received May 31, 2023.
PreviousNext
Back to top

In this issue

Journal of Nuclear Medicine: 64 (10)
Journal of Nuclear Medicine
Vol. 64, Issue 10
October 1, 2023
  • Table of Contents
  • Table of Contents (PDF)
  • About the Cover
  • Index by author
  • Complete Issue (PDF)
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Journal of Nuclear Medicine.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Stacking Ensemble Learning–Based [18F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma
(Your Name) has sent you a message from Journal of Nuclear Medicine
(Your Name) thought you would like to see the Journal of Nuclear Medicine web site.
Citation Tools
Stacking Ensemble Learning–Based [18F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma
Shuilin Zhao, Jing Wang, Chentao Jin, Xiang Zhang, Chenxi Xue, Rui Zhou, Yan Zhong, Yuwei Liu, Xuexin He, Youyou Zhou, Caiyun Xu, Lixia Zhang, Wenbin Qian, Hong Zhang, Xiaohui Zhang, Mei Tian
Journal of Nuclear Medicine Oct 2023, 64 (10) 1603-1609; DOI: 10.2967/jnumed.122.265244

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
Stacking Ensemble Learning–Based [18F]FDG PET Radiomics for Outcome Prediction in Diffuse Large B-Cell Lymphoma
Shuilin Zhao, Jing Wang, Chentao Jin, Xiang Zhang, Chenxi Xue, Rui Zhou, Yan Zhong, Yuwei Liu, Xuexin He, Youyou Zhou, Caiyun Xu, Lixia Zhang, Wenbin Qian, Hong Zhang, Xiaohui Zhang, Mei Tian
Journal of Nuclear Medicine Oct 2023, 64 (10) 1603-1609; DOI: 10.2967/jnumed.122.265244
Twitter logo Facebook logo LinkedIn logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One
Bookmark this article

Jump to section

  • Article
    • Visual Abstract
    • Abstract
    • MATERIALS AND METHODS
    • RESULTS
    • DISCUSSION
    • CONCLUSION
    • DISCLOSURE
    • Footnotes
    • REFERENCES
  • Figures & Data
  • Supplemental
  • Info & Metrics
  • PDF

Related Articles

  • PubMed
  • Google Scholar

Cited By...

  • No citing articles found.
  • Google Scholar

More in this TOC Section

  • First-in-Human Study of 18F-Labeled PET Tracer for Glutamate AMPA Receptor [18F]K-40: A Derivative of [11C]K-2
  • Detection of HER2-Low Lesions Using HER2-Targeted PET Imaging in Patients with Metastatic Breast Cancer: A Paired HER2 PET and Tumor Biopsy Analysis
  • [11C]Carfentanil PET Whole-Body Imaging of μ-Opioid Receptors: A First in-Human Study
Show more Clinical Investigation

Similar Articles

Keywords

  • PET
  • diffuse large B-cell lymphoma
  • prognosis
  • machine learning
  • radiomics
SNMMI

© 2025 SNMMI

Powered by HighWire