Abstract
2410
Introduction: Total metabolic tumor volume (TMTV) and tumor dissemination (Dmax) calculated from baseline [18F]FDG-PET/CT images are prognostic biomarkers in Diffuse Large B-cell lymphoma (DLBCL) patients. Yet, their automated calculation remains challenging. In this work, we investigated whether TMTV and Dmax features could be replaced by surrogate features automatically calculated using an artificial intelligence (AI) algorithm from only two maximum intensity projections (MIP) of the whole-body [18F]FDG-PET images.
Methods: Two cohorts of DLBCL patients from the REMARC (NCT01122472) and LNH073B (NCT00498043) trials were retrospectively analyzed. Experts delineated lymphoma lesions from the baseline whole-body [18F]FDG-PET/CT images, from which TMTV and Dmax were measured. Coronal and sagittal MIP images and associated 2D MIP lesion masks were calculated. An AI algorithm was trained on the REMARC MIP data (297 patients) to segment lymphoma regions. The trained AI algorithm was tested on the unseen LNH073B MIP cohort (174 patients). The AI-based MIP segmentation results were then used to estimate surrogate TMTV (sTMTV) and surrogate Dmax (sDmax) on both datasets. The ability of the original (3D) and surrogate (MIP-based) TMTV and Dmax to stratify patients was compared.
Results: The AI algorithm, evaluated patient-wise, achieved a 0.80 median Dice score (interquartile range [IQR]: 0.63-0.89), 80.7% (IQR: 64.5%-91.3%) sensitivity, and 99.7% (IQR: 99.4%-0.99.9%) specificity on the REMARC data. On the LNH073B data, the AI algorithm yielded a 0.86 (IQR: 0.77-0.92) Dice score, 87.9% (IQR: 74.9.0%-94.4%) sensitivity, and 99.7% (IQR: 99.4%-99.8%) specificity. The Dice score was not significantly different on the coronal and sagittal views (p>0.05). For the biomarker and survival analysis, 382 patients [287 REMARC, 95 LNH073B] (mean age, 62.1 years ±13.4 [standard deviation]; 207 men) for whom survival data was available were evaluated. Expert-based 3D biomarkers were significantly correlated with the associated surrogate biomarkers obtained automatically using AI. sTMTV was highly correlated with TMTV for REMARC and LNH073B datasets (Spearman r=0.878 and r=0.752 respectively), and so were sDmax and Dmax (r=0.709 and r=0.714 respectively). For the REMARC data, the hazard ratios (HR) for progression free survival (PFS) of 3D- and MIP-based features were similar, e.g., TMTV: 11.24 (95% confidence interval (CI): 2.10-46.20), sTMTV: 11.81 (95% CI: 3.29-31.77), and Dmax: 9.0 (95% CI: 2.53-23.63), sDmax: 12.49 (95% CI: 3.42-34.50). The time-dependent areas under the ROC curves (tdAUC) were TMTV: 0.67, sTMTV: 0.65, and Dmax: 0.65, sDmax: 0.68. For the LNH073B data, the tdAUC for PFS of 3D- and MIP-based features derived using AI were TMTV: 0.62 (95% CI: 0.49-0.75), sTMTV: 0.66 (95% CI: 0.53-0.80), and Dmax: 0.56 (95% CI: 0.39-0.72), sDmax: 0.58 (95% CI: 0.41-0.74). The concordance index of the [18F]FDG-PET/CT expert-delineated biomarkers and [18F]FDG-PET MIP AI-driven surrogate biomarkers were 0.861 (between sTMTV and TMTV) and 0.775 (between Dmax and sDmax) on the REMARC cohort. On the LNH073B data, they were 0.784 (between TMTV and sTMTV) and 0.768 (between Dmax and sDmax). The classification of patients into three risk groups using the expert-driven 3D-based TMTV and Dmax agreed with the patient’s classification based on the AI-driven MIP sTMTV and sDmax. Visual assessment of the segmentation results suggested that the MIP-based surrogate biomarkers tend to perform well compared to the 3D-based biomarkers when the patient had lesions spread over the body and performed less well when the patient had a large bulky lesion.
Conclusions: Surrogate TMTV and Dmax calculated from only 2 PET MIP images are prognostic biomarkers in DLBCL patients and can be automatically estimated using an AI algorithm. It might considerably facilitate the calculation and usage of these features in clinical practice.