Abstract
241056
Introduction: Convolutional Neural Networks (CNNs) show potential for treatment outcome prediction in diffuse large B-cell lymphoma (DLBCL). The use of maximum intensity projections (MIP) derived from [18F]-FDG PET/CT baseline scans has facilitated the implementation of such models, decreasing the computational cost and the data storage required compared to whole scan datasets. The main advantage of CNNs is that no tumor segmentations are required in the process, which are time consuming and require manual intervention. The aim of this study was to extend the validation of a MIP-CNN model previously developed with the HOVON-84 trial data and validated on the PETAL trial data by Ferrández et al. (Sci Rep, 2023) to 5 other international clinical trials (GSTT15[Mikhaeel et al. EJNMMI, 2016], IAEA[Carr et al. JNM, 2014], NCRI[Mikhaeel et al. Br J Haematol, 2021], SAKK[Mamot et al. JCO, 2015] and HOVON-130[Chamuleau et al. Haematol, 2020]). The predictive performance of the MIP-CNN model was compared to the international prognostic index (IPI) and to two other PET segmentation-dependent models (clinical PET and, PET model).
Methods: A total of 1140 DLBCL patients were included in this study, of whom 296 HOVON84 patients were used to train the MIP-CNN model and 844 for external validation. The primary outcome was 2-year time-to-progression (TTP). The MIP-CNN model was trained on coronal and sagittal MIPs. The clinical PET model reported by Eertink et al. (Blood, 2023) included metabolic tumor volume (MTV), maximum distance from the bulkiest lesion and another lesion (Dmaxbulk), peak standardized uptake value (SUVpeak), age and performance status whereas the PET model included only PET data, i.e. MTV, Dmaxbulk and SUVpeak. Model performance was assessed using the area under the curve (AUC) of the receiver operating characteristic curve.
Results: For each individual trial, the performance of the three predictive models was consistently better than the IPI (Figure 1). The increase in the AUC for the MIP-CNN model ranged from 2% to 15% compared to IPI. The IAEA trial showed the lowest AUC increase (0.56 to 0.57) and the PETAL trial showed the largest AUC increase (0.65 to 0.74) for MIP-CNN model. The PET model showed an improved AUC compared to IPI ranging from 5% to 27% with the lowest AUC increase for HOVON-130 (0.53 to 0.55) and the largest AUC increase for PETAL (0.65 to 0.78). The clinical PET model AUC increase ranged from 13% to 37% compared to IPI, with the lowest AUC increase for HOVON-130 (0.53 to 0.6) and the largest AUC increase for SAKK (0.51 to 0.7). PETAL was the external trial with the highest AUC overall for the 3 different models (MIP-CNN = 0.74, PET = 0.78 and clinical PET = 0.75) and HOVON-130 the study with the lowest AUC (MIP-CNN = 0.56, PET = 0.55 and clinical PET = 0.6) (Figure 2).
Conclusions: The MIP-CNN was predictive of outcome in 6 individual external DLBCL trials with a higher performance than IPI. The PET model had comparable performance as the clinical PET model, which are both based on tumor segmentations. Our MIP-CNN can predict treatment outcome in DLBCL without tumor segmentation, but at the cost of a lower prognostic performance compared to segmentation-dependent models.