Abstract
40
Purpose: Radiomic features are potential imaging biomarkers for prognosis and predictive modeling in oncology. Selection of features sufficiently robust or even completely insensitive to the variability of imaging characteristics in multicenter data (scanner mode, acquisition protocols and/or reconstruction settings), may help building robust models, however it also may lead to loss of potentially useful information (predictive features are being discarded beforehand). Our goal was to compare two approaches for radiomic predictive modelling relying on embedded feature selection methods of machine learning: using only robust features or all available features, in combination with the ComBat harmonization.
Methods: ninety-two IBSI-compliant radiomic features were extracted from FDG-PET and MRI sequences (T1, T1c, T2 and ADC maps) of 189 cervical cancer patients treated with radiochemotherapy in 3 centers (Brest, n = 117 and Nantes, n = 44 in France, and Montreal, n = 28, in Canada). The distributions of the features were harmonized using different versions of ComBat including improved versions recently developed[1]. An interclass coefficient coefficient (ICC) > 0.90 was used to identify features robust across the 3 centers and the different scanners (two different PET scanners were used in Montreal). The predictive ability of these robust features was compared to the ones selected by three different machine learning pipelines: LASSO for multivariate regression (MR) and embedded feature selection associated with Random Forest (RF) and Support Vector Machine (SVM). After splitting the data 70/30 in training and testing sets, models to predict recurrence were built using the 3 pipelines that used either the untransformed or harmonized (with the 4 ComBat versions) features. They were compared using Matthews correlation coefficient (MCC), Area under the curve (AUC), and balanced accuracy (BAC) given the class imbalance.
Results: Models built using features selected by either LASSO or RF/SVM embedded selection showed a consistent better predictive ability, with or without harmonization. In MR, models built using LASSO-based features outperformed those relying on ICC-selected features (BAcc 79% vs. 43%, AUC 80% vs. 45%, and MCC 0.48 vs. 0.05) using the untransformed data. Similar comparison was observed with RF and SVM, models built using embedded selected features outperforming those relying only on ICC-selected ones. Similarly to our previous results, performance of models was also consistently improved using harmonized features compared to untransformed ones, even in the case of models using only robust features.
Conclusions: We have shown that feeding all available features to any machine learning pipeline relying on embedded features selection techniques leads to models with much better predictive performance than pre-selecting a smaller set of features based on their robustness to changes in multicentre data. This suggests that at least in our dataset, informative and predictive features were the ones sensitive to changes in imaging properties of the multicentre data and this information was lost during the pre-selection. We thus recommend on pooling datasets together then perform standardization to ensure that informative features are kept for model building. This study was partially funded by 766276 PREDICT H2020-MSCA-ITN-2017