Abstract
276
Purpose: Multicentric studies are solely lacking for a convincing demonstration of the clinical potential value of radiomics as a prognostic or predictive tool. This is due to the fact that radiomic features are sensitive to variability in scanner models, acquisition protocols and reconstruction settings, which are unavoidable in a multicentric setting. A statistical harmonization method (ComBat) was developed to deal with center-effect. Our goal was to evaluate two improvements of ComBat allowing for more flexibility in choosing a reference and improving the robustness of the estimation.
Methods: M-ComBat allows to transform all features distribution to a chosen reference, instead of the overall mean, avoiding such issues as impossible values in the transformed features. B-ComBat adds bootstrap and Monte Carlo for improved robustness. BM-ComBat combines both. These were compared regarding their ability to harmonize features in a multicentric context in two different clinical settings: i) 119 locally advanced cervical cancer (LACC) patients with clinical as well as MRI and PET features and 3 well-identified labels (the 3 clinical centers) to apply ComBat; ii) 98 locally advanced laryngeal cancer (LALC) patients with contrast-enhanced CT features, from 5 centers with highly heterogeneous imaging settings, even within each site. Here, unsupervised clustering was used to determine 2 labels. The impact of the improvements to ComBat on the performance was evaluated through three different machine learning pipelines for predicting the clinical outcomes, across 2 metrics (balanced accuracy-BAcc and Matthews correlation coefficient-MCC).
Results: Before harmonization, more than 96% of radiomic features had significantly different distributions between labels. These differences were successfully removed with all ComBat versions. The predictive ability of the radiomic models was always improved with harmonization and the improved BM-ComBat provided the best results. This was observed consistently in both datasets, through all machine learning pipelines and performance metrics (table 1).
Conclusions: The proposed modifications allow for more flexibility and robustness in the estimation. They also slightly but consistently improve the predictive power of the resulting radiomic models. Acknowledgement: This study was partly funded by 766276 PREDICT H2020-MSCA-ITN-2017 and M2R201806006004