Abstract
3166
Introduction: Medical imaging data frequently encounter image-generation heterogeneity and class imbalance properties, challenging strong generalized predictive performances with data-driven learning methods. The purpose of this study was to investigate the impact of harmonization and oversampling methods for multi-center imbalanced datasets in PET, with specific application to radiomics-based predictive modeling of histologic subtype of non-small cell lung cancer (NSCLC).
Methods: Radiomics analysis was performed on PET images acquired on multi-vendor (Philips, Siemens and GE) PET/CT scanners, and reconstructed using different methods (i.e., VPFX, VPHD, VPHDS, OSEM). Hundred twenty five patients with adenocarcinoma (ADC) and 27 patients with squamous cell carcinoma (SCC) from two independent institutions were randomly divided into training (50%) and testing (50%) datasets, with approximately matching class-imbalance proportions, repeating this process 50 times for further statistical analysis. The predictive performance was investigated for 25 cross-combinations derived from no harmonization or 4 harmonization methods (ComBat, centering-scaling, Singular Value Decomposition (SVD)-based matrix factorization and Independent Component Analysis (ICA)-based matrix factorization) coupled to no oversampling (NOS) or 4 oversampling methods (synthetic minority oversampling technique (SMOTE), adaptive synthetic (ADASYN), borderline-SMOTE (BLSMOTE) and safe-level-SMOTE (SLSMOTE)). Before feature extraction all images were interpolated to isotropic voxel spacing of 4 ×4×4 mm3. Two hundred fifteen radiomic features (79 first order (including morphological, statistical, histogram and intensity-histogram), 136 3D texture features (including GLCM, GLRLM, GLSZM, GLDZM, NGTDM and NGLDM matrices) ) were extracted using the standardized publicly-available standardized environment for radiomics analysis (SERA). The minimum redundancy-maximum relevance (MRMR) feature selection method was used to reduce feature dimensionality, and the top k features were input into logistic regression classifier (k was determined via 5-fold cross validation within the training set). Area under the receiver operating characteristic curve (AUC) and balanced accuracy were used to evaluate predictive performance, and p-values were reported using the paired t-test for comparison of methods.
Results: ComBat harmonization (AUC 0.711; balanced accuracy 0.641) and BLSMOTE oversampling method (AUC 0.667; balanced accuracy 0.607) showed good mean performance amongst harmonization and oversampling methods employed in this study. The optimal harmonization and oversampling methods ComBat + BLSMOTE performed significantly better than combination of no harmonization + no oversampling (NOS) in terms of AUC (0.708 ± 0.062 vs. 0.640 ± 0.064, p < 0.0001) and balanced accuracy (0.651 ± 0.067 vs. 0.544 ± 0.031, p < 0.0001).
Conclusions: Our study showed a significant positive improvement in NSCLC-subtype predictive performance in multi-center imbalanced PET radiomics analysis when applying harmonization and oversampling methods. Harmonization improved data overall-consistency by removing batch effect, while oversampling increased intra-class diversity by generating new samples, which could potentially improve biological status capturing in NSCLC.