Abstract
242130
Introduction: Lung cancer poses a significant global health challenge, necessitating improved prognostic methods for personalized treatment. This study explores the integration of clinical and imaging features to enhance overall survival (OS) predictions. Clinical features, including demographics, molecular characteristics, medical history and others provide insights into individualized disease profiles, while imaging features offer a detailed visualization of the cancer's dynamics. Leveraging advanced measures, namely handcrafted radiomics (RF) and deep features (DF) combined with machine learning, our integrated approach aims to address the complex heterogeneity of lung cancer, improving the accuracy of OS predictions and guiding tailored treatment strategies for better patient outcomes.
Methods: Our study enrolled 199 lung cancer patients who had PET/CT and clinical information from The Cancer Imaging Archive (33 patients) and our local clinical database (166 patients). In the pre-processing stage, PET images were first registered to CT images for added accuracy, and subsequent enhancements, including Standardized Uptake Value correction, clipping, and normalization, were applied to optimize the images. This study investigates impact of clinical features on OS prediction, when combining with 2 kinds of imaging features including (i) a set of 215 RFs extracted from manually segmented tumors via the standardized ViSERA software, and (ii) 1024 DFs through the bottleneck layer of a 3D Autoencoder on 3 different masks, including whole, cropped (32×32×32 mm3), and segmented PET/CT images. Different clinical features categories such as surgical, biopsy, clinical history, tumor staging, chemo&radiotherapy, and demographics information in addition to DFs and RFs were utilized to predict OS outcomes. Various hybrid machine learning systems (HMLSs), consisting of 3 feature selection methods (regulated on selection of 20 relevant features) linked with 10 regression or 3 survival regression algorithms were employed in this study. Regressor parameters were optimized using 5-fold cross-validation and grid-search methods. The data was split into 80% for 5-fold cross-validation and 20% for external nested testing.
Results: In the RF framework, not utilizing any clinical features, a mean absolute error (MAE) of 0.74±0.08 years [outcome range: 0.11-6.6 years] for 5-fold cross-validation, with external nested test of 1.03±0.2 years was obtained through CT-RF (RFs extracted from CT) combined with F-Regression feature selection (FR) + Gradient Boosting (GB) (Fig.1). In DF framework, without any clinical features, PET-C-DF (DFs extracted from cropped PET image) combined with FR+ Extra Trees (ET) significantly outperformed the performance provided via RF framework, providing an MAE of 0.38±0.03 with external nested test of 1.04±0.23 (P-value<0.0001, paired t-test). Sole clinical feature provided an MAE of 0.17±0.01 with external nested test of 0.6±0.11 by R-Regression feature selection (RR) + ET. Adding clinical features to both RF and DF frameworks overall enhanced performance (both 5-fold cross-validation and external nested tests) in most HMLSs such that the lowest MAE of 0.13±0.04 with external nested test of 0.55±0.05 was provided through combination of PET-W-DF (DFs extracted from whole PET image) and clinical features applied to Mutual Information feature selection + GB. In survival analysis, RR + Fast Survival Support Vector Machine applied to combination including clinical features resulted in the highest c-index of 0.82±0.01 and a Log Rank p-value<<0.0001 with external nested testing of 0.77± 0.08 (see Fig. 2).
Conclusions: Our study indicated that combining clinical features significantly added value to survival prediction performance compared to usage of sole imaging data. Furthermore, we showed that sole usage of DFs combined with appropriate HMLS, compared to sole usage of RFs, can significantly enhance survival prediction performance.