Abstract
423
Objectives Lung cancer is the most common cancer worldwide, which usually present as solid pulmonary nodules (SPNs) on early diagnostic images. Classification of malignant disease at this early timepoint is critical for improving success of surgical resection and increasing 5 year survival. This study aimed to determine whether quantitative heterogeneity derived from various radiomics features on dual time PET images (DTPI) can help to differentiate between malignant and benign SPN.
Methods We retrospectively analyzed data from patients with solid pulmonary nodules (SPNs) who had DTPI 18F-FDG PET/CT between 2004 and 2014 at a single institution. Patient without pathology confirmed diagnosis or those receiving surveillance follow-up imaging less than 1 year were removed from this study. The early PET scans and delayed PET scans were performed in the same session at 60min and 180min post injection, respectively. Nodules were identified and segmented on each scan by two experienced nuclear medicine physicians. The metabolic volumes (MV) on early PET scans were calculated, and MV smaller than 5cc were removed from this study. Early and delayed lesion MVs were used to extract 59 imaging features from early PET, early CT, and delayed PET images. SUVmax of each lesion were calculated on both early PET and delayed PET images. Retention index (RI) of SUVmax was calculated for each lesion as the percent change in SUVmax from early to delayed timepoints. Four Support Vector Machine (SVM) models were made to classify the lesions into malignant and benign. The four models were built with different feature sets: (1) early PET features, (2) early PET and early CT features, (3) delay PET features, (4) all PET and CT feature sets. Sequential forward floating selection in a nested cross validation was used to reduce the dimension of the features and evaluated the performance of the models. The ROC analysis was performed on clinical metrics, early SUVmax and RI of SUVmax, and each model individually. The areas under curve (AUC) of the ROC curves were calculated and compared using DeLong's test.
Results In total, lesions from 85 patients were included in this study (63 malignant nodules and 22 benign nodules). AUC for SUVmax was 0.77. All of the models showed a larger AUC than early SUVmax. Models 1 and 2 had larger AUCs than early SUVmax (0.83 and 0.84, respectively), but the improvements were not significant. Models 3 and 4 had significantly larger AUCs, which compared to early SUVmax (0.90 and 0.91, respectively). Employing an optimal cut-off, Models 3-4 outperformed clinical characteristics in sensitivity, specificity, and accuracy (table 1). Commonly, early SUVmax of 2.5 or greater is used for clinical diagnosis; however, this was shown to have very low specificity compared to other metrics in this study. AUC for RI was much lower of only 0.56.Using RI of SUVmax of 10% to diagnose performed similarly to standard SUVmax > 2.5. Three PET features (busyness, coarseness and cluster prominence) were critical in our ability to discriminate benign from malignant nodules, as they had the highest frequency of inclusion in cross-validation of all models.
Conclusions Quantitative heterogeneity by texture features on FDG DTPI images were useful for discriminating benign from malignant nodules in larger SPNs, especially on delayed FDG PET images. SVMs using texture features extracted from both time points in DTPI FDG PET/CT images achieved significant improvements over standard clinical metrics in discriminating benign from malignant nodules.
Table 1 Diagnostic value for differentiation of malignant and benign SPNs with models and metrics