Abstract
1411
Objectives: PET-derived tumor burden biomarkers have shown promising results for risk stratification and response assessment 1 2 3. Their measurement requires detection of all tumor sites, an often demanding task which is not practical to perform manually in the clinical routine. Deep learning has achieved high accuracy in medical images for lesion detection and segmentation 4. However, de novo training of deep learning algorithms typically requires large datasets with expert annotated ground truth to reach sufficient accuracy. In this work we show how features extracted by a convolutional neural network trained for detection of tumor sites in 18F-FDG PET/CT can be exploited to train a deep learning algorithm for whole-body tumor assessment in 68Ga-PSMA-11 PET/CT using transfer learning 5.
Methods: 50 consecutive patients referred to 68Ga-PSMA-11 PET/CT for assessment of either primary staging or biochemical recurrence and 20 consecutive patients referred to the same modality for all indications of prostate cancer were respectively assessed by two experienced nuclear physicians. Each region with elevated tracer uptake was segmented using semi-automatic tools and labeled as either physiological or suspicious. Each region was also assigned an anatomical location among a set of 46 possible sites, including sites of physiological tracer uptake and anatomical sites relevant for staging. The total group of 70 patients was split between training (60%) and testing (40%). A convolutional neural network (CNN) trained on 629 subjects with lymphoma or lung cancer to evaluate regions of elevated uptake in 18F-FDG PET/CT 6 was used for transfer learning to 68Ga-PSMA-11 PET/CT. Network layers used to extract hierarchical image features were kept fixed, while fully connected layers used to determine the output were trained using 68Ga-PSMA-11 PET/CT examples. All CNN hyperparameters were kept constant. The hold-out test set was used exclusively to evaluate the performance of the CNN. Training was also performed without transfer learning to compare performance. To evaluate the performance of the CNN for whole-body tumor burden assessment, elevated uptake regions of patients in the testing set were systematically selected with SUVmax>3, delineated with 45% of local SUVmax and labeled using the CNN. Regions classified as physiological were discarded and the remaining regions were used to generate a fully automated estimation of the total tumor mask. Precision, recall and dice score of the total tumor mask segmentation were assessed with respect to the total tumor mask determined by expert annotation.
Results: In total 1955 regions with elevated tracer uptake were annotated, including 1470 regions with physiological uptake and 485 with suspicious uptake, of which 59 were pelvic lymph nodes, 109 were distant lymph nodes and 285 were bone lesions. For the annotated regions in the test set, f-score for classification as physiologic or suspicious was 0.95/0.91 with/without transfer learning, anatomical labeling average per-class recall was 0.64/0.50 with/without transfer learning. For the test subjects where at least one suspicious finding was identified in the annotations (23/28), median per subject recall of the whole-body tumor mask was 0.80 [Inter Quartile Range (IQR): 0.54-0.96], precision was 0.62 (IQR: 0.42-0.90), dice score was 0.61 (IQR 0.50-0.77).
Conclusions: In this study we show how a deep learning algorithm trained on an extended set of 18F-FDG PET/CT images can be modified to assess 68Ga-PSMA-11 PET/CT using a limited training set and leveraging transfer learning. The optimized CNN can evaluate regions of elevated tracer uptake for whole-body tumor burden assessment in good agreement with expert evaluation. Transfer learning may be used to improve algorithms for efficiently assessing total tumor burden biomarkers with novel tracers when the availability of expert-annotated cases is limited.