Abstract
284
Introduction: Review of oncologic FDG PET images normally includes the processes of optimizing the view window, location in coronal MIP, synchronized review other modality images such as (CT / MRI) in transversal plane, size measurement and lesion counting, which takes much labor effort. Deep learning is an emerging technique which allows the automatic detection and segmentation of lesions or organs, and might accelerate the review of oncologic FDG PET images and enable the improved characterization of lesions. In study, we are aiming to develop an automatic high-uptake lesion detection as a first step approach to fast review of oncologic FDG PET images.
Methods: The deep learning model we explored in this study was a 2D faster R-CNN trained in Coco image datasets and fine-tuned by 840 patients’ annotated coronal FDG PET MIP images. The input matrix size was modified to 128 x 128 with 1 channel and output 7 classes (5 classes of normal FDG high-uptake organs including brain, thyroid, heart, renal and bladder, other 2 classes representing various markers and high-uptake lesions) and corresponding bounding boxes as detailed in Figure 1. Among the 840 MIP images, 820 were collected from publications and internet via key words search, where the high-uptake lesions were localized by referring to the image descriptions and contoured using a 2D bounding box; the other 20 were acquired in our imaging center. For the validation of this lesion detection model, another 60 patients’ FDG PET MIP images in our imaging center were annotated and enrolled into this study as testing sets. Either the training or the testing images were annotated by a 5-year experienced nuclear medicine physician. The pre-processing steps include image normalization, sliding-window in z direction, resizing. The post-processing step was ROI union. During the training phase, the images were augmented on-the-fly using mirror, scale, random inserting various markers. In total, 6 times training images were used to train the model. Confusion matrix analysis were explored to evaluate the performance of the trained model in a multi-class classification and binary classification wise. The accuracy, sensitivity, specificity, positive predicative value (PPV) and negative predictive value (NPV) were reported for binary classification.
Results: The disease spectrum of training datasets enrolled into this study consisted of head-neck cancer, lung cancer, esophageal cancer, lymphoma, whole body metastasis, gynecological cancer, thyroid cancer and the spectrum of testing datasets was lung cancer, thyroid cancer and whole body metastasis. Among 820 training datasets collected from publications and internet, 733 were acquired from PET/CT, and others were acquired with PET/MR; The additional 20 training datasets from our center were acquired from both PET/MR (10) and PET/CT (10). For the testing datasets from our center, 32 were from PET/MR and 28 were PET/CT images, and total 157 high uptake lesions were observed from 48 patients. A confusion matrix analysis of testing datasets was shown in Figure 2 in the form of multi-class and binary class. The accuracy, sensitivity, specificity, PPV and NPV were 97.2%, 95.4%, 98.1%, 96.8%, 97.0% respectively for binary classification.
Conclusions: This study shows the promise of detecting of high uptake lesions from MIP PET images using the faster R-CNN model. The model generalization can be improved by training with MIP PET images from publications and transfer learning from few on-site new MIP PET images.