Abstract
285
Introduction: In our previous study, we developed a convolutional neural network (CNN)-based system which classified whole-body FDG PET images into 3 categories, 1) benign, 2) malignant, 3) equivocal with high accuracy. In addition to the detection of malignant findings, the localization of malignant findings using a CNN-based system would provide further information to help both physicians and technologists in the clinical settings. In this study, we developed a CNN-based system that predicts the location of the malignant uptake and evaluated the accuracy of predictions. To clarify the function of CNN-based system, we examined the predictions derived from each location. Materials and Methods: This retrospective study included 3,485 sequential patients who underwent whole-body FDG PET-CT with 2 scanners at our institute between January 2016 and December 2017. A nuclear medicine physician evaluated every image based on the FDG PET maximum intensity projection (MIP) images and diagnostic reports. The patient was labeled as malignant when any malignant uptakes were observed. Location of malignant uptake was classified at the same time into, A) head-and-neck, B) chest, C) abdomen, D) pelvic region. In the experiment for the head-and-neck region, a new labeling system was introduced to classify the images into 3 categories: 1) benign in the head-and-neck region, 2) malignant in the head-and-neck region, and 3) equivocal in the head-and-neck region. In this experiment of a specific lesion, even if there were malignant uptakes in other lesions, the case was labelted as benign if there was no malignant uptake in the specific lesion. In this study, we used a network model with the equivalent configuration as ResNet24 to classify whole-body FDG PET images.
Results: A total of 76,785 MIP images were investigated. When images from ‘malignant in the head-and-neck region’ patients were given to the learned model, the accuracy was 97.3%. The accuracy were 97.8% and 96.2% for ‘benign in the head-and-neck region’ patients and ‘equivocal in the head-and-neck region’ patients, respectively. When images from malignant-uptake in each region were given to the learned model, the accuracy were 97.3% (head-and-neck), 96.6% (chest), 92.8% (abdomen), 99.6% (pelvic region), respectively.
Conclusions: The frequent patterns for the reasons of the classification failure were ‘strong physiological uptake’. The physiological uptakes of specific organs such as pharynx or intestine often show relatively high, which might have caused erroneous predictions. The difference of physiological uptake pattern may have resulted in the different accuracy for each location. These results suggested that the system may have a potential to help radiologists localize the lesions with high accuracy. In the future, development of a CNN-based system that can localize malignant findings in a narrower range or enclose segmentation of malignant uptake automatically are expected. Network with deeper layers can be built technically and is expected to improve the performance. However, this needs sufficient amount of data to make the training and test converge.