TY - JOUR T1 - Automatic classification of myocardial 18-FDG uptake patterns using deep learning JF - Journal of Nuclear Medicine JO - J Nucl Med SP - 525 LP - 525 VL - 61 IS - supplement 1 AU - Nicholas Josselyn AU - Matthew MacLean AU - Benjamin Fuchs AU - Paco Bravo AU - Walter Witschey Y1 - 2020/05/01 UR - http://jnm.snmjournals.org/content/61/supplement_1/525.abstract N2 - 525Introduction: Myocardial 18F fluorodeoxyglucose positron emission tomography (FDG-PET) show variable uptake patterns such as no uptake, diffuse, and partial uptake in the form of focal, or focal on diffuse uptake. Due to uncertainty in uptake patterns, this classification method shows variability when an image is classified by different experts. Robust and automatic techniques for classifying uptake patterns may improve consistency of diagnosis between two patients with similar uptake patterns and among two caregivers reviewing the same data. Deep learning neural networks have shown excellent performance on some medical imaging tasks and can extract features and patterns unique to each uptake category. The purpose of this work was to develop a computer algorithm to classify patterns of myocardial FDG uptake using deep learning. These algorithms performed the joint task of left ventricular segmentation from whole-body FDG images and classification of the pattern of uptake within the left ventricular myocardium. Methods: The LV was manually segmented (MIM Software Inc) from 610 patients (age 64±14; 49% male) clinically referred for whole-body FDG PET/CT imaging for an oncology indication and myocardial FDG uptake pattern was visually classified as: no uptake (n= 225, 37%), diffuse uptake (n=250, 41%), and partial uptake (n=135, 22%). Our data split was 60% training, 20% validation, and 20% testing data. We deployed a U-Net with 15 convolutional layers, 3 max pooling layers, 3 layers of deconvolution and concatenation, and 1 layer of dropout (batch normalized with ReLu activation functions). Model performance was evaluated by observing training and validation loss and dice coefficients in addition to linear regression and Bland-Altman plots comparing volume and activity quantification between manual and predicted methods for the test data. Following segmentation, we used a 3D classification network with 8 convolutional layers, 4 max pooling layers, 2 dense layers, and 2 levels of dropout. This network was trained on 609 images with identical data split parameters. MATLAB histogram equalization was performed to enhance image contrast. Individual uptake category group performance was further analyzed by investigating results of a confusion matrix and ROC curves. Results: We report dice scores (91.4% training, 79.1% validation, 78.9% testing) for the performance of the U-Net along with linear regression and Bland-Altman plots that demonstrate good agreement between manual and automated segmentations. Linear regressions have R2 of 0.353 and slope of 0.713 for volume measurements and R2 of 0.934 and slope of 1.2 for activity measurements. We see that the no uptake patients are being underpredicted and is potentially due to less distinct, unique features for a network or human to make out consistently when defining the LV ROI. We then observe appreciable performance from our classification CNN across all data sets. Examining model performance per uptake category showed that there was a true positive classification rate above 70% for all classification groups (78.3% for no uptake, 70.8% for diffuse, and 71.4% for partial). The largest source of confusion as seen in the confusion matrix is between diffuse uptake and partial uptake groups (false negative rates of 25% for diffuse uptake and 17.9% for partial uptake). This may be due to combining focal and focal on diffuse groups into one partial uptake group. ROC curves show good AUC measures of 0.96 for no uptake patients, 0.91 for diffuse patients, and 0.77 for partial patients. Conclusions: Our fully automated method for segmenting and classifying LV whole-body FDG-PET images into three clinically defined categories demonstrates the ability of deep learning technology to be employed for patients receiving these scans. We intend to expand this work to include four classification groups and also introduce unhealthy patients to subsequently distinguish between healthy and unhealthy FDG uptake in the LV. ER -