PT - JOURNAL ARTICLE AU - Weisman, Amy AU - Lee, Inki AU - Im, HyungJun AU - McCarten, Kathleen AU - Kessel, Sandy AU - Schwartz, Cindy AU - Kelly, Kara AU - Santoro-Fernandes, Victor AU - Jeraj, Robert AU - Cho, Steve AU - Bradshaw, Tyler TI - Machine learning-based assignment of Deauville scores is comparable to interobserver variability on interim FDG PET/CT images of pediatric lymphoma patients DP - 2020 May 01 TA - Journal of Nuclear Medicine PG - 1434--1434 VI - 61 IP - supplement 1 4099 - http://jnm.snmjournals.org/content/61/supplement_1/1434.short 4100 - http://jnm.snmjournals.org/content/61/supplement_1/1434.full SO - J Nucl Med2020 May 01; 61 AB - 1434Purpose: Visually assigned Deauville scores (DS) on interim FDG PET images are used internationally for reporting response of lymphoma to treatment and in making subsequent treatment decisions. However, visual scores can be prone to inter-observer variability due to challenges in identifying sites of residual disease, especially in differentiating FDG uptake relative to reference mediastinal blood pool and liver activity. The purpose of this study was to assess the feasibility of calculating DS in pediatric lymphoma patients using a completely automated method based on machine learning. Methods: 18F-FDG PET/CT images of 97 pediatric Hodgkin’s lymphoma (HL) patients with high quality baseline pre-therapy and interim PET/CT images amenable for quantitative analysis acquired on the Children’s Oncology Group (COG) AHOD0831 high-risk pediatric HL phase 3 clinical trial were retrospectively analyzed. Two experienced nuclear medicine physicians identified and segmented sites of FDG avid disease using two PET thresholding methods (absolute 2.5 SUV and 40% maximum tumor SUV). Visual DS were assigned separately after independent review by two experienced nuclear medicine physicians, after which a consensus score was reached. Baseline and interim PET and CT images were used as input to a 3D patch-based, multi-resolution pathway CNN. The CNN was trained to automatically predict physician segmentations on baseline and follow-up images. Lesion contours detected on baseline images were registered to follow-up interim images using a whole-body deformable registration technique. From the lesions that were detected by the CNN at interim and spatially overlapped with lesions from baseline, 6 PET features were extracted: SUVmax, MTV, TLG, qPET (ratio of SUVpeak to liver uptake), and median SUV in bone marrow and spleen. A linear support vector machine classifier was implemented using these six features to predict consensus DS. Automated DS was compared to consensus DS using accuracy and Cohen’s kappa, which was compared to agreement between the two separate reader DS. Results: On interim scans, consensus scores showed 56/97 patients had DS greater than 3 (DS1: 2, DS2: 16, DS3: 23, DS4: 35, DS5: 21). Overall accuracy of automated 5-class DS was 61%, with Cohen’s kappa of κ=0.47 and Cohen’s weighted kappa of κ=0.74. Between the two individual observer scores, agreement of the 5-class DS was 60%, with Cohen’s kappa of κ=0.47 and Cohen’s weighted kappa of κ=0.78. For binary classification grouping scores of DS 1, 2, and 3 vs. 4 and 5, the automated method’s accuracy was 82%, with Cohen’s kappa of κ=0.64. For comparison, interobserver DS showed a two class agreement of 89% and κ=0.77. Similar results were found for grouping based on DS of 1 and 2 vs. DS of 3, 4, or 5: automated results showed 86% accuracy and κ=0.52, while interobserver agreement showed 87% agreement and κ=0.62. Conclusions: A fully automated method for assigning Deauville scores on interim PET scans was developed using a dataset from a multi-institutional prospective clinical trial of pediatric HL patients. Performance comparable to interobserver variability of visual DS was found for five class labeling. Binary DS classification results were more difficult to replicate, however validation in larger datasets including more patients with interim PET DS less than 3 is needed. Acknowledgements: This work was partially supported by GE Healthcare. We would also like to thank IROC-Rhode Island for their assistance with image transfers.