%0 Journal Article %A Gerben J.C. Zwezerijnen %A Jakoba J. Eertink %A Coreline N. Burggraaff %A Sanne E. Wiegers %A Ekhlas A.I.N. Shaban %A Simone Pieplenbosch %A Daniela E. Oprea-Lager %A Pieternella J. Lugtenburg %A Otto S. Hoekstra %A Henrica C.W. de Vet %A Josee M. Zijlstra %A Ronald Boellaard %T Interobserver Agreement on Automated Metabolic Tumor Volume Measurements of Deauville Score 4 and 5 Lesions at Interim 18F-FDG PET in Diffuse Large B-Cell Lymphoma %D 2021 %R 10.2967/jnumed.120.258673 %J Journal of Nuclear Medicine %P 1531-1536 %V 62 %N 11 %X Metabolic tumor volume (MTV) on interim PET (I-PET) is a potential prognostic biomarker for diffuse large B-cell lymphoma (DLBCL). Implementation of MTV on I-PET requires a consensus on which semiautomated segmentation method delineates lesions most successfully with least user interaction. Methods used for baseline PET are not necessarily optimal for I-PET because of lower lesional SUVs at I-PET. Therefore, we aimed to evaluate which method provides the best delineation quality for Deauville score (DS) 4–5 DLBCL lesions on I-PET at the best interobserver agreement on delineation quality and, second, to assess the effect of lesional SUVmax on delineation quality and performance agreement. Methods: DS 4–5 lesions from 45 I-PET scans were delineated using 6 semiautomated methods: a fixed SUV threshold of 2.5 g/cm3, a fixed SUV threshold of 4.0 g/cm3, an adaptive threshold corrected for source-to-local background activity contrast at 50% of the SUVpeak, 41% of SUVmax per lesion, a majority vote including voxels detected by at least 2 methods, and a majority vote including voxels detected by at least 3 methods (MV3). Delineation quality per MTV was rated by 3 independent observers as acceptable or nonacceptable. For each method, observer scores on delineation quality, specific agreement, and MTV were assessed for all lesions and per category of lesional SUVmax (<5, 5–10, >10). Results: In 60 DS 4–5 lesions on I-PET, MV3 performed best, with acceptable delineation in 90% of lesions and a positive agreement of 93%. Delineation quality scores and agreement per method strongly depended on lesional SUV: the best delineation quality scores were obtained using MV3 in lesions with an SUVmax of less than 10 and using SUV4.0 in more 18F-FDG–avid lesions. Consequently, overall delineation quality and positive agreement improved by applying the most preferred method per SUV category instead of using MV3 as the single best method. The MV3- and SUV4.0-derived MTVs of lesions with an SUVmax of more than 10 were comparable after exclusion of visually failed MV3 contouring. For lesions with an SUVmax of less than 10, MTVs using different methods correlated poorly. Conclusion: On I-PET, MV3 performed best and provided the highest interobserver agreement regarding acceptable delineations of DS 4–5 DLBCL lesions. However, delineation-method preference strongly depended on lesional SUV. Therefore, we suggest exploration of an approach that identifies the optimal delineation method per lesion as a function of tumor 18F-FDG uptake characteristics, that is, SUVmax. %U https://jnm.snmjournals.org/content/jnumed/62/11/1531.full.pdf