TY - JOUR T1 - Interobserver agreement in automated metabolic tumor volume measurements of Deauville score 4 and 5 lesions at interim <sup>18</sup>F-FDG PET in DLBCL JF - Journal of Nuclear Medicine JO - J Nucl Med DO - 10.2967/jnumed.120.258673 SP - jnumed.120.258673 AU - Gerben JC Zwezerijnen AU - Jakoba J Eertink AU - Coreline N Burggraaff AU - Sanne E Wiegers AU - Ekhlas A Shaban AU - Simone Pieplenbosch AU - Daniela E Oprea-Lager AU - Pieternella J Lugtenburg AU - Otto S Hoekstra AU - Henrica CW de Vet AU - Josee M Zijlstra AU - Ronald Boellaard Y1 - 2021/03/01 UR - http://jnm.snmjournals.org/content/early/2021/03/05/jnumed.120.258673.abstract N2 - Introduction: Metabolic tumor volume (MTV) on interim-PET (I-PET) is a potential prognostic biomarker for diffuse large B-cell lymphoma (DLBCL). Implementation of MTV on I-PET requires consensus which semi-automated segmentation method delineates lesions most successfully with least user interaction. Methods used for baseline PET are not necessarily optimal for I-PET due to lower lesional standardized uptake values (SUV) at I-PET. Therefore, we aimed to evaluate which method provides the best delineation quality of Deauville-score (DS) 4-5 DLBCL lesions on I-PET at best interobserver agreement on delineation quality and, secondly, to assess the effect of lesional SUVmax on delineation quality and performance agreements. Methods: DS4-5 lesions from 45 I-PET scans were delineated using six semi-automated methods i) SUV 2.5, ii) SUV 4.0, iii) adaptive threshold [A50%peak], iv) 41% of maximum SUV [41%max], v) majority vote including voxels detected by ≥2 methods [MV2] and vi) detected by ≥3 methods [MV3]. Delineation quality per MTV was rated by three independent observers as acceptable or non-acceptable. For each method, observer scores on delineation quality, specific agreements and MTV were assessed for all lesions, and per category of lesional SUVmax (&lt;5, 5-10, &gt;10). Results: In 60 DS4-5 lesions on I-PET, MV3 performed best, with acceptable delineation in 90% of lesions, with a positive agreement (PA) of 93%. Delineation quality scores and agreements per method strongly depended on lesional SUV: the best delineation quality scores were obtained using MV3 in lesions with SUVmax&lt;10 and SUV4.0 in more FDG-avid lesions. Consequently, overall delineation quality and PA improved by applying the most preferred method per SUV category instead of using MV3 as single best method. MV3- and SUV4.0-derived MTVs of lesions with SUVmax&gt;10, were comparable after excluding visually failed MV3 contouring. For lesions with SUVmax&lt;10, MTVs using different methods correlated poorly. Conclusion: On I-PET, MV3 performed best and provided the highest interobserver agreement regarding acceptable delineations of DS4-5 DLBCL lesions. However, delineation method preference strongly depended on lesional SUV. Therefore, we suggest to explore an approach that identifies the optimal delineation method per lesion as function of tumor FDG uptake characteristics, i.e. SUVmax. ER -