Abstract
1409
Objectives: Partial volume effects (PVE) are known to degrade quantitative accuracy in PET, and thus multiple partial volume compensation (PVC) methods have been developed. There is an important need to clinically evaluate these methods, but this is challenging due to the lack of ground truth. To address this challenge with evaluating quantitative imaging methods, a no-gold-standard evaluation (NGSE) technique has been developed [1-3] and validated in the context of evaluating segmentation [4-6] and quantification [2] methods. Our objective was to first validate the NGSE technique in the context of evaluating different PVC methods, and then apply the technique to clinically evaluate PVC methods on a 15O-H2O brain PET dataset.
Methods: The NGSE technique we validated assumes a linear relationship between the true and measured quantitative values, and then, using maximum-likelihood estimation, and without the availability of the true quantitative values, estimates the slope, bias, and noise standard deviation terms that parameterize this relationship [3, 7]. The ratio of the noise standard deviation and slope terms (noise-to-slope ratio: NSR) are used to rank the methods on the basis of precision of quantitative values. We first validated this NGSE technique on the application of evaluating PVC methods for measuring mean standardized uptake value (SUVmean). For this purpose, the BrainWeb dataset was used to generate 400 realistic FDG-PET images with known SUVmean values. The images were post-processed using three strategies (1) deconvolution methods regularized with symmetric Bowsher (sBowsher) [8], (2) regional-based voxelwise (RBV)-based PVC [9], and (3) no PVC. We first evaluated whether the linearity assumption made by the NGSE technique was valid for these three strategies. Next, the NGSE technique was used to rank these three strategies based on how precisely they computed the SUVmean in whole gray matter. Estimated NSRs from NGSE were compared with those obtained when ground truth was known. We also computed 95% of the upper/lower limit of confidence intervals (CI) for NSR differences to determine the most precise method. Following validation, we applied the NGSE technique to evaluate three PVC methods, namely, (1) RBV, (2) non-local means (NLM), and (3) anatomically-guided non-local means (NLMA) [10], as applied on a dataset of 429 15O-H2O brain PET scans collected from the Baltimore Longitudinal Study of Aging (BLSA). The NGSE technique was applied to evaluate these methods on the task of computing SUVRs in gray matter, medial frontal and posterior cingulate regions.
Results: In the realistic simulation study, the scatter plots and Pearson correlation coefficients (rho>0.95) between the true SUVmean and those obtained with different PVC methods indicated that the linearity assumption made by the NGSE technique was valid. Next, the NGSE technique yielded the same ranking of the PVC methods as when the ground-truth was known in 50 out of 50 noise realizations. More specifically, the NGS approach predicted that RBV yielded the most precise quantitative values, and no PVC yielded the least precise quantitative values. In the clinical study with the BLSA dataset, the NGSE technique indicated that NLMA method was most precise (NSR: 0.055 in gray matter, 0.068 in medial frontal and 0.080 in posterior cingulate).Conclusion: The NGSE technique yielded accurate rankings of different PVC methods, as evaluated using realistic simulation studies. Application of this technique to an in vivo 15O-H2O brain PET dataset indicated that the NLMA PVC method yielded more precise SUVR values compared to other PVC methods.