Abstract
P1380
Introduction: Elevated levels of noise in PET images reduce image quality and quantitative accuracy. Numerous factors, such as radiotracer dose, scan time, etc., contribute to the occurrence and variability of PET image noise. Our objective is to utilize the most recent advances in deep learning to denoise PET. Denoising using supervised learning methods requires pairs of noisy and clean images, which are difficult to obtain in clinical settings. Here we present a model that leverages a self-supervised approach named neighbor-to-neighbor (NB2NB) to denoise PET images using only single noisy inputs.
Methods: NB2NB was inspired by the well-known Noise2Noise (N2N) method, wherein training does not require ground-truth images but requires two realizations of the noisy input image. During N2N training, one noisy image is considered an input, while the other is considered a target. In contrast, NB2NB training utilizes two subimages acquired from the same noisy input using a neighborhood subsampler that is a 2 by 2 window with a stride of 2 that randomly selects two distinct pixels, one for each sub-image to train a network like N2N. Another distinguishing aspect of NB2NB is that a denoised image is pre-generated without a gradient and sampled into two denoised subimages using the same neighborhood subsampler for a regularizer that considers the fundamental difference in the ground-truth pixel values between the subsampled noisy image pair. We implemented the U-Net architecture with three resolution levels as an NB2NB network on the PyTorch platform. The network uses additional anatomical information in the form of high-resolution anatomical MR images. For training and validation of the NB2NB network, we use both simulation and clinical datasets. For the simulation data, 5 sets of 18F-FDG PET images with available T1-weighted MR images were synthesized using the BrainWeb database. Segmented gray matter, white matter, and cerebrospinal volumes were used to generate the noiseless PET images with a gray-to-white contrast ratio of 4:1. The Siemens ECAT HR+ scanner model was used to generate sinogram data. Noisy data were generated using Poisson deviates with mean counts of 50M. The data were then reconstructed using the OSEM algorithm to generate noisy PET images. Our clinical dataset consists of 18F-fluorodeoxyglucose (18F-FDG PET) scans and corresponding T1-weighted MR images, obtained from the ADNI database. For this clinical dataset, there is no noiseless ground-truth image available. Hence, the mean image across six 5-min time-frames was used as the reference low-noise image after confirming that the contrast levels were comparable across the 6 time-frames.
Results: Extensive preliminary studies were carried out to compare several unsupervised and self-supervised deep learning methods. We compared the following optimized versions of state-of-the-art networks based on the highest-achievable PSNR and using both PSNR and SSIM as evaluation metrics. In the simulation results, the mean PSNR order with corresponding mean SSIM values was as follows: N2N (20.91 dB/0.88) > NB2NB with MR (20.53/0.88) > NB2NB (20.39 dB/0.87) > Noise2Void with MR (20.31 dB/ 0.88) > Noise2Void (20.04 dB/0.86) > Gaussian (19.62 dB/ 0.85). The mean PSNR order with corresponding the mean SSIM values in the clinical results was as follows: N2N (30.68 dB/0.95) > NB2NB with MR (32.28/0.95) > NB2NB (32.15 dB/0.94) > Noise2Void with MR (32.11 dB/ 0.95) > Noise2Void (31.92 dB/0.94) > Gaussian (30.68 dB/ 0.92). Except for N2N which requires two realizations, the proposed NB2NB with MR had the highest PSNR and SSIM among the deep learning models that require only a single noisy image.
Conclusions: We developed an NB2NB architecture with extra anatomical inputs to denoise single noisy PET images. Our results indicate that except for N2N (which uses two noise realizations and is, therefore, more of a gold standard), the NB2NB with MR outperformed all other deep learning methods, as well as the Gaussian.