Abstract
1176
Introduction: Deep learning techniques have been used to recover high quality image from low-dose PET images. Recent developments have shown that dilated convolutions incorporated into the U-Net architecture (dNet) have superior image recovery performance due to its expanding nature and larger field-of-view1,2. The majority of the current deep learning techniques are trained using supervised learning, which indicates that for every low-count training image, there is a paired high quality PET image. As a consequence, a large amount of paired data is required to train such a network, which can be difficult to implement for newly developed tracers. More recently, unsupervised learning techniques have been proposed for this task, which used a U-Net based deep learning framework3 with no need for pairs of low-count and high-count images. Thus, increasing feasibility for clinical studies or newly developed tracers without large datasets4. Since there is no high-quality label for the network to learn from, an important aspect of these unsupervised learning techniques involves a stopping criterion for the network to halt training and maintain a high-quality output. The recent PET unsupervised learning technique has used Contrast-to-noise (CNR) ratio as the stopping criteria. Therefore, this work is novel in 2 fold: 1) a unsupervised learning-based dNet framework is used to optimize image recovery and 2) improved stopping criteria are developed for optimal image recovery.
Methods: 185 MBq (5mCi) of 18F-FDG was administered to a subject. This study acquired listmode data using a dedicated MRI head coil for a Siemens Biograph mMR PET/MRI scanner. Attenuation maps were generated using an established MRI-based algorithm5,6. Scanner attenuation maps were also extracted for reconstruction. Single static PET images were reconstructed from 10-minute emission data (50-60 minutes after injection) using Siemens’ E7tools with ordered subset expectation maximization (OSEM). Low-count PET data (10% count) were generated through Poisson thinning from the listmode file. The unsupervised learning dNet consists of similar skip connections as U-Net but incorporates dilated convolutions2. Mean absolute error (MAE) was used as the loss function. Similar to previous unsupervised learning framework4, anatomical MPRAGE image were used as input to the network and were trained to output low-count PET image. In order to optimize the stopping criteria, three different metrics were used: CNR, structural similarity index measure (SSIM), and the loss function MAE. These metrics were calculated after each epoch against the low-count image and the best output was chosen by finding the optimal combination of the following stopping criteria: 1) CNR, 2) CNR and MAE, 3) SSIM and MAE . Final outputs from the unsupervised learning dNet framework were then compared to full-count data to assess the best stopping criteria using the following quantitative metrics: 1) Peak Signal to Noise Ratio (PSNR), 2) SSIM, 3) Mean Absolute Percent Error (MAPE); all with respect to the full-count image.
Results: Figure 1 shows best network output for all three stopping criteria for the proposed unsupervised dNET network. When CNR is used alone, it may provide great CNR for the ROI chosen but not the entire image as seen in Figure 1. The stopping criteria of “CNR and MAE” and “SSIM and MAE” visually look better than “CNR” alone. As demonstrated in Table 1, amongst all three stopping criteria assessed, our proposed “SSIM and MAE” performed the best across all metrics.
Conclusions: Previous unsupervised PET denoising frameworks have used a U-Net architecture and CNR either in a lesion or specific region of muscle as stopping criteria. We proposed a dNet architecture along with improved stopping criterion to show that the our proposed criterion for unsupervised PET denoising, namely, “SSIM and MAE”, outperforms previous criterion.
Table 1: Quantitative metrics used to evaluate unsupervised learning output using Full-count image a