Abstract
8
Objectives: Compared with PET, CT or MRI have better anatomical image resolution. During the past decades, various methods were proposed to improve PET image quality by using anatomical information, such as mutual information prior and kernel method. Due to potential mismatch between anatomical and functional features, most of these methods with an explicit model may introduce bias to PET images and impair image quality. In this work, we propose a novel co-learning 3D convolutional neural network (CNN) to extract modality-specific features from PET/CT image pairs and fuse complementary features to represent high quality PET images. The pre-trained network is integrated into an iterative reconstruction framework to ensure data consistency.
Methods: Six patients received one-hour (one bed) FDG dynamic PET/CT scans on a GE 690 PET/CT scanner. Five of them were used for training and one reserved for testing. Scans from 20 to 60 minutes post injection were used as high-count data, which were reconstructed using the maximum likelihood (ML) expectation-maximization (EM) algorithm with 50 iterations to generate training labels. For each training subject, ten identically distributed realizations of the low-count data sets were generated by randomly down-sampling the high-count data to 1/10th of the events. Low-count data were reconstructed using the ML-EM algorithm with 20, 30, and 50 iterations to account for different noise levels. PET and CT images (180x180x49 matrix with 3.27-mm cubic voxels) were co-registered by manufacturer provided software. The low-count PET/CT image pairs were used as the input to train 3D denoising CNNs with high-count PET images as label. For each training pair, additional data transformations including one random image translation and one rotation were applied to the input and label to increase the diversity of the training data. The trained denoising CNNs were used to represent feasible PET images in a constrained ML reconstruction. For comparison, two different co-learning strategies were conducted. One is a multi-channel input CNN, which treated PET and CT volumes as different channels of the input. Another is a multi-branch CNN, which implemented separate encoders for PET and CT images to extract features and combined the latent features before upsampling. We compared the proposed method with existing methods, including EM reconstruction with Gaussian filtering, kernel-based reconstruction and CNN-based penalized reconstruction.
Results: Reconstructed images showed the proposed constrained ML reconstruction produced higher quality images than other methods. The tumors in the lung region have higher contrast in the constrained ML reconstruction than in the CNN-based penalized reconstruction. The image quality was further improved by using the anatomical information. Moreover, the standard deviation of liver was lower in the proposed reconstruction than in the kernel-based reconstruction.
Conclusions: By co-learning PET/CT information in CNN, this study indicates that the constrained reconstruction method improves the image quality and produces a better lesion contrast vs. background standard deviation trade-off curve than existing methods. Acknowledgements: Support for this work includes NIH grant R21 EB026668.