Abstract
Attenuation correction is a notable challenge associated with simultaneous PET/MRI, particularly in neuroimaging, where sharp boundaries between air and bone volumes exist. This challenge leads to concerns about the visual and, more specifically, quantitative accuracy of PET reconstructions for data obtained with PET/MRI. Recently developed techniques can synthesize attenuation maps using only MRI data and are likely adequate for clinical use; however, little work has been conducted to assess their suitability for the dynamic PET studies frequently used in research to derive physiologic information such as the binding potential of neuroreceptors in a region. At the same time, existing PET/MRI attenuation correction methods are predicated on synthesizing CT data, which is not ideal, as CT data are acquired with much lower-energy photons than PET data and thus do not optimally reflect the PET attenuation map. Methods: We trained a convolutional neural network to generate patient-specific transmission data from T1-weighted MRI. Using the trained network, we generated transmission data for a testing set comprising 11 subjects scanned with 11C-labeled N-[2-]4-(2-methoxyphenyl)-1-piperazinyl]ethyl]-N-(2-pyridinyl)cyclohexanecarboxamide) (11C-WAY-100635) and 10 subjects scanned with 11C-labeled 3-amino-4-(2-dimethylaminomethyl-phenylsulfanyl)benzonitrile (11C-DASB). We assessed both static and dynamic reconstructions. For dynamic PET data, we report differences in both the nondisplaceable and the free binding potential for 11C-WAY-100635 and distribution volume for 11C-DASB. Results: The mean bias for generated transmission data was −1.06% ± 0.81%. Global biases in static PET uptake were −0.49% ± 1.7%, and −1.52% ± 0.73% for 11C-WAY-100635 and 11C-DASB, respectively. Conclusion: Our neural network approach is capable of synthesizing patient-specific transmission data with sufficient accuracy for both static and dynamic PET studies.
The benefits of PET/MRI are perhaps nowhere more impactful than they are to neuroimaging, with PET/MRI standing to significantly improve the quality of quantitative neuroimaging-driven studies in the psychiatric community (1), offering several benefits such as reduced radiation burden, motion correction (2), partial-volume correction (3), and MRI-guided PET reconstruction (4,5). Despite these promised benefits, PET/MRI attenuation correction (AC) remains a nontrivial problem, particularly in the head and neck, where sharp boundaries between air and bone exist in the field of view.
Historically, PET AC has been straightforward: standalone scanners use a rotating nuclear transmission source, deriving the attenuation map with the same 511-keV photons imaged in PET, whereas PET/CT scanners make use of the photon attenuation data provided by CT. In PET/MRI, the strong magnetic field and limited space within the magnet make it impractical to implement either technique (6,7).
CT AC and transmission AC techniques exhibit contrasting benefits and drawbacks. Transmission AC theoretically provides a direct representation of the patient’s attenuation map, given the fact that 511-keV photons are used to acquire the attenuation data. On the other hand, CT AC must account for several nonidealities, most notably the fact that attenuation values determined using CT must be adjusted to account for the 4- to 5-factor difference in CT and PET photon energies. This difference has been shown to lead to significant and heterogeneous overestimation of attenuation coefficients and radiotracer uptake (8,9). The overestimation is exacerbated by the polychromatic nature of CT beams (10), which obfuscate the effective energy at which attenuation coefficients are determined and subject the CT AC maps to beam-hardening artifacts. From a research perspective, it is equally concerning that multiple methods exist for correcting these problems: CT AC techniques can differ in terms of both the CT acquisition technique and the processing steps used to perform energy scaling (9). Given its ease of acquisition, CT AC has become the preeminent technique for clinical scanning. Nevertheless, CT cannot be considered a genuine gold standard for AC data (11), and a recent multicenter review of several clinically acceptable PET/MRI attenuation protocols stated that not having a gold standard transmission scan for comparison was a limitation (12).
Techniques for deriving attenuation maps in PET/MRI generally fall into 2 categories: those that use specialized pulse sequences such as ultrashort or zero echo time (13–16) and those that seek to generate pseudo-CT data from an atlas of matched MRI and CT data (12,17,18). These techniques demonstrate unique shortcomings beyond their reliance on a CT gold standard for evaluation of function. In particular, MRI-alone methods consume scan time that could be dedicated to other sequences and typically assign predetermined attenuation values to any number of segmented tissue classes (19), thus preventing MRI-alone methods from accurately reflecting differences in bone density. At the same time, whereas atlas-based methods seek to reflect patient-specific attenuation values, they are dependent on accurate registration of the atlas onto the input MRI volume. Although certain techniques, such as patch-based learning, have been presented to mitigate the effects of misregistration (17,20), no truly registration-free atlas-based method has been independently evaluated in the literature.
Alongside these limitations is the concern about the lack of task-based validation of published methods’ suitability for kinetic modeling (21). Given the proportion of PET/MR scanning dedicated to research (22), this limitation represents a gap in the literature. There is reasonable concern about the parameters obtained in dynamic PET studies; a side-by-side comparison of several PET/MRI AC techniques for 18F-FDG, 11C-Pittsburgh compound B, and 18F-florbetapir demonstrated tracer-dependent differences in performance across all methods (12). This dependence on tracer distribution is a possible source of error in dynamic PET/MRI studies given the varying dynamics of the radiotracer throughout the brain during the scan. Encouragingly, recent work has demonstrated the stability of a pseudo-CT technique for kinetic modeling of 2′-methoxyphenyl-(N-2′-pyridinyl)-p-18F-fluoro-benzamidoethylpiperazine (18F-MPPF) (21), although this study was limited and examined the kinetic modeling of only a single radiotracer.
The emergence of convolutional neural networks (CNNs) has led to considerable examination of their feasibility for synthesizing patient-specific AC data for PET/MRI. Whereas initial works focused only on the quantitative accuracy of CNN-derived pseudo-CT images (23), recent studies have demonstrated accurate PET reconstruction. A CNN model depending on T1-weighted MRI data was shown to yield accurate PET reconstructions, although reconstruction analysis was confined to a small cohort and analyzed only short acquisitions of a single radiotracer (24). More recently, models depending on Dixon or a combination of Dixon and zero echo time were seen to provide accurate static PET reconstructions, although these models depend on additional MRI sequences and were again validated for only a single radiotracer (25); a similar approach based on Dixon and zero echo time data was shown to yield significantly more accurate PET quantitation than vendor-implemented PET/MRI AC techniques for scans of the pelvis (26). Whereas CNN-based PET/MRI attenuation map generation is an established technique, there are unaddressed limitations in the field: no technique has been compared with gold standard transmission data, all analyses have been confined to a single radiotracer, and—most importantly for neuroimaging research communities—no technique has been validated for kinetic modeling of PET data.
Here, we demonstrate the suitability of generating pseudo-transmission data with a CNN using only T1-weighted MRI. We demonstrate that this method is well suited for static and dynamic PET analysis using data previously collected using 2 radiotracers: 11C-labeled N-[2-]4-(2-methoxyphenyl)-1-piperazinyl]ethyl]-N-(2- pyridinyl)cyclohexanecarboxamide) (11C-WAY-100635), which is an agonist of the serotonin-binding 5-hydroxytryptamine receptor 1A (27), and 11C-labeled 3-amino-4-(2-dimethylaminomethyl-phenylsulfanyl)benzonitrile (11C-DASB), which targets the serotonin 5-hydroxytryptamine transporter (28).
MATERIALS AND METHODS
Subject Population
The institutional review board approved this retrospective study, and the requirement to obtain informed consent was waived.
We queried all anonymized scans in our database for patients who had previously been scanned using both standalone PET and MRI. The PET transmission data were extracted alongside the PET emission and MRI data.
The largest single radiotracer dataset available consisted of 66 individuals scanned using 11C-WAY-100635. These individuals were randomly partitioned into training (n = 44), validation (n = 11), and testing (n = 11) datasets.
After validation of our method with 11C-WAY-100635, we similarly assessed performance for 10 subjects scanned with 11C-DASB.
MRI Acquisition
The MRI acquisition consisted of identical pulse sequences on the same scanner for subjects in both the 11C-WAY-100635 and the 11C-DASB datasets. All MRI was performed on a GE Healthcare 1.5-T Signa Advantage. MRI was acquired using a spoiled gradient recalled acquisition. Pulse sequence parameters were as follows: echo time, 2.8 ms; repetition time, 7.0 ms; inversion time, 500 ms; acquisition matrix, 256 × 256 × 170; coronal slices; and voxel size, 1.0 mm isotropic. Spoiled gradient recalled acquisition was the only CNN input used.
PET Acquisition, Reconstruction, and Modeling
PET data for both radiotracers were acquired on a Siemens ECAT HR+. 68Ge transmission data were acquired for 10 min before the injection of a single intravenous bolus of up to 185 MBq. Arterial input functions were obtained during the acquisitions. Arterial blood was drawn periodically throughout scans with both radiotracers, and metabolite correction was performed using previously described methods for both 11C-WAY-100635 (29) and 11C-DASB (30). Static and dynamic reconstructions were performed with synthesized and ground-truth transmission data using filtered backprojection. Dynamic PET data were motion-corrected by rigidly registering each frame onto a reference frame as previously described (31). Static images were formed by averaging the final 10 motion-corrected frames. PET reconstructions were registered onto their accompanying T1-weighted MR images for region-of-interest (ROI) analysis using FLIRT (32). ROIs were determined using a previously described, automated technique (33).
Kinetic modeling was used to derive physiologic parameters from dynamic PET data. For 11C-WAY-100635, we derived 2 binding potential parameters, free binding potential (BPF) and nondisplaceable binding potential (BPND), which are commonly reported estimates of neuroreceptor density. For 11C-DASB, we report the volume of distribution, VT, which is the ratio of concentration in tissue relative to plasma. Kinetic modeling techniques for each tracer are provided in the sections detailing their specific acquisition.
11C-WAY-100635 Acquisition and Modeling
Emission data were collected for 110 min and binned into 20 frames (frame durations, 3 × 20 s, 3 × 1 min, 3 × 2 min, 2 × 5 min, and 9 × 10 min). BPF and BPND were derived using a constrained 2-tissue-compartment model, with cerebellar white matter used as reference tissue as previously described (34,35).
BPF and BPND are defined thus:where VT is in the ROI, VND the nondisplaceable volume of distribution in the reference region, and fp the amount of radiotracer freely available in the plasma.
11C-DASB Acquisition and Modeling
Emission data were collected for 120 min and binned into 21 frames (frame durations, 3 × 20 s, 3 × 1 min, 3 × 2 min, 2 × 5 min, and 10 × 10 min). Outcome measures were derived using likelihood estimation in graphical analysis (36), which has been reported to be the most stable method of modeling 11C-DASB (37). Given that no brain region is devoid of specific 11C-DASB binding (28,30), we report VT, which has been shown to be the only reproducible 11C-DASB modeling measure in test–retest studies (37).
Preprocessing
MR images were first downsampled to a 2-mm isotropic resolution and normalized using FreeSurfer (38). For the transmission images of the training and validation sets, areas outside the brain, notably the scanner bed, were cropped out of the transmission data because the network would have no ability to synthesize them from the MR images. Afterward, atlas transmission images were rigidly registered to their matched MRI volume using FLIRT.
CNN Design
Patient-specific pseudo-transmission data were generated using a CNN implemented in TensorFlow (39). The network made use of a uNet architecture (40), as shown in Figure 1. Upsampling was performed using transposed convolutions. The network was implemented for 2-dimensional inputs and synthesizes whole-volume data slice by slice. Five consecutive axial slices, centered on the slice to be synthesized, are presented to the network as unique channels. All activation functions were chosen to be rectified linear units.
CNN Training
The CNN was trained using 2 Nvidia Tesla K80 graphics processing units on a workstation running Ubuntu, version 14.04. Pairs of MRI and transmission volumes were presented to the network axially on a slice-by-slice basis. The training objective was the minimization of L1 error between synthesized and ground-truth transmission slices. In addition, L2 regularization was incorporated into the cost function to improve generalizability to the external testing sets. The Adam optimizer was used for updating during backpropagation (41). Data were presented to the network one subject at a time, without using batches.
The validation set was passed through the network after each epoch. Training was halted whenever the total cost across the validation set increased for 5 consecutive epochs. The model converged after 32 h of training. Once trained, whole-volume transmission data can be synthesized in about 1 s.
CNN Evaluation
After training, we examined the similarity of synthesized transmission data to ground truth, as well as the quantitative accuracy of PET reconstructions making use of the synthesized transmission data.
Synthesized Transmission Data
Previously masked scanner beds were added to the synthesized data before PET reconstruction. Synthesized attenuation maps were compared with scanner transmission data on the basis of mean bias:where μsynth represents the attenuation coefficients in voxels of the CNN-derived transmission data, and μraw indicates those of the ground-truth transmission data. Areas outside the head, which are identical for synthesized and ground-truth data, were not included.
Static PET Analysis
Masks were generated by taking the intersection of FSL-derived brain masks (42) and voxels with at least 20% of maximum PET activity in the ground-truth reconstruction. This was done for consistency with a recent side-by-side comparison of many proposed PET/MRI AC techniques (12). For voxels contained within this mask, mean biases are reported along with SD. We present these data at both the global and the ROI levels. All ROIs were tested for statistically significant errors using the Student t test.
Kinetic Modeling Analysis
VT, BPF, and BPND were estimated as described above. We report the mean bias and SD in each ROI. The Student t test was again used to determine whether ROIs exhibited statistically significant errors.
RESULTS
Synthesized Transmission Data
Synthesized attenuation maps demonstrated a slight negative bias; the mean relative bias between the synthesized and ground-truth maps was seen to be −1.06% ± 0.81%.
Figure 2 shows a representative pseudo-transmission map alongside its accompanying ground-truth data. Slices in the sinus region were chosen because of their complexity relative to superior slices.
Static PET Analysis
Relative to ground truth, 11C-WAY-100635 PET images reconstructed using synthesized attenuation data demonstrated a mean relative bias of −0.49% ± 1.7%. 11C-DASB images demonstrated a mean relative bias of −1.52% ± 0.73%. Figure 3 shows slices of ground-truth and reconstructed PET images alongside percent error maps for each radiotracer.
Static PET analysis was continued by examining relative biases at the ROI level. Figure 4 illustrates subject-specific ROI biases. The Student t test did not suggest statistically significant errors in any ROIs for either radiotracer.
Dynamic PET Analysis
Kinetic modeling results for all radiotracers are shown in Figure 5. For 11C-WAY-100635, between BPF and BPND, BPF was generally more stable, with mean biases closer to zero as well as lower SDs of the error in most ROIs. Using the Student t test, no statistically significant errors were found for either radiotracer.
DISCUSSION
The presented method improves on the state of the art in 2 important ways. Our primary contribution is the validation of a PET/MRI AC method predicated on gold standard nuclear transmission data, which provides more accurate quantitation than CT data. Second, we have approached this issue using a CNN; this allows us to circumvent the atlas registration requirements of many current pseudo-CT protocols. An interesting consideration motivated by this approach would be the technique’s performance in subjects with nonstandard anatomies relative to currently published pseudo-CT techniques, although no subjects with remarkable anatomic deviation were available for analysis. Future work investigating the utility of CNN-derived attenuation maps would certainly benefit from evaluation against more traditional techniques in the presence of anatomic deviations.
The primary motivation of this work is that transmission data provide a more accurate representation of the subject’s attenuation map than CT data, given the large energy difference between PET and CT photons. In addition, transmission data can be argued to have additional benefits given that the attenuation data are collected using the same detector system as the PET emission data. Unfortunately, in the present state, an analysis of these effects is not possible, given that only retrospective transmission data are available. This inability relates to the primary limitation of the proposed method, namely that a proper validation of the method using MRI and PET emission data collected on a simultaneous PET/MRI scanner, alongside subject-specific transmission data, cannot be provided. Pseudo-transmission attenuation maps applied to PET/MRI data yielded noticeably greater PET values throughout the brain—most notably the cortical areas—in comparison to standard vendor methods, although direct comparison is not possible because of the lack of gold standard attenuation data. An example 18F-FDG reconstruction is provided as supplemental data (supplemental materials are available at http://jnm.snmjournals.org).
A prospective study is warranted in light of the lack of direct validation on a PET/MRI scanner. This study will require the assembly of a PET/MRI T1-weighted MRI and transmission database for several subjects. Collecting such a database will require scans specific to this task; however, multiple solutions exist. Primarily, institutions equipped with a standalone PET scanner can seek volunteers willing to undergo a brief T1-weighted study on a simultaneous PET/MRI system. At the same time, the feasibility of a fixed torus geometry for transmission scanning on the Siemens Biograph mMR has been demonstrated (43). Although a modern database suitable for PET/MRI data is certainly required going forward, this necessity is addressable.
Our method is not based on any assumptions about anatomy, or the nature of the transformation between MRI and transmission data. As such, it is flexible and can be easily adapted to accept additional MRI sequences, such as ultrashort or zero echo time, into the training process. Any MRI contrast that can be reliably resampled to the same space as the input T1-weighted data could be simply added as an input channel. Because data were collected from a retrospective study that used MRI only to delineate PET ROIs, this addition was not possible in the current analysis.
The presented work, a validation of multiple dynamic PET measures in multiple radiotracers, adds to the existing literature in several important ways. To the best of our knowledge, only one study to date has examined the suitability of synthesized AC data (pseudo-CT) for dynamic PET studies (21). The previous work used 18F-MPPF, a 5-hydroxytryptamine receptor 1A binding tracer, for kinetic analysis. Our work adds evidence to the previous investigators’ conclusion that synthesized attenuation data are sufficient for kinetic modeling of PET data. By using multiple radiotracers probing different aspects of neurobiology, we have added significant generalizability to the prior work’s observations. Generalizability is crucial, as PET/MRI AC methods have been shown to exhibit varying performance with different radiotracers in static analysis (12). At the same time, our work generates data with comparable accuracy but relative to a higher gold standard in that the 18F-MPPF analysis used CT AC.
The cerebellum is an extremely important region in dynamic PET neuroimaging, as the cerebellar white matter expresses many neuroreceptors of interest in far lower concentrations than cerebral regions. As such, the cerebellum is a frequently used area for the kinetic modeling of a large number of radiotracers, including those used in this work. Given the reasonably accurate assumption of no specific binding occurring in the reference region, one can easily estimate the nondisplaceable volume of distribution, thus providing an estimation of the amount of signal measured in other ROIs which is not related to the tracer binding to its targeted neuroreceptor. In this work, we observed highly accurate cerebellar uptake estimations using our synthesized transmission data. This is crucial to the accuracy of such kinetic studies, and our demonstrated accuracy in BPF and, in particular, BPND modeling would not have been possible without accurate measurements of cerebellar activity. Relatively few studies have placed an emphasis on the effects of various AC paradigms on cerebellar activity, thus severely limiting the confidence with which they can be applied to dynamic PET studies.
Despite the use of a separate tracer and parcellation strategies, we report BPND biases with similar magnitude to those previously described. Further, our results add to those of the previous 18F-MPPF study by demonstrating the suitability of our technique for BPF estimation, which is the PET parameter most closely related to receptor density. Despite the ideality of estimating BPF, it is dependent on repeated, invasive blood sampling and subsequent metabolite analysis; BPND is therefore a more reported metric.
Generally speaking, BPF estimations were found to be more stable than BPND in the 11C-WAY-100635 testing dataset. This result was largely expected given the formulae used to derive each expression. BPF estimations are normalized to the amount of free radiotracer in the plasma, whereas BPND estimations are normalized to the volume of distribution in some reference region, in this case the white matter of the cerebellum. As such, errors in reference region modeling will compound errors in BPND quantitation more than they do in BPF.
To contextualize our reported biases in PET quantitation, 11C-WAY-100635 BPF quantitation is associated with an average test–rest variability of 9% (33). No ROI ± SD range intersects this inherent uncertainty. Moreover, VT estimation in 11C-DASB data has been shown to exhibit an inherent test–retest variability of 5.5% (37). Static reconstruction biases are well within the test–retest reproducibility of PET imaging in both cases.
CONCLUSION
Synthesizing pseudo-transmission data using CNNs is a promising technique for AC in psychiatric PET/MRI. CNN-based attenuation demonstrates comparable accuracy to currently optimally performing techniques while possibly obviating some nonidealities exhibited by current techniques, such as registration of prospective data onto a predefined atlas. Moreover, the technique presented here is suitable for static and dynamic PET imaging.
DISCLOSURE
This work is partially supported by a NARSAD young investigator award (Chuan Huang) and by the Stony Brook Bridge Fund Program (Chuan Huang). Yi Gao received support from the National Natural Science Foundation of China (61601302) and the Shenzhen Peacock Plan (KQTD2016053112051497). No other potential conflict of interest relevant to this article was reported.
Footnotes
Published online Aug. 30, 2018.
- © 2019 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication May 10, 2018.
- Accepted for publication August 23, 2018.