Abstract
Many authors have reported the importance of motion correction (MC) for PET. Patient motion during scanning disturbs kinetic analysis and degrades resolution. In addition, using misaligned transmission for attenuation and scatter correction may produce regional quantification bias in the reconstructed emission images. The purpose of this work was the development of quality control (QC) methods for MC procedures based on external motion tracking (EMT) for human scanning using an optical motion tracking system. Methods: Two scans with minor motion and 5 with major motion (as reported by the optical motion tracking system) were selected from 18F-FDG scans acquired on a PET scanner. The motion was measured as the maximum displacement of the markers attached to the subject's head and was considered to be major if larger than 4 mm and minor if less than 2 mm. After allowing a 40- to 60-min uptake time after tracer injection, we acquired a 6-min transmission scan, followed by a 40-min emission list-mode scan. Each emission list-mode dataset was divided into 8 frames of 5 min. The reconstructed time-framed images were aligned to a selected reference frame using either EMT or the AIR (automated image registration) software. The following 3 QC methods were used to evaluate the EMT and AIR MC: a method using the ratio between 2 regions of interest with gray matter voxels (GM) and white matter voxels (WM), called GM/WM; mutual information; and cross correlation. Results: The results of the 3 QC methods were in agreement with one another and with a visual subjective inspection of the image data. Before MC, the QC method measures varied significantly in scans with major motion and displayed limited variations on scans with minor motion. The variation was significantly reduced and measures improved after MC with AIR, whereas EMT MC performed less well. Conclusion: The 3 presented QC methods produced similar results and are useful for evaluating tracer-independent external-tracking motion-correction methods for human brain scans.
The importance of motion correction (MC) for PET brain scanning has been reported widely in the literature (1–6). The most cited effects of patient motion are the misalignment of frame images affecting kinetic analysis of dynamic protocols and resolution degradation by motion blurring within frames. A more serious effect reported recently is scatter estimation error due to transmission and emission misalignment, resulting in a regional quantification bias (7). Misaligned frame images can be coregistered using software, whereas regional bias and resolution degradation cannot be recovered, thus raising the need for a reliable MC method for brain PET capable of registering all emission frames to the transmission image and potentially doing intraframe MCs as well.
MC using the Polaris Vicra (Northern Digital Inc.) optical motion tracking has been shown to be accurate on phantoms (3,6,8), to which the markers can be rigidly fixed. However, fixation of the markers on human heads is still a problem (3,9). The challenge is to find a fixation method acceptable for the subject's comfort and safety while minimizing the motion of the tool relative to the head. Multiple fixation methods have been used (e.g., neoprene cap (3,5,9), goggles (4), headband (10), and adhesive bandage (11)). No comparative study has been conducted to determine the most reliable fixation method. New motion-tracking methods using stereo cameras are being developed (12) to avoid the fixation problem. Alternatively, software realignment methods have been proposed (13–16), but their accuracy may be limited by activity distribution changes and noise in low-statistics images as they rely on the image data, which are tracer-dependent. Because external tracking methods are tracer-independent, they are of the most interest. However, to evaluate any motion compensation method in actual use on human subjects for whom exact motion is not known, one or more MC quality control (QC) methods are required.
Evaluation of MC methods in the literature is mostly done on either simulated motion data or phantom scans with known motion. In the few cases in which actual human scans are evaluated, cases with large motion can be visually inspected and improvements from MC are apparent in either the images or the time–activity curves generated from the images. One of the few exceptions is a recent publication (9) in which a large dataset of motion-corrected human scans was subjected to MC QC using a combination of an objective method and human observers.
In this article, we present 3 objective QC methods to evaluate motion tracking and correction using 18F-FDG human scans on a high-resolution PET scanner (the Siemens ECAT HRRT (17)) to evaluate the performance of external motion tracking (EMT) with a Polaris system in a clinical setting. The 3 QC methods can also be used to evaluate the Polaris system with different fixation methods or evaluate other tracer-independent EMT systems such as a new stereo camera–based system (12).
MATERIALS AND METHODS
Patients
Over a period of 1 y, seventeen 18F-FDG scans with Polaris tracking data recorded during the imaging examination were acquired on the HRRT PET scanner installed at Copenhagen University Hospital, Rigshospitalet. Three of the 17 scans were rescans. The study included 14 hepatitis C patients (mean age ± SD, 48.4 ± 6.7 y; age range, 38–63 y; 9 men [1 rescanned] and 5 women) participating in a drug combination-therapy study. The trial was approved by the Ethics Committee of Copenhagen, Denmark, and was in accordance with the Helsinki II declaration. Seven of the 17 datasets were selected for use in this MC QC study.
PET and Motion Tracking
The protocol comprised 6-min hot transmission scanning, followed directly by 40-min emission scanning that was started 40–60 min after tracer injection. Both the transmission and the emission scans were acquired in list-mode format. The patient tool for the Polaris system was fixed to the subject using a standard adhesive bandage with hook-and-loop tape from ApodanNordic (11) as shown in Figure 1. The patient tool displacement from its initial position during the scanning was plotted as a function of time (Fig. 2 [top]), with the horizontal line indicating how the maximum motion magnitude during a scan was determined. The bottom plot in Figure 2 shows the displacement of a point relative to the reference frame, as estimated by the automated image registration (AIR) software (13) used in our study.
Fixation of patient tool.
Motion data for subject 7. (Top) Polaris patient tool displacement relative to starting time of track with 1-s filtering to remove outliers. Turquoise horizontal line indicates motion magnitude classification criteria. (Bottom) AIR motion plot of point 100 mm above center of field of view (near face of subject). x-axes = time in seconds, where t = 0 is start of emission scanning; y-axis = displacement in mm; D = translational Euclidian distance (AIR only), and TX, TY, and TZ its subcomponents.
The motion magnitude classification is simple but mainly serves to select 2 extreme subsets of data for further use in this study: the ones with the largest and smallest motions. Four datasets had a maximum motion of 1.5–2 mm, 3 had approximately 3 mm of maximum motion, 6 had approximately 4 mm of maximum motion, and 4 had 5–10 mm of maximum motion. On the basis of these magnitudes, we selected 7 datasets for further investigation, 2 with minor motion (<2 mm) and 5 with major motion (≥4 mm).
Image Reconstruction and MC
The emission data were divided into 8 frames of 5-min duration. An automatic method for thresholding of the tracking data (18) was used on the Polaris data to find the frame of reference and identify subframes for frames with high intraframe motion. The same reference frame was used by all MC methods to enable direct comparison.
The transmission list-mode data were histogrammed with emission contamination correction (19), and the transmission image (μ-map) was reconstructed from the blank and transmission sinograms using the HRRT Users Software TXTV method (20). The emission list-mode data were histogrammed into sinograms using the framing information (including subframes for EMT MC), and the sinograms were reconstructed using fast ordinary Poisson 3-dimensional ordered-subset expectation maximization (21) and resolution modeling (22,23) (16 subsets, 10 iterations) with attenuation and scatter correction. The image matrix size was 256 × 256 × 207, with a voxel size of 1.22 × 1.22 × 1.22 mm. We denote these images as no MC.
Four different MC strategies were applied with the reconstruction, aligning each frame to the reference frame of the scanning. An overview flow diagram of our data processing from scanning to MC evaluation is given in Figure 3.
Overview flow diagram of our data processing from scanning to MC evaluation.
The first of the 4 MC methods is EMT. An automatic thresholding of motion data for intraframe MC (18) was used to divide the frames into subframes when the detected motion was larger than 1 or 2 mm, as determined from the Polaris tracking data. The method also determined the frame with the minimum intraframe motion, which was used as a reference for all MC methods. The EMT MC method provides a filtered mean motion for each subframe and the transmission frame. Further details are given in the supplemental data (supplemental materials are available online only at http://jnm.snmjournals.org).
Before the emission reconstruction, the transmission image was aligned to each frame or subframe using the tracking data. The reconstructed images were aligned to the frame of reference using the tracking data, and the subframes were summed into full frames as in the work of Picard and Thompson (1) and Fulton et al. (2).
In the second MC method, postreconstruction MC (AIR), the no-MC images were filtered using a gaussian filter of 6 mm in full width at half maximum. This step improves the accuracy of aligning each frame to the reference frame using the AIR software (13).The AIR alignment was performed using default parameters, except that the threshold was set to 36% of the maximum voxel value. The alignment transformers were used on the unfiltered images to create the final motion-corrected images.
In the third method, aligned transmission MC (AIR aligned transmission [ATX]), we use aligned transmission images for the reconstruction of each emission frame (14–16). The alignment of the transmission image to the reference frame in each dataset was checked visually by an experienced operator, and no manual adjustments were needed. Then, a transmission image for each of the emission frames was computed from the reference frame transmission image using the inverse transformer of the frame alignment. A second reconstruction was then performed using an aligned transmission image for each frame. The reconstructed images were filtered using a gaussian filter of 6 mm in full width at half maximum and the alignments to the reference frame found using AIR. The alignment transformers were then used to create the final motion-corrected and unfiltered images.
The fourth and final MC method is non–attenuation-corrected MC (AIR NAC), in which a preliminary reconstruction of NAC and non–motion-corrected images was performed, and the 6-mm filtered images were aligned with AIR to obtain transformers (14–16). The alignment of the transmission image to reference frame was verified visually as was done for AIR ATX. The aligned transmission image for each frame was computed from the reference transmission image using the inverse NAC transformer of the frame and used for a final reconstruction. The transformers from the NAC images were applied to the reconstructed images to create the final motion-corrected images.
Gray Matter (GM)/White Matter (WM) MC QC Method
In the scans with minor motion, the sum image of the no-MC frames exhibited a good separation of the 3-mm-wide GM from the neighboring WM (Fig. 4). The GM/WM method defines a region of interest (ROI) with GM containing voxels with values above 50% of the voxel maximum in the whole image and an ROI containing neighboring WM voxels within a maximum distance of 2 voxels from any GM voxel (|dx| < =2, |dy| < =2, |dz| < =2), where d is the distance in image space (x,y,z). This WM region is different from what would be segmented as WM from an MR image but is sufficient for the purpose of this study. A segmented image was created with values of, respectively, 2 and 1 for the GM and WM ROIs (Fig. 4). In the scans with major motion, a summed image of the AIR MC frames was used to create the segmented image. Our proposed method uses the segmented image to compute the average voxel values in the GM and WM regions in each frame. The GM/WM ratio can then be used as an evaluation measure for the MC methods. The hypothesis is that motion during scanning results in a mixing of the 2 ROIs that should lower the ratio in the no-MC images and in any images in which MC does not work as anticipated.
18F-FDG image for subject 1 with minor motion (sum of eight 5-min frames) at high resolution of HRRT (A). (B) Corresponding GM/WM segmented image. (C) Masking used for mutual information and XC.
Mutual-Information MC QC Method
Mutual information is a well-known measure for image registration (24,25). Instead of registering frames, we computed the mutual information between each frame and its reference using the Kullback–Leibler analog mutual information measure given in the study by Pluim et al. (25). The 1- and 2-dimensional mutual-information histograms are made on gaussian-filtered versions of the images (6 mm in full width at half maximum), and the 5 lowest of the 256 histogram bins were cut off. Removing several bins is simply a masking of the image background in the histogram domain. The mask as it looks in the image domain is shown in Figure 4. Increasing or lowering the number of bins significantly from 256 and not masking or masking too aggressively will give an ambiguous or nonconclusive result of the mutual-information evaluation. In that sense, mutual information is not highly parameter-sensitive, but some caution must still be exercised. We have not tested filters larger than 6 mm but noted that a 3-mm filter or no filtering degraded the performance of mutual information.
Normalized Cross Correlation (XC) MC QC Method
Correlation or XC is a classic image analysis (and signal processing) tool for measuring image similarities and recognizing objects (e.g., characters) in images (26). This method is also used in medical image registration (15). We computed the normalized XC (26) between each image and the reference image after filtering the images with a gaussian filter (6 mm in full width at half maximum). The images were masked in the image domain using a threshold of 5.5/256 = 1.96% of the maximum voxel value, matching the mask size used for mutual information. XC and mutual information are about equally sensitive to changes in filtering and masking. Matlab code (The MathWorks) for mutual information and XC is given in the supplemental data.
Motion Simulation
Three common types of patient motion observed during scans (axial translation and rotations about the x-axis or z-axis) were applied to frame 5 of the scanning of subject 1, who had minor motion, to perform a basic validation and assess the sensitivity and linearity of the 3 methods.
Interpolation Effect
The default interpolation method to create aligned images in AIR (and other tools) is trilinear and is the one we applied. Trilinear interpolation averages over voxels, possibly lowering the 3 MC QC measures. We applied trilinear interpolation with a double transformation: the interpolation size we wished to test and its inverse. This method enabled us to evaluate the trilinear interpolation effect alone (a single transformation corresponds to testing simulated motion).
AIR has a simple nearest-neighbor (NN) interpolation option, which we tested and compared with trilinear interpolation.
RESULTS
The results we present in this section show a high correspondence between our 3 QC methods and a significant discrepancy between AIR and EMT MCs.
Simulation
The simulation results in Tables 1 and 2 show that all 3 QC methods are sensitive to small motion and decrease monotonically with the magnitude of motion as we expected. To compare the 3 methods directly, we made each of them relative and normalized to their 0-motion value, mn* = (m0 − mn)/m0, where m is GM/WM, mutual information, or XC at simulated motion n. The results are given in Figure 5 and show that GM/WM and mutual information have high sensitivity, whereas the less-sensitive XC is the only measure that can be considered linear (as confirmed by a statistical test for linearity, with results given in Tables 1 and 2).
Results of Simulated Translational Motion on Frame 5 for Subject 1
Results of Simulated Rotational Motion on Frame 5 for Subject 1
Results of simulated translational (left) and rotational (right) motion normalized and relative to no motion. RX = rotation around x-axis; RZ = rotation around z-axis; TZ = translational.
Interpolation
Interpolation results (on frame 4 of the dataset for subject 1) are given in Table 3, where any difference from 0 motion (top row) indicates an interpolation effect. As expected, we see monotonically increasing effects of interpolation up to the maximum effect at a size of half a voxel and identical results from performing interpolations of 0.1 mm and (voxel size) – 0.1 mm. Because it was necessary to do a double interpolation (out and back), the results in Table 3 show 100%–200% of the true interpolation effect.
Results of Back-and-Forth Interpolation in 3 Dimensions
We see that XC is robust to trilinear interpolation, which is important when it is the least sensitive to motion. The discrete natures of mutual information (its 251 bins) and GM/WM (its 3-level segmented image) make them much more sensitive to interpolation than XC.
Scans with Minor Motion
The QC method results for the 2 scans with minor motion are shown in Figure 6 (AIR NAC and EMT MC methods were left out), and as expected, the curves are almost flat because there is no need for MC. For mutual information and XC, we see a minimal improvement in the measures from performing MC, whereas the GM/WM actually drops significantly after MC (y-axis is at the same scale as major motion result plots shown later). This difference is due to the interpolation effect and the nature of the methods. For subject 1, we have added AIR NN (AIR with NN interpolation), which then gives results similar to no MC. For mutual information and XC, the difference between no MC and MC is small, and so is the effect of switching to NN. For XC, this small effect is due to its low interpolation sensitivity. For mutual information, the interpolation effect is small, as compared with the larger difference between frames and the reference frame. The peak for the reference frame is due to self-comparison and illustrates the magnitude of this difference. A peak is also seen for XC, but at a smaller scale.
GM/WM ratio, mutual information, and XC for 2 scans with minor motion: subject 1 (top) and subject 2 (bottom). For subject 1, AIR with NN interpolation is also shown. x-axes show frame numbers.
Because GM/WM has no comparison to the reference frame, the discrete nature of the segmented image makes the interpolation effect significant, telling us that we need rather large differences in GM/WM to verify a difference between 2 measures. Thus, we will be conservative and only conclude a difference between 2 MC (and no MC) methods if the GM/WM difference is larger than 0.05 (approximately the AIR–to–AIR NN difference for subject 1).
Scans with Major Motion
The results for 3 of the 5 subjects with major motion are plotted in Figure 7. We see a good correspondence between the 3 MC QC measures in 38 of the 5 × 8 frames (the 2 outliers are discussed later).
GM/WM ratio, mutual information, and XC for 3 of 5 subject scans with major motion: subject 3 (top), subject 4 (middle), and subject 5 (bottom). x-axes are frame numbers.
We also see a significant discrepancy between AIR and EMT MC performances: in the 23 frames in which AIR improves over no MC, EMT does not perform as well as AIR in any of them, and EMT is better than no MC in only 4 of the frames (subject 3, frame 8, and subject 4, frames 6–8). AIR never does worse than no MC. An overview of the results on the 23 frames and their motion relative to the reference is given in Table 4. These motion magnitudes are for just 1 point, and the complete motion of the brain is much more complex. In the last column of Table 4, we see that on 20 of the 23 frames in which AIR gives better measures than no MC, and EMT performs less well, we have more than 2 mm of motion. For the 17 frames not listed in Table 4, we see no effect of MC, and we see motion of some magnitude in only 3 of them (subject 3, frame 5, and subject 7, frames 5–6, in which either AIR or Polaris gives approximately 2 mm of motion).
Results for the 5 Subjects with Major Motion Summarized
The poorer performance of EMT is due to the independent motion of the patient fixation method, which is most likely caused by the fixation loosening from the subject's head. This effect is confirmed by visual inspection of the corrected images (viewed as sequences), in which the motion patterns are different for EMT MC than for no MC (initial motion). AIR MC shows virtually no residual motion. This subjective visual inspection thus confirms the results found with our 3 objective QC measures.
For subject 3, frames 1 and 7, GM/WM deviates from the 2 other QC methods. In subject 3, frame 7, the XC result for EMT is also a bit lower than the mutual information result. The 3 different AIR MC methods perform similarly for all subjects, with the exception of subject 3. In these cases, AIR and AIR ATX deviate from each other in frames 1 and 7 (again), and AIR NAC is different in 6 of the 8 frames. Subject 3 is the subject with most (intraframe) motion, which may cause problems in the AIR alignment and the segmentation step in GM/WM QC.
DISCUSSION
We have proposed 3 measures for QC of MC methods on high-resolution 18F-FDG brain scans and tested them on the HRRT scanner with EMT and AIR MC. On our test data, we saw a high correspondence among the results of the 3 methods, and they all showed an effect of MC, mainly on frames with motion larger than 2 mm.
From our simulation studies, we saw a high sensitivity at small motions for both GM/WM and mutual information, but the interpolation problems of GM/WM and the high frame–to–reference frame difference for mutual information is most likely why we see only a limited effect of MC at small motions, despite the high sensitivity of GM/WM and mutual information. Because we prefiltered the images with a 6-mm gaussian filter, we see no positive effect of interpolation and remove (most) of the noise problem, and therefore at least XC should be able to detect (positive) effects of MC even at small motions.
The measure GM/WM has problems with the standard trilinear interpolation because of the discrete nature of its segmented image. Still, the GM/WM method can be used to evaluate any external MC method (tracer or image-independent) on high-resolution 18F-FDG scans, and the MC method can then be used on any other tracer or scanner. The XC and mutual information measures should be applicable for MC QC directly on other scanners and tracers. XC showed larger relative differences between MC and no MC than did mutual information, but on other tracers mutual information might show advantages because both mutual information and XC measure any differences between images, including noise and tracer dynamics. All 3 measures could be used as motion QC tools to assess whether MC is needed: if the no-MC curve is flat, no MC is needed.
We have chosen to test using 18F-FDG images to be able to include the GM/WM method but mainly to have high-statistic images with as few confounding effects (e.g., tracer dynamics and image noise) as possible when testing the QC measures. Our choice favors the image-based AIR MC, whereas external tracking systems are tracer-independent. Thus, it is expected that AIR MC will perform better than EMT MC, especially on smaller motions for which typical EMT errors are more pronounced.
Using the QC methods, we found that EMT performs poorly with the fairly small subject motions that we saw in our data, but performance was also worse than basic errors can explain. We have tried to improve the selection of just 1 transformer to represent the motion during a (5-min) time frame (Fig. 2 of Olesen et al. (18)) because this is a major source of problems with EMT methods. The fixation method is, to our knowledge, the main source of error. The adhesive bandage method was selected for its simplicity and safety in a clinical setting, and no comparative study has been conducted to show if any other fixation method works better.
AIR MC is better than the no-MC method in all 3 measures, giving flatter curves, and the improvements can be confirmed visually from the images. AIR does, however, have limitations because tracer dynamics and noise might cause problems, and AIR is unable to perform transmission-to-emission image registration. Also, the options for subframing in case of larger intraframe motion with AIR are limited. Thus, it seems that an improved external tracking system (12) may be the most optimal MC method.
We have also validated that MC methods based on the Polaris tracking system are accurate on phantom scans (8), and such methods have been proposed or implemented for human scans using various fixations of the tracked markers on the subject's head (2,3,5,6). However, the validity of these methods relies on the recorded-motion data accuracy, which is difficult to assess on human scans and has not been demonstrated. Thus, there is a need for MC QC methods. In the literature (4–6), Polaris-based EMT methods outperform image-based MC such as AIR, but the better performance is on tracers for which such methods are expected to be less accurate, and the better performance is at higher motion magnitudes, for which the EMT errors are relatively less important.
The only alternative QC we have found also involves AIR and EMT MC but on 11C-raclopride scans (9). This alternative is not a purely objective QC method but rather a selective method that uses MC when the transformers of the 2 methods are close. In borderline cases, a human operator evaluates whether MC is an improvement over no MC.
On the basis of the results summarized in Table 4, we suggest using AIR ATX MC at motions larger than 2 mm. For the question “When does MC need to be applied?” the answer depends on scanner resolution, the tracer used, and the MC method. In answering questions about performance of line-of-response MC, fixation methods, or stereo camera–based EMT, our QC methods could be used as evaluation tools because they are simple and usable.
CONCLUSION
We have proposed 3 objective MC QC methods that show corresponding results on human 18F-FDG brain scans on the HRRT scanner, and the results were confirmed by visual inspection. As expected, we saw a clear positive effect of AIR MC due to the favorable conditions of 18F-FDG data whereas EMT MC had a negative effect on some of the frames, most likely because of the problems with the patient fixation method and with selecting a representative transformer for a 5-min frame.
Any external-tracking MC method can be evaluated with 18F-FDG images on a high-resolution scanner and subsequently used on other scanners and tracers. Two of our MC QC methods, mutual information and XC, can most likely be used “as is” for evaluating external MC methods on other tracers and scanners. In the future, we plan to test these methods on 11C-tracers used in neuroimaging studies. The most widely used EMT system, the Polaris, has been validated to be accurate on phantom scans, but full evaluation on human scans has not yet been performed. A comparative study of Polaris patient marker fixation along with human validation of markerless external tracking systems would be highly desirable. Thus, there is a need for MC QC methods such as the three that were proposed in this work.
DISCLOSURE STATEMENT
The costs of publication of this article were defrayed in part by the payment of page charges. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Acknowledgments
The HRRT PET scanner was donated by the John and Birthe Meyer Foundation. No other potential conflict of interest relevant to this article was reported.
Footnotes
Published online Feb. 13, 2012.
- © 2012 by the Society of Nuclear Medicine, Inc.
REFERENCES
- Received for publication July 1, 2011.
- Accepted for publication November 9, 2011.