Abstract
In this work, a method for registration of whole-body (WB) scintillation-camera images is presented. The primary motive for the development is to perform activity quantification using the conjugate view method on an image basis. Accurate image registration is required for sequential anterior and posterior scans, for serial emission images for analysis of the biokinetics, and for transmission and emission images for a pixel-based attenuation correction. Methods: Registration is performed by maximization of the mutual information. The spatial transformation has been tailored for the registration of WB images and is composed of global and local transformations, including rigid, projective, and curved transformations. A coarse registration is first performed using cross-correlation and direct pixel scaling. Optimization is then performed in a sequence, beginning with the 2 legs independently, followed by the upper body and head. Evaluation is performed for clinical images of an 131I-labeled monoclonal antibody and for Monte Carlo–simulated images. An anthropomorphic WB computer phantom, which has been especially modified to match the patient position during WB scanning, is used for the simulations. Results: For simulated images, registration errors are within 1 pixel (<3.6 mm) for a sufficient image count level. Separate evaluation of the influence of noise shows that the errors increase below a total image count of approximately 105 (signal-to-noise ratio, approximately 4). For clinical evaluations, the deviations between point markers are 9 ± 5 mm. Conclusion: An automatic registration method for WB images has been developed, which is applicable to emission–emission and transmission–emission registration. This method has been applied in more than 50 clinical studies and has shown to be robust and reliable.
Whole-body (WB) scintillation-camera imaging is frequently used to measure the uptake and retention of activity in organs and tissues. Activity measurements are required for absorbed dose assessments, for radiation protection in diagnostic nuclear medicine, and for dose planning in radionuclide therapy (1–4).
WB activity quantification can be performed using the conjugate view method (2,5,6). These calculations are usually based on the count level in separate regions of interest (ROIs) from anterior and posterior images, with attenuation correction by a factor corresponding to the mean value for the ROI. The conjugate view method can also be applied directly to the WB images. A geometric mean image is then calculated from the anteroposterior pair, and the attenuation correction is performed using a measured attenuation map. Analysis of the biokinetics can be performed on a pixel basis (7) or by applying identical ROIs to all images in a time series. Furthermore, an image of the cumulated activity can be formed, in which ROI analysis yields values that can be used directly for absorbed dose calculations.
This quantification methodology assumes that the patient position is the same in each imaging session. Image registration is required if the anterior and posterior images are acquired sequentially and the patient has moved between the scans, for alignment of images from a series of acquisitions, and for alignment to a transmission scan acquired on a separate occasion.
We present a method for registration of WB images that is applicable to between-emission image registration and to registration of transmission and emission images. Two methods have been presented for automatic registration of WB images (7,8). Both were used for emission scans only, either by manual or automatic extraction of features. Registration between emission and transmission images has been performed using point markers for regional spot scans (3). In this method, the mutual information (MI) is used as a measure of similarity (9–11). The MI method has proven to be robust and accurate in several studies (10,12) and, because it uses the entire image information, registration can be automated. With regard to spatial transformation, the registration of WB images places special demands on correcting for patient movement between scans. In this work, registration is accomplished using global linear scaling and translation, combined with local rotation, shearing, and curved transformation. The most suitable transformation was established after a survey among several candidate transformations.
The method is applied to clinical images of patients who are given radioimmunotherapy for non-Hodgkin’s lymphoma using an 131I-labeled antibody (13). Five intratherapy anteroposterior WB scans are acquired over a period of 1 wk. On each occasion, a geometric mean image is calculated by multiplication, where the posterior image is first registered to the anterior image. Transmission scanning is performed 1 d before injection, from which an attenuation factor image is calculated. Attenuation correction is performed by image multiplication after each of the geometric mean images has been registered to the attenuation factor image. For evaluation of the method, Monte Carlo simulation and a modified WB anthropomorphic computer phantom have been used.
MATERIALS AND METHODS
Regions for Local Transformation
The software consists of 2 parts: a preparation program and the main registration program. The preparation program is used to calculate the geometric mean and attenuation factor images from the acquired images and to determine the regions for the local spatial transformations. Figure 1 illustrates the regions, which are determined automatically. First, a threshold, calculated with an adaptive method, separates the patient from the background (14). A bounding box (xmin, xmax, ymin, and ymax) is determined from the minimal and maximal x- and y-coordinates in the threshold image. The coordinates xmidline, yhead, yupper, and ylegs are then calculated according to:
Eq. 1
Definition of regions for attenuation factor image. Bounding box, marked as rectangle surrounding patient, is determined from threshold image. Coordinates xmidline, yhead, yupper, and ylegs are calculated using Equation 1 and are used to define head region (x∈[0, xend-of-matrix], y∈[0, yhead]) (1), upper body (x∈[0, xend-of-matrix], y∈[0, yupper]) (2), pelvic region (x∈[0, xend-of-matrix], y∈[yupper, ylegs]) (3), right leg region (x∈[0, xmidline], y∈[ylegs, yend-of-matrix]) (4), and left leg region (x∈[xmidline, xend-of-matrix], y∈[ylegs, yend-of-matrix]) (5).
Five regions are determined covering the head, the upper body, the pelvic region, and each leg (Fig. 1). The constants in Equation 1 have been determined empirically from several patient WB images. Optionally, if the regions appear poorly positioned, an interactive display allows for manual readjustment.
Spatial Transformation
In the early stages of development, the spatial transformation consisted of a second-order polynomial (15) according to:
Eq. 2
where the unprimed system denotes the original coordinates and the primed system denotes the transformed coordinates, with transformation coefficients pij and qij.
The image quality of scintillation cameras does not allow for the determination of the 18 coefficients included in this transformation. Furthermore, some terms generate transformations that are not anatomically realistic. To simplify the expression, an investigation was conducted to identify the terms that could be removed. The usefulness of each term was ranked according to the resemblance to a patient movement, as judged by an experienced technologist. The sensitivity with which the MI measure was affected by each term was also investigated. Each coefficient was applied separately, and a registration to the original image was performed, where only the selected coefficient was allowed to vary in the registration. The coefficients that could be correctly recovered were highly ranked. Coefficient plots of the MI were also considered in the selection of the terms, where an increasing MI, with no local maxima and a clearly expressed peak at 0, was desired.
The following transformations were then considered most relevant:
Linear translations governed by the coefficients p00 and q00. The MI coefficient plot showed an increased slope for x–y translations near 0, and a coarse prepositioning was thus expected to make the optimization converge more rapidly.
The linear scaling coefficients p01 and q10. These were difficult to determine by optimization; therefore, the pixel dimensions were determined from measurements.
The linear shearing coefficients p10 and q01. These were useful but were sometimes difficult to recover. Good recovery and a better patient resemblance were achieved by applying the transformations locally (Fig. 2) and by constraining the transformation. For the head region, a rigid rotation described by the angle α was used with the center of rotation at (xmidline, yhead). For the upper body, a rotation angle, β, with the center of rotation at (xmidline, yupper) was used. For the leg regions, the coefficient p10 was determined for each leg separately, denoted p10L and p10R, centered at (xmidline, ylegs).
The coefficient q02 produced an image curvature along the x-direction. With the center at (xmidline, yupper), applied to the upper body and denoted q02U, it resembled a shrugging of the shoulders.
Test transformation of mass-density image. (A) Original image. (B) Transformed image. (C) Absolute difference between transformed image and original image. (D) Transformation applied to regular grid, shown for illustration. Coefficients used were p00 = −7, q00 = −7, p01 = 1.02, q01 = 1.02, α = −4, β = 5, p10L = −0.05, p10R = 0.07, and q02U = 0.0005. Euclidian distance is illustrated in A and B, where center of mass of evaluation regions for original image are marked as squares and as crosses for transformed image. Values are head region, 33 mm; trunk, 25 mm; right leg, 41 mm; and left leg, 9 mm, yielding average of 27 mm. Summation of counts in C yields value of SAD of 40%.
The final expression for the coordinate transformation is given by:
Eq. 3
where v, v2, and t are column vectors:
and S, Rγ, Q, and LX are 2 × 2 matrices:
The matrices t and S operate globally, whereas the other terms describe local transformations, with the factors ri determining the regions in which they are applied (Fig. 1): r1 = 1 for the head, r2 = 1 for the upper body, r4 = 1 for the right leg, r5 = 1 for the left leg, and ri = 0 otherwise (i = 1,…5). Figures 2B and 2D show an example of a spatial transformation.
Registration Procedure
Before registration, 1 image is defined as the match image and the other is defined as stationary, where the match image is spatially transformed at registration. The program is written so that any image can be used as match or stationary, and the effect of the use of the different images is discussed further in the Results.
A coarse preregistration is first performed by interpolation of the match image into the pixel dimensions of the stationary image, which are obtained from weekly calibration measurements. An approximate determination of translations t is then performed by a direct computation of the maximal cross-correlation, in the frequency domain. This preregistration is necessary because the image separator coordinates (xmidline,…,ylegs) determined for the stationary image are also used subsequently for the match image.
Optimization is then performed. Because the local transformations for the upper body and the legs are largely independent of each other, the optimization is divided into separate steps. The optimization is first done for the leg regions, including coefficients p10L, p10R, and p00. Initially, q00 was included, but it was excluded on finding that its value was little influenced. Optimization of the upper body includes α, β, q02U, p00, and q00. Initially, all 5 coefficients were optimized simultaneously, but, because of the occurrence of misregistration of q00 and q02U, these are optimized separately.
For optimization of the local transformations, the MI is calculated on the basis of the separate regions to avoid the coefficients for the legs being affected by a potentially mismatched upper body and vice versa. For optimization of p10L, p10R, and p00, the MI is calculated on the basis of the leg regions. Optimization of the coefficients p00, α, and β is performed with respect to the MI of the upper body. Optimization of q02U and q00 is based on the entire image MI because q00 requires information from the entire image for its determination.
In summary, registration is performed as follows:
Interpolation into equal pixel dimensions by a direct determination of S.
Approximate direct determination of t using cross-correlation.
Optimization of the coefficients p10L, p10R, and p00 on the basis of the MI of the leg regions.
Optimization of α, p00, and β on the basis of the MI of the upper body region.
Optimization of q02U and q00 on the basis of the MI of the entire image.
Readjustment of coefficients p10L and p10R on the basis of the MI of the leg regions.
Implementation and Application
The MI is implemented according to Studholme et al. (10), where estimates of the joint and marginal density distributions are calculated as the joint and marginal histograms. The histograms range from 0 to 255 and are normalized to unity. Powell’s algorithm is used for the optimization (16). To avoid the convergence to local maxima and ridges in the coefficient space, the magnitude of the initial search vectors has been adjusted for each coefficient to values on the order of what is typically obtained at registration.
Before the registration, some image preprocessing is performed to amplify features that enhance the similarity between the images. A logarithmic transform is applied to the emission images, after adding a value of 1 to account for the 0 values. The attenuation factor image is low-pass filtered to a spatial resolution comparable with that of the emission images. Thresholds are applied to remove the background outside the patient attributed to photons scattered in the couch. The registration transformation result is applied to the unprocessed images. To account for the discontinuities that may occur at the boundaries between the local regions, nearest-neighbor interpolation between the regions is used, and an averaging filter is applied.
All software is written in Interactive Data Language (Research Systems, Inc., Boulder, CO). The software is platform independent and has been successfully compiled and tested for Windows 98 (Microsoft, Redmond, WA), Compaq True-Unix (Compaq, Houston, TX), and Sun Solaris (Sun Microsystems, Mountain View, CA).
Imaging Systems
Clinical acquisitions were performed on a DIACAM scintillation camera (Siemens Gammasonic, Inc., Chicago, IL) connected to a NucLearMAC acquisition system (Scientific Imaging, Inc., Littleton, CO). A high-energy collimator was used for 131I measurements and a low-energy, all-purpose collimator was used for 57Co. A 15% energy window, centered at 364 keV for 131I and at 122 keV for 57Co, was used. For the transmission studies, an uncollimated 57Co flood source was mounted on the camera head at a distance of 100 cm. The patient position was adjusted in relation to the couch and to the directional lasers.
For test image simulation, a modified version of an anthropomorphic WB computer phantom (17,18) was used. The modification was performed to match the patient position during WB scanning. It consisted principally of a straightening of the originally angled arms, by cutting out the lower arms and rotating them in 3 dimensions, with care taken to maintain the continuity of the tissue compartments.
For simulation of a noise-free system with perfect spatial resolution, mass-density values were assigned to the various phantom organs (19,20). A projection of the mass-density distribution was obtained by line integration along the anteroposterior direction (Fig. 2A).
Monte Carlo simulation used the SIMIND code connected to a routine that allows photon interaction in the collimator (21,22). The activity in organs was estimated from clinical radioimmunotherapy WB and SPECT images using ROI analysis. Simulations of emission anteroposterior images were performed for 131I, with a complete decay scheme. Transmission studies of 57Co were simulated with and without the phantom present. On the basis of measurements in the clinical images, a background intensity of 15% of the average count density within the phantom was added homogeneously to the emission images. Nine different levels of Poisson noise were simulated. The clinical image count levels ranged between 104 and 107 counts, and realizations including 104 . (1, 2.5, 5, 10, 25, 50, 100, 250, 500) counts were made, corresponding to approximate signal-to-noise ratios (SNRs) of between 1 and 25. Figure 3 shows 3 geometric mean images with different noise levels together with a clinical image. Figure 4 depicts a simulated and a clinical attenuation factor image.
(A–C) Three noise realizations for simulated WB images. Image total counts are 104, 105, and 106 for A, B, and C, respectively, corresponding to SNRs of approximately 1, 4, and 12, respectively. (D) Patient image. In this patient, uptake in heart was higher than that used in simulation, and kidney uptake was lower. Source distribution was estimated from several patient images, and, as illustrated, individual variations occur.
WB attenuation factor images: simulated image (A) and patient image (B). Main differences between images occur at jaw because of amalgam fillings, in right lung because of portal catheter, and at hip because of implant at neck of femur. No negative effects on registration accuracy were seen because of such patient-related details. Noise level is generally low for clinical images; therefore, no separate noise simulations were performed for attenuation factor images.
Evaluation
The general evaluation methodology for the simulated images was to apply a test transform to the match image (M), yielding the image M′. Registration was performed between the stationary image and M′, followed by evaluation by measuring the deviation between the registered M′ and the original M image. The following figures of merit were used: (a) the Euclidian distance
between the center of mass of the images in pixels or mm and (b) the sum of the absolute difference
expressed as the percentage of the total number of counts in the original M image. The figures of merit were calculated for 4 separate regions covering the head, the trunk, and the left and right legs, as illustrated in Figures 2A–2C. These regions were identical to those used for registration (Fig. 1) apart from the trunk region, which extended from yhead to ylegs. Generation of test transformations was performed using Equation 3, with coefficient values randomly sampled within reasonable ranges. Average induced values for each evaluation region, obtained from 180 test images, are presented in Table 1. The induced value of the SAD for the WB was 98% ± 8%.
Euclidian Distance for Spatially Transformed Test Images
The registration performance for an ideal imaging system was evaluated by means of the mass-density projection image. Test versions of the image were generated whereby registration was performed to the original image.
To assess the accuracy of registration in a realistic situation, Monte Carlo–simulated images were used for registration of the 3 image couples: posteroanterior, geometric mean to attenuation factor, and attenuation factor to geometric mean. For each registration couple, the respective match image (posterior, geometric mean, attenuation factor) was test transformed. Registration was performed to the respective stationary image, and the accuracy was quantified by comparison with the original match images. The susceptibility to noise was evaluated by means of the images with different count levels. For each level, more than 2 noise realizations were made, and 20 versions of test images were registered. Registration was also performed with and without preprocessing of images. To assess the influence of collimator septal penetration, Monte Carlo simulation was performed with and without recording septal penetration counts, and the results were compared for a count level of 106.
The improvement gained through using the complete registration scheme in comparison with a direct, rigid method was also evaluated. The direct method consisted of pixel resizing and rigid translation obtained by cross-correlation (i.e., steps 1 and 2 in the registration scheme).
In 2 clinical studies, the registration accuracy was assessed by comparison with the position of 57Co markers, attached at the sternal notch, the pelvic bones, and the umbilicus, for the 131I emission and 57Co transmission acquisitions. For 131I acquisitions, the marker image was acquired in a separate energy window. The coefficient values resulting from the registration were applied to the marker image, whereby the centroid position of each marker was determined.
RESULTS
For the mass-density projection image, the average Euclidian distance was 0.4 ± 0.2 mm, with a maximal deviation of 2 mm. The average value of the SAD for the whole body was 1.5% ± 0.2%.
Figure 5 shows the errors obtained for the 9 count levels. Each data point represents an average of the 4 evaluation regions and the 20 test registrations. Similar results are obtained for the posteroanterior registration, regardless of the use of preprocessing (Fig. 5A). For count levels of >2.5 . 105, the error distance is between 1 and 4 mm, with an average of 2 ± 1 mm. The error and the associated SD increase gradually for count levels of <105 counts. At 5 . 104 counts (SNR, approximately 2), values of about 10 ± 5 mm are obtained, and, for 104 counts (SNR, approximately 1), the average error distance is 20 ± 10 mm. For the geometric mean to attenuation factor registration, results are similar to those for the posteroanterior registration for count levels of >2.5 . 105, with an average error distance of 2 ± 1 mm (Fig. 5B). For lower count levels, the error shows a sharp increase, and, for 104 counts, values similar to those induced for test transformation are obtained (Table 1). The results obtained when reversing the choice of the match image (not shown) are similar, although a subtle difference is the effect of the preprocessing. When using the attenuation factor as the stationary image, no effect is seen on applying the preprocessing, whereas for the reverse a slight improvement is seen for the lower count levels. Generally, for all data points in Figures 5A and 5B, 1 SD was approximately 50% of the error distance.
Euclidian distances obtained for registration of posteroanterior images (A) and attenuation factor to geometric mean images (B). Results are displayed as function of total number of counts in images. Results, including preprocessing, with associated error bars (mean ± SD) (○). (A) Error bars for 106, 2.5 . 106, and 5 . 106 counts cannot be seen on scale but were within ±1 mm for all 3 points. (B) SDs for 5 . 105, 106, 2.5 . 106, and 5 . 106 counts were within ±1 mm. Results for unprocessed images, where error bars have been omitted for clarity (•).
For the registration of posteroanterior and geometric mean to attenuation factor images with a count level of 106, error distances are ≤2 ± 1 mm for all anatomic regions (Table 2). Results are also given for registration performed with the direct, rigid method, in which the error distance lies between 6 ± 4 mm and 23 ± 13 mm for all anatomic regions with the largest obtained discrepancies for the head region and the legs. The values of the SAD for the posteroanterior rigid registration were obtained to 47% ± 6% calculated for the whole body and to 28% ± 6% using the complete registration method. For the geometric mean to attenuation factor registration, 44% ± 6% was obtained for the rigid method and 32% ± 5% was obtained for the complete method. The SAD arising from noise alone was computed from various noise realizations to a value of 27%. Thus, the WB SAD related to the registrations was 1% (28% − 27%) for the posteroanterior registration and was 5% for the geometric mean to attenuation factor, using a complete registration scheme, whereas for the rigid registration, the SADs were 19% and 17%.
Registration Results, in Terms of Euclidian Distance (mm), for Count Level of 106
No clear trend could be seen in the registration accuracy when septal penetration was included in the simulations. For similar count levels, the results were not significantly different from those obtained without septal penetration. The deviation assessed from the positions of the point markers in the clinical images was 9 ± 5 mm.
DISCUSSION
The novelty of this work compared with previous studies (1,3,7,8) is the use of MI for WB images in conjunction with a tailored spatial transformation. Furthermore, monomodality (posteroanterior) and multimodality (geometric mean to attenuation factor) image registration is addressed.
To date, more than 50 clinical images have been registered and, in general, good results have been obtained. The automatic division into local regions works for about 50% of the patient images. Manual readjustment has been required in cases of poorly positioned patients, where the image is truncated or the legs are angled and therefore fall outside the subregions. Further, the thresholding algorithm fails to find the patient outline for high-noise cases and in the presence of patient-related details, which alter the general intensity distribution (Fig. 4). For the preprocessing of images, a logarithmic transform is used because it amplifies the low image intensities (e.g., the blood background and scatter), which, to some extent, reflect the patient outline. In this study, the effect of image preprocessing was small in most cases because the body contour was well delineated with the present activity distribution. In preliminary studies using an activity distribution that is primarily concentrated to bone, preprocessing shows a considerable effect on the registration results. Attempts to use Compton scatter images for registration have also been made (23).
The MI measure has shown to be robust in that it copes well with noisy images and also shows no distinguishable effects from various artifacts in the clinical images (Fig. 4). The use of point markers (1,3) does not represent an alternative for the current application because of the radiation dose to the personnel. Injected activity levels are several gigabecquerels, and the laborious attaching of markers is not considered justified. Each patient is imaged on at least 5 occasions, and some return for a second treatment regime. Therefore, the error associated with the positioning of the markers is prone to be high. Other suggested similarity measures are manually selected landmark regions (7) and grid points generated after the images have been segmented into a binary version (8). Landmark identification is difficult in this study because of the high blood background and a generally diffuse uptake pattern. At late time points, the high level of noise makes segmentation difficult.
Spatial transformations of various complexity have been used in previous studies, including translation–rotation only (1), scaling (7), local projective transformations (8), and full second-order polynomial warping (3). Through comparisons with a less-detailed method, we concluded that registration using pixel resizing combined with linear translations does not yield sufficiently accurate results. We also concluded that a full second-order polynomial transformation is not achievable for the image information at hand. The transformation we used was established as a compromise, where the most relevant aspects required for the registration are included. In Equation 3, no local term is used for the pelvic region (r3 is always 0) because the patient positioning procedure is assumed to account for this position. Furthermore, patient movements that alter the lines of projection in the formation of the WB images, such as trunk twisting and rolling, are not modeled. A potential problem is the discontinuity of the transformation, which may affect the appearance of tumors situated at the boundaries of the transformation regions. Several registered images have been inspected using surface plots, and the areas between the regions have appeared smooth. However, in the future, a more sophisticated interpolation method may be desired.
The Monte Carlo simulation technique is useful for validation because the physics involved in the image formation can be modeled in detail, and images from different modalities that exactly match each other can be obtained. In the early stages of development, we found that the patient shape had a major influence on the registration and, thus, the straightening of the phantom’s arms was important. For the evaluation, 2 figures of merit were used: the SAD, which reflects the registration error on a pixel level, and the Euclidian distance, which reflects the mean distance error for 4 image regions. In this article, the Euclidian distance is generally stated as a measure of the accuracy because the SAD is influenced by the noise, which complicates comparisons of results from different image types.
The results for the mass-density projection image show that the method works well for an ideal system. The differences obtained between the registered and original image are distributed evenly over the image, suggesting that they originate mainly from interpolation effects. Generally, the largest systematic errors are obtained in the cranioinferior direction because of the appearance of WB images. A great deal of registration information is contained in the x-direction, whereas for the y-direction, the anatomic regions that supply information are the top of the head, the shoulders, the crotch, and the edges of the feet. Consequently, the appearance of the shoulders in the image was deemed to be important for the registration. Noisy images lead to larger errors and a larger variance in the results. Good results are obtained for posteroanterior registration, even at low count levels. Registration of attenuation factor and geometric mean images yields acceptable results down to levels of about 105 counts. For the lower count levels, the optimization of the local transformations fails. The conclusions regarding clinical applications are that registered images must be carefully inspected for count levels of <105, especially for registration between geometric mean and attenuation factor images. The proposed method is generally advantageous compared with simpler methods. However, for noisy images, a less flexible method may work better, and we are currently investigating which transformations should be included in such a case. A particular application here is dosimetry calculations, where late time points are of great importance.
CONCLUSION
A new registration method for WB scintillation-camera images has been presented. The method was evaluated using Monte Carlo simulations and has to date been applied to more than 50 clinical images. Our results show that the method is reliable and robust and satisfies the requirements for the current application. Future studies will focus on the use of the registration method for absolute activity quantification.
Acknowledgments
The authors thank Ola Lindén, MD (Department of Oncology, Lund University Hospital), for assistance in the definition of the activity distribution. This work was funded by the Swedish Cancer Foundation, the Gunnar Nilsson Foundation, the Berta Kamprad Foundation, the Royal Physiographic Society, the John and Augusta Persson Foundation, and the Lund University Hospital Foundations.
Footnotes
Received Dec. 8, 2000; revision accepted Jun. 12, 2001.
For correspondence or reprints contact: Katarina Sjögreen, MSc, Department of Radiation Physics, Lund University Hospital, SE-22185 Lund, Sweden.