|
|
||||||||
Clinical Investigations |
1 Department of Radiology/Nuclear Medicine, Memorial Sloan-Kettering Cancer Center, New York, New York
2 Department of Medical Physics, Memorial Sloan-Kettering Cancer Center, New York, New York
3 Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York
| ABSTRACT |
|---|
|
|
|---|
Key Words: PET image reconstruction standardized uptake value cancer 18F-FDG
| INTRODUCTION |
|---|
|
|
|---|
The standardized uptake value (SUV) is a commonly used parameter in clinical practice to assess semiquantitatively the intensity of 18F-FDG uptake in tumors (68). However, Ramos et al. (9) suggested that SUVs derived from FBP images underestimate the true activity concentration in tissues. In normal tissues, SUVs for various organs were on average 20% lower when measured on FBP images as compared with IR images. Boellaard et al. (10) reported similar findings in 3 patients with lung cancer. They found quantitative measurements of glucose uptake to be 5%40% higher when derived from iterative reconstructed images as compared with FBP images. Similarly, Visvikis et al. (11) reported differences of 5%20% between SUVs derived from IR images compared with those from FBP images. However, none of these previous studies performed a systematic comparison of SUV measurements between IR and FBP reconstructed images and, to our knowledge, no study has focused on the potential clinical implications of this phenomenon. For instance, changes in SUV over time are frequently used to evaluate the response to therapy in cancer patients (7,12). In addition, an SUV of 2.5 is commonly used to differentiate between 18F-FDG uptake in benign versus potentially malignant lesions (13,14). However, this number is (almost) exclusively based on FBP reconstructed PET studies, whereas IR is now being used increasingly in many institutions. The aim of this study was, therefore, to evaluate the accuracy of SUV measurements from FBP and IR images in phantom studies and to assess the magnitude and potential clinical implications of differences in SUV measurements between FBP images and IR images in cancer patients.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Patient Studies
We analyzed (data analysis approved by the Institutional Review Board) whole-body 18F-FDG PET studies of 85 consecutive cancer patients (51 males, 34 females), with a mean age of 60 ± 17 y (range, 1483 y). Imaging was done for primary staging or treatment evaluation. Malignancies included lymphoma, melanoma, lung cancer, colorectal cancer, esophageal cancer, breast cancer, head and neck cancer, mesothelioma, and neuroblastoma. Patients were injected intravenously with 370555 MBq (1015 mCi) 18F-FDG, depending on body weight and habitus, and images were acquired 4560 min after infusion. Before tracer injection, patients fasted for at least 6 h, although liberal water intake was encouraged.
PET
All studies were performed in 2-dimensional (2D) mode using an Advance whole-body tomograph (General Electric Medical Systems). This tomograph has a transaxial field of view of 15.2 cm and an axial field of view of 55 cm. The spatial resolution is 4.3 mm (full width at half maximum [FWHM]) at the center of the field of view, deteriorating to 7.5 mm at 20 cm off axis. Emission images were acquired first for 45 min per bed position, followed by transmission images for 3 min per bed position, using 68Ge rod sources.
Image Reconstruction
Comparison of Standard Clinical Image Sets.
In a first step, we compared image sets that were reconstructed using our standard clinical parameters, which were chosen previously because they provide images of good diagnostic quality.
For FBP image reconstruction, the FBP+MAC algorithm (filtered backprojection with measured attenuation correction) was used. FBP images were reconstructed at a 128 x 128 matrix using a Hanning filter with a cutoff frequency of 2.0 cycles/cm (8.5 mm). A nonquantitative filter with 3.5 cycles/cm (15 mm) was used for smoothing of the transmission data. This filter was chosen because the 3-min transmission images are rather noisy and considerable smoothing is necessary to obtain attenuation-corrected images of optimal quality (9). With rod-based transmission scans, some pixels may register negative values, which are caused by the real-time randoms correction (i.e., by subtracting delayed channel from the prompt channel). Since these negative values do not represent valid data, they have to be replaced before applying any correction. For the nonquantitative filtering applied here, negative values were therefore replaced with the lowest possible positive number of counts (1.0 count) before smoothing. This replacing of negative values with 1.0 will bias the transmission attenuation value and will lead to an unavoidable underestimation of the activity in the reconstructed emission image.
For the IR, the IR+SAC (iterative reconstruction with segmented attenuation correction) algorithm was used as described previously (9), applying the expectation maximization (EM) algorithm as described by Shepp and Vardi (16) with ordered subsets (OS) (17). OSEM was first defined as a single pass through all subsets and iterated for the second time because the algorithm is optimal for convergence and reconstruction time for 2 iterations. OSEM parameters of 28 subsets and 2 iterations, with a loop filter of 1 cycle/cm (4.3 mm) and a postfilter of 1.4 cycles/cm (6 mm), appeared optimal in terms of image quality, as agreed on by a consensus of 4 experienced nuclear medicine physicians at our institution (9). A gaussian filter with a cutoff of 1.9 cycles/mm (8 mm) was chosen for smoothing of the transmission data. Further details of SAC used for this study were described previously (9). The final reconstructed slice thickness was 4.25 mm for both image sets.
SUV Calculation.
Coronal images reconstructed with either FBP or IR were displayed simultaneously on the monitor. For phantom images, a circular region of interest (ROI) was placed within the sphere. For each clinical study, circular ROIs were placed in the normal tissue of the right lobe of the liver, in the urinary bladder, and in tumor lesions in the chest, abdomen, or pelvis. The ROIs had a size of 50 pixels, where the pixel size is 4.3 x 4.3 mm. Hence, the ROI size is approximately 9.25 cm2 with a diameter = 3.4 cm. Liver and urinary bladder were chosen because they represent 2 extremes regarding the intensity of 18F-FDG uptake in whole-body studies: The liver normally shows homogeneous tracer uptake of low intensity, whereas the urinary bladder is usually the location of highest activity. A tumor lesion was defined as abnormal focal 18F-FDG uptake above background level and outside of normal anatomic structures. The SUV was calculated as follows:
![]() |
Additional Comparison Using MAC and SAC for Attenuation Correction.
To evaluate specifically to what degree differences in SUV are related to image reconstruction (IR vs. FBP) as compared with the differences in the processing of transmission data for the attenuation correction (SAC vs. MAC), additional analysis was performed in a subset of 15 patients with 24 tumor lesions. Images were reconstructed using the following 4 combinations: FBP+MAC, FBP+SAC, IR+SAC, and IR+MAC. In this subgroup of patients, circular ROIs (50 pixels) were again placed in tumor lesions, liver, and urinary bladder, and SUVs were calculated as described above.
Statistical Analysis.
SUV max and SUV avg for liver, urinary bladder, and tumor lesions were tabulated; all data are shown as mean ± SD. A paired t test was used to compare SUV max and SUV avg derived from FBP images versus IR images using our clinical standard parameters IR+SAC and FBP+MAC. Least-squares regression analysis was used to evaluate for correlations between SUV measurements and the correlation between SUV measurements versus true activity concentrations in the phantom study.
To further address the effect of different methods of attenuation correction (MAC vs. SAC), in a subset of 15 patients SUVs were compared for all possible pairs of IR+SAC, IR+MAC, FBP+MAC, and FBP+SAC using a paired t test, and P values were adjusted for multiplicity (18). This analysis was repeated for each organ site (tumor, bladder, and liver) as well as for SUV max and SUV avg separately. For all analyses, a P value of <0.05 was considered significant.
| RESULTS |
|---|
|
|
|---|
|
Values of SUV max and SUV avg for liver tissue, urinary bladder, and tumor lesions are shown in Figures 2 and 3. Regardless of the location, SUV derived from FBP images were significantly lower than those derived from IR images (SUV avg for liver = 1.5 ± 0.3 vs. 2.1 ± 0.4 g/mL; SUV avg for urinary bladder = 23 ± 14 vs. 35 ± 23 g/mL; SUV max for liver = 3.0 ± 0.6 vs. 3.6 ± 0.9 g/mL; SUV max for urinary bladder = 45.9 ± 43.1 vs. 67.4 ± 49.4 g/mL; all P < 0.01). A similar discrepancy was noted for tumor lesions (SUV avg = 4.4 ± 2.5 vs. 6.1 ± 3.7 g/mL and SUV max = 7.1 ± 5.3 vs. 10.7 ± 8.1 g/mL; both P < 0.01). This difference between SUV derived from FBP versus IR images was consistently observed for every single lesion in all patients. Discrepancies between measurements became more apparent with increasing activity concentration and, in some cases, were as high as 55% (Fig. 4). An example of a patient with diffuse large B-cell lymphoma in the abdomen is shown in Figure 5.
|
|
|
|
|
| DISCUSSION |
|---|
|
|
|---|
Methodologic Considerations
For this study, we have used the same clinical imaging parameters that we have been using for the past 8 y. Therefore, our findings truly reflect daily clinical practice. These parameters generate images of acceptable diagnostic quality in most cases. It is conceivable that reconstruction parameters would have to be altered for certain patient groups (e.g., small children or obese patients) to improve image quality. However, any such alteration of reconstruction parameters would introduce additional uncertainty; since we frequently use the SUV for treatment evaluation and for all patients enrolled in clinical research protocols, we prefer a standardized approach to PET imaging.
Reconstruction Method
In addition to method-inherent differences in reconstruction algorithms for the emission image, differences in activity quantification between FBP and IR might be affected by the filter selection. The reconstruction filter is the single most important factor determining the final image resolution. However, this is a variable effect that can be changed by selecting a filter with a smaller FWHM. In comparison, the combined effect of energy of the positron and noncolinearity on the resolution of a clinical (gantry opening, 6070 cm) PET scanner is about 1.8 mm. The effect of scatter is about 1.5 mm. So, if we combine the effect of positron energy, noncolinearity, and scatter, the cumulative effect for the resolution is 33.5 mm, which is a major effect for the overall system resolution. In any event, this use of different filters for FBP and IR reconstructions was based on the attempt to produce images of good diagnostic quality, but this may have led to differences in image resolution (smoother images have a lower spatial resolution) and may have contributed to the observed differences in SUV measurements.
With IR, SUV measurements are sensitive to changes in the number of iterations and subsets. Our parameters of 28 subsets and 2 iterations were based on extensive phantom studies to determine the most suitable default parameters for OSEM reconstructions to generate PET images of superior diagnostic quality (19).
Attenuation Correction
In this study, the change of the method for reconstructing the emission image (FBP vs. IR) caused smaller changes in measured activity concentration than changing the method of attenuation correction (MAC vs. SAC) while keeping the reconstruction of the emission image identical (Table 1). The following provides some explanation for discrepancies in SUV measurements that are related to the preferred method for attenuation correction. Transmission data need to be smoothed before they can be applied for the attenuation correction of PET emission images. When attenuation-corrected FBP images are generated, we routinely use a nonquantitative filter for the smoothing of transmission data to reduce image noise that could interfere with the study interpretation. Nonquantitative filtering consists of putting 1 count into transmission sinogram elements with zero or negative counts. This is very effective in reducing streaks in the image that are induced by transmission noise. However, each of these replacements contributes to an underestimation of transmission attenuation by the patients body. With an increasing number of such corrections or replacements, measurements of true activity concentration on the attenuation-corrected emission image may be overestimated. The gaussian smoothing of transmission data eliminates this "replace with ones" step for the negative values in the transmission scan. Therefore, the bias for underestimation in the transmission data is reduced but, at the same time, the probability for streaks in the attenuation-corrected emission image increases. In contrast, when SAC is applied as part of IR (i.e., IR+SAC algorithm), a nonquantitative filter cannot be used because SAC processing is based on creating histograms and organizing pixel values into 3 compartments: soft tissue, lung, and bone. If a nonquantitative filter were to be applied, as for the FBP images, negative values would have to be replaced with 1.0 count. In the case of IR, such deliberate alteration of counts per pixel may cause misclassification of certain pixels as belonging into one of the other tissue compartments. This misclassification would introduce a bias into the processing of the transmission data that would eventually affect the attenuation-corrected emission images. Therefore, for SAC processing, a gaussian filter is preferred.
In summary, though changes from IR to FBP (or vice versa) do affect the measurement of activity concentration, such differences in SUV measurements are to a larger degree related to the way transmission data are processed. However, the use of different filters for the smoothing of transmission data is necessary to obtain attenuation-corrected images of optimal quality for clinical interpretation. Although this may be considered an "unfair" comparison between various ways of image processing, this approach does reflect daily clinical practice with the intent to achieve optimal image quality. For instance, when using SAC for FBP images with transmission data filtered with an 8-mm gaussian filter (same setting as for IR transmission filtering), the resulting images are very noisy and of low quality (Fig. 6). On the other hand, for the reasons explained in the previous paragraph, nonquantitative transmission smoothing cannot be used for IR images.
|
|
Clinical Implications
Since the introduction of seamless whole-body imaging by Dahlbom et al. several years ago (20), PET has continuously gained clinical acceptance and is now an essential and widely used modality for the staging and treatment evaluation in patients with a large variety of malignancies (1,2). The routine clinical use of attenuation correction and IR has vastly improved the quality of PET images. It appears that many institutions are now relying on IR images because they appear generally less noisy and easier to interpret (4). Some studies also suggest that the detection rate of tumor lesions (with abnormal FDG uptake) is higher on IR than on FBP images (19).
Optimal image quality is an essential prerequisite for the proper review and accurate interpretation of whole-body PET studies. Therefore, image reconstruction parameters need to be adjusted accordingly, even if this may cause some under- or overestimation of the true activity concentration in a given lesion using semiquantitative (SUV) analysis. Although there is a close correlation between true activity concentration and SUVs from FBP images as well as between SUVs from FBP and IR images, significant differences were noted. Previous studies have also investigated the variability in activity quantification with different image reconstruction methods (911). In addition, Ramos et al. (9) reported a considerable within-site variability of SUV measurements from FBP reconstructed images. This is probably due to the fact that FBP has a tendency to underestimate the true activity concentration (or introduce negative bias), which is the result of the filtering step. The magnitude of this effect can be influenced by the specific algorithm used, object size, and the activity concentration in the object. However, none of these prior studies focused on the inevitable clinical implications in cancer patients.
Systematic underestimation of the true radiotracer concentration in tissue would not affect the interpretation when repeated PET studies (for instance, for the evaluation of the response to therapy) are reconstructed with the same method as in the baseline study. In contrast, the study interpretation would change considerably when repeated PET studies are reconstructed using different algorithms: If the initial study is reconstructed with IR and follow-up studies are reconstructed with FBP, an apparent decrease in SUV might then falsely imply a response to therapy. Conversely, an apparent increase in SUV on a repeated PET study might be related solely to the use of a different image reconstruction method (such as the use of IR during follow-up and FBP for the baseline study). Such misinterpretations might be avoidable within a given institution where set protocols exist for image acquisition and reconstruction and physicians are familiar with these settings. In fact, the present study was prompted when we observed a 50% difference in tumor SUVs between the baseline and follow-up study in a patient with stable tumor markers and clinically and radiographically stable disease. This difference was solely due to the use of different image reconstruction parameters. However, a greater problem may arise when serial PET studies are performed at different institutions, not an uncommon scenario in patients seeking a second opinion or being referred to a tertiary care center from outside institutions. Similarly, the validity of apparently established cutoff values for SUV needs to be reassessed in light of these findings. If an SUV of 2.5 were a widely accepted cutoff for the differentiation between malignant and benign lesions (14), a cutoff that is at least 20% higher (i.e., 3.0) would appear appropriate for images reconstructed with IR. Obviously, this would have to be confirmed before applying it in clinical practice.
Though it is clear that SUV should not be taken as the single parameter in PET study interpretation, it is also true that many clinicians and research protocols rely heavily on changes (or lack of changes) in SUV to guide their treatment approach. In fact, the European Organization for Research and Treatment of Cancer has published specific guidelines for the interpretation of FDG PET studies performed for the evaluation of treatment response in cancer patients; for instance, an increase in tumor SUV by 25% is considered an indicator of progressive disease (21). As shown in the present study, differences in SUV much greater than 25%, which are solely due to differences in image reconstruction, can be observed in clinical practice. Of note, discrepancies between SUV measurements increased with increasing activity concentration. Thus, differences in SUV measurements will be most apparent for intensely hypermetabolic lesions and in this study were as high as 55%.
Some authors have questioned the usefulness of SUV measurements altogether (22,23); others have suggested normalization of injected activity to lean body mass (24); in children, body surface area seems to be more appropriate for normalization (25). Nevertheless, the SUV, usually normalized to body weight, is the most frequently used parameter for daily clinical use. This is justified and supported by several studies: For instance, the SUV at the time of initial diagnosis appeared to correlate with the aggressiveness of the primary tumor and the eventual clinical outcome after therapy (6,8,26). Others reported that changes in the SUV during treatment appeared highly predictive of the final treatment response and for the identification of nonresponders (7,12). Therefore, in spite of its limitations, the SUV is, and will likely remain, the most commonly used parameter for the quantification of tracer uptake in clinical PET imaging. Many factors can influence SUV measurements (22,23); the current study emphasizes the need to consider differences in image reconstruction parameters as yet another potential source of changes in SUV, which are not related to (tumor) biologic changes. In light of these findings, some consensus, if not standardization, regarding image acquisition and reconstruction parameters appears warranted.
| CONCLUSION |
|---|
|
|
|---|
| FOOTNOTES |
|---|
For correspondence or reprints contact: Heiko Schöder, MD, Department of Radiology/Nuclear Medicine, Memorial Sloan-Kettering Cancer Center, Box 77, 1275 York Ave., New York, NY 10021.
E-mail: schoderh{at}mskcc.org.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
G. S.P. Meirelles, Y. E. Erdi, S. A. Nehmeh, O. D. Squire, S. M. Larson, J. L. Humm, and H. Schoder Deep-Inspiration Breath-Hold PET/CT: Clinical Findings with a New Technique for Detection and Characterization of Thoracic Lesions J. Nucl. Med., May 1, 2007; 48(5): 712 - 719. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Jentzen, L. Freudenberg, E. G. Eising, M. Heinze, W. Brandau, and A. Bockisch Segmentation of PET Volumes by Iterative Image Thresholding J. Nucl. Med., January 1, 2007; 48(1): 108 - 114. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Picchio, U. Treiber, A. J. Beer, S. Metz, P. Bossner, H. van Randenborgh, R. Paul, G. Weirich, M. Souvatzoglou, R. Hartung, et al. Value of 11C-Choline PET and Contrast-Enhanced CT for Staging of Bladder Cancer: Correlation with Histopathologic Findings J. Nucl. Med., June 1, 2006; 47(6): 938 - 944. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Mawlawi, J. J. Erasmus, R. F. Munden, T. Pan, A. E. Knight, H. A. Macapinlac, D. A. Podoloff, and M. Chasen Quantifying the Effect of IV Contrast Media on Integrated PET/CT: Clinical Evaluation Am. J. Roentgenol., February 1, 2006; 186(2): 308 - 319. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Schoder, A. Noy, M. Gonen, L. Weng, D. Green, Y. E. Erdi, S. M. Larson, and H. W.D. Yeung Intensity of 18Fluorodeoxyglucose Uptake in Positron Emission Tomography Distinguishes Between Indolent and Aggressive Non-Hodgkin's Lymphoma J. Clin. Oncol., July 20, 2005; 23(21): 4643 - 4651. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. J. Kelloff, J. M. Hoffman, B. Johnson, H. I. Scher, B. A. Siegel, E. Y. Cheng, B. D. Cheson, J. O'Shaughnessy, K. Z. Guyton, D. A. Mankoff, et al. Progress and Promise of FDG-PET Imaging for Cancer Patient Management and Oncologic Drug Development Clin. Cancer Res., April 15, 2005; 11(8): 2785 - 2808. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Higashi, K. Ito, Y. Hiramatsu, T. Ishikawa, T. Sakuma, I. Matsunari, G. Kuga, K. Miura, T. Higuchi, H. Tonami, et al. 18F-FDG Uptake by Primary Tumor as a Predictor of Intratumoral Lymphatic Vessel Invasion and Lymph Node Involvement in Non-Small Cell Lung Cancer: Analysis of a Multicenter Study J. Nucl. Med., February 1, 2005; 46(2): 267 - 273. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Y. Salaun, R. K. Grewal, I. Dodamane, H. W. Yeung, S. M. Larson, and H. W. Strauss An Analysis of the 18F-FDG Uptake Pattern in the Stomach J. Nucl. Med., January 1, 2005; 46(1): 48 - 51. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY | THE JOURNAL OF NUCLEAR MEDICINE |