Abstract
In 2005, 8 Imaging Response Assessment Teams (IRATs) were funded by the National Cancer Institute (NCI) as supplemental grants to existing NCI Cancer Centers. After discussion among the IRATs regarding the need for increased standardization of clinical and research PET/CT methodology, it became apparent that data acquisition and processing approaches differ considerably among centers. To determine the variability in detail, a survey of IRAT sites and IRAT affiliates was performed. Methods: A 34-question instrument evaluating patient preparation, scanner type, performance approach, display, and analysis was developed. Fifteen institutions, including the 8 original IRATs and 7 institutions that had developed affiliate IRATs, were surveyed. Results: The major areas of variation were 18F-FDG dose (259–740 MBq [7–20 mCi]) uptake time (45–90 min), sedation (never to frequently), handling of diabetic patients, imaging time (2–7 min/bed position), performance of diagnostic CT scans as a part of PET/CT, type of acquisition (2-dimensional vs. 3-dimensional), CT technique, duration of fasting (4 or 6 h), and (varying widely) acquisition, processing, display, and PACS software—with 4 sites stating that poor-quality images appear on PACS. Conclusion: There is considerable variability in the way PET/CT scans are performed at academic institutions that are part of the IRAT network. This variability likely makes it difficult to quantitatively compare studies performed at different centers. These data suggest that additional standardization in methodology will be required so that PET/CT studies, especially those performed quantitatively, are more comparable across sites.
PET with 18F-FDG has been established as a broadly useful technique in cancer imaging, especially in cancer diagnosis, staging, and treatment response assessment. The utility of the PET/CT approach has been widely appreciated, and randomized trials of PET in lung cancer have been shown to reduce rates of unnecessary thoracotomies ( 1– 3). The Centers for Medicare and Medicaid Services have approved PET with 18F-FDG for a broad range of indications in most cancers, supporting the value of the method. However, comparing PET studies from institution to institution can be challenging, in part because of methodologic variabilities.
In 2005, 8 Imaging Response Assessment Teams (IRATs) were funded by the National Cancer Institute (NCI) as supplemental grants to existing NCI-designated Cancer Centers. The major rationale for supporting these teams was to increase the appropriate use of quantitative medical imaging in clinical trials. In addition, through annual meetings and frequent telephone conference calls, the 8 original IRATs, as well as other additional nonfunded IRATs, have worked together on several group initiatives.
One of the IRAT national groups was the PET/CT subcommittee, chaired by 2 authors of this paper, Michael Graham and Richard Wahl. After some discussion about the need for and feasibility of standardizing both clinical and research PET/CT methodology, the IRAT group began to realize that, although the members of the group represented major academic imaging programs, data acquisition and processing differed considerably among the centers. Accordingly, the group found it appropriate to survey its members to see how they were conducting clinical PET/CT—this being a needed starting point before any meaningful standardization of imaging protocols could occur.
This paper presents the results of that survey and summarizes the results.
MATERIALS AND METHODS
A series of teleconferences of the PET/CT subcommittee resulted in iterative development of a standard survey for PET/CT for oncologic applications. Several general areas of interest were addressed, including patient preparation; methods for performance of the scan; and display, analysis, review, and archiving. The complete survey is shown in Figures 1 and 2, and the questions are summarized in Table 1. Sites that had more than 1 PET/CT system filled out the scanner-specific questions for each scanner at the institution. Ambiguous or blank answers were clarified before the data were finalized and summarized.
The complete survey: page 1.
The complete survey: page 2.
Summary of Survey Questions
The 34-question survey was completed by the following 15 sites: the University of Iowa, Johns Hopkins University, Ohio State University, the University of Pittsburgh, Roswell Park Cancer Institute, Washington University, the University of Washington, the University of Wisconsin, Memorial Sloan-Kettering Cancer Center, the University of Arizona, the Dana-Farber Cancer Institute, the University of Colorado, Georgetown University, Vanderbilt University, and the University of California–Davis.
RESULTS
Patient Dose and Preparation
The average administered 18F-FDG dose for adults varies from 259 to 740 MBq (7–20 mCi). At least 2 institutions give as much as 925 MBq (25 mCi). For those sites (only 3) that reported dose per kilogram, the range is 5.18–8.14 MBq (0.14–0.22 mCi)/kg. Pediatric dose modifications were not surveyed.
The uptake time after injection varies from 45 to 90 min. Most try for 60 min but tolerate deviations from 45 to 90 min. Five sites image as late as 90 min occasionally or usually, and 4 sites image as early as 45 min. For brain imaging, at least 1 site images as early as 30 min.
Most sites do the CT transmission scan before injection of CT intravenous contrast material.
Most sites do diagnostic CT 15%–50% of the time, 4 sites never do diagnostic CT, and 1 site does diagnostic CT in all patients. Almost all sites use intravenous contrast material for diagnostic CT.
Bladder catheterization is rarely or never used at any of the sites.
Ten sites almost never use sedation (<3% of the time). Others use it more frequently, particularly in head and neck cancer.
The percentage of cases in which intravenous contrast material is used varies widely by disease type. The most common use is in head and neck cancer (average, 28%; range, 0%–100%), followed by lymphoma (18%, 0%–95%), lung cancer (13%, 0%–80%), and colon cancer (13%, 0%–95%).
The recommended minimum duration for fasting was evenly split between 4 and 6 h, although 1 site suggested 12 h.
The sites were evenly split on recommending a low-carbohydrate diet on the day before the PET study.
A wide range of different strategies is used for preparation of diabetic patients: performing PET/CT early in the morning after an overnight fast and holding all diabetic medications (applied in 40% of diabetic studies); performing PET/CT early in the morning after an overnight fast and holding some diabetic medications (applied in 27%); performing PET/CT early in the morning after an overnight fast and allowing all diabetic medications (applied in 7%); allowing an early-morning light breakfast and all diabetic medications (applied in 15%); allowing an early-morning light breakfast and some diabetic medications (applied in 11%); and titrating with intravenous insulin when necessary (almost never applied).
All but one of the sites measure blood glucose levels before the 18F-FDG injection. The single exception is a site that measures blood glucose levels only in diabetic or research patients. Most sites have a policy of not performing a study on a patient whose blood glucose level is above 200 mg/dL. One site has a cutoff of 180 mg/dL, another has a cutoff of 220 mg/dL, and one does not have a cutoff. Two indicated they have a lower limit of 70 mg/dL.
Scanner, Acquisition, and Reconstruction Parameters
Eight sites have GE Healthcare scanners, 5 have Siemens Healthcare scanners, and 2 have both GE and Siemens scanners.
Apparently, the different sites all have different versions of PET scanner software, although this apparent difference may be partly due to lack of rigor in determining the version numbers. Clearly, substantial variability is present—even within the same manufacturer and scanner model.
The emission acquisition for whole-body imaging is 3-dimensional (3D) for 8 systems (all Siemens) and 2-dimensional (2D) for 13 systems (all GE). Brain imaging, when reported separately, is almost always 3D (7 sites), although 2 sites report doing 2D brain studies.
Most PET/CT studies are done from the base of the brain to the thighs (83%). Other scan extents include top of head to toes (7%), brain only (5%), and head and neck only (5%). Approximately 60% of studies are done with the arms up and 40% with the arms down. Arm position was almost certainly related to the type of examination—that is, neck versus chest or body.
The duration of the emission acquisition per bed position for whole-body scans ranges from 2 to 7 min. The sites with longer scans generally administer lower doses of 18F-FDG.
The transmission scans for all PET/CT scanners are done using the CT portion of the PET/CT scanner.
For the average adult patient, the CT technique varies moderately. Most studies are done at 120 kVp (10 CT systems), although 2 other energies are also used: 140 kVp ( 4) and 160 kVp ( 1). At least 4 sites adjust amperage automatically, presumably using appropriate software incorporated into their systems. Of the sites with a fixed amperage, most use 50 mAs ( 4), although the range is 8–120 mAs. Most sites (13) adjust CT dose for pediatric patients, although 2 do not.
The most common PET reconstruction algorithm used is 2D ordered-subsets expectation maximization (OSEM) with 2 iterations and 20–30 subsets (total of 12 scanners). Fourier rebinning 2D OSEM is used by 4 scanners. 3D OSEM is used by 3 scanners. Filtered backprojection is not used by any of the sites.
2D postreconstruction filtering is used as part of image reconstruction for 14 scanners. 3D filtering is used for 4 scanners. z-axis filtering is used at least for some of the images for 13 scanners.
The most common PET voxel dimensions (in mm) for GE scanners (8/13) are 4.69 for x, 4.69 for y, and 3.27 for z. For Siemens scanners (5/9) the dimensions are 4.06 for x, 4.06 for y, and 3.37 for z. Generally, smaller voxels are used for dedicated brain imaging.
Display, Review, and Archiving
Sites with GE scanners usually use GE Xeleris software for image interpretation by radiologists and nuclear medicine physicians. Half the sites with Siemens scanners use Siemens software (Esoft or Leonardo). The other sites use software by MedImage, MIM Software Inc., or Phillips (Stentor, now iSite).
All sites archive their PET/CT images to a PACS, where the images are available for viewing. At 12 sites, fused images can be viewed on the PACS, although at some sites image fusion is achieved only through saving screenshots.
The display quality of PACS PET/CT images varies markedly across the sites: 5 sites assess the quality of the PACS image display as excellent, 6 as acceptable, and 4 as poor. A wide range of PACS software is used for the referring physicians. The software packages include those by McKesson Corp. (3 sites), Philips (iSite) (5 sites), Agfa Healthcare (2 sites), UltraVisual Medical Systems (1 site), and Emageon Inc. (1 site).
Most sites make digital images available on compact disk for referring physicians and patients. A wide range of viewing software is supplied with the compact disks. The most common are eFilm lite (Merge Healthcare Inc.) (4) and MIMviewer (MIM Software Inc.) (3). Other sites used packages by Hermes Medical Solutions, Philips (iSite), MedImage, and GE Healthcare (Centricity).
Most sites load outside DICOM images onto their PACS or PET image viewing systems for review.
DISCUSSION
Surprisingly little information has been available on the variability in PET practices in the United States. This survey showed that there are many commonalities but also that, across various institutions with considerable expertise in PET, there is remarkably wide variability in some aspects of how clinical PET studies are performed. It is likely that this large variation arises because of the lack of a rigorously proven and validated optimum method for conducting the studies. In the absence of a standard, each site has chosen what seems to be the most reasonable approach.
The wide variation certainly means that it is quite difficult to use the retrospective data of a site in any meaningful multicenter quantitative analysis of efficacy, especially if absolute quantitation is required across centers. This wide variation also impedes the ability of the various sites to participate in prospective clinical trials, because baseline studies are often done before a patient is recruited into a trial. If the standard technique varies markedly from the required technique in the study, the baseline scan will often have to be repeated.
Another significant problem with this wide variation is that sensitivities and specificities for a specific clinical setting—for example, staging lung cancer—potentially could be different at each institution. Quantitative thresholds, such as SUV, will also differ. This is a reason that health technology assessment experts often regard the literature on 18F-FDG PET/CT as limited in adequately justifying the utility of the studies.
Several guidelines on oncologic imaging with 18F-FDG PET/CT have been developed by the NCI ( 4), the Netherlands and European groups ( 5, 6), and the American College of Radiology Imaging Network (ACRIN) ( 7). Similar efforts are under way by other groups, including the Quantitative Imaging Biomarkers Alliance and the Uniform Protocols for Imaging in Clinical Trials group. Table 2 shows how the recommendations of the NCI, the European group, and ACRIN compare with the findings from the IRAT survey for several areas of concern.
Comparison of IRAT Survey Results with Recommendations of NCI, European Group, and ACRIN
The following parameters have moderate variation among the sites surveyed but are reasonably close to the NCI, European, and ACRIN recommendations. It is likely that broad agreement can be achieved relatively easily regarding these parameters.
Administered Dose
Generally, the recommendations of the European group are lower than those in the United States. It appears that 370–555 mBq (10–15 mCi) is reasonable, although with newer tomographs and with the increased general concern about patient radiation exposure ( 8), recommendation of a lower range may be appropriate. In general, it would be more appropriate to specify an administered dose per kilogram of body weight, because patient weight ranges widely.
Uptake Time
The general recommendation is that imaging take place 60 ± 10 min after injection of the 18F-FDG. This is the only parameter that is essentially empiric and convenient, but the value is also probably not optimal. Numerous papers show that tumor uptake continues to rise over time and that tumor-to-background ratio improves over time. The problem with defining 60 min now, for practical reasons, as standard for PET/CT studies is that this timing may become firmly established despite multiple existing and future studies that show the utility of imaging later. This issue should be carefully considered before 60 min is defined as standard.
Duration of Fasting
It appears that the most reasonable recommendation will be that fasting should be for at least 4 h and preferably for at least 6 h.
Recommendation for Low-Carbohydrate Diet
The NCI and half the IRAT sites recommend a low-carbohydrate diet to their patients. It is not really known how well patients adhere to the recommendation or how effective it is in improving the quality or reproducibility of 18F-FDG PET/CT studies. In the absence of any compelling evidence, this recommendation is difficult to support, and patient compliance is difficult to verify.
Blood Glucose Cutoff
There seems to be broad agreement that oncologic 18F-FDG PET/CT studies should certainly not be done if the blood glucose level is above 200 mg/dL (11 mmol); however, both the NCI and the European group suggest that the limit should be much lower: 120 mg/dL. This suggestion likely reflects concerns about standardization in clinical trials, whereas the cutoff of 200 mg/dL at the surveyed sites reflects the practical operation in a clinical PET center.
Emission Scan Duration per Bed Position
The appropriate duration of imaging per bed position depends on the type of tomograph (2D vs. 3D), the amount of overlap per bed position, the administered dose, and body weight. Recommending an optimal time or range of times seems inappropriate. Rather, imaging time should be based on a calculation involving administered dose and body weight, or on the true counting rate recorded from the patient.
CT Technique
A wide variation, particularly in amperage, was seen at the IRAT sites. The European recommendations are somewhat lower than the recommendations of most IRAT sites. Because of increasing concern about radiation exposure ( 8), it may be appropriate to recommend 30–50 mAs or less for attenuation imaging for an average-size patient and to adjust for body size at a reasonably high pitch. Adjustment for body size, collimation, and pitch is particularly important in children ( 9).
Some parameters are more difficult and have wide variation at the sites surveyed and in the NCI, European, and ACRIN recommendations. These parameters, which include diabetic patients, reconstruction algorithms, voxel sizes, and analysis software, will require further investigation and discussion.
The survey showed a wide variation in how to manage diabetic subjects, and the recommendations of the NCI, European group, and ACRIN do not converge. This divergence likely reflects lack of knowledge of an optimum strategy and the fact that the final results may not be sensitive to the exact strategy. We need to pick a strategy and adopt it widely.
Reconstruction algorithms and voxel size are dependent on the options available for the various available PET/CT systems. Standardization of these parameters is essential and will require collaboration between representatives of both industry and academia. The challenge will be to ensure that the images of all major manufacturers are quantitatively comparable.
The same problem exists for display and analysis software, but the problem may be more challenging in this area because of the wider range of available software. Validation methodology is essential to ensure that all software produces the same result.
One area not included in the survey was the frequency and type of calibration at the facilities surveyed. Frequent, careful calibration is important in maintaining the accuracy and capabilities of a PET/CT system. As recommendations for quality assurance in PET/CT expand, an important recommendation to include will be calibration with a common phantom across all the sites in a clinical trial. Such calibration is also appropriate for high-quality clinical imaging and is likely to become a future standard as part of the accreditation of all systems.
The quality of the display capabilities of the PACS systems at the surveyed sites varied widely, from poor to excellent. Because high-quality display capability, including viewing fused images, definitely exists, this variability in display quality likely reflects the fact that some sites have upgraded their PACS system more recently than other sites.
18F-FDG PET/CT has proven to be a remarkably effective clinical tool despite the wide variation in technique used at imaging centers throughout the United States and the world. It is virtually certain that the variation in the nonsurveyed centers is considerably wider than that found in the surveyed group of academic centers. Although 18F-FDG PET/CT is robust and can be used with wide variation, the technique would significantly benefit from standardization, particularly as we try to identify new indications for reimbursement, for clinical trials, and for reproducible quantitative imaging to assess response to therapy. Approaches for standardizing PET have been proposed in an NCI consensus paper ( 4) and, more recently, from experience in Europe ( 5, 6). Such studies suggest PET can be made more quantitatively comparable across sites by standardization of imaging approaches.
We have made some suggestions in this paper on the specific parameters that need to be standardized. This paper is not intended to be a definitive statement but points the way toward a consensus paper, which is definitely needed.
CONCLUSION
A survey applied mainly to academic PET centers participating in the IRAT network showed considerable variability in patient preparation, 18F-FDG dose, CT technique, tracer uptake, imaging time, reconstruction methods, and suitability of PACS for PET/CT display. The existence of this variance despite professional guidelines suggests that results from quantitative PET analyses are likely to differ widely across centers. These data indicate that additional standardization is needed to bring about results—particularly quantitative results—that are more comparable among sites.
Acknowledgments
The study was supported by IRAT supplements to NCI Cancer Center support grants at Johns Hopkins University (3P30CA006973), Memorial Sloan-Kettering Cancer Center (3P30CA008748), Ohio State University (3P30CA016058), the University of Arizona (2P30CA023074), the University of California–Davis (3P30CA093373), the University of Iowa (5P30CA086862), the University of Pittsburgh (3P30CA047904), and Washington University (5P30CA091842).
- © 2011 by Society of Nuclear Medicine
REFERENCES
- Received for publication December 19, 2009.
- Accepted for publication June 29, 2010.