Abstract
At present, there is no ideal imaging modality for the diagnosis of distant metastases and second primary cancers in cancer patients. We aimed to assess the accuracy of whole-body PET/CT for the overall assessment of distant malignancies in patients with various cancers. Methods: Studies about whole-body PET/CT for the detection of distant malignancies in cancer patients were systematically searched in MEDLINE and EMBASE. We determined sensitivities and specificities across studies, calculated positive and negative likelihood ratios, and constructed summary receiver operating characteristic curves using hierarchical regression models for whole-body PET/CT. Results: Across 41 studies (4,305 patients), the sensitivity and specificity of whole-body PET/CT were 0.93 (95% confidence interval [CI], 0.88–0.96) and 0.96 (95% CI, 0.95–0.96), respectively. Subgroup analysis showed that the sensitivity and specificity of whole-body PET/CT for various cancers, respectively, were as follows: head and neck cancer, 0.90 (95% CI, 0.83–0.95) and 0.95 (95% CI, 0.94–0.96); lung cancer, 0.91 (95% CI, 0.76–0.97) and 0.96 (95% CI, 0.94–0.98); breast cancer, 0.97 (95% CI, 0.93–0.99) and 0.95 (95% CI, 0.90–0.97); and cancer of digestive system, 0.92 (95% CI, 0.68–0.98) and 0.97 (95% CI, 0.91–0.99). Conclusion: Whole-body PET/CT has excellent diagnostic performance for the overall assessment of distant malignancies in patients with various cancers, especially head and neck cancer, breast cancer, and lung cancer.
The presence of distant metastases is one of the most important prognostic factors in most cancer patients. Most tumors are classified according to the TNM staging system, and treatment is modified when distant metastases are present. Disease localized to primary sites and to regional lymph nodes is generally treated with curative strategies, including surgery, chemotherapy, and radiotherapy. In contrast, palliative treatment of patients with metastatic disease consists of less aggressive strategies. Moreover, distant metastases usually occur late during the course of cancer, whereas second primary cancers may be found even in early-stage patients. Early detection of distant metastases and second primary cancers is a fundamental precondition for guiding precise staging and optimal management.
Conventional imaging procedures (such as chest radiography, CT, abdominal ultrasonography, and bone scan) are commonly used to detect distant metastases and second primary cancers in patients with various cancers (1,2). However, conventional imaging procedures often do not reliably characterize the extent of disease because it is difficult to identify small distant lesions on the basis of morphologic criteria and to distinguish potential metastatic lesions from benign findings. 18F-FDG PET is a functional imaging modality that is based on the increased glucose metabolism of malignant cells. However, anatomic information concerning distant lesions is limited on 18F-FDG PET images, and the resolution is insufficient to detect small lesions. The introduction of PET/CT scanners combined the functional data of PET with the detailed anatomic information of CT into a single examination. In several previous studies, 18F-FDG PET/CT was shown to be more sensitive and specific than conventional imaging procedures for the detection of distant malignancies in cancer patients at initial staging before treatment or restaging after treatment (1–5). Although many studies about whole-body PET/CT for various cancers were done, the results were still controversial and inconclusive. Here, we undertook a meta-analysis to evaluate the diagnostic performance of whole-body PET/CT for detecting distant malignancies in patients with various cancers.
MATERIALS AND METHODS
Search Strategy
We searched for studies evaluating whole-body PET/CT for the overall assessment of distant metastases with or without second primary cancers in patients with various cancers. Articles were identified with a search of MEDLINE and EMBASE from January 1, 2000, to April 30, 2012. We used a search algorithm that was based on a combination of text words: (CT OR “computed tomography”) AND (PET OR “positron emission tomography”) AND (neoplasm OR cancer OR carcinoma) AND (staging OR “distant metastases”). We had no language restrictions for searching relevant studies. References in the retrieved articles were also screened for additional studies. Authors of eligible studies were contacted and asked to supplement additional data when key information relevant to the meta-analysis was missing.
Study Selection
We considered studies using 18F-FDG PET/CT for the overall assessment of distant malignancies in cancer patients. Inclusion criteria were as follows: 18F-FDG PET/CT was used as a diagnostic tool in cancer patients of all ages regardless of primary sites and treatment status; there were sufficient data to reconstruct a 2 × 2 table such that the cells in the table could be labeled as showing true-positive, false-positive, true-negative, and false-negative results; there was a minimal sample size of 10 cancer patients, including both patients with and patients without distant metastases; analysis was done at the patient level; studies with both retrospective and prospective designs were included in this meta-analysis; and histopathologic analysis or clinical and imaging follow-up was used as the gold standard to assess diagnostic performance. We excluded studies that focused exclusively on second primary cancers and studies from the same study group. We also excluded studies with verification bias, that is, those in which the reference standard was used only for subsets of patients based on positive PET/CT results.
Data Extraction
Two reviewers extracted data from eligible studies independently and resolved discrepancies by discussion. A third investigator settled any remaining discrepancies. For each report, we recorded the author names, year of publication, country of origin, number of eligible patients, type of eligible patients (those with primary cancer or recurrent cancer), study design (prospective or retrospective), type of cancer (head and neck, lung, breast, digestive system, urogenital system, melanoma, or others), and definition of positive PET/CT results (both qualitative and quantitative, qualitative, or unclear). For each study, we recorded the number of true-positive, false-positive, true-negative, and false-negative findings for whole-body PET/CT using histopathologic analysis or clinical and imaging follow-up as the reference standard.
Quality Assessment
To evaluate the quality, applicability, and reporting of the studies, we used a tool for the quality assessment of studies of diagnostic accuracy (QUADAS); this tool was recently proposed to assess the quality of studies of diagnostic accuracy included in a meta-analysis. The QUADAS tool included 14 items, each of which was assessed as “yes” or “no.”
Statistical Analysis
We used bivariate regression models to obtain weighted overall estimates of sensitivity and specificity as the main outcome measures and to construct hierarchic summary receiver operating characteristic (HSROC) curves for whole-body PET/CT. On the basis of random-effects models, this bivariate approach accounted for potential between-study heterogeneity and incorporated the correlation between sensitivity and specificity. Overall sensitivity and specificity and their 95% confidence intervals (CIs) were calculated on the basis of the binominal distributions of true-positive and true-negative findings. By using the pooled sensitivities and specificities, we also calculated positive likelihood ratios (PLRs) and negative likelihood ratios (NLRs) for whole-body PET/CT. Discriminating ability is better with higher PLRs and lower NLRs. Although there is no absolute cutoff, a good diagnostic test may have PLRs of greater than 10.0 and NLRs of less than 0.1.
We used the summary estimates of sensitivity and specificity obtained in the meta-analysis for cancer patients to calculate negative predictive values (that is, the probability that a patient does not have distant malignancies when the test results are negative) for whole-body PET/CT when the prevalences of distant malignancies in the population were assumed to be 10%, 20%, and 30%.
We investigated the effect of heterogeneity on the diagnostic accuracy of whole-body PET/CT by subgroup analysis. Analysis of the covariates included the type of eligible patients (primary cancer vs. recurrent cancer), prevalence of distant malignancies (high prevalence [≥15%] vs. low prevalence [<15%]), number of items assessed as “yes” in the QUADAS tool (high quality [≥12] vs. low quality [<12]), imaging analysis (both quantitative and qualitative vs. qualitative), study design (prospective vs. retrospective), and primary site (head and neck, lung, breast, and digestive system). For studies that included both patients with primary cancer and patients with recurrent cancer, the relevant subsets of patients were included in the summary calculations for each subgroup when the data could be split into such subsets.
We also compared the performance of whole-body PET/CT with that of conventional imaging procedures using the same bivariate regression models. This analysis included all data regardless of the types of conventional imaging procedures compared.
All analyses were conducted with Stata version 11.0 (Stata Corp.).
RESULTS
Eligible Studies
The electronic search yielded 8,091 articles; 8,018 were excluded after reading of the abstract because they did not present any diagnostic information. We screened in full text 73 articles and rejected 32; thus, 41 articles (1–41) were eligible for the meta-analysis. Reasons for exclusion are shown in Figure 1. Of the 41 studies, 22 (53.7%) were described as being prospective. All 41 studies (4,305 patients) were analyzed for the diagnostic accuracy of whole-body PET/CT for the detection of distant malignancies (Table 1). In 21 studies only patients with primary cancer were enrolled, in 14 studies only patients with recurrent cancer were enrolled, and in 6 studies mixed patient populations were enrolled. We were able to extract subgroup data (for primary cancer and recurrent cancer) from 2 (6,20) of the studies with mixed populations. In 17 studies, whole-body PET/CT–positive results were stated to have been assessed in a qualitative manner, whereas in 19 studies, they were stated to have been assessed by both quantitative and qualitative manners. In 5 studies (33–35,38,41), the manner used for assessment was not stated. The prevalence of distant malignancies in the included studies ranged from 5.0% to 80.8%. If 15% were used as a cutoff for high prevalence versus low prevalence, 41.5% of the 41 studies would have had a prevalence of less than 15%.
Quality Assessment
We assessed the quality of the 41 articles according to the 14-item QUADAS tool. Eight of the 14 items could be scored for all included articles: clear selection criteria (item 2), acceptable reference standard (item 3), partial verification (item 5), incorporation bias (item 7), masking to reference test results (item 10), availability of clinical data that would be available in clinical practice when using the index test (item 12), reporting of uninterpretable results (item 13), and explanation of withdrawals from the study (item 14). No study (0%) reported that all patients received the same reference test regardless of the index test result (item 6) or that the reference standard was masked to the index test result (item 11). Representative spectrum (item 1: was the spectrum of patients representative of the patients who will receive the test in practice?) was not present in 34.2% (2,4,9,16,18–21,26,28,29,34,38,39) of the 41 articles. Acceptable delay between tests (item 4) was not reported in 4.9% (20,39) of the 41 articles. The execution of the index test in detail (item 8) was not present in 12.2% (33–35,38,41) of the 41 articles. The reference standard in detail (item 9) was not present in 4.9% (22,28) of the 41 articles. The number of items assessed as “yes” in the QUADAS tool would have been less than 12 in 43.9% of the 41 studies if 12 were used as a cutoff for high quality versus low quality.
Diagnostic Accuracy of Whole-Body PET/CT
When we considered all 41 studies (4,305 patients) with data on a per-patient basis (1–41), the 18F-FDG PET/CT sensitivity was 0.93 (95% confidence interval [CI], 0.88–0.96) and the specificity was 0.96 (95% CI, 0.95–0.96). The type of eligible patients, prevalence of distant malignancies, quality scoring, imaging analysis, and study design did not statistically significantly influence the reported sensitivities and specificities of 18F-FDG PET/CT (P > 0.05) (Table 2).
Likelihood ratio syntheses yielded overall PLR of 20.8 (95% CI, 16.8–25.8) and NLR of 0.08 (95% CI, 0.05–0.13). Our data showed that the HSROC curve was positioned near the desirable upper left corner and that the overall weighted area under the curve was 0.97 (95% CI, 0.95–0.98), indicating a high level of overall accuracy (Fig. 2).
When the prevalences of distant malignancies in cancer patients were assumed to be 10%, 20%, and 30%, the negative predictive values for 18F-FDG PET/CT were 0.99, 0.98, and 0.97, respectively.
Diagnostic Accuracy of Whole-Body PET/CT for Various Cancers
Head and Neck Cancer
When we considered all 16 studies (1,800 patients) with data on head and neck cancer (1,6–20), the pooled sensitivity, specificity, PLR, and NLR of 18F-FDG PET/CT were 0.90 (95% CI, 0.83–0.95), 0.95 (95% CI, 0.94–0.96), 19.0 (95% CI, 14.6–24.7), and 0.10 (95% CI, 0.06–0.18), respectively.
Lung Cancer
When we considered all 5 studies (578 patients) with data on lung cancer (22–26), the pooled sensitivity, specificity, PLR, and NLR of 18F-FDG PET/CT were 0.91 (95% CI, 0.76–0.97), 0.96 (95% CI, 0.94–0.98), 25.9 (95% CI, 15.4–43.6), and 0.09 (95% CI, 0.03–0.26), respectively.
Breast Cancer
When we considered all 5 studies (547 patients) with data on breast cancer (2,28–31), the pooled sensitivity, specificity, PLR, and NLR of 18F-FDG PET/CT were 0.97 (95% CI, 0.93–0.99), 0.95 (95% CI, 0.90–0.97), 18.5 (95% CI, 10.0–34.1), and 0.03 (95% CI, 0.01–0.07), respectively.
Cancers of Digestive System
When we considered all 6 studies (379 patients) with data on carcinomas of the digestive system (32–37), the pooled sensitivity, specificity, PLR, and NLR of 18F-FDG PET/CT were 0.92 (95% CI, 0.68–0.98), 0.97 (95% CI, 0.91–0.99), 34.9 (95% CI, 9.8–123.9), and 0.09 (95% CI, 0.02–0.40), respectively.
Comparison With Conventional Imaging Procedures
Comparison of the performance of 18F-FDG PET/CT with that of conventional imaging procedures in 7 studies (823 patients) (1,2,12,29–31,36) suggested a major difference in sensitivity (43%) between 18F-FDG PET/CT and conventional imaging procedures. The pooled sensitivity, specificity, PLR, and NLR of 18F-FDG PET/CT were 0.95 (95% CI, 0.89–0.98), 0.96 (95% CI, 0.94–0.98), 25.2 (95% CI, 14.8–43.0), and 0.05 (95% CI, 0.02–0.12), respectively. The respective values for conventional imaging procedures were 0.52 (95% CI, 0.31–0.72), 0.89 (95% CI, 0.80–0.95), 4.9 (95% CI, 2.9–8.2), and 0.54 (95% CI, 0.36–0.80).
DISCUSSION
Early detection of distant malignancies in cancer patients is crucial for guiding subsequent staging procedures and treatment. In this meta-analysis, we included all studies about whole-body PET/CT instead of conventional imaging procedures. Histopathologic analysis or clinical and imaging follow-up was used as the reference standard. We considered 41 PET/CT studies (4,305 patients) for inclusion in the meta-analysis and quantified the pooled sensitivities and specificities of all 41 PET/CT studies. We found that the sensitivity and specificity of whole-body PET/CT were 0.93 (95% CI, 0.88–0.96) and 0.96 (95% CI, 0.95–0.96), respectively. Across 7 studies (823 patients), whole-body PET/CT had a higher sensitivity (0.95 vs. 0.52) than conventional imaging procedures. This meta-analysis documented that whole-body PET/CT has excellent diagnostic performance for the overall evaluation of distant metastases with or without second primary cancers in cancer patients.
Because HSROC curves are not easy to interpret and use in clinical practice and because likelihood ratios are considered to be more clinically meaningful, both PLRs and NLRs served as our measures of diagnostic accuracy. Discriminating ability is better with higher PLRs and lower NLRs. Although there is no absolute cutoff, a good diagnostic test may have PLRs of greater than 10.0 and NLRs of less than 0.1. The PLR for whole-body PET/CT was 20.7. This value may be high enough to diagnose distant malignancies in cancer patients. On the other hand, the NLR for whole-body PET/CT was 0.08. A negative whole-body PET/CT result may be used alone as a justification to rule out distant malignancies in cancer patients.
The inherent limitations of PET are poor spatial resolution and failure to depict the anatomic structure of disease. Moreover, false-positive findings from inflammatory or granulomatous lesions in regions with a high prevalence of granulomatous disease are still problematic on 18F-FDG PET images. These issues may restrict its use for assessing distant malignancies in cancer patients. The poor spatial resolution of PET is substantially compensated for by integrated PET/CT, with coregistration of functional imaging with PET and anatomic imaging with CT. However, little is known about the validity of PET/CT relative to PET for detecting distant malignancies in cancer patients. A retrospective study of 248 patients with various cancers showed that PET/CT had a higher sensitivity than PET alone (0.94 vs. 0.78; P < 0.05) and a specificity similar to that of PET alone (0.97 vs. 0.99; P > 0.05) (3). A retrospective study of 250 patients with melanoma also showed that PET/CT had a higher sensitivity than PET alone (0.99 vs. 0.89; P < 0.05) and a specificity similar to that of PET alone (0.98 vs. 0.95; P > 0.05) (41). Limited prospective evidence supports the notion that integrating PET with CT may obviously improve diagnostic accuracy over that achieved with PET alone.
This meta-analysis had some limitations. First, there was no standard follow-up strategy or time. This factor may have affected the accuracy of whole-body PET/CT for the detection of distant malignancies in cancer patients. Second, selective reporting bias is a well-known threat for many clinical research fields, including diagnostic tests. The effect, if present, would be in favor of whole-body PET/CT. The exclusion of conference abstracts and letters to the editors may also have led to reporting bias. Third, we did not perform subgroup analyses for every location of the primary tumors because doing so would have required individual patient data and the number of included studies was too limited. Fourth, approximately 46% of the 41 studies were described as being retrospective. The retrospective nature of studies can be considered a limitation because the possibility that imaging observers might have known the diagnostic outcomes of other imaging modalities before assessing the PET/CT results cannot be excluded. Fifth, the sensitivities in the HSROC curves for the included studies were not uniformly high. Nine (22%) of the 41 included studies had sensitivities of less than 75% (5,10,11,13,19,21,24,25,32). No clear reason for the large difference between these studies and other studies was found.
CONCLUSION
Whole-body PET/CT has excellent diagnostic performance for the detection of distant malignancies in patients with various cancers, especially head and neck cancer, breast cancer, and lung cancer. Large, multicenter, and prospective studies with strict standardization of PET/CT protocols are now needed to investigate the added value of whole-body PET/CT over conventional imaging procedures and could help in establishing whole-body PET/CT as an accurate tool for the detection of distant malignancies in patients with thyroid cancer, bladder cancer, melanoma, and specific types of digestive system cancers.
Acknowledgments
No potential conflict of interest relevant to this article was reported.
Footnotes
↵* Contributed equally to this work.
Published online Oct. 16, 2012.
- © 2012 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
REFERENCES
- Received for publication February 27, 2012.
- Accepted for publication July 2, 2012.