Abstract
The aim of this study was to assess the diagnostic performance of 18F-FDG PET and integrated 18F-FDG PET/CT for diagnosing recurrent esophageal cancer after initial treatment with curative intent. Methods: The PubMed, Embase, and Cochrane library were systematically searched for all relevant literature using the key words “18F-FDG PET” and “esophageal cancer” and synonyms. Studies examining the diagnostic value of 18F-FDG PET or integrated 18F-FDG PET/CT, either in routine clinical follow-up or in symptomatic patients in whom recurrence of esophageal cancer was suspected, were deemed eligible for inclusion. The primary outcome was the presence of recurrent esophageal cancer as determined by histopathologic biopsy or clinical follow-up. Risk of bias and applicability concerns were assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Sensitivities and specificities of individual studies were meta-analyzed using bivariate random-effects models. Results: Eight eligible studies were included for meta-analysis, comprising 486 patients with esophageal cancer who underwent 18F-FDG PET or PET/CT after previous treatment with curative intent. The quality of the included studies assessed by the QUADAS-2 tool was considered reasonable; there were few concerns with regard to the risk of bias and applicability. Integrated 18F-FDG PET/CT and standalone 18F-FDG PET were used in 4 and 3 studies, respectively. One other study analyzed both modalities separately. In 4 studies, 18F-FDG PET or PET/CT was performed as part of routine follow-up, whereas in 4 other studies the diagnostic test was performed on indication during clinical follow-up. Pooled estimates of sensitivity and specificity for 18F-FDG PET and PET/CT in diagnosing recurrent esophageal cancer were 96% (95% confidence interval, 93%–97%) and 78% (95% confidence interval, 66%–86%), respectively. Subgroup analysis revealed no statistically significant difference in diagnostic accuracy according to type of PET scanner (standalone PET vs. integrated PET/CT) or indication of scanning (routine follow-up vs. on indication). Conclusion: 18F-FDG PET and PET/CT are reliable imaging modalities with a high sensitivity and moderate specificity for detecting recurrent esophageal cancer after treatment with curative intent. The use of 18F-FDG PET or PET/CT particularly allows for a minimal false-negative rate. However, histopathologic confirmation of 18F-FDG PET– or PET/CT-suspected lesions remains required, because a considerable false-positive rate is noticed.
- 18F-FDG PET
- positron emission tomography/computed tomography
- esophageal cancer
- recurrent disease
- systematic review
Surgical resection of the esophagus with en-bloc lymphadenectomy remains the cornerstone of treatment with curative intent for patients with localized esophageal cancer (1). A multimodal approach is increasingly applied as strong evidence exists for a survival benefit of 7%–13% with neoadjuvant chemoradiotherapy over surgery alone (2,3). Overall 5-y survival rates of patients with esophageal cancer who are treated with curative intent remain relatively poor (34%–47%) (3,4). These low survival rates are mainly attributable to the high incidence of recurrent disease early after treatment ranging from 45% to 53% (5–7). Most recurrences occur within the first 2 y after surgery, with a median time to recurrence of 10–12 mo (6,7). About half of these patients (51%) are diagnosed with isolated distant systemic recurrence, which affects liver, bone, and lung mainly (5–7). Locoregional recurrence or a combination of locoregional and distant recurrence occur less frequently (14% and 35%, respectively) (7). After recurrent esophageal cancer is diagnosed, poor median survival rates of 3–9 mo have been reported (8).
Currently, most institutes use conventional imaging modalities such as CT and endoscopy with or without endoscopic ultrasound for the detection of recurrent esophageal cancer. However, the interpretation of these imaging techniques after prior treatment is difficult because of local anatomic changes caused by surgery (9). In addition, distant recurrent esophageal cancer may be radiologic occult on CT or may occur in unusual and unexpected locations outside the conventional field coverage of CT (10).
Whole-body 18F-FDG PET and integrated 18F-FDG PET/CT have emerged as useful adjuncts to conventional staging modalities in the pretreatment staging of esophageal cancer. In particular, baseline 18F-FDG PET/CT has gained ground by outperforming CT alone in the detection of unexpected distant metastases (11). Accordingly, 18F-FDG PET or PET/CT may also be a useful method for detecting recurrent disease in the postoperative follow-up of esophageal cancer patients because recurrences tend to occur predominantly at distant sites (7). In the past years, several studies have been published on the utility of 18F-FDG PET or PET/CT in the detection of esophageal cancer recurrence. However, it is difficult to draw conclusions based on the individual studies because methodologic quality may vary, sample sizes are generally small, and differences in study design and patient populations may cause heterogeneity in reported outcomes.
To critically appraise and potentially overcome shortcomings of individual studies, the aim of this study was to systematically review and meta-analyze the diagnostic performance of 18F-FDG PET and PET/CT for diagnosing recurrent esophageal cancer after initial treatment with curative intent.
MATERIALS AND METHODS
The study protocol has been registered in the PROSPERO international prospective register of systematic reviews and is accessible at http://www.crd.york.ac.uk/prospero/ (registration no., CRD42014009615).
Search Strategy
On the December 16, 2014, a systematic search was performed in the databases Medline (via PubMed), Embase, and the Cochrane library. The full search strategy is presented in Table 1.
Full Text of Search Strategy and Results as of December 16, 2014
Study Selection
After duplicates of the retrieved articles were removed, titles and abstracts were screened for eligibility by 2 authors independently. The full text of potentially relevant articles was retrieved and independently assessed by 2 authors for inclusion.
Studies examining the test accuracy of 18F-FDG PET or integrated 18F-FDG PET/CT, either in routine clinical follow-up with a fixed time interval irrespective of physical complaints or in symptomatic patients in whom recurrence of esophageal cancer was suspected, were deemed eligible for inclusion. Only studies that included patients who were previously treated with curative intent for esophageal cancer and that reported on the diagnostic accuracy of 18F-FDG PET or PET/CT for the detection of disease recurrence were included. Treatment with curative intent should have had at least included surgery, either or not combined with neoadjuvant chemoradiotherapy. The reference standard was recurrent esophageal cancer as confirmed by histopathologic biopsy or clinical follow-up.
Case reports, studies with fewer than 10 included patients, reviews, poster abstracts, and animal studies were excluded. Also publications written in a language other than Dutch, English, or German were excluded from this review. Missing data of possible eligible studies were requested from study authors. References of the included studies and of related review studies were also screened for inclusion. Disagreements regarding the eligibility of a study were resolved by consensus.
Data Extraction and Quality Assessment
Study and patient characteristics along with 18F-FDG PET or PET/CT parameters were extracted from each study. The quality of the included studies was critically appraised by 2 authors independently, according to the revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool (12). QUADAS-2 assesses risk of bias and applicability concerns on 4 key domains including patient selection, index text, reference standard, and flow and timing, respectively. To reach a judgment on the risk of bias the provided signaling questions of the QUADAS-2 tool were used. Risk of bias and applicability concerns were judged as low, high, or unclear risk or concern for the various QUADAS domains.
Statistical Analysis
The target condition consisted of the presence of recurrent esophageal cancer as determined by histopathologic biopsy or clinical follow-up. From each included study, the number of true-positives (TPs), false-positives (FPs), true-negatives (TNs), and false-negatives (FNs) were obtained on a per-patient basis if available. From studies reporting on a per-lesion or per-scan basis, the reported sensitivities and specificities were used, but the absolute numbers leading to these estimates according to the total number of patients with and without recurrent disease were recalculated to prevent overestimation of the weight of the results. Subsequently, for each study the sensitivity and specificity along with 95% confidence intervals (95% CIs) were calculated and depicted in Forest plots.
A bivariate random-effects model was used to obtain pooled estimates of sensitivity and specificity with their corresponding 95% CIs from the individual studies. The bivariate model uses a random-effects approach to incorporate heterogeneity beyond chance as a result of clinical and methodologic differences between studies (13). The bivariate model also estimates whether sensitivities and specificities are (negatively) correlated across studies due to implicit differences in threshold to consider a 18F-FDG PET or PET/CT scan suspected for recurrence (positive index test result). The pooled estimate of sensitivity and specificity and the corresponding 95% confidence ellipse is shown in receiver-operating-characteristic space (14).
Subgroup analyses were performed by adding the following study characteristics (covariates) to the bivariate model: “standalone 18F-FDG PET” versus “integrated 18F-FDG PET/CT,” “index test performed on indication” versus “index test performed as part of routine follow-up,” and “Asian studies” versus “non-Asian studies.” A P value of less than 0.05 was considered statistically significant. The nonlinear mixed model procedure of SAS (version 9.2; SAS Institute) was used to estimate the parameters of the bivariate model.
RESULTS
Eligible Studies
The systematic search yielded 948 articles from Medline, 1,684 from Embase, and 60 from the Cochrane library (Table 1). After duplicates were removed, 1,867 articles remained, of which title and abstract were reviewed. Forty-three articles were deemed potentially relevant for this study. After the full text of the remaining studies was read, 35 articles were excluded because these concerned review studies (n = 13), nondiagnostic studies (n = 8), poster abstracts (n = 5), publications in other than prespecified languages (n = 4), case reports (n = 2), a study that included fewer than 10 patients (n = 1), or studies in which insufficient data were available (n = 2). Missing data of these latter 2 studies were requested from study authors without satisfying result (15,16). Screening of references of these eligible articles and related review studies did not yield additional relevant publications. Consequently, 8 studies met our inclusion and exclusion criteria, comprising 486 patients with esophageal cancer who underwent 18F-FDG PET or PET/CT after previous treatment with curative intent. The described process of study selection is summarized in Figure 1.
Flowchart summarizing search results and study selection.
The general characteristics of the included studies are presented in Table 2 (17–24). Table 3 outlines the used 18F-FDG PET or PET/CT parameters and reference standards. Only 1 of the 8 studies was prospectively designed to answer this research question (22). The duration of clinical follow-up after acquisition of a 18F-FDG PET or PET/CT scan was less than 6 mo in 1 of the included studies (23), at least 6 mo or longer in 5 studies (17,18,20,21,24), and not described in 2 other studies (19,22). In 4 studies, the diagnostic value of integrated 18F-FDG PET/CT was analyzed (17,18,20,21), and 3 studies analyzed the diagnostic value of standalone 18F-FDG PET (22–24). In 1 study, the value of integrated 18F-FDG PET with CT versus 18F-FDG PET alone was analyzed separately; hereafter, the data from this study are referred to as Roedl 1 and Roedl 2, respectively (19). In 4 studies, the diagnostic test was performed on a routine basis (19,21–23), whereas in the other studies the diagnostic test was performed on indication during clinical follow-up (17,18,20,24). In 6 studies, 18F-FDG PET- or PET/CT-positive results were analyzed on a per-patient basis, whereas in 2 studies the results were assessed on either a per-scan (17) or a per-lesion (24) basis.
Characteristics of Included Studies (n = 8)
18F-FDG PET or PET/CT Parameters, Methods of Image Interpretation, and Reference Standard of Included Studies
Quality Assessment
The results of the quality assessment using the QUADAS-2 tool are presented in Table 4. The risk of bias concerning patient selection was low in 7 of the included studies; 1 study was deemed at high risk of bias because it did not include a consecutive sample of patients (20). Risk of bias with regard to the index test was low in all studies because the index test results were consistently interpreted without knowledge of the outcome of the reference test. However, the risk of bias for the reference test was deemed unclear for most studies because these articles failed to report whether or not the reference standard was interpreted without knowledge of the index test result. Furthermore, applicability concerns for patient selection were found in 4 studies because the study population consisted of patients who underwent a variety of treatment regimens. In general, there were only a few high concerns with regard to the risk of bias and applicability; the quality of the currently available literature was considered reasonable.
Quality Assessment of Included Studies
Diagnostic Accuracy
The results of 2 studies that assessed the diagnostic value of 18F-FDG PET or PET/CT on a per-lesion or per-scan basis were adjusted according to their sample size (17,24). The paired Forest plots of sensitivity and specificity of the 8 individual studies are presented in Figure 2. The reported sensitivities ranged from 89% to 100% and specificities from 55% to 94%. For the calculation of the overall pooled estimates, only the data of Roedl 1—and not of Roedl 2—were used to prevent using the data from this study twice (19). Sensitivity was eventually pooled with a fixed-effect model as the between-study variation was not larger than could be expected by chance. More variation than expected by chance was observed for specificity; therefore, a random-effects pooling was used for specificity. Pooled estimates of sensitivity and specificity were 96% (95% CI, 93%–97%) and 78% (95% CI, 66%–86%), respectively. The estimates from the individual studies and the pooled estimates of sensitivity and specificity together with the 95% confidence ellipse are shown in Figure 3.
(A) Forest plot of sensitivity of integrated 18-F-FDG PET/CT and PET alone for detection of recurrent esophageal cancer after treatment with curative intent. n = number of TP; N = number of TP + number of FN. (B) Forest plot of specificity of 18F-FDG PET with integrated CT and PET alone for the detection of recurrent esophageal cancer after treatment with curative intent. n = number of TN; N = number of TN + number of FP.
Pooled estimate of sensitivity and specificity (▪) and corresponding 95% confidence ellipse along with estimates from individual studies (▲) in receiver-operating-characteristic space.
The planned subgroup analysis was restricted to specificity alone because there was no real heterogeneity in sensitivity. These subgroup analyses revealed no statistically significant difference in specificity according to type of PET scanner (standalone PET vs. integrated PET/CT), indication of scanning (part of routine follow-up vs. on indication), and country of origin (Asian vs. non-Asian) (Table 5).
Results from Subgroup Analyses for Specificity
DISCUSSION
This study is the first, to our knowledge, to systematically review and summarize the currently available evidence on the accuracy of 18F-FDG PET and PET/CT for diagnosing recurrent esophageal cancer after primary treatment with curative intent. The methodologic quality of the 8 included studies analyzed by the QUADAS-2 tool concerning risk of bias was low in most studies. Pooled estimates for 18F-FDG PET and PET/CT yielded a high sensitivity and moderate specificity of 96% and 78%, respectively. Sensitivity was consistently high in all studies, but variation was present in specificity. Subgroup analysis could not link specific study characteristics to systematically higher or lower specificity. Current evidence indicates that 18F-FDG PET and PET/CT are valuable tests for clinical practice in the follow-up of patients with esophageal cancer after primary treatment.
Certain limitations apply to this meta-analysis. Methodologic concerns that may have influenced the results of the various studies include absence of masking the index test from the reference test and inclusion of heterogeneous treatment modalities among individual studies. Another limitation is the limited number of included studies in this meta-analysis. Also, in this meta-analysis, 3 of 8 studies included only patients with a clinical suspicion of recurrence. This may have led to an overestimation of the diagnostic value of 18F-FDG PET or PET/CT, as these patients have an increased pretest probability, compared with patients without suspicion of recurrence. However, subgroup analysis could not confirm this potential difference in diagnostic accuracy of 18F-FDG PET or PET/CT on clinical indication or as part of routine follow-up (specificity, 78% [95% CI, 69%–86%] vs. 76% [95% CI, 65%–85%], respectively; P = 0.748). In addition, the country of origin did not seem to have influenced the results of the different studies significantly. Last, differential verification bias was of concern in most included studies because different reference standards were used for confirmation of the diagnosis. Most negative 18F-FDG PET or PET/CT cases were verified by a potentially less reliable and second-best reference test (clinical follow-up instead of histopathologic biopsy), which may have resulted in a slight overestimation of sensitivity and underestimation of specificity (25). None of the included studies applied a correction method to their results for this potential bias.
Conventional imaging modalities for recurrent esophageal cancer include endoscopy with or without endoscopic ultrasound and CT of the thorax and abdomen. Endoscopic ultrasound has proven to be effective for the detection of locoregional recurrence (sensitivity > 90%), but both endoscopy and endoscopic ultrasound fail to detect distant metastases (26). Currently, distant metastases are of particular interest because the incidence of locoregional recurrence is substantially reduced by new treatment algorithms, including neoadjuvant chemoradiotherapy (7). CT scans are commonly used for the detection of distant metastases, although the diagnostic value of CT for local recurrence is limited at the site of resection because of anatomic distortion caused by surgery and radiotherapy (9). Furthermore, only limited data on the diagnostic value of CT for detecting recurrent esophageal cancer is available, with reported sensitivities ranging from 65% to 89% (22,23). The pooled sensitivity estimate for 18F-FDG PET and PET/CT of 96% from this meta-analysis indicates that 18F-FDG PET and PET/CT is likely to outperform CT to this regard, which is confirmed by direct comparison in 2 studies (22,23).
Comparison of reported specificities for CT and the current pooled specificity estimate for 18F-FDG PET and PET/CT suggests an inferior specificity for 18F-FDG PET and PET/CT, compared with standalone CT (78% vs. 79%–91%, respectively) (22,23). The lower specificity of 18F-FDG PET is a common problem in oncologic patients and is mainly caused by FP findings due to chronic inflammation after surgery, chronic respiratory tract disease, radiation pneumonitis, or dilation of anastomotic strictures (20,27,28). A combination of metabolic imaging (18F-FDG PET) with anatomic imaging (CT) has been reported to improve diagnostic accuracy, compared with PET alone, especially in diagnosing locoregional recurrence (15,17,19). To this regard, the only direct comparative study in esophageal cancer recurrence diagnosis found a higher specificity in favor of PET/CT, compared with PET alone (75% vs. 55%, respectively) (19). However, this potential benefit of 18F-FDG PET/CT as opposed to standalone 18F-FDG PET for diagnosing recurrent esophageal cancer did not reach statistical significance by subgroup analysis in this meta-analysis (specificity, 78% [95% CI, 70%–85%] vs. 70% [95% CI, 59%–80%], respectively; P = 0.213).
The specificities used in this meta-analysis were derived from analysis on a per-patient basis, and the pooled results can therefore not exclude the possibility of superiority of 18F-FDG PET/CT over PET for specific anatomic sites. Anatomic site-specific TP and FP numbers were reported on a per-lesion basis in 5 of 8 studies (17,19,20,23,24) and suggested a difference in the positive predictive values (e.g., TP/[TP + FP]) for diagnosing locoregional recurrence using 18F-FDG PET/CT (range, 79%–95%) (17,19,20), compared with 18F-FDG PET (range, 59%–68%) (19,23,24). The differences between positive predictive values for diagnosing distant recurrence of PET/CT (range, 89%–95%) (17,19,20) and PET (84%–90%) studies were minor (19,23,24). However, in contrast to specificities, the pooling of positive predictive values is questionable because of their strong dependency on the pretest probability (e.g., prevalence of true recurrences), which varies among the included studies with different clinical settings. To this regard, another subject of note is the continuous technologic progress of 18F-FDG PET/CT image generation and reconstruction algorithms, and 18F-FDG PET with integrated MR imaging is now clinically introduced (29). These developments may prove to further increase the accuracy in diagnosing recurrent esophageal cancer.
In the current guidelines of the European Society for Medical Oncology and the National Comprehensive Cancer Network, there is no room for routine imaging or endoscopy with biopsies after initial treatment for esophageal cancer (30,31). The key reason to refrain from routine imaging is the limited amount of adequate therapeutic options when recurrence is detected. Current treatment options for recurrent disease consist of salvage chemoradiotherapy, which is associated with symptomatic relief and improved survival rates (32,33). Furthermore, recent experimental studies have demonstrated that reoperation for selected cases of localized recurrence or solitary recurrence in lymph nodes, lungs, and subcutaneous lesions is safe and may improve survival (34–38). This finding is supported by a recent study that demonstrated a significant survival benefit for patients with cervical lymph node recurrence who were treated with salvage lymphadenectomy, compared with chemoradiotherapy (37).
Future clinical decision making with regard to treatment strategy for recurrent disease will depend on the extent and location of the recurrence. Routine imaging with CT and PET has been shown to possess the ability to detect recurrent esophageal cancer in a presymptomatic phase (8,39). However, so far no studies combining routine imaging with aggressive treatment strategies are available. Also, little is known about cost-effectiveness of routine imaging and gain of quality of life after early detection of recurrent esophageal disease. Therefore, with the limited evidence available for routine imaging in recurrent esophageal cancer, at this time routine imaging is not recommended. In the case that recurrent disease is clinically suspected, the method of choice is 18F-FDG PET/CT.
CONCLUSION
This meta-analysis demonstrates that 18F-FDG PET and PET/CT are reliable imaging modalities, with a high sensitivity and moderate specificity for detecting recurrent esophageal cancer. The use of 18F-FDG PET or PET/CT particularly allows for a minimal FN rate. However, histopathologic confirmation of 18F-FDG PET– and PET/CT-suspected lesions remains required, because a considerable FP rate is noticed. The benefit of 18F-FDG PET and PET/CT over conventional imaging techniques, in terms of cost-effectiveness and improving clinical outcome, remains a subject of debate. Future studies are warranted to analyze whether earlier detection of recurrent esophageal cancer along with more aggressive therapeutic approaches will improve survival and quality of life.
DISCLOSURE
The costs of publication of this article were defrayed in part by the payment of page charges. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734. No potential conflict of interest relevant to this article was reported.
Acknowledgments
We thank Professor Rob J.P.M. Scholten—director of The Dutch Cochrane Centre hosted by the Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, The Netherlands—for critically reviewing the manuscript and providing methodologic support.
Footnotes
↵* Contributed equally to this work.
Published online May 7, 2015.
- © 2015 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
REFERENCES
- Received for publication February 5, 2015.
- Accepted for publication April 25, 2015.