Abstract
The aim of this work was to assess the overall value of 18F-FDG PET/CT in the diagnosis of residual or recurrent nasopharyngeal carcinoma using a metaanalysis. Methods: The literature published between January 1990 and September 2014 was searched in the PubMed, EMBASE, Cochrane Library, EBSCO, VIP, CNKI, and Wanfang databases to identify eligible studies on PET/CT of residual or recurrent lesions. The methodologic quality of the included studies was evaluated using the “quality assessment for studies of diagnostic accuracy” tool. Summary sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic odds ratio, and receiver-operating characteristic curve were obtained using Meta-DiSc freeware. Subgroups were also analyzed. Results: A total of 23 studies, involving 1,253 subjects, were included in the metaanalysis. Pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio, with 95% confidence intervals in parentheses, for 18F-FDG PET or PET/CT were 0.93 (0.91–0.95), 0.87 (0.84–0.89), 5.52 (3.96–7.71), 0.12 (0.09–0.15), and 55.31 (34.94–87.57), respectively. The area under the receiver-operating characteristic curve and Q* index estimate of PET/CT were 0.9473 and 0.8869, respectively. There was no significant difference between the area under the curve of PET and PET/CT (P > 0.05). Conclusion: Our study has confirmed that 18F-FDG PET/CT has high sensitivity and specificity but significant heterogeneity in the diagnosis of residual or recurrent nasopharyngeal carcinoma.
- nasopharyngeal carcinoma
- PET/CT
- recurrence
- residue
- meta-analysis
Nasopharyngeal carcinoma is a rare malignancy in most parts of the world but is common in southern China. The incidence and mortality in China and constituent areas are 3.16 and 1.53 per 100,000 people, respectively, according to the national population in 2010. The world age-standardized incidence and mortality were 2.44 and 1.18 per 100,000 people, respectively (1). Nasopharyngeal carcinoma can invade tissue adjacent to the nasopharynx and even metastasize to bone, liver, lung, and other organs via blood or lymph. Currently, radiation therapy is preferred for newly diagnosed nonmetastatic nasopharyngeal cancer, particularly intensity-modulated radiation therapy, allowing a 5-y survival rate of approximately 50% (2). Advanced age, local extension, and advanced disease stage adversely affect the prognosis in patients with nasopharyngeal carcinoma after treatment (3). Despite significant improvements in local control due to advances in radiotherapy, local recurrence and residual disease remain the main reasons for failure in patients with advanced nasopharyngeal carcinoma (4). After radiation therapy, various changes occur in nasopharyngeal tissues, such as edema, inflammation, fibrosis, and scarring. Because traditional imaging is not always reliable in distinguishing between recurrent disease and posttherapy changes, the use of 18F-FDG PET/CT may be complementary for many lesions (5).
In general, MRI is preferred over CT as the conventional imaging modality for posttherapy surveillance of nasopharyngeal carcinoma because of its high contrast resolution (6,7). However, MRI still has some difficulty in differentiating posttreatment fibrosis from recurrent or residual tumor. Postradiation therapy fibrosis, which may appear as a mass seen on MRI, always poses a diagnostic question of whether it is a tumor (8). In contrast, the specificity of 18F-FDG PET/CT in assessing efficacy after radiotherapy and in diagnosing lesions is about 93.4%.
In our opinion, 18F-FDG PET/CT, which has high sensitivity and specificity in detecting lesions, is the ideal combination of metabolic imaging and morphologic imaging. Thus far, although there have been many studies reporting the diagnostic efficacy of PET/CT for recurrent or residual nasopharyngeal carcinoma, the results are incongruous (9–11). Therefore, systematic evaluation of 18F-FDG PET/CT in the diagnosis of recurrent or residual nasopharyngeal carcinoma is necessary.
In our previously published article (12), 26 studies involving 1,203 patients were included. We calculated the pooled sensitivity, specificity, and diagnostic odds ratios and concluded that 18F-FDG PET/CT performed well for diagnosis of residual and recurrent nasopharyngeal carcinoma, with relatively high sensitivity and specificity. But in some included studies the sample size was below 25, which is not a large enough sample to provide conclusive results that heterogeneity exists. In the current article, the sample size of all included studies was above 25, and after assessing the positive likelihood ratio, negative likelihood ratio, and publication bias, we were able to confirm the existence of heterogeneity.
MATERIALS AND METHODS
Literature Search
We searched the PubMed, EMBASE, Cochrane Library, and EBSCO databases, as well as 3 Chinese databases (VIP, CNKI, and Wanfang), for studies published from January 1990 to September 2014, with no restrictions on the language of the publication. The search strategy was based on the combination of the following keywords: (“nasopharyngeal carcinoma”) AND (“PET” OR “positron emission tomography” OR “FDG”). Additionally, the reference lists of relevant studies were manually screened for additional eligible studies.
Selection of Studies
The inclusion criteria for relevant studies were as follows: whole-body 18F-FDG PET or PET/CT had been used to identify and characterize the suspected residual or recurrent nasopharyngeal carcinoma; histopathologic results (HP) or clinical or radiologic follow-up had been used as the reference standard; and absolute numbers of true-positive, true-negative, false-positive, and false-negative data had been presented. When the data were published in more than one article, the latest one or that with the greatest detail was included.
Studies were excluded if 18F-FDG PET or PET/CT had been used in staging and prognosis or if fewer than 10 patients had been included. In addition, duplicate publications were excluded, as were publications such as review articles, case reports, conference papers, and letters, which do not contain the original data.
Data Extraction and Quality Assessment
Two reviewers independently extracted data from each article and recorded them on a standardized form. Any disagreement in data extraction was resolved by consensus or by consulting a third investigator. True-positive, false-positive, true-negative, and false-negative results were recorded for further analysis. The following data were also extracted from each study: author names, article title, year of publication, country of origin, age of patients, sample size, sex ratio, study design (prospective or retrospective), the technical characteristics of the 18F-FDG PET or PET/CT examinations, and whether follow-up was histopathologic, clinical, or radiologic. Additionally, technical information such as collection device, dose, and intervals were extracted.
We assessed the methodologic quality of the studies using “quality assessment for studies of diagnostic accuracy” (QUADAS), a comprehensive, systematic reviewing tool. QUADAS contains 14 items, including questions on the validity of the reference standard, the spectrum of patients, verification methods, and incorporation bias. For each item, the tool uses a “yes” or “no” response (13).
Statistical Analysis
The pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio with 95% confidence intervals (CIs) were calculated on the basis of the bivariate analysis. Heterogeneity was assessed by the likelihood ratio I2 index. P < 0.05 or I2 > 50% suggested heterogeneity. If heterogeneity existed, a random-effects model was used for the primary metaanalysis to obtain a summary estimate of the test’s performance with 95% CIs. Otherwise, a fixed-effects model was used.
Furthermore, we constructed the summary receiver operating characteristic (SROC) curves and determined the area under the curve and the Q* estimate. (The Q* index is the best statistical method to reflect diagnostic value. It is defined by the point at which sensitivity and specificity are equal, which is the point closest to the ideal top-left corner of the summary receiver-operating characteristic space.) Considering that the results might be influenced by the scanning technique, the subgroup analyses were based on the “PET/CT” or “PET-alone” subgroups. Publication biases were assessed by Deek funnel plots. All analyses were performed using Meta-DiSc, version 1.4 (Ramón y Cajal Hospital, Madrid, Spain).
RESULTS
Literature Search
The computerized search initially yielded 871 potential articles using the key words. After all the articles, titles, and abstracts had been reviewed, 810 were excluded: 655 did not concern PET or PET/CT or did not concern recurrent or residual nasopharyngeal carcinoma; 140 were case reports, reviews, conference papers, or letters; and 15 were duplicate articles. After excluding those in which the absolute data could not be extracted, 23 articles remained (a total of 1,253 patients) and were included in the metaanalysis (5,14–35). A flowchart of the literature search is shown in Fig. 1.
Study Characteristics
Five articles (16,21,31–33) used either histopathologic results or clinical follow-up as the reference standard and the rest used both (5,14,15,17–19,21–30,34,35). Twelve of the 23 studies were performed in China (24–35), 9 in Taiwan (15–23), 1 in Saudi Arabia (14), and 1 in Italy (5). Four were prospective (15–18) and 19 retrospective (5,14,19–35). Additionally, 16 studies (5,14,15,17–23,26,28,29,32–35) enrolled patients in a consecutive manner; the other 7 studies (16,24,25,27,30,31) did not provide that information. The principal characteristics of the 23 studies are listed in Table 1.
Assessment of Study Quality and Publication Bias
All studies fulfilled 8 or more of the 14 items in the QUADAS tool. The acceptable interval between PET/CT or PET and the reference standard (item 4) was not presented in 91% of the articles (5,14,15,17–19,21–35). The common weaknesses were concentrated in the masking between the reference standard and index test (items 10 and 11). The Deek funnel plot (Fig. 2) found no significant publication bias (P > 0.05).
Diagnostic Accuracy of PET/CT
Table 1 shows the characteristics of the 23 studies, and Table 2 the diagnostic accuracy of whole-body 18F-FDG PET or PET/CT in them. The pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratios, followed by 95% CIs in parentheses, for 18F-FDG PET or PET/CT were 0.93 (0.91–0.95), 0.87 (0.84–0.89), 5.52 (3.96–7.71), 0.12 (0.09–0.15), and 55.31 (34.94–87.57), respectively. Fig. 3 shows a forest plot of sensitivity and specificity, and Fig. 4 the positive and negative likelihood ratios for 18F-FDG PET or PET/CT in the diagnosis of residual or recurrent nasopharyngeal carcinoma. The χ2 values for sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio were 30.49 (P = 0.1071 [>0.05]), 53.66 (P < 0.05), 63.47 (P < 0.05), and 18.83 (P = 0.6559 [>0.05]), respectively, indicating heterogeneity between studies for specificity and positive likelihood ratio but homogeneity for sensitivity and negative likelihood ratio. Therefore, a random effects model was used for the primary metaanalysis to obtain a summary estimate for test performance with 95% CIs. The area under the curve was 0.9473, and the Q* index estimate was 0.8869 for 18F-FDG PET or PET/CT, suggesting outstanding performance (Fig. 5).
Regarding the threshold effect, it arises when differences in sensitivities, specificities, or likelihood ratios occur between studies because of differences in the thresholds used to define positive or negative test results. When a threshold effect exists, there is a negative correlation between sensitivities and specificities (or a positive correlation between sensitivities and 1 − specificities). In test accuracy studies, the threshold effect is one of the primary causes of heterogeneity. In this study, the Spearman rank correlation was −0.136 (P = 0.537), indicating absence of a threshold effect. The subgroup analyses were based on a PET system and showed no significant differences among subgroups (P > 0.05). In addition, heterogeneity between studies was highly significant; thus, we should interpret the results with caution.
DISCUSSION
For patients with nasopharyngeal carcinoma after treatment, local residual or recurrent lesions are important for tumor staging, treatment options, and prognosis. Patients with residual nasopharyngeal carcinoma have a poor prognosis, and the chance of recurrence is high (36). 18F-FDG PET/CT has a growing role in the diagnosis and management of nasopharyngeal carcinoma. Fischbein et al. evaluated the ability of 18F-FDG PET to detect residual or recurrent squamous cell carcinoma of the head and neck at both primary and nodal sites (37); the respective sensitivity and specificity were 100% and 64% for residual or recurrent disease at the primary site and 93% and 77% for nodal disease. However, PET is limited in differentiating between inflammation and tumors and has low specificity in the early diagnosis of disease (15).
We searched all qualified databases, including foreign databases and comprehensive Chinese databases, to minimize bias. Additionally, 2 reviewers screened qualified literature according to strict selection criteria and QUADAS standards. Of the included studies, only a few (19,27–29,33) used biopsy as the reference standard; most used biopsy and clinical follow-up because in clinical practice a biopsy cannot be performed for each lesion. Different reference standards may be an important source of heterogeneity.
Another explanation for heterogeneity is the difference among studies in the interval between the end of radiotherapy and PET examination. In general, increased 18F-FDG uptake early after radiotherapy should be considered an inflammatory reaction to the therapy. 18F-FDG uptake decreased significantly after a few months because of mitigation of the inflammation (38,39). An inflammatory response will create a false-positive PET/CT finding and affect diagnostic performance. Rate of disease progression and inconsistency in follow-up interval can also cause bias. For example, slowly progressing disease combined with a short follow-up interval may bring about a false-negative diagnosis. Complementary information from PET, particularly if combined with CT, adds value to the interpretation of both modalities. For example, improved accuracy from the use of 18F-FDG PET/CT in patients with head and neck tumors has been reported (40). However, because our analysis suggests no significant difference among subgroups, the heterogeneity might arise from too few samples and low-quality literature.
In the current metaanalysis, the pooled sensitivity, specificity, and diagnostic odds ratio of PET/CT were 0.93 (95% CI, 0.91–0.95), 0.87 (95% CI, 0.84–0.89), and 55.31 (95% CI, 34.94–87.57), respectively, which are similar to the data in our previous article (12), 0.92 (95% CI, 0.89–0.94), 0.87 (95% CI, 0.84–0.90), and 51.10 (95% CI, 34.29–76.15), respectively. The pooled diagnostic odds ratio indicated somewhat low accuracy in the detection of residual or recurrent nasopharyngeal carcinoma. Both positive likelihood ratio and negative likelihood ratio were calculated additionally, with ratios of more than 10 or less than 0.1 indicating high accuracy. The pooled positive likelihood ratio of PET/CT was 5.52, which is not high enough to diagnose local residual or recurrent nasopharyngeal carcinoma. On the other hand, the pooled negative likelihood ratio was 0.11. These data indicate that a negative PET/CT result alone does not justify ruling out local residual or recurrent nasopharyngeal carcinoma. The Deek funnel plot (Fig. 2) found no significant publication bias (P > 0.05). All the new data make the metaanalysis more comprehensive and provide convincing clinical reference value.
Our metaanalysis had some shortcomings. Exclusion of reviews, conference papers, and letters might have produced a publication bias. Lack of conformity in the reference standard, follow-up time, and other important variables might have affected diagnosis. Because most of the studies were retrospective, there is the potential that the interpreters knew the results of the other modalities before assessing the PET/CT results. Finally, we had only 2 subgroup analyses for PET, and we did not analyze the median time to posttreatment 18F-FDG PET and follow-up time because of the limited number of studies.
CONCLUSION
PET/CT has high sensitivity and specificity, but significant heterogeneity, in the diagnosis of local residual or recurrent nasopharyngeal carcinoma after radiotherapy and is useful for clinical treatment decisions.
DISCLOSURE
The costs of publication of this article were defrayed in part by the payment of page charges. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734. No potential conflict of interest relevant to this article was reported.
Footnotes
Published online Nov. 5, 2015.
- © 2016 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- Received for publication August 11, 2015.
- Accepted for publication October 21, 2015.