|
|
||||||||
Clinical Investigations |
1 TuftsNew England Medical Center Evidence-Based Practice Center, Division of Clinical Care Research, Department of Medicine, TuftsNew England Medical Center, Boston, Massachusetts
2 Clinical Trials and Evidence-Based Medicine Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
| ABSTRACT |
|---|
|
|
|---|
Key Words: 18F-FDG PET meta-analysis soft-tissue sarcoma
| INTRODUCTION |
|---|
|
|
|---|
| MATERIALS AND METHODS |
|---|
|
|
|---|
We conducted MEDLINE and EMBASE searches (last update, February 2002) using various search terms for PET (PET, positron emission tomography, 18F-FDG, and fluorodeoxyglucose) and sarcoma and limited the search to "human subjects." Searches were also conducted using names of specific histologic types of STS (liposarcoma, malignant fibrous histiocytoma, leiomyosarcoma, fibrosarcoma, malignant schwannoma, synovial sarcoma, and peripheral nerve sheath tumor). Furthermore, we also perused the references of retrieved articles to find additional studies and communicated with expert investigators for additional data and clarifications. We set no language restrictions.
Data Extraction
We extracted information on authors, year of publication, age, number of subjects, benign and malignant lesions evaluated, inclusion and exclusion criteria, number of primary lesions and of lesions assessed for recurrence, study design (prospective, retrospective, or unclear), histologic types, technical characteristics of PET, diagnostic parameters considered, definition and interpretation of the reference test (biopsy and type thereof, operative histologic diagnosis, other imaging, clinical, other, and unspecified), and potential for verification bias. Verification bias refers to incomplete confirmation of the results of the test under investigation with the reference test (e.g., no biopsy performed when PET suggests benign lesion).
For each report, we recorded the number of true positives, false positives, true negatives, and false negatives for 18F-FDG PET in diagnosing malignant versus benign lesions using the following prespecified parameters: (a) qualitative visualization (simple visualization, qualitative interpretation by experts, or assessment based on a tumor-to-background ratio [TBR]
3.0 without correction for dose of 18F-FDG, weight, and plasma glucose); (b) standard uptake value (SUV)
2.0; (c) SUV
3.0; and (d) metabolic rate of glucose (MRG)
6 µmol/100 g/min. We also separated evaluations for primary lesions from evaluations for potential recurrences. Furthermore, whenever information was provided on tumor grade, we recorded the number of lesions that were positive by PET (based on each of the above definitions) for intermediate/high-grade (G II/III) and low-grade (G I) tumors. Benign lesions were separated into noninflammatory and inflammatory ones.
For reports that had also used CT scans or MRI, we evaluated the performance of each test in diagnosing primary disease, local recurrence, and metastases. Data were compared on the same patients when 2 imaging procedures had been performed in parallel. For reports describing longitudinal evaluation of therapeutic response with serial 18F-FDG PET, the baseline (pretherapy) data were used in the quantitative synthesis. In addition, we extracted descriptive information regarding the impact of 18F-FDG PET on patient management.
Statistical Analysis
Data were combined quantitatively to provide summary information across studies for each of the 4 prespecified diagnostic definitions. We estimated the overall number of true positives, false negatives, true negatives, and false positives, and we estimated the overall sensitivities and specificities using a random- effects model that incorporated between-study variability. We also performed data synthesis using the summary receiver operating characteristic (SROC) approach that takes into consideration the interdependence between sensitivity and specificity (5,6). Combining sensitivity and specificity data independently across studies may underestimate both parameters and provides no information about the effect of diagnostic threshold. However, the overall random-effects estimates fall close to the SROC curve and can provide a useful indicator of where most investigations operated. The SROC curve shows how the true-positive rate (sensitivity) changes as a function of false-positive rate (1 - specificity) across all studies, when the same diagnostic criterion is used to classify cases as benign or malignant. It is estimated by the equation D = a + bS, where D is the difference of the logit of the true-positive rate and the logit of the false-positive rate and S is the respective sum. Both unweighted and weighted regressions were evaluated. When sufficient data are available, the area under the curve (AUC) can also be estimated. For all analyses, data on primary lesions were treated separately from data on evaluation of recurrences whenever feasible. The main analyses combined all data. Subgroup analyses were also performed for each subgroup (primary and recurrent). We also report typical pairs of sensitivity and specificity from these curves, taking as reference value for specificity the one estimated by random effects weighting of specificities across the included studies.
For the analysis of grading, we calculated the percentage of subjects with each type/grade of lesion when 18F-FDG PET was positive by each of the prespecified diagnostic criteria. Exact 95% confidence intervals (CIs) are also provided. Differences in pooled proportions were tested by Fisher exact test. Analyses were conducted in SPSS (SPSS Inc.), Meta-Test (Joseph Lau), and StatXact 3.0 (Cytel Inc.).
| RESULTS |
|---|
|
|
|---|
|
Diagnoses
We retrieved specific histologic diagnoses on malignant tumors from all studies with one exception (22). Among 208 malignant tumors with recorded histology, the most common were liposarcoma and variants (n = 56), malignant fibrous histiocytoma (n = 43), leiomyosarcoma (n = 21), fibrosarcoma (including myxofibrosarcoma) (n = 13), malignant schwannoma (n = 13), synovial sarcoma (n = 12), and peripheral nerve sheath tumor (n = 11). These types accounted for over 80% of STS. Of note, soft-tissue malignancies also included 1 case of non-Hodgkin lymphoma and 1 case of metastatic nasopharyngeal carcinoma.
Specific diagnoses were available for all 214 benign lesions. The most common diagnoses were postsurgical or posttraumatic lesions (typically scars, n = 113), lipoma (n = 23), neurofibroma (n = 21), hemangioma (n = 13), and schwannoma (n = 12). Overall, there were 199 benign noninflammatory lesions and 15 benign lesions with acute or chronic inflammation (infectious or noninfectious).
18F-FDG PET and Reference Test Characteristics
A variable amount of radiopharmaceutical was used across studies (148407 MBq). Imaging was typically performed after fasting from a few hours to overnight. With one exception (21), all studies attempted some qualitative interpretation of the PET images, based mostly on various nonquantitative criteria with consensus between radiologists, and in 3 studies, a crude TBR was also used. SUV was estimated in 9 studies, all of which had primary lesions, and 3 also included an evaluation of recurrences (total of 12 case series). MRG was estimated in 5 studies (3 of primary lesions, 2 evaluations of recurrences). Histology was the typical reference standard for all lesions, but there were some notable exceptions. In 3 studies (19,20,22), most or all of the patients with benign-appearing lesions did not have histologic confirmation, and diagnosis depended only on clinical and radiologic criteria. This represents considerable verification bias. A small number of patients with benign lesions did not have histologic confirmation in a fourth study (14) (Table 2). Typically there was no clear mention about whether 18F-FDG PET evaluations were performed without knowledge of histologic results.
|
|
|
In the 8 case series in which tumor grading was assessed and SUV was estimated, values above 2.0 were seen in 59/66 (89.4% [95% CI, 79.4%95.6%]) of intermediate/high-grade malignant lesions, 8/24 (33.3% [95% CI, 15.6%55.3%]) of low-grade malignant lesions, and 13/68 (19.1% [95% CI, 10.6%30.5%]) of benign lesions. SUV values above 3.0 were seen in 45/66 (68.2% [95% CI, 55.6%79.1%]), 3/24 (12.5% [95% CI, 26.6%32.4%]), and 8/68 (11.8% [95% CI, 5.2%21.9%]), respectively. With either cutoff, there was no significant difference between the low-grade malignant lesions and the benign lesions, whereas intermediate/high-grade malignant lesions differed significantly from both other groups. Inferences were similar when data on benign lesions were included from studies without tumor grading (data not shown).
MRG data were limited, but the inferences were similar. Values
6.0 µmol/100 g/min were seen in 32/35 (91.4% [95% CI, 76.9%98.2%]) G II/III tumors, 1/13 (7.7% [95% CI, 0.2%36.0%]) G I tumors, and 6/24 (25.0% [95% CI, 9.8%46.7%]) benign lesions (all noninflammatory).
Comparison with MRI and CT
Several studies on primary lesions clarify that part of the inclusion criteria was the prior performance of CT, MRI, or ultrasound, suggestive of malignancy. Because MRI/CT and 18F-FDG PET are performed in series, rather than in parallel, one cannot address their comparative accuracy for primary soft-tissue lesions. The same applies to studies on evaluation of recurrence, with 2 exceptions: In one study (22), 18F-FDG PET sensitivity was 13/17 (76%) and specificity was 47/50 (94%) versus 15/17 (88%) and 48/50 (96%), respectively, for paired MRI. 18F-FDG PET was wrong in 3 cases in which MRI was correct. The study may have substantial verification bias. In the other study (15), PET had 92% sensitivity (12/13) and 100% specificity (2/2). Paired MRI failed to diagnose 3 malignant lesions (including the one also misdiagnosed as benign by PET) and misdiagnosed both benign lesions (scar and Ascaris mass). PET was correct in 4 cases in which MRI was wrong.
Data on the comparative performance of 18F-FDG PET for the diagnosis of distant metastasis were also sparse. In 1 study (22), PET was positive by qualitative interpretation in 13/15 lung metastases (sensitivity, 87%) and was negative in all 55 cases without metastasis (specificity, 100%). Paired CT scans had 100% sensitivity and 96% specificity. PET failed to detect 2 lung metastases seen on CT but was correct in 2 cases in which CT was falsely positive. The study suffers from substantial verification bias for patients with negative imaging. Inconclusive retrospective paired CT and PET data were also given by 1 of the excluded studies (7) on 8 patients and were also subject to verification bias.
Longitudinal Evaluation of Treated STS
18F-FDG PET was used for the evaluation of response to therapy in 3 case series with at least 4 subjects involving radiotherapy hyperthermia (n = 4); hyperthermic isolated limb perfusion with tumor necrosis
, interferon
,and melphalan (n = 20); and various modalities (n = 8), for a total of only 32 patients (24,18,7). PET showed clear changes in the treated tumors but did not discriminate partial from complete response in the largest study (18).
| DISCUSSION |
|---|
|
|
|---|
2.0, and the large majority has MRG
6 µmol/100 g/min. Most low-grade tumors are also diagnosed on qualitative interpretation, but they usually have SUV < 2.0 and rarely have MRG
6 µmol/100 g/min. Neither SUV nor MRG can differentiate low-grade tumors from benign lesions, and inflammatory lesions have quantitative characteristics similar to those of high-grade malignancies. There were no major differences in the diagnostic performance when variable diagnostic parameters were considered, including qualitative visualization, single SUV measurements, or dynamic estimation of MRG. Paradoxically, the estimated diagnostic performance was at its worst with the more sophisticated MRG. The differences could easily be caused by chance. However, one might have expected probably an opposite trend with worse diagnostic performance for subjective parameters. It is possible that the diagnostic performance of the qualitative evaluations might have been spuriously inflated if the evaluators had not been unaware of the clinical and histologic information of each subject. In fact, although we standardized the cutoffs for SUV and MRG in the meta-analysis, the qualitative evaluations used a variety of rules across studies. With the exception of studies using TBR, these rules were largely subjective and depended on consensus between radiologists. Post hoc modulation of diagnostic thresholds may also spuriously inflate the estimated performance of a diagnostic test. Furthermore, it is unknown whether some publication bias may be operating against the publication of studies that may have found less promising results (30). Finally, there was a large scatter of sensitivity and specificity values with the qualitative rules used. Simple qualitative interpretation is unlikely to offer a generalized standard for routinely interpreting 18F-FDG PET scans across different centers.
SUV and MRG are more objective measures in this regard. For SUV, a cutoff of 2.0 was probably preferable to a cutoff of 3.0 in adequately discriminating between benign and malignant lesions. Nevertheless, even the 2.0 cutoff will give false-positive results for almost one fifth of the benign lesions, whereas it will miss two thirds of low-grade tumors. MRG had an even worse performance and, overall, was not a good discriminating parameter, although data were very limited. Although we had to exclude 1 MRG study with nonseparable data on bone and soft-tissue lesions (10,11), the composite data seem similar to what we observed in the remaining studies. Similarly, neither SUV nor MRG is perfect in assigning tumor grade to a malignant lesion. We should acknowledge, however, that tumor grade in STS may sometimes be difficult to establish even with histologic examination. Thus, even a perfect imaging test may spuriously seem to have less than perfect concordance with histologic readings. By measuring the metabolic activity of a lesion, it is possible that 18F-FDG PET values may offer additional long-term prognostic information besides histology. However, no long-term prognostic analyses have been performed with 18F-FDG PET on STS to date.
There are no good quality data on the comparative diagnostic performance of 18F-FDG PET against CT scans or MRI for the diagnosis of primary soft-tissue lesions, because all studies to date have used these imaging tests in series, not in parallel. There is limited evidence suggesting an approximately equivalent diagnostic performance of 18F-FDG PET and MRI for assessing local recurrence and even more limited evidence suggesting approximately equivalent performance of PET and CT scan in searching for distant metastatic disease. The one study of adequate sample size pertains to lung metastatic disease. 18F-FDG PET has been considered to be useful technique for the evaluation of lung nodules in general (31), and it is conceivable that this may apply also in the evaluation of lung lesions in patients with STS. There are very limited data on the usefulness of PET in assessing the response of STS to therapy. One study (16) suggests that tyrosine PET may be superior to 18F-FDG PET for assessing response to therapy, but the data are equally limited. Thus, based on the current evidence, it is unclear whether PET can offer any advantage over traditional imaging modalities. There is considerable enthusiasm about the ability of 18F-FDG PET to assess the response of various tumors to therapy (1,32), but more data are needed for STS.
By assembling a large number of subjects, the meta-analysis has managed to estimate the diagnostic accuracy of 18F-FDG PET in STS diagnosis and grading, decreasing the uncertainty inherent in isolated case series. 18F-FDG PET seems to be a promising technology, and its diagnostic performance is very good. The ability to gain insight into the metabolic parameters of a lesion with 18F-FDG PET has generated extensive enthusiasm (1,2,32). However, despite the experimental enthusiasm and the wide field of potential applications (1,2), routine use needs to be scrutinized carefully (33). The incremental value of 18F-FDG PET against other imaging modalities and its proper role in the management and prognostic assessment of STS remain largely unknown.
| CONCLUSION |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
For correspondence or reprints contact: Joseph Lau, MD, Division of Clinical Care Research, TuftsNew England Medical Center, 750 Washington St., Boston, MA 02111.
E-mail: JLau1{at}tufts-nemc.org
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
P. A. Kyzas, E. Evangelou, D. Denaxa-Kyza, and J. P. A. Ioannidis 18F-Fluorodeoxyglucose Positron Emission Tomography to Evaluate Cervical Node Metastases in Patients With Head and Neck Squamous Cell Carcinoma: A Meta-analysis J Natl Cancer Inst, May 21, 2008; 100(10): 712 - 720. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. K. Buck, K. Herrmann, C. M. z. Buschenfelde, M. E. Juweid, M. Bischoff, G. Glatting, G. Weirich, P. Moller, H.-J. Wester, K. Scheidhauer, et al. Imaging Bone and Soft Tissue Tumors with the Proliferation Marker [18F]Fluorodeoxythymidine Clin. Cancer Res., May 15, 2008; 14(10): 2970 - 2977. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Masciari, A. D. Van den Abbeele, L. R. Diller, I. Rastarhuyeva, J. Yap, K. Schneider, L. Digianni, F. P. Li, J. F. Fraumeni Jr, S. Syngal, et al. F18-Fluorodeoxyglucose-Positron Emission Tomography/Computed Tomography Screening in Li-Fraumeni Syndrome JAMA, March 19, 2008; 299(11): 1315 - 1319. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. E. Ferner, J. F. Golding, M. Smith, E. Calonje, W. Jan, V. Sanjayanathan, and M. O'Doherty [18F]2-fluoro-2-deoxy-D-glucose positron emission tomography (FDG PET) as a diagnostic tool for neurofibromatosis 1 (NF1) associated malignant peripheral nerve sheath tumours (MPNSTs): a long-term clinical study Ann. Onc., February 1, 2008; 19(2): 390 - 394. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Tateishi, U. Yamaguchi, K. Seki, T. Terauchi, Y. Arai, and E. E. Kim Bone and Soft-Tissue Sarcoma: Preoperative Staging with Fluorine 18 Fluorodeoxyglucose PET/CT and Conventional Imaging Radiology, December 1, 2007; 245(3): 839 - 847. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. E. Pakos, A. D. Fotopoulos, and J. P.A. Ioannidis 18F-FDG PET for Evaluation of Bone Marrow Infiltration in Staging of Lymphoma: A Meta-Analysis J. Nucl. Med., June 1, 2005; 46(6): 958 - 963. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. P. Cobben, P. H. Elsinga, A. J. H. Suurmeijer, W. Vaalburg, B. Maas, P. L. Jager, and H. J. Hoekstra Detection and Grading of Soft Tissue Sarcomas of the Extremities with 18F-3'-Fluoro-3'-Deoxy-L-Thymidine Clin. Cancer Res., March 1, 2004; 10(5): 1685 - 1690. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| JOURNAL OF NUCLEAR MEDICINE TECHNOLOGY | THE JOURNAL OF NUCLEAR MEDICINE |