Abstract
18F-FDG PET is the most accurate noninvasive modality for staging mediastinal lymph nodes in lung cancer. Besides using visual image interpretation, some institutions use standardized uptake value (SUV) measurements in lymph nodes. Mostly, an SUV of 2.5 is used as the cutoff, but this choice was never deduced from respective studies. Receiver operating characteristic (ROC) analyses demonstrated that SUV thresholds of more than 4 resulted in the highest accuracy. But these high cutoffs imply high false-negative rates (FNRs). The aim of our evaluation was to determine an optimal SUV threshold and to compare its diagnostic performance with the results of visual interpretation. Methods: This retrospective study included 95 patients with suspected lung cancer who underwent mediastinoscopy/mediastinal lymphadenectomy after 18F-FDG PET (90–150 min after 250 MBq of 18F-FDG). Maximum SUV was measured in 371 lymph node regions biopsied afterward and visually interpreted using a 6-level score (− − − through + + +). Diagnostic performance was assessed by ROC analysis. FNR and false-positive rate (FPR), the sum of both error rates (FNR + FPR), and diagnostic accuracy were plotted against a hypothetical SUV threshold to determine the optimum SUV threshold. Results: SUVs in metastatic lymph nodes were higher (mean ± SD, 7.1 ± 4.5; range, 1.4–26.9; n = 70) than in tumor-free lymph node stations (2.4 ± 1.7; range, 0.6–14.9; n = 301; P < 0.01). Inflammatory lymph nodes exhibited slightly increased SUVs (2.7 ± 2.0; range, 0.8–14.9; n = 146). The plot of error rates featured a minimum of the sum FNR + FPR for an SUV of 2.5. With increasing SUV threshold, the FPR decreased most prominently up to that value whereas a continuous rise of FNR was noticed. Highest diagnostic accuracy was achieved with an SUV of 4.5. The areas under the ROC curves demonstrated that visual interpretation tends to be more accurate than SUV quantification (visual, 0.930 ± 0.022; SUV, 0.899 ± 0.025; P = 0.241). Using an SUV of 2.5 as the threshold, the resulting sensitivity, specificity, and negative predictive value were 89%, 84%, and 96%, respectively. Conclusion: For mediastinal staging, the choice of an SUV of 2.5 as the threshold is justified because FNR + FPR is minimized. The resulting high negative predictive value of 96% allows the omission of mediastinoscopy in patients with negative mediastinal findings on 18F-FDG PET images. For the experienced observer, visual analysis should be relied on primarily, with calculation of the SUV used, at most, as a secondary aid. For the less experienced observer, the SUV may be of greater value.
The prognosis of non–small cell lung cancer is determined mainly by the presence of mediastinal lymph node metastases indicating an advanced stage of disease. Thus, mediastinal staging is essential to determine the treatment strategy (1). Mediastinoscopy remains the gold standard for the evaluation of mediastinal lymph node involvement and exhibits a sensitivity of about 80% with a specificity of, by definition, 100% (2,3). Mediastinoscopy entails a low but existing risk (0.5%) of life-threatening complications such as hemorrhage, mediastinitis, pneumothorax, or vocal cord paresis (4,5). Thus, there is need for less invasive procedures. Even though specialized centers reported promising results for minimally invasive mediastinal staging by endoscopic ultrasound-guided fine-needle aspiration, this method could not be established for clinical routine until now (6).
Contrast-enhanced CT is broadly used for noninvasive staging of non–small cell lung cancer but lacks sufficient accuracy to detect or exclude lymph node metastases with high reliability before surgical exploration (7). The diagnostic performance of CT in mediastinal nodal staging is limited because of the use of size criteria to differentiate between benign and malignant lesions. Histopathologic studies showed that 21% of the metastases are in normal-sized lymph nodes (8), whereas no malignancy is found in 40% of the enlarged lymph nodes, especially in patients with poststenotic pneumonia (9). The use of MRI provides no improvement over CT (10), even with new superparamagnetic contrast agents (11).
Unlike CT, PET using 18F-FDG allows for the functional characterization of tissues. Tumor cells, especially non–small cell lung cancer, exhibit increased glucose metabolism. Thus, 18F-FDG PET is able to visualize not only the primary tumor but also its metastases (12). Several studies investigated the diagnostic performance of 18F-FDG PET in mediastinal lymph node staging, and their results were comprehended in meta-analyses (13–16). 18F-FDG PET is the most accurate noninvasive modality for staging mediastinal lymph nodes in lung cancer. 18F-FDG PET outperforms CT because the former detects metastases even in normal-sized lymph nodes (17). Thus, 18F-FDG PET found its way into the guidelines (18) and is increasingly used in the diagnostic work-up of lung cancer.
Besides using visual image interpretation, some institutions use standardized uptake value (SUV) measurements in lymph nodes. Mostly, an SUV of 2.5 is used as the cutoff, but this choice was never deduced from respective studies. Receiver operating characteristic (ROC) analyses demonstrated that thresholds of 4.4 or 5.3 resulted in the highest accuracy (19,20). However, high SUV thresholds imply considerable false-negative rates (FNRs). By this means 18F-FDG PET may lose one of its clinically most important features—its high negative predictive value, which allows the omission of invasive surgical staging in patients with an 18F-FDG PET–negative mediastinum (18). Altogether, it remains unclear whether the use of an SUV cutoff improves mediastinal staging, in comparison to the use of visual analysis.
The aim here was to determine an optimum SUV threshold for mediastinal lymph node staging in patients with non–small cell lung cancer and to compare the diagnostic performance of semiquantitative SUV analysis with the results of visual interpretation of 18F-FDG PET images.
MATERIALS AND METHODS
Patients
This retrospective analysis is based on the records of patients with suspected non–small cell lung cancer who were referred to our institution between March 1997 and November 2002. Patients were eligible if they had undergone 18F-FDG PET with SUV quantification and, within the next 6 wk, mediastinoscopy (mean ± SD, 13 ± 9 d). The clinical indication for mediastinoscopy was suspected or proven lung cancer and radiologic suspicion of mediastinal N2 or N3 lymph node disease. A total of 95 patients met these criteria: 75 men and 20 women with a mean age of 62 ± 9 y.
18F-FDG PET
After overnight fasting, the blood glucose concentration was verified to be below 160 mg/dL. The patients received 250 ± 20 MBq of 18F-FDG by intravenous injection. Before injection, additional transmission scans for correction of attenuation were acquired using rotating 68Ge/68Ga rod sources (before February 1998) or 137Cs point sources (since February 1998). PET from neck to hips using an ECAT ART scanner (Siemens Medical Solutions) was started 90 min after injection and lasted 60 min. Images were iteratively reconstructed using attenuation-weighted ordered-subset expectation maximization with 8 subsets and 2 iterations (21). Plane separation in the reconstructed images was 5.15 mm, with an in-plane resolution of 6.5 mm in full width at half maximum.
Image Analysis
Unaware of the results of CT, interpreters visually scored the 18F-FDG PET images using 6 levels ranging from − − − (clearly negative) to + + + (clearly positive) by comparison of the intensity in a lesion with that of the mediastinal blood pool. Lymph node stations were assigned according to the classification of Mountain and Dresler (1).
SUVs were calculated as the ratio of the regional radioactivity concentration divided by the injected amount of radioactivity normalized to body weight (22). The peak SUV in all lymph node stations sampled was measured with a region-of-interest technique. If a lymph node was visible on the PET image by increased activity, a 1.5-cm-diameter region was positioned around that node and the peak value measured. In patients with no apparent localized increase of radioactivity, the region of interest was positioned in the typical area for the lymph node station.
Surgical Staging
Mediastinoscopy was performed on all patients using the standard procedure (23) for radiographically enlarged lymph nodes. The lymph node stations biopsied were 1R, 1L, 2R, 2L, 4R, 4L, and 7. In addition to undergoing mediastinoscopy, 22 of these patients subsequently underwent thoracotomy. The histopathologic results of mediastinal lymph node stations 1R, 1L, 2R, 2L, 4R, 4L, and 7 harvested by systematic lymph node dissection during thoracotomy were also used for correlation with PET results. The presence of normal lymph node tissue, inflammation, or metastatic involvement was noted for all lymph nodes sampled.
Statistical Analysis
Positive PET findings without histologic proof of malignancy were defined as false-positive; negative PET findings in the presence of positive histologic findings were regarded as false-negative. The diagnostic test performance of 18F-FDG PET was compared with the results of surgical lymph node staging by calculating sensitivity, specificity, positive predictive value, negative predictive value, and accuracy.
SUVs are reported as mean ± SD. Differences in SUV between the groups of patients with and without metastases were analyzed using the Student t test. The diagnostic performance of visual image interpretation and semiquantitative analysis using SUV measurements were compared using ROC curves (24). FNRs (defined as 1 − sensitivity, or the number of false-negative findings divided by the number of metastatic lymph node stations) and false-positive rates (FPRs, defined as 1 − specificity, or the number of false-positive findings divided by the number of lymph node stations free of malignancy), the sum of both error rates (FNR + FPR), and diagnostic accuracy were plotted against a hypothetical SUV threshold to define an optimal SUV cutoff.
The statistical calculations were performed with the software package SPSS, version 13.0 (SPSS Inc.). P values of less than 0.05 were regarded as significant.
RESULTS
Clinical Data
In 84 of 95 patients, thoracic malignancies were proven histologically. Eighty patients had lung cancer (39 with squamous cell carcinoma, 31 with adenocarcinoma, 4 with unspecified non–small cell lung cancer, 1 with carcinosarcoma of the lung, and 5 with small cell lung cancer). The remaining patients had malignant mesothelioma (n = 2), malignant lymphoma (n = 1), or malignant fibrous histiocytoma (n = 1). Eleven patients had benign lesions (4 with tuberculosis, 4 with inflammatory residuals after pneumonia, 2 with silicosis, and 1 with sarcoidosis).
Metastatic involvement of mediastinal lymph nodes was present in 23 of 80 patients with lung cancer (prevalence, 29%). Figure 1 illustrates the characteristic findings of 18F-FDG PET in different stages of lymph node involvement in 4 patients with lung cancer. During mediastinoscopy, 371 lymph node regions were biopsied. In the subpopulation of 80 patients with proven lung cancer, 311 lymph node stations were assessed. Biopsies from 44 lymph node stations from mediastinal dissection obtained at subsequent thoracotomy in 27 patients were available for comparison. For 4 patients, mediastinal metastases were found during surgery in lymph node stations that were negative at the previous mediastinoscopy. The prevalence of metastases in the mediastinal lymph nodes of lung cancer patients was 23% (70/311).
18F-FDG PET in lymph node staging of lung cancer: unaffected lymph nodes (N0), peribronchial or hilar lymph node involvement (N1), ipsilateral mediastinal and subcarinal involvement (N2), and contralateral and supraclavicular involvement (N3).
18F-FDG Uptake into Mediastinal Lymph Nodes
SUVs were higher in metastatic lymph node stations (7.1 ± 4.5; range, 1.4–26.9; n = 70) than in tumor-free lymph nodes (2.4 ± 1.7; range, 0.6–14.9; n = 301; P < 0.01). Tumor-free lymph nodes with inflammatory changes exhibited slightly increased uptake of 18F-FDG (SUV, 2.7 ± 2.0; range, 0.8–14.9; n = 146). Figure 2 shows the mean SUV peaks in mediastinal lymph nodes for benign and malignant lesions and in relation to the results of visual interpretation. The error bars demonstrate overlap between groups.
Comparison of maximum SUV in mediastinal lymph nodes. FN = false-negative; TN = true-negative; FP = false-positive; TP = true-positive.
Diagnostic Performance Using Visual Interpretation of 18F-FDG PET
The visually interpreted 18F-FDG PET findings were compared with the surgical specimen findings for all patients and for the subset of patients with lung cancer using 2 × 2 contingency tables and are shown in Table 1. The corresponding diagnostic test parameters are shown in Table 2.
Results of Characterization of Mediastinal Lymph Nodes by Visual Interpretation of 18F-FDG PET and Correlation with Pathologic Diagnosis
Diagnostic Performance of 18F-FDG PET in Assessment of Mediastinal Lymph Nodes of Lung Cancer
Diagnostic Performance Using SUV 2.5 as Cutoff Value
The 18F-FDG PET findings with an SUV cutoff of 2.5 were compared with the surgical specimen findings for all patients and for the subset of patients with lung cancer using 2 × 2 contingency tables and are shown in Table 3.
Results of Characterization of Mediastinal Lymph Nodes by SUV Analysis of 18F-FDG PET and Correlation with Pathologic Diagnosis
ROC Curves for Comparison of Visual Interpretation and SUV Analysis
ROC analyses were performed to compare the results of the visually interpreted 18F-FDG PET findings and the findings obtained with the use of an SUV threshold. Figure 3 illustrates ROC curves obtained from the data of patients with lung cancer. The area under the ROC curve was 0.930 ± 0.022 for visual interpretation and 0.899 ± 0.025 for SUV quantification. The area under the ROC curve for visual interpretation tends to higher values without reaching significance (P = 0.241). The same tendency was observed in the complete patient cohort containing even patients without malignant disease or with thoracic neoplasms other than lung cancer (visual analysis, 0.915 ± 0.200; SUV quantification, 0.881 ± 0.027; P = 0.216).
ROC curves for semiquantitative SUV analysis and visual interpretation of 18F-FDG PET images for mediastinal lymph node staging in patients with lung cancer. Intensity of lesion was visually compared with that of mediastinal blood pool and scored from + + + to − − −.
Determination of an Optimal SUV Threshold
Because the shoulder of the ROC curve was slightly curved, no self-evident SUV cutoff could be derived. To identify an SUV threshold for the differentiation of tumor-free lymph nodes from malignant lymph nodes, we plotted the error rates of positive and negative interpretations against the SUV cutoff applied (Fig. 4). The graph of the error rates featured a minimal FNR + FPR at an SUV of 2.5. With increasing SUV thresholds, the FPR decreased most prominently up to that value whereas a continuous rise of FNR was noticed. The highest diagnostic accuracy was achieved at an SUV of 4.5. The diagnostic test parameters resulting from the use of an SUV of 2.5 as the threshold are given in Table 2.
Error rates and diagnostic accuracy of lymph node characterization by 18F-FDG PET as function of SUV threshold applied.
DISCUSSION
The data presented here agree with mediastinal-staging meta-analyses showing sensitivities of 87%–90% and specificities of about 85% for 18F-FDG PET (13,16). Most studies on the detection of mediastinal lymph nodes by 18F-FDG PET included fewer than 50 patients. Because of the large patient cohort investigated here, a detailed analysis of SUV quantification was possible. Lymph node metastases showed elevated 18F-FDG uptake and increased SUVs. Because of the broad range of SUVs in false-positive lymph nodes, SUV analysis failed to be more accurate than visual interpretation in predicting the presence of mediastinal metastases. Moreover, visual interpretation of 18F-FDG PET images seems to exceed the diagnostic test performance of a threshold SUV of 2.5.
Characterization of mediastinal lymph nodes using semiquantitative analysis of 18F-FDG SUVs was assessed previously (19,20). Vansteenkiste et al. stated that “the best SUV threshold to distinguish benign from malignant lymph nodes was 4.40” (19). Bryant et al. reported that accuracy was maximized at an SUV of 5.3 (20). Their results are in line with ours. Figure 4 shows that overall accuracy reached levels of above 80% with an SUV of more than 3 and exhibited only small changes for higher SUVs. In our series, accuracy was maximized at an SUV of 4.5. Because of the plateaulike shape of the graph, there may be uncertainty in the determination of the optimum value. This uncertainty may explain the variation in the cutoff values (4.4 and 5.3) identified by the other investigators. By all means, the use of such a high SUV threshold implies a considerable number of false-negative results, strictly speaking 27% in our series. Considering this high FNR, it is obvious that the SUV threshold of 3.5 as applied by Yi et al. was the cause of the low sensitivity (only 44%) they found for the detection of mediastinal lymph node metastases (25).
Here, we used the plot of error rates versus SUVs to disclose the weakness of the approach in maximizing accuracy. Clinically, we need an FNR as small as possible and to a lesser extent a high overall accuracy. This need can be met by a lower threshold, with a reasonable limit being an SUV of 2.5. But even using this cutoff, malignant lymph nodes may be missed because of only slight uptake in small metastases or because of the effects of limited spatial resolution. Thus occur false-negative findings that are avoided with visual interpretation by an experienced nuclear medicine physician. This result can also be seen in the ROC curve (Fig. 3), where the graph for visual interpretation lies above that for SUV analysis, demonstrating higher sensitivity for the human observer.
Even though other study groups have used an SUV threshold of 2.5 for mediastinal staging of lung cancer (11,26,27), until now there has been no evidence to justify this approach. In fact, the use of an SUV of 2.5 as the cutoff was a disputable generalization of results obtained from the evaluation of lung lesions (28,29). The validation lacking up to now is given here.
Positive and negative predictive values are important for the clinical management of patients. The excellent high negative predictive value for mediastinal staging by 18F-FDG PET implies that preoperative invasive nodal staging may be omitted if the mediastinum is negative for 18F-FDG uptake. In the patients analyzed here, the 18F-FDG PET examination outperformed even the sensitivity of mediastinoscopy. Despite the high sensitivity of mediastinoscopy in our series (66/70, or 94%), compared with that reported in the literature (80% (30)), mediastinal nodal involvement was not proven until thoracotomy in 4 patients.
On the other hand, inflammatory changes in lymph nodes caused false-positive findings with 18F-FDG PET and led to a lower positive predictive value. Thus, confirmation of 18F-FDG PET–positive lymph nodes by invasive means is required. In the subgroup of patients with proven lung cancer, the PPV was higher than in the unselected population.
We included patients with proven lung cancer as well as patients in whom lung cancer was finally excluded and inflammatory diseases or thoracic neoplasms other than lung cancer were confirmed. This broad patient spectrum is the population referred for 18F-FDG PET evaluation of the mediastinum in clinical routine. We here present the results for the whole patient population and the subgroup with lung cancer. Thus, our results should be relevant to daily practice.
The present investigation had some potential limitations that may affect the common applicability of our results. If histopathologic findings are compared with 18F-FDG PET findings, all accessible lymph node stations should be covered by the biopsy. This population consisted exclusively of patients with at least one enlarged mediastinal lymph node on CT. This factor may be influential, because most of these patients have abnormal lymph nodes. The large sample size (3.9) of lymph nodes per patient and the high number (301) of tumor-free lymph nodes appear sufficient to exclude this theoretic bias.
Another potential reason for deviations between the results from the separate studies is the use of different PET scanners, acquisition protocols, and image reconstructions, which may affect the reproducibility of SUVs. The test–retest variability of SUV measurements in lung cancer is about 10% (31). The main precondition of reproducible SUV measurements is a uniform protocol for patient preparation and imaging. 18F-FDG is increasingly taken up into lung cancer for at least 2 h, with ascending SUVs within that time (32). A constant distribution time for the radiopharmaceutical is a necessary prerequisite to minimize SUV variations. The PET protocol defines the acquisition times and the reconstruction algorithm with the parameters applied and by this means affects spatial image resolution and hence the recovered amount of radioactivity (33). Our results are valid for the imaging equipment and evaluation methods we used and may vary for different PET scanners and reconstructions. Furthermore, 18F-FDG uptake into lymph node metastases differs between various tumor entities (34). Thus, an SUV of 2.5 cannot be generally applied as the threshold for the characterization of lymph nodes.
CONCLUSION
To our knowledge, the present study is the first one to show that a threshold SUV of 2.5 for differentiating benign from metastatic lymph nodes is a feasible choice for mediastinal staging because FNR + FPR is minimized. The resulting high NPV of 96% allows the omission of mediastinoscopy in patients with an 18F-FDG PET–negative mediastinum. For the experienced observer, visual analysis should be relied on primarily, with calculation of the SUV used, at most, as a secondary aid. For the less experienced observer, the SUV may be of greater value.
Acknowledgments
We thank the technologists of the Department of Nuclear Medicine at the Saarland University Medical Center for their valuable technical assistance.
Footnotes
-
COPYRIGHT © 2007 by the Society of Nuclear Medicine, Inc.
References
- Received for publication June 28, 2007.
- Accepted for publication August 8, 2007.