Abstract
Benign and malignant pulmonary lesions usually are differentiated by 18F-FDG PET with a semiquantitative 18F-FDG standardized uptake value (SUV) of 2.5. However, the frequency of malignancies with an SUV of <2.5 is significant, and pulmonary nodules with low 18F-FDG uptake often present diagnostic challenges. Methods: Among 360 consecutive patients who underwent 18F-FDG PET to evaluate pulmonary nodules found on CT, we retrospectively analyzed 43 who had solid pulmonary lesions (excluding lesions with ground-glass opacity, infiltration, or benign calcification) with an SUV of <2.5. The uptake of 18F-FDG was graded by a visual method (absent, faint, moderate, or intense) and 2 semiquantitative methods (SUV and contrast ratio [CR]). Final classification was based on histopathologic findings or at least 6 mo of clinical follow-up. Results: We found 16 malignant (diameter, 8–32 mm) and 27 benign (7–36 mm) lesions. When faint visual uptake was the cutoff for positive 18F-FDG PET results, the receiver-operating-characteristic (ROC) analysis correctly identified all 16 malignancies and yielded false-positive results for 10 of 27 benign lesions. Sensitivity was 100%, specificity was 63%, and the positive and negative predictive values were 62% and 100%, respectively. When an SUV of 1.59 was the cutoff for positive 18F-FDG PET results, the ROC analysis revealed 81% sensitivity, 85% specificity, and positive and negative predictive values of 77% and 89%, respectively. At a cutoff for positive 18F-FDG PET results of a CR of 0.29, the ROC analysis revealed 75% sensitivity, 82% specificity, and positive and negative predictive values of 71% and 85%, respectively. The areas under the curve in ROC analyses did not differ significantly among the 3 analyses (visual, 0.84; SUV, 0.81; and CR, 0.82). Analyses of intra- and interobserver variabilities indicated that visual and SUV analyses were quite reproducible, whereas CR analysis was poorly reproducible. Conclusion: These results suggested that for solid pulmonary lesions with low 18F-FDG uptake, semiquantitative approaches do not improve the accuracy of 18F-FDG PET over that obtained with visual analysis. Pulmonary lesions with visually absent uptake indicate that the probability of malignancies is very low. In contrast, the probability of malignancy in any visually evident lesion is about 60%.
PET has been used widely with 18F-FDG to differentiate malignant from benign pulmonary lesions. The intensity of 18F-FDG uptake by malignant tumors is influenced by various factors, including biologic nature and lesion size. Relatively large, rapidly growing, and metabolically active lesions are usually obvious on 18F-FDG PET. In contrast, slowly growing, well-differentiated, or small lesions exhibit little or no accumulation (1). A ground-glass appearance on CT typically represents bronchioalveolar carcinoma with either negative or very low 18F-FDG uptake (2). However, 18F-FDG uptake in solid malignant nodules also can be absent or low, thus providing diagnostic challenges.
A standardized uptake value (SUV) of 2.5 generally has been used as a cutoff value for diagnosing pulmonary malignancies with 18F-FDG PET (3). However, 1 study indicated that the sensitivity of this SUV cutoff was lower than that of visual assessment (4). Some authors have recommended using visual evaluation rather than the SUV for small solitary pulmonary nodules (5), suggesting that the classical SUV criterion of 2.5 is inappropriate for diagnosing malignancies with low 18F-FDG uptake (4). Another study also indicated that the contrast ratio (CR), an index of relative tracer uptake of lesions versus background lung activity, is superior to the SUV for differentiating pulmonary malignancies (6). Thus, a reliable analytic method for discriminating lung lesions with low 18F-FDG uptake has not been established. Likewise, a relationship between the intensity of 18F-FDG uptake and a diagnosis of solid pulmonary lesions with low 18F-FDG uptake has not been confirmed.
In the present study, we examined the characteristics of solid nodules or masses with an 18F-FDG SUV of <2.5 and the diagnostic ability of 18F-FDG PET to differentiate benign from malignant lung diseases both visually and semiquantitatively.
MATERIALS AND METHODS
Patients
Patients were identified from a retrospective review of the PET center database at Tokyo Women's Medical University. Among 360 individuals who had a solitary pulmonary lesion and who presented between May 2003 and March 2005 (27 men; age [mean ± SD], 65 ± 11 y; range, 38–85 y), 43 fulfilled the following conditions for inclusion in the present study: chest CT scan available, solid nodules or masses seen on CT (excluding lesions with ground-glass appearance, infiltration, or typical benign calcification), a lesion 18F-FDG SUV of <2.5, and a definitive diagnosis (benign or malignant) determined by pathologic analysis or at least 6 mo of follow-up by chest radiograph or CT. A lesion that disappeared within the 6-mo follow-up period was classified as benign. We used SUVs from the original PET scan reports to select the study participants because we have routinely reported SUVs for visually detectable lesions as measured by the method used in this study. Two experienced radiologists independently reviewed all CT scans and measured maximum lesion diameters. Any disagreement was resolved by consensus. Prior malignancy and diabetes were not exclusion criteria, except for a fasting blood sugar concentration of higher than 200 mg/dL at the time of 18F-FDG PET.
In addition, we analyzed a selection of studies with original SUVs of <3.0 to verify whether the optimal cutoff values would differ from those obtained in studies with SUVs of <2.5.
18F-FDG PET
Patients fasted for at least 5 h before receiving an intravenous injection of 18F-FDG (3.7 MBq/kg of body weight). Approximately 60 min later, PET was undertaken by use of a dedicated full-ring lutetium oxyorthosilicate (LSO) scanner (ECAT ACCEL; Siemens) with a transaxial spatial resolution of 6.3 mm at full width at half maximum. Attenuation was corrected by standard transmission scanning with 68Ge sources. Transmission scans were acquired for 1 min and emission scans were acquired for 2 min per bed position in the 3-dimensional mode from the skull base to the midthigh level. Images were reconstructed by use of ordered-subset expectation maximization (OSEM) with 2 iterations and 8 subsets, a 128 × 128 matrix, and postsmoothing with a gaussian filter.
Data Analysis
The 18F-FDG PET scans were analyzed visually and semiquantitatively by 2 independent observers, who also performed the CT examinations and who were also unaware of the definitive diagnosis. Lesions on CT images were localized at the time of PET analysis. The intensity of 18F-FDG uptake by pulmonary lesions relative to the background activity in the uninvolved adjacent lung parenchyma and the mediastinum was assessed visually, and the intensity was scored with a 4-point scale (absent, faint, moderate, or intense) modified from a previously reported method (7) as follows: absent, not visible on the image display; faint, less intense than mediastinal blood-pool activity; moderate, equal in intensity to mediastinal blood-pool activity; and intense, more intense than mediastinal blood-pool activity. Scans were analyzed semiquantitatively by use of the SUV and the CR (6) as indices of 18F-FDG uptake. Spheric regions of interest (ROIs) were placed over lesions visible on PET images, on simultaneously displayed axial, coronal, and sagittal tomograms. The ROIs of lesions that were invisible on PET images were located by use of the corresponding CT images. The highest activity within an ROI was measured, and the SUV was determined as the highest activity concentration per injected dose per body weight (kg) after correction for radioactive decay. The CR was determined by measuring the highest activity in the tumor ROI (T) and in the contralateral normal lung ROI (N) and was calculated as (T – N)/(T + N) for each lesion (6).
Statistical Analysis
Data are expressed as mean ± SD. Three datasets for the same lesion from 2 readers were averaged, and the mean values were used for further analyses. Receiver-operating-characteristic (ROC) curves for visual scores, the SUV, and the CR were derived and evaluated by comparing the areas under the curves. The sensitivity, specificity, and positive and negative predictive values of the 3 analyses were determined at the optimal cutoff values by use of the ROC curves. Unpaired t tests were used to examine normally distributed continuous variables, and χ2 analyses were used to assess differences in frequencies. The intra- and interobserver variabilities of each method were determined by use of the Cohen κ-statistic for visual scores and the coefficient of variation (CV) for the SUV and the CR. The CV was calculated by dividing the SD by the mean of the 2 repeated measurements, and the root-mean-square values of these CVs represented the overall intra- and interobserver variabilities. A P value of <0.05 was considered significant.
RESULTS
Malignant and Benign Lesions
Table 1 summarizes the clinical, 18F-FDG PET, and histologic findings for the 43 lesions. Sixteen (37%) were malignant, and 27 were benign. The maximum diameters did not differ significantly between malignant and benign lesions (15 ± 6 mm, with a range of 8–32 mm, and 15 ± 8 mm, with a range of 7–36 mm, respectively). The prevalence of small lesions (≤10 mm) also did not differ significantly between malignant and benign lesions (18.8% and 29.6%, respectively). All 16 malignancies were histologically confirmed primary lung cancers (13 adenocarcinomas, 1 squamous cell carcinoma, 1 small cell carcinoma, and 1 large cell carcinoma). The adenocarcinomas were classified as either well differentiated (n = 8) or moderately differentiated (n = 5), and the maximum diameter ranged from 10 to 32 mm (16.5 ± 5.4 mm). Seven of the 27 benign lesions also were confirmed by histologic analysis (2 hamartomas and 5 tuberculomas). For the remaining 20 benign lesions, the median duration of clinical follow-up for lesions that disappeared or decreased in size (n = 10) was 15.5 mo (range, 3–21 mo), and for those with no change (n = 10), this duration was 16 mo (range, 6–24 mo).
Diagnostic Performance of Visual and Semiquantitative Analyses
The frequency distributions of visual uptake scores for benign and malignant lesions are shown in a histogram (Fig. 1). The ROC analysis revealed that when faintly enhanced uptake on visual assessment was taken as the cutoff for positive 18F-FDG PET results, the visual inspection yielded 100% sensitivity, 63% specificity, and positive and negative predictive values of 62% and 100%, respectively. At this threshold, visual scores correctly identified 33 of 43 lesions (78%) with low 18F-FDG uptake (Table 2).
The median SUVs were 1.69 (range, 0.97–2.29) for malignant lesions and 1.31 (range, 0.65–2.31) for benign lesions. When an SUV of 1.59 was used as the cutoff for positive 18F-FDG PET results, the ROC analysis showed 81% sensitivity, 85% specificity, and positive and negative predictive values of 77% and 89%, respectively. At this threshold, the SUV correctly identified 36 of 43 lesions (84%) with low 18F-FDG uptake (Table 2).
The median CRs were 0.36 (range, 0.09–0.49) for malignant lesions and 0.17 (range, −0.18 to 0.42) for benign lesions. At a cutoff for positive 18F-FDG PET results of a CR of 0.29, the ROC analysis showed 75% sensitivity, 82% specificity, and positive and negative predictive values of 71% and 85%, respectively. At this threshold, the CR correctly identified 34 of 43 lesions (79%) with low 18F-FDG uptake (Table 2).
The areas under the ROC curves, which represent overall diagnostic performance, did not differ significantly among the 3 analytic methods (visual, 0.84; SUV, 0.81; and CR, 0.82) (Fig. 2).
A reanalysis of 49 studies (18 malignant and 31 benign) with original SUVs of <3.0 showed that the optimal cutoff values for visual scores, the SUV, and the CR were faint, 1.59, and 0.29, respectively, values that were identical to those of studies with SUVs of <2.5.
Intra- and Interobserver Reproducibilities
The Cohen κ-statistic for intra- and interobserver reproducibilities were 0.62 and 0.65, respectively, indicating good agreement with visual scores of pulmonary lesions with low 18F-FDG uptake. The root-mean-square values of the CVs for the intraobserver variability were 11% and 221% for the SUV and the CR, respectively, and those for the interobserver variability were 20% and 142%, respectively. These findings indicated that the SUV was quite reproducible whereas the CR was poorly reproducible with respect to the semiquantitative assessment of pulmonary lesions with low 18F-FDG uptake.
DISCUSSION
One major finding of the present study was that visual and semiquantitative (SUV and CR) assessments can differentiate malignant from benign pulmonary lesions equally, a finding that is consistent with those of previous reports (3,4,8,9). The present study reconfirmed this fact for lesions with low 18F-FDG uptake (SUVs of <2.5). Another key finding was that a solid pulmonary lesion with visually absent or very low tracer activity (SUVs of <1.59 or CRs of <0.29) on 18F-FDG PET images had a low probability (0%–15%) of malignancy. In contrast, the probability of malignancy was likely to be moderate (62%–77%) when tracer uptake was at least faintly visible or moderate (SUVs of ≥1.59 or CRs of ≥0.29).
Traditionally, lung lesions have been evaluated visually by comparison with the intensity of uptake in a lesion with normal mediastinal activity. That is, if the intensity of uptake is lower than that in the mediastinum, then the lesion is suspected to be benign (9). However, our findings suggested that any lesions visually detectable on 18F-FDG PET images should be considered carefully for the possibility of malignancy; other clinical (age, smoking history, or other cancer) and radiologic (spiculation, lesion location, and size) factors that influence the likelihood of malignancy also should be taken into consideration (10).
Among the various factors that influence the visibility and uptake of malignant tumors on 18F-FDG PET images, tissue differentiation of tumors is important (1). Most malignant pulmonary nodules with SUVs of <2.5 are differentiated adenocarcinomas (4). The present study also showed that 81% of the malignant tumors were determined histologically to be differentiated adenocarcinomas. Lesion size is also an important factor (11). The contrast between a tumor and normal lung decreases as the size of the lesion decreases and may even disappear (12) because of partial-volume averaging effects attributable to the limited resolution of a PET scanner (13). The detection of nodules measuring less than 15 mm in diameter is slightly less sensitive than that for lesions larger than 15 mm (14). Correction of the SUV on the basis of lesion size determined from axial CT images may help to improve sensitivity without degrading specificity compared with the use of conventional SUV measurements (11). However, other studies have indicated that the recovery coefficient (measured activity in a lesion divided by true activity) depends not only on lesion size but also on object geometry (15,16). Thus, whether a simple correction of the SUV on the basis of lesion size actually can improve semiquantitative discrimination between benign and malignant pulmonary nodules with low 18F-FDG uptake remains to be determined.
Other important factors affecting the visibility of target lesions include scintillator type, image reconstruction methodology, and image noise. We used an LSO-based PET scanner, a 3-dimensional acquisition mode, and an OSEM reconstruction method after the administration of 18F-FDG at 3.7 MBq/kg. Compared with the dose used for a conventional bismuth germanate (BGO) scanner, the dose administered in the present study seems to be rather low. However, compared with a BGO scintillator, an LSO scintillator has a similar attenuation length but 4 times the light output and a decay time 7 times shorter. The coincidence time resolution of the scanner used in the present study (ECAT ACCEL) is substantially narrower (6 ns) than that of BGO-based scanners (10–12 ns). All of these factors are likely to improve the visibility of target lesions on 18F-FDG PET images because of an improvement in performance and a reduction in image noise accomplished by a limited administered dose.
One study has indicated that the CR is more sensitive than the SUV in diagnosing faintly positive pulmonary nodules when the classical SUV criterion of 2.5 is applied (4). However, the ROC curve analysis in the present study showed that these methods were identical in terms of overall diagnostic performance. Furthermore, the present study also indicated that the inter- and intraobserver reproducibilities of CR measurements were quite poor. Factors that determine the reproducibility of CR measurements include the maximal SUVs of lesions and the contralateral pulmonary background (6). Because the lesion SUV was quite reproducible in the present study, key reasons for the poor CR reproducibility must have been related to the inconsistent pulmonary background SUV. Indeed, regional pulmonary SUVs differ significantly depending on the sampling location in the lung (17). In addition, the poor CR reproducibility might have been associated with the SDs of SUV measurements of the normal lung. When the tumor SUV was low and within the ranges of the present study, the value overlapped the reconstruction noise in the normal lung. Thus, the CR does not seem to have any advantage over the SUV in diagnosing malignant pulmonary nodules with low 18F-FDG uptake.
The present study has some limitations. During semiquantitative analysis with only a PET scanner, the ROI location that corresponded to the lesion site was impossible to determine only on PET images when the lesion was invisible. Thus, we selected a nearby location by using corresponding CT slices; this method would have produced some inaccuracies in SUV and CR measurements. This problem can be resolved by using a PET/CT scanner, because the ROI location can be determined easily by use of fused PET and CT images even when 18F-FDG uptake in lesions is negative. Another limitation may be that not only the SUV but also the visibility of target lesions is dependent on image reconstruction methodology and image noise (18). For example, PET/CT apparently improves the contrast-to-noise ratio of images over that of PET alone because of the noise reduction achieved with CT-based attenuation correction rather than 68Ge-based attenuation correction. CT-based attenuation correction produces a significantly higher SUV than attenuation correction based on germanium (19). Thus, the optimal cutoff thresholds for visual scores, the SUV, and the CR for differentiating benign from malignant lesions should be determined individually depending on the detector type (BGO, LSO, or germanium oxyorthosilicate), the reconstruction method (filtered backprojection or OSEM), and the scanner (PET or PET/CT). Finally, the present study was a retrospective analysis, and we did not use dual-time-point imaging, which can be potentially valuable in distinguishing benign from malignant lung nodules. One study has indicated that 18F-FDG activity in cancerous lesions increases whereas benign lung nodules remain relatively stable over time (20). The technique described here may help to improve the accuracy of characterizing lung nodules with 18F-FDG SUVs of less than 2.5 at the initial scan but requires further investigation.
CONCLUSION
Our results suggest that the abilities of visual and semiquantitative methods to identify malignancies in solid pulmonary lesions with low 18F-FDG uptake are equal. The probability of malignancy for pulmonary lesions with visually absent uptake is very low. In contrast, the probability of any visually obvious lesion being malignant is about 60%.
Footnotes
-
↵* Contributed equally to this work.
References
- Received for publication October 22, 2005.
- Accepted for publication December 15, 2005.