Abstract
Our aim was to evaluate the interobserver agreement in 18F-sodium fluoride (NaF) PET/CT for the detection of bone metastases in patients with prostate cancer (PCa). Methods: 18F-NaF PET/CT scans were retrieved from all patients who participated in 4 recent prospective trials. Two experienced observers independently evaluated the 18F-NaF PET/CT scans on a patient level using a 3-category scale (no bone metastases [M0], equivocal for bone metastases, and bone metastases present [M1]) and on a dichotomous scale (M0/M1). In patients with no more than 10 lesions, the location and number of lesions were recorded. On a patient level, the diagnostic performance was calculated using a sensitivity analysis, in which equivocal lesions were handled as M0 as well as M1. Results: 18F-NaF PET/CT scans from 219 patients with PCa were included, of whom 129 patients were scanned for primary staging, 67 for biochemical recurrence, and 23 for metastatic castration-resistant PCa. Agreement between the observers was almost perfect on a patient level (3-category unweighted κ = 0.83 ± 0.05, linear weighted κ = 0.90 ± 0.06, and dichotomous κ = 0.91 ± 0.07). On a lesion level (dichotomous scale), the observers agreed on the number and location of bone metastases in 205 (93.6%) patients. In the remaining 14 patients, the readers disagreed on the number of lesions in 13 patients and the location of bone metastases in 1 patient. A final diagnosis of bone metastases was made for 211 of 219 patients. The sensitivity ranged from 0.86 to 0.92, specificity from 0.83 to 0.97, positive predictive value from 0.70 to 0.93, and negative predictive value from 0.94 to 0.96. Conclusion: The interobserver agreement on 18F-NaF PET/CT for the detection of bone metastases in patients with PCa was very high among trained observers, both on a patient level and on a lesion level. Moreover, the diagnostic performance of 18F-NaF PET/CT was satisfactory, rendering 18F-NaF PET/CT a robust tool in the diagnostic armamentarium.
Prostate cancer (PCa), the most common cancer in European men (1), has a predilection for metastasizing to the bones. A correct diagnosis of bone metastases is fundamental to determine the correct treatment for an individual patient. The European Association of Urology and the National Comprehensive Cancer Network recommend bone scintigraphy for assessment of bone metastases (2,3).
With increasing access to PET/CT, several centers have replaced bone scintigraphy with 18F-sodium fluoride (NaF) PET/CT for the assessment of bone metastases. 18F-NaF PET/CT exhibits improved pharmacokinetic properties and a favorable target-to-background ratio compared with methylene diphosphonate, which is frequently used for bone scintigraphy (4). Additionally, several studies imply improved diagnostic accuracy by 18F-NaF PET/CT compared with bone scintigraphy (5–7).
Despite favorable properties in 18F-NaF PET/CT, imaging evaluation remains a matter of subjectivity and has been labeled the weakest aspect of evaluation (8,9). We have previously shown excellent interobserver agreement among trained nuclear medicine physicians for the assessment of bone metastases by planar bone scintigraphy (10), but to the best of our knowledge, no such studies have been conducted using 18F-NaF PET/CT.
The aim of the present exploratory study was, primarily, to evaluate the interobserver agreement for 18F-NaF PET/CT findings among experienced observers in various settings of PCa and, second, to assess diagnostic accuracy for bone metastasis in patients with PCa.
MATERIALS AND METHODS
Patients
We included patients with PCa who had participated in 4 prospective studies at our department (6,11–13). All patients had undergone 18F-NaF PET/CT as part of the study-related procedures. If more than 1 18F-NaF PET/CT scan had been conducted, the first was was included in the present analysis. The patients represent a wide range of disease stages, from newly diagnosed PCa and biochemical recurrence after previous curative intent treatment to metastatic castration-resistant PCa.
18F-NaF PET/CT
18F-NaF PET/CT was conducted in accordance with the guidelines from the Society of Nuclear Medicine and Molecular Imaging (14) and the European Association of Nuclear Medicine (15) as previously described (6). In short, image acquisition was performed in 3-dimensional mode from the vertex to the mid thigh 30 min after intravenous injection of approximately 200 MBq of 18F-NaF. The images were reconstructed using attenuation correction and an ordered-subset expectation maximization algorithm. Immediately after the PET acquisition, low-dose CT was conducted for use in attenuation correction and anatomic coregistration.
Masked Evaluation of 18F-NaF PET/CT Findings
Each patient’s set of 18F-NaF PET/CT images was independently interpreted by 2 board-certified nuclear medicine physicians. Both readers were very experienced (having evaluated more than 2,000 18F-NaF PET/CT scans at the start of the study). The observers were masked to all clinical information except the PCa diagnosis.
The 18F-NaF PET/CT findings were categorized on a patient level into 1 of 3 categories: no bone metastases (M0), equivocal, and bone metastases (M1). In addition, the observers had to classify the 18F-NaF PET/CT findings dichotomously as bone metastases being present or not present. For patients with 10 or fewer lesions, the observers marked metastatic and equivocal lesions on a schematic drawing of a skeleton to evaluate whether they were considering the same lesions metastatic in each patient. Patients with a metastatic superscan were included in the category with more than 10 bone metastases.
Clinical Data and Final Diagnosis of Metastatic Status
Clinical data were retrieved from the case report forms from the prospective studies. Patients were categorized as having biochemical recurrence and castration-resistant PCa on the basis of the criteria from the European Association of Urology (2,16). In most patients (211/219, 96%), a final diagnosis for the presence or absence of bone metastases was available from the initial study in which the patient participated (6,11–13). In short, in every study a final diagnosis on a patient level was achieved by combining all clinical data, biochemical data, and imaging conducted before inclusion in the study and during follow-up. This process was done by a group of experienced nuclear medicine physicians, a radiologist, and a urologist. In 2 studies, the patients underwent systematic imaging follow-up (6,11). In 1 study, the patients underwent additional MRI (diffusion-weighted MRI), and at least 2 y of follow-up was available for all patients (12). Finally, in 1 study, all patients underwent radical prostatectomy. All patients with postoperative PSA values below 0.1 ng/mL for at least 6 mo after surgery was classified as being without bone metastasis at staging (13).
Statistics
The agreement between the 2 observers was assessed both on the original rating in 1 of the 3 categories and on the dichotomous classification. For the 3-category scale, we used the Cohen κ calculated as unweighted and with a linear weighting, treating the equivocal option as 0.5. The κ was categorized according to the suggestion by Landis and Koch (17). For the dichotomous scale, we calculated κ as well as the positive and negative agreement (ppos and pneg), defined as twice the number of cases for which the 2 observers agreed on a positive or negative rating divided by the sum of the number of cases for which each observer gave a positive or negative rating. Because the ppos and pneg are not simple proportions of independent observations, we calculated percentile-based confidence intervals from a bootstrapping procedure with 1,000 replications for these measures. All other confidence intervals are analytic.
Furthermore, the observer interpretation of the 18F-NaF PET/CT was evaluated against the final diagnosis (used as the reference) by calculating the sensitivity, specificity, positive predictive value, and negative predictive value for each observer and for the consensus rating by the 2 observers. For the 3-category scale, we calculated diagnostic accuracy using sensitivity analysis, with equivocal findings being considered either M0 or M1. Statistical analyses were performed using STATA, version 15 (StataCorp LP).
Approvals
This study complied with the Helsinki II Declaration. All patients provided written informed consent to participate. The studies were approved by the ethical committee (N-20130068, N-20140042, N-20140057, and N-20140080) and by the Danish Data Protection Agency.
RESULTS
Patients
In total, 219 patients were included in the analysis of interobserver agreement. The patients had participated in 4 recent prospective studies (6,11–13), and 18F-NaF PET/CT was performed as a part of the study-related procedures. 18F-NaF-PET/CT was performed at primary staging in 129 patients and at the time of biochemical recurrence after previous treatment with curative intent in 67 patients. Finally, 18F-NaF PET/CT was performed on 23 patients with metastatic castration-resistant PCa (Table 1).
Observer Agreement on a Patient Level
Using the 3-category scale, crude agreement between the 2 observers was seen in 200 of 219 (91%) patients, corresponding to an unweighted κ value of 0.83 ± 0.05 (linear weighted κ was 0.90 ± 0.06). The observers agreed that 131 patients had no bone metastases, that 61 patients had bone metastases, and that the 18F-NaF PET/CT findings were equivocal in 8 patients. The observers disagreed on 19 patients (Supplemental Table 1); in 1 patient, the observers had a 2-category difference (no bone metastases vs. bone metastases), whereas in the remaining 18 patients, the difference in categorization was equivocal versus no bone metastases or equivocal versus bone metastases.
Using the dichotomous scale, the crude agreement increased. The observers agreed on the M0/M1 classification in 211 of 219 (96%) patients (κ = 0.91 ± 0.07). The ppos (M1) was 0.94, and the pneg (M0) was 0.97.
In addition, the agreement was assessed for each disease stage (Table 2). The best observer agreement was found for metastatic castration-resistant PCa (κ = 1.0), whereas the greatest disagreement was observed for biochemical recurrence (unweighted κ = 0.70, linear weighted κ = 0.79). Overall, disagreement was seen in patients with 18F-NaF uptake without corresponding changes on the low-dose CT scan.
Observer Agreement on a Lesion Level (Dichotomous Scale)
In patients with bone metastases according to at least 1 of the observers (n = 72), the number of lesions (from 1 to 10) and lesion location were compared between the observers. Thirty-nine patients had more than 10 bone metastases as determined by both observers.
Of the remaining 33 patients, complete agreement on both the number and the location of the lesions was observed in 19 patients (Fig. 1). A difference in the number of metastatic lesions was encountered in 13 patients (Fig. 2; Supplemental Fig. 1; supplemental materials are available at http://jnm.snmjournals.org), of whom 1 patient was categorized as having a metastatic superscan by 1 observer, who also indicated that similar changes may be caused by metabolic bone disease, whereas the other observer indicated benign metabolic bone disease. Finally, the observers indicated different locations of a single lesion of a total of 6 lesions considered to be bone metastases in 1 patient.
Diagnostic Characteristics of 18F-NaF PET/CT
In most patients (211/219, 96%), a final diagnosis of bone metastases present or absent was established on the basis of clinical and imaging follow-up (6,11–13). For both observers and for the consensus evaluation, the sensitivity, specificity, positive predictive value, and negative predictive value were calculated for the dichotomous evaluation and for the 3-category evaluation, with equivocal results analyzed as negative for bone metastases and then positive for bone metastases (Table 3). In general, the sensitivity ranged from 0.86 to 0.92, and the specificity ranged from 0.83 to 0.97.
DISCUSSION
To the best of our knowledge, the present study was the first to evaluate interobserver agreement in 18F-NaF PET/CT for the detection of bone metastases. Two experienced masked observers evaluated 18F-NaF PET/CT images for the presence or absence of bone metastases in a large group of patients with PCa and found almost perfect agreement in their assessment of bone metastases. The agreement was investigated across the various stages of PCa with substantial to almost perfect results.
Method evaluation comprised test–retest and observer assessments, and previous studies have primarily focused on test–retest evaluation in 18F-NaF PET/CT, particularly for quantitative parameters such as SUV (18–20), investigated methods for quantification of tumor burden (20), or the inter- and intraobserver variation in the assessment of tumor burden (21). Our findings of almost perfect agreement are similar to the previously reported interobserver agreement for planar whole-body bone scintigraphy (10), which found a κ value of 0.87 for a dichotomous evaluation of bone scintigraphy. In the study of bone scintigraphy, a 4-category scale for evaluation was available; this might explain the difference observed between the linear weighted κ values, which were 0.72 for bone scintigraphy versus 0.90 for 18F-NaF PET/CT. A similar study using the tracer 68Ga-prostate specific membrane antigen (PSMA) in PET/CT found almost perfect agreement among experienced readers for the presence of bone metastases, with κ values of 0.84 for patients with biochemical recurrence and 0.87 for patients with bone metastases at the time of staging (22). Additionally, a large study evaluating PSMA PET/CT in more than 600 patients with biochemical recurrence reported κ values of 0.78 for the assessment of bone metastases, analogous to the present findings (23).
The present study was a head-to-head comparison of the number and localization of lesions in patients with up to 10 bone lesions. In most patients, the observers agreed on both the number and the location of bone metastases. In only a minority of patients (n = 13) did the number of metastatic lesions differ between the 2 observers, and in only 1 patient did the observers consider different lesions metastatic. This is an important finding in the STAMPEDE era (Systemic Therapy in Advancing or Metastatic Prostate Cancer: Evaluation of Drug Efficacy), in which knowing the number and location of bone metastases is crucial to assigning upfront chemotherapy or abiraterone appropriately (24,25) or to offering radiotherapy in men with metastatic PCa who have a low metastatic burden (26).
Across the range of disease stages, the observer evaluation was compared with a final diagnosis of bone metastases at the patient level. The sensitivity ranged from 0.86 to 0.92, and the specificity was 0.83–0.97. The sensitivity reported in the present study was numerically larger than that reported in a recent prospective study by Löfgren et al. (7), whereas the specificity was comparable. Löfgren et al. used a 3-category scale for evaluation, whereas other studies have applied a dichotomous scale (27,28) with results comparable to our dichotomous evaluation. A review of 18F-NaF PET/CT by Wondergem et al. showed a sensitivity of 89% and specificity of 91% for the detection of bone metastases on a patient level (29); these numbers are in line with our data. However, a few studies have reported moderate specificity for 18F-NaF PET/CT due to uptake in benign degenerative and inflammatory lesions. Furthermore, the present diagnostic accuracy for 18F-NaF PET/CT was comparable to similar studies using PSMA PET/CT for bone metastases (23,30).
The use of a 3-category scale provides for the equivocal option and may thus resemble everyday clinical practice more than the dichotomous option does. We found that the proportion of equivocal findings by consensus was considerably lower in the present study of 18F-NaF PET/CT than that observed in a similar study using planar bone scintigraphy, in which approximately 1 in 4 patients had equivocal findings (31). The proportion of equivocal findings on 18F-NaF PET/CT resembles previously published findings in studies that added SPECT/CT to planar bone scintigraphy (6,32,33), as well as the reported proportion of equivocal findings in previous diagnostic test accuracy studies on 18F-NaF PET/CT (5,7). In contrast, the proportion of equivocal 18F-NaF PET/CT findings in the present study was much lower than that stated by the U.S. National PET Registry study, which reported equivocal 18F-NaF PET/CT results in 15% of the patients (34).
To ensure the independence and masking of the observers, they were not involved in recruiting or managing patients in any way. The observers had extensive experience with 18F-NaF PET/CT from having served at a high-volume center for many years. The high level of agreement might in part be due to their vast experience, and the experience level may render the results less generalizable to readers with less experience and to centers with a low volume of 18F-NaF PET/CT-scans. To the best of our knowledge, no studies have investigated the impact of reader experience on agreement or accuracy with 18F-NaF PET/CT. The number of 18F-NaF PET/CT scans one would have to read to reach this expert assessment level remains unknown; in PSMA PET/CT this number is considered to be around 300 (22). Whether the number could be lower for 18F-NaF PET/CT if the observer has extensive experience in bone scintigraphy with SPECT/CT remains undetermined.
According to the U.S. National PET Registry study, 18F-NaF PET/CT changed management in approximately 40% of the patients, and this percentage was reduced to 12%–16% in patients for whom other imaging modalities were planned (34). In a study by Gauthé et al., 18F-NaF PET/CT changed management in 7% of patients at initial staging (35). However, the improved diagnostic properties of 18F-NaF PET/CT or the 18F-NaF PET/CT-induced change in patient management has not yet shown improvement in patient-related outcomes. We recently showed that in patients with negative bone scintigraphy results who underwent radical prostatectomy, 18F-NaF PET/CT did not add prognostic value in terms of the outcome after radical prostatectomy (13). There is a lack of studies showing that 18F-NaF PET/CT improves patient-related outcome, which might be one of the reasons why 18F-NaF PET/CT is not recommended by the international guidelines for assessment of bone metastases.
CONCLUSION
The present study demonstrated almost perfect agreement between 2 observers in using 18F-NaF PET/CT images to evaluate bone metastases in PCa. Likewise, the 18F-NaF PET/CT agreement on a lesion level was substantial, and the diagnostic accuracy was satisfactory, rendering 18F-NaF PET/CT a robust tool for the assessment of bone metastases in PCa. Future studies on 18F-NaF PET/CT should focus on the patient-related outcome to evaluate whether the advantageous properties of 18F-NaF PET/CT are reflected in patient-related outcomes.
DISCLOSURE
Study N-20140042 was supported by an unrestricted grant from the Danielsen Foundation. Studies N-20130068, N-20140057, and N-20140080 were supported by unrestricted grants from the Obel Family Foundation. Studies N-20130068 and N-20140057 were supported by the Danish Medical Research Grant (the Højmosegård Grant) and the Heinrich Kopps Grant. No other potential conflict of interest relevant to this article was reported.
KEY POINTS
QUESTION: What is the level of interobserver agreement on 18F-NaF PET/CT for the detection of bone metastases in patients with PCa?
PERTINENT FINDINGS: Two experienced observers independently evaluated 18F-NaF PET/CT scans from 4 prospective trials for the detection of bone metastases in 219 patients with PCa across primary staging, biochemical recurrence, and metastatic castration-resistant PCa. Excellent agreement was seen both on a patient level and on a lesion level. Furthermore, satisfactory diagnostic accuracy was seen when the findings were compared with a final diagnosis.
IMPLICATIONS FOR PATIENT CARE: 18F-NaF PET/CT has excellent interobserver agreement and is a robust tool for the detection of bone metastases in patients with PCa.
Acknowledgments
The data were presented in part at the 2017 annual meeting of the European Association of Nuclear Medicine (abstract EP-0712).
Footnotes
Published online Sep. 3, 2019.
- © 2020 by the Society of Nuclear Medicine and Molecular Imaging.
REFERENCES
- Received for publication June 19, 2019.
- Accepted for publication August 5, 2019.