Abstract
Clinical 123I-2-β-carbomethoxy-3β-(4-iodophenyl)-N-(3-fluoropropyl)nortropane (123I-FP-CIT) SPECT studies are commonly performed and reported using visual evaluation of tracer binding, an inherently subjective method. Increased objectivity can potentially be obtained using semiquantitative analysis. In this study, we assessed whether semiquantitative analysis of 123I-FP-CIT tracer binding created more reproducible clinical reporting. A secondary aim was to determine in what form semiquantitative data should be provided to the reporter. Methods: Fifty-four patients referred for the assessment of nigrostriatal dopaminergic degeneration were scanned using SPECT/CT, followed by semiquantitative analysis calculating striatal binding ratios (SBRs) and caudate-to-putamen ratios (CPRs). Normal reference values were obtained using 131 healthy controls enrolled on a multicenter initiative backed by the European Association of Nuclear Medicine. A purely quantitative evaluation was first performed, with each striatum scored as normal or abnormal according to reference values. Three experienced nuclear medicine physicians then scored each striatum as normal or abnormal, also indicating cases perceived as difficult, using visual evaluation, visual evaluation in combination with SBR data, and visual evaluation in combination with SBR and CPR data. Intra- and interobserver agreement and agreement between observers and the purely quantitative evaluation were assessed using κ-statistics. The agreement between scan interpretation and clinical diagnosis was assessed for patients with a postscan clinical diagnosis available (n = 35). Results: The physicians showed consistent reporting, with a good intraobserver agreement obtained for the visual interpretation (mean κ ± SD, 0.95 ± 0.029). Although visual interpretation of tracer binding gave good interobserver agreement (0.80 ± 0.045), this was improved as SBRs (0.86 ± 0.070) and CPRs (0.95 ± 0.040) were provided. The number of striata perceived as difficult to interpret decreased as semiquantitative data were provided (30 for the visual interpretation; 0 as SBR and CPR values were given). The agreement between physicians’ interpretations and the purely quantitative evaluation showed that readers used the semiquantitative data to different extents, with a more experienced reader relying less on the semiquantitative data. Good agreement between scan interpretation and clinical diagnosis was seen. Conclusion: A combined approach of visual assessment and semiquantitative analysis of tracer binding created more reproducible clinical reporting of 123I-FP-CIT SPECT studies. Physicians should have access to both SBR and CPR data to minimize interobserver variability.
SPECT imaging of the presynaptic dopaminergic terminal using the tracer 123I-2-β-carbomethoxy-3β-(4-iodophenyl)-N-(3-fluoropropyl)nortropane (123I-FP-CIT) (DaTSCAN; GE Healthcare) has proven to be an effective tool in the diagnosis of neurodegenerative disorders linked to disturbances of the nigrostriatal dopaminergic system (1,2). Clinical indications for 123I-FP-CIT SPECT imaging include differentiation between idiopathic parkinsonism (Parkinson disease [PD]) and essential tremor; assessment of atypical parkinsonian syndromes such as multiple system atrophy, progressive supranuclear palsy (PSP), and corticobasal degeneration (CBD); and differentiation between dementia with Lewy bodies (DLB) and Alzheimer disease (3).
The results of clinical 123I-FP-CIT SPECT scans are most commonly obtained using visual assessment of tracer binding (4). For normal cases and cases entirely typical of dopaminergic degeneration, visual interpretation is usually sufficient for an accurate diagnosis (5). This type of assessment is, however, subjective in nature, with the reporter’s judgment relying heavily on experience and knowledge within the field. Further complications include assessment of patients with tracer binding in the lower range of normality, evaluation of follow-up studies, and assessment of nonstandard uptake patterns as seen for a subsection of patients with, for example, DLB, PSP, and CBD (6,7). The reduced tracer binding seen with advanced age can also pose a problem for the reporter (8). For a more objective approach, quantitative evaluation of tracer binding can be a useful aid. A variety of methods are available for quantification (9). Most commonly semiquantitative techniques are used, with regions of interest or volumes of interest (VOIs) defined to assess tracer binding in the striatum and its main components, the caudate nucleus and putamen.
The relative accuracy of diagnosis between visual analysis and a semiquantitative approach was investigated by Acton et al. (10), showing that semiquantification using region-of-interest analysis gave diagnostic accuracy comparable to visual analysis. The consistency between visual and semiquantitative assessments has also been investigated, with encouraging results (11–13). Tondeur et al. (14) looked at the reproducibility of a mixed visual and semiquantitative approach when interpreting 123I-FP-CIT SPECT data. Thirty nuclear medicine physicians of varying experience were asked to assess 12 scans using a combination of visual and semiquantitative data. The study found a suboptimal interobserver agreement, in particular for studies involving subtle changes and patients with structural brain abnormalities. The authors noted important differences in observers’ sensitivities, concluding that standardization of interpretation criteria is needed for improved reproducibility. As the authors pointed out, a limitation of the study was that normal reference values for the quantitative data were not provided, meaning each physician in effect had to make up his or her own reference range for the test data supplied. Because of its design, the study by Tondeur et al. could not evaluate to what extent physicians used the quantification in their diagnosis, and its overall usefulness for reporting could not be assessed.
The study by Tondeur et al. (14) highlighted a general problem with the semiquantitative approach for 123I-FP-CIT SPECT imaging—that normal reference values have not been easily available. Because of interscanner differences in, for example, sensitivity and collimator design and lack of standardization in imaging and reconstruction protocols, center-specific reference values have historically been needed. Difficulties with recruiting a sufficient number of healthy controls have made it impractical for most centers to use semiquantification. In 2007, the Neuroimaging Committee of the European Association of Nuclear Medicine, however, initiated the European Normal Control Database of DaTSCAN (ENCDAT) study (15). Through a collaboration of 15 European institutions, the committee aimed to generate a database of 123I-FP-CIT SPECT scans of healthy controls and also to standardize imaging protocols in terms of acquisition parameters and reconstruction methods. This study has now been finalized, with a database of one hundred fifty-two 123I-FP-CIT SPECT scans of healthy controls generated. The database can be used to create normal reference values, regardless of imaging system, through the usage of camera-specific calibration factors (16,17).
Semiquantification of 123I-FP-CIT SPECT images has been available for some time, and although this method is recommended, there is still uncertainty in its value for general clinical reporting (1,9,18). With a large database of healthy controls now available, this study investigates the usefulness of 123I-FP-CIT quantification in the clinical setting. The overall aim of the study was to assess whether 123I-FP-CIT quantification created more reproducible clinical reporting in terms of interobserver variability. The study also investigated in what form semiquantitative data should be supplied, whether striatal binding ratios were sufficient or whether specific information about binding in striatal subregions was also needed.
MATERIALS AND METHODS
Subjects
This was a retrospective study of patients referred for 123I-FP-CIT scans at the University College London Hospital between January 2010 and July 2010. Fifty-four patients who gave written informed consent for their data to be used for retrospective research were enrolled on the study (age range, 25–84 y; median age, 65 y; 26 women, 28 men). Local ethics committee approval was given for this study. All patients were referred for evaluation of nigrostriatal dopaminergic degeneration and included a broad spectrum of tracer binding. As a reflection of the hospital’s referral basis, most patients (86%) were referred for investigation of possible PD, with indications including rigidity, tremor, and atypical gait disorder. A small number of patients was referred to exclude or confirm DLB (4%), multiple system atrophy (4%), and PSP (2%). The remaining 2 patients (4%) could not be grouped into these standard referral groups and included 1 patient enrolled on a clinical trial and 1 for whom clinical data retrospectively were not available.
SPECT Protocol
Patients were imaged on an Infinia Hawkeye SPECT/CT scanner (GE Healthcare) according to a routine clinical protocol similar to the standardized ENCDAT imaging protocol (19). A SPECT brain scan was obtained at 3–4 h after injection of 185 MBq of 123I-FP-CIT. Imaging parameters used were a 128 × 128 matrix, 15 cm or less radius of rotation, 120 projections over a 360° orbit, 30-s projection time, and 159 keV ± 10% photopeak energy window. Energy windows to enable triple-energy window scatter correction were also acquired (138 keV ± 3.5% and 184 keV ± 3%) (20). A low-dose CT acquisition followed the emission scan for attenuation-correction purposes. Data were reconstructed on a Xeleris workstation (GE Healthcare) using ordered-subset expectation maximization iterative reconstruction (10 subsets and 10 iterations), including triple-energy window scatter and CT attenuation corrections. A 3-dimensional Butterworth filter was used to smooth the data (cutoff, 0.55 cm−1; order, 10). A movie of the acquired projections for all patients was evaluated to assess patient motion before reconstruction.
Image Quantification
For semiquantitative analysis of tracer binding, BRASS software was used (HERMES Medical Solutions, SE). BRASS software is a 3-dimensional semiautomatic brain analysis package in which the subject’s brain is first registered to a standard anatomic atlas, then tracer binding in the whole striatum, caudate nucleus, and putamen is assessed (21). VOIs are automatically defined over the caudate nucleus and putamen to assess specific tracer binding and over a reference region, the occipital cortex (OCC), to assess nonspecific binding (Fig. 1). The count concentrations in these regions were used to calculate striatal specific binding ratios (SBRs) as [VOIstriatum − OCC]/OCC, where VOIstriatum and OCC are the count concentrations in the striatum and occipital cortex, respectively. Because parkinsonian syndromes tend to affect the caudate nucleus and putamen with different severity, caudate-to-putamen ratios (CPRs) were also calculated for all subjects. An experienced image processor performed the semiquantitative analysis.
VOI definition used in BRASS software. Green/yellow = caudate nucleus; red/orange = putamen; dark blue = occipital cortex.
Normal Reference Values
Normal reference values were obtained using 131 of the 152 healthy controls from the ENCDAT database (subject demographics are given in Table 1). Subjects for whom scatter data were not available and subjects imaged on scanners for which acquisition data were not compatible with the reconstruction software used in this study were not included (n = 21). Images were reconstructed using parameters optimized by the core lab of the ENCDAT initiative (19). A HERMES HOSEM program was used to iteratively reconstruct the data with triple-energy window scatter correction and attenuation correction using a uniform attenuation correction map (16,22).
Subject Demographics for Healthy Controls
Normal reference values for the Infinia Hawkeye were determined using cross-camera calibration factors as described by Tossici-Bolt et al. (16). Briefly, calibration factors for each camera–collimator combination included in the ENCDAT trial were created using an anthropomorphic basal ganglia phantom (Radiology Support Devices Inc.). The phantom was filled with activity concentrations mimicking the striatal-to-background ratios seen in normal clinical practice and scanned using the ENCDAT standardized imaging parameters (16). After image reconstruction using HERMES HOSEM software and semiquantitative analysis in BRASS, linear regression was performed, giving camera calibration factors relating true and measured binding ratios for each camera–collimator system. This calibration also accounts for the different reconstruction methods used for the database and clinical scans. Using these calibration factors, we calculated true SBRs for the healthy controls. A calibration factor for the Infinia Hawkeye system was then used to scale these to binding ratios relevant for this particular system. The normal reference SBR was set as the mean − 2 SDs. As expected, decreased striatal binding ratios were seen with increased age (8), and to accommodate this, decade-specific reference SBRs and a mean reference SBR over all ages were determined. The reference value for the CPR was set as the mean + 2 SDs. Because this is a ratio within the striatum, cross-camera calibration factors were not needed. The CPR proved unchanged with age, meaning an average reference value defined over all ages was sufficient. Mean and decade-specific reference values as calculated for the Infinia Hawkeye are shown in Figure 2.
Normal reference values for Infinia Hawkeye γ-camera.
Data Analysis
Evaluations
For all evaluations, each striatum was taken as an independent measurement (n = 108). A purely quantitative interpretation was first performed, with each striatum being scored as normal or abnormal according to reference values obtained from the ENCDAT database. The studies were then read by 3 experienced nuclear medicine physicians, all actively involved in reading FP-CIT scans at the time of the study. As a marker of experience, the number of 123I-FP-CIT scans analyzed per year times the number of years of experience of reporting FP-CIT studies was calculated (360, 536, and 180 for physicians A, B, and C, respectively). The readers were masked to the clinical information and asked to score each striatum as normal or abnormal and also to indicate any difficult cases for each evaluation. The initial evaluation was performed using visual interpretation only. For this purpose, images were prepared in a standardized format by an experienced image processor, reorienting transversal data on a Xeleris workstation to the orbitomeatal plane and using the “GE COL” color scale (Fig. 3). Visual interpretation was performed twice to enable evaluation of intraobserver variability, followed by visual interpretation with additional information on SBRs for each patient. For cases for which the readers were still in disagreement after being given information about striatal binding, either with each other or with the purely quantitative evaluation, a final interpretation was performed using visual interpretation with information on SBRs and CPRs. Approximately 1 mo elapsed between each interpretation.
Standardized image format used in study, showing transversal images for patient with normal tracer binding. Patient-specific and reference SBR and CPR values are also shown.
Statistical Analysis
To assess intra- and interobserver agreement, the κ-statistic (κ) was calculated (23). For interpretation, a κ of less than 0.20 was set to represent poor agreement; 0.21–0.40, fair agreement; 0.41–0.60, moderate agreement; 0.61–0.80, good agreement; and more than 0.81, very good agreement (24). Agreement between readers’ evaluations and the purely quantitative evaluation was also calculated using κ, providing a measure of the extent to which each physician changed his or her interpretation according to the semiquantitative data as it was provided.
Clinical Diagnosis
Information about clinical diagnosis was available for patients referred within the University College London Hospital (n = 35). For these patients, the clinical diagnosis in the most recent clinical letter was assessed with a mean follow-up time of 312 ± 151 d. Discordance between clinical diagnosis and scan interpretation was assessed, with patients classified as having normal or abnormal scan results according to consensus report between the 3 readers after they had been provided with both SBR and CPR data.
RESULTS
Evaluations
Of the 108 striata, the purely quantitative evaluation diagnosed 39 (mean SBR reference values) and 38 (decade-specific SBR reference values) as abnormal. Abnormal striata increased to 48 as reference CPR values were also considered. Cases for which the CPR changed the diagnosis exclusively included striata with a decreased uptake in the putamen but with a high enough tracer uptake in the caudate nucleus to push the striatal binding ratio into the reference range (Fig. 4).
Patient for whom purely semiquantitative evaluation using SBR values scored right striatum as normal, whereas evaluation using SBR and additional CPR moved striatum into abnormal category. Right SBR was in this case 2.28 (normal mean SBR > 1.91, normal decade-specific SBR > 1.84), whereas right CPR was 1.58 (normal mean CPR < 1.29).
Overall, the number of abnormal striata for physicians A and C decreased as semiquantitative data were provided, whereas for physician B the number remained unchanged (Table 2). When visual interpretation only was used, physicians A and C indicated a larger number of striata than physician B (18 and 8 for physicians A and C, respectively, and 4 for physician B) as being difficult to interpret. After being given information about SBRs, readers were in disagreement for 15 striata, corresponding to 11 patients (4 of these striata also were indicated as difficult to interpret). For these 11 patients, a final reading was performed using SBR and CPR data. An example of a patient for whom 2 of the observers indicated the left and right striata as difficult to assess when the purely visual interpretation was performed is shown in Figure 5. Later, these observers were able to perform a confident read for the left and right striata when SBR and CPR data were supplied.
Number of Abnormal and Difficult Striata Identified for Each Reader
Patient for whom 2 observers had indicated that left and right striata were difficult to interpret when visual evaluation only was performed; later, these 2 observers were able to give a confident read because semiquantitative data were provided. Right and left SBR values were 2.26 and 2.36, respectively (normal mean SBR > 1.91 and normal decade-specific SBR > 1.61), whereas right and left CPR values were 1.17 and 1.03, respectively (normal mean CPR < 1.29).
Statistical Analysis
A good intraobserver agreement was seen for all readers (Table 3). Interobserver agreement improved as quantitative data were provided (Table 3). Using a purely visual interpretation, we obtained a κmean of 0.80 ± 0.045 (mean ± SD) for the interobserver agreement, which was increased to 0.86 ± 0.070 as SBRs were provided and 0.95 ± 0.040 as SBRs and CPRs were given.
Intra- and Interobserver Agreement Assessments
The agreement between the purely quantitative evaluation and readers’ evaluations was increased as physicians gained access to semiquantitative data (Table 4). Physicians A and C changed their evaluations to score a better agreement with the purely quantitative evaluation as semiquantitative data were provided, whereas for physician B the change in agreement was minor.
Agreement Between Purely Quantitative Assessment and Observers’ Evaluations
Clinical Diagnosis
Overall, good agreement between scan interpretation and clinical diagnosis was seen (Table 5). Two patients with abnormal scan results had clinical diagnoses of dystonia and normal-pressure hydrocephalus (Fig. 6). The former patient, however, had an MR imaging result showing atrophy of the right striatum that corresponded to the area of nigrostriatal degeneration noted on the 123I-FP-CIT scan. The latter patient was reported to have reduced tracer binding of the right striatum. The patient presented with an atypical gait disorder and had before the 123I-FP-CIT scan been put on a trial of carbidopa and levodopa (Sinemet; Merck Sharp Dohme) with no response. Because of enlarged ventricles, the patient subsequently had a ventriculoperitoneal shunt put in place, which was reported to improve the patient’s symptoms. The reduced tracer binding observed primarily in the right putamen could potentially be explained by a combination of abnormal anatomy and partial-volume effects due to the enlarged ventricles.
Agreement Between Scan Interpretation and Clinical Diagnosis
Patients with abnormal scan interpretation and clinical diagnosis of (left) dystonia and (right) normal pressure hydrocephalus. For both patients, right striatum was reported as abnormal according to consensus scan interpretation.
DISCUSSION
In this study, we assessed whether semiquantitative analysis of tracer binding was a useful aid for 123I-FP-CIT SPECT studies in terms of creating more reproducible clinical reporting. We also investigated in what form semiquantitative data should be supplied, whether striatal binding ratios were sufficient or whether specific information about binding in striatal subregions in the form of caudate-to-putamen ratios was also needed.
The study showed that as readers were given access to semiquantitative information, the reproducibility improved; the best interobserver agreement was obtained when information on both striatal tracer binding and binding in the striatal subregions were given in conjunction (Table 3). Providing observers with semiquantitative data also made them more confident in their readings, with no striatum perceived as being difficult to interpret as SBR and CPR data were provided (Table 2).
Evident in the study was that readers used the semiquantitative data to different extents (Table 4). Physician B, being the most experienced observer, showed confident reporting with only minor changes as semiquantitative data were provided and a small number of striata perceived as being difficult to interpret. Physicians A and C, however, changed their reads to a larger extent, with a tendency to overreport studies as abnormal when visual interpretation only was performed (59 and 57 abnormal striata for readers A and C, respectively, compared with 50 for reader B and 48 for the purely quantitative evaluation). For the visual interpretation, readers A and C also indicated a larger number of striata than reader B as being difficult to interpret (Table 2). After semiquantitative data were provided, the agreements of these readers with both the semiquantitative evaluation and physician B increased, and a smaller number of striata was perceived as difficult to interpret (Tables 3 and 4). Overall, this result suggests that a less experienced observer of 123I-FP-CIT SPECT studies can match the performance of a more experienced observer as semiquantitative data are provided. The measure of experience used in this study (the number of 123I-FP-CIT scans analyzed per year × the number of years’ experience of reporting 123I-FP-CIT studies) proved to relate well to reader performance.
A good agreement was seen between clinical diagnosis and scan interpretation for a subset of the patients (Table 5), supporting the validity of the results and giving an indication that the physicians’ consensus reports after being provided with both visual and semiquantitative data were of high accuracy. A problem with this approach is, however, that the clinical diagnosis can only evaluate patients on an overall level—that is, classification of each striatum as normal or abnormal is not possible as it is for the imaging data. A further limitation is that the referring physicians were not masked to the imaging results. The mean follow-up time of 312 d should, however, be sufficient to highlight discordance between scan interpretation and clinical diagnosis. Two cases were found for which the clinical diagnosis did not match the imaging result (Fig. 6). Both patients were found to have structural brain abnormalities, most likely affecting the imaging results and highlighting the importance of having accurate clinical information available to aid scan interpretation.
The study design has the potential limitation that repeated interpretations of the same data can bias the results. Because all readers were involved in a busy nuclear medicine and PET/CT clinic at the time of the study, including the reporting of other 123I-FP-CIT cases, a 1-mo delay between readings was judged a sufficient amount of time between repeated assessments, without introducing bias. The very good intraobserver agreement obtained (mean κ, 0.95) also showed that observers were consistent in their readings, with minor changes seen for the repeated visual analysis. This agreement served as a control of the study design and supports the decision that the interval between representation of images was of appropriate length.
When semiquantitative analysis is used as an aid for clinical reporting, its limitations should be clear to the reporter, minimizing the risk of overrelying on the data. A technical limitation for VOI methods as used in this study includes poor fitting of individual patient studies to the anatomic atlas, most commonly seen for patients with abnormal anatomy (18). This was, however, not encountered for the subjects included in this study, with all automated VOI placements judged accurate. Additional limitations that physicians need to be aware of are medications and drugs shown to affect the ratio between specific and nonspecific tracer binding (25), the presence of vascular lesions (3), and effects of patient movement (26). These limitations, however, also present challenges for visual interpretation of tracer binding. A further shortcoming evident in this study is that the usage of striatal binding ratios only as an aid to visual interpretation can be misleading, because a high tracer binding in the caudate nucleus can give normal quantitative values, even though the tracer binding in the putamen is clearly reduced (Fig. 4). To avoid misinterpretation, particularly important for less experienced readers, information about tracer binding in striatal subregions should accompany the striatal tracer binding data. It has also been shown by others that semiquantitative analysis of the relative uptake in the caudate nucleus and putamen is a relevant measure to discriminate early PD from control subjects (27,28). Advantages of using the caudate-to-putamen ratio, compared with the putamen SBR, are that the ratio is both age-independent (Fig. 2) and camera-independent (cross-camera calibration factors not needed). Therefore, the CPR has the potential of an easier implementation in routine clinical practice. A further semiquantitative measure that has the potential to aid the reporting physician is the striatal asymmetry index, shown to distinguish between PD and non-PD tremor syndromes (29). The asymmetry index, however, was not used in this study because the left and right striata were kept as separate data points. Additional cases for which semiquantitative data could prove valuable include evaluation of nonstandard uptake patterns as seen for some PSP, CBD, and DLB patients. The SBR could for these cases prove an important quantitative measure because it has been shown that a symmetric decrease in tracer binding in the whole of the striatum can be obtained for a subsection of these patients (7), making visual interpretations more challenging. For DLB and PSP patients, there is also the possibility of a decreased tracer binding in the caudate nucleus, compared with the putamen, necessitating the inclusion of a lower cutoff reference value for the CPR. The small percentage of PSP, CBD, and DLB patients included in this study, however, meant that these aspects could not be evaluated.
In this study, an automatic 3-dimensional VOI method was used for semiquantitative analysis of tracer binding. Compared with manual semiquantitative techniques, the automatic VOI approach has the advantages of having a shorter processing time and being less observer-dependent and hence more reproducible (21). Other software with methodologies similar to the BRASS software used in this study include EXINI dat (EXINI Diagnostics, SE) and DaTQUANT (GE Healthcare). These have slight differences in the VOIs used and VOI placement algorithms, but in our experience they have shown similar results. Other automatic semiquantitative methods are available, including methods taking larger VOIs over the whole of the striatum (30) and homegrown methodologies (11–13). The former has the advantage of accounting for partial-volume losses but cannot help with assessments of separate basal ganglia compartments such as the caudate nucleus and putamen. Homegrown methodologies tend not to be available for the general imaging community. Other types of automatic quantitative methods available include voxel-based methods, such as statistical parametric mapping. Although promising (31,32), these methods tend to be better used for group analysis studies and are not designed for evaluation on a subject-by-subject basis, making them difficult to implement in routine clinical practice.
CONCLUSION
We have shown that using semiquantitative data as an aid to visual interpretation of clinical 123I-FP-CIT SPECT studies creates more reproducible reporting. For minimized interobserver variability and to provide the most complete description of tracer binding, information about tracer binding in the whole of the striatum and the striatal substructures should be given in conjunction.
DISCLOSURE
The costs of publication of this article were defrayed in part by the payment of page charges. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734. Professor Booij and Dr. Kemp are consultants for GE Healthcare. Part of this work was undertaken at UCLH/UCL, which received a proportion of funding from the Department of Health’s NIHR Biomedical Research Centres funding scheme. No other potential conflict of interest relevant to this article was reported.
Footnotes
Published online Mar. 14, 2013.
- © 2013 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
REFERENCES
- Received for publication June 18, 2012.
- Accepted for publication November 26, 2012.