Abstract
This study was designed to evaluate the interobserver variability in reporting on 99mTc–dimercaptosuccinic acid (DMSA) scanning performed 6 mo after an acute episode of pyelonephritis for the detection of late renal sequelae. Methods: Forty-six children were selected, who had early and late DMSA studies for evaluation of acute pyelonephritic lesions as well as sequelae. Three observers reported independently and separately on the early and late DMSA scans and, in a second step, on the late scan in the presence of the early scan. Interobserver reproducibility was evaluated for the early DMSA scan, the late DMSA scan alone, and the late DMSA scan with the early scan for comparison. Results: Complete agreement between the three observers was reached in 75%, 78%, and 77% for the early DMSA scan, the late DMSA scan alone, and the late DMSA scan with the early scan for comparison, respectively. Conclusion: Interobserver reproducibility was high and was comparable for both early and late DMSA scintigraphy.
A recent Belgian survey on interobserver variability was reported on dimercaptosuccinic acid (DMSA) scintigraphy (1). De Sadeleer et al. (1) concluded that, in a series of DMSA studies performed mainly during the acute and remission phases of renal infection, the overall reproducibility was excellent among a large number of nuclear medicine physicians.
However, it is generally accepted that the main application of DMSA scintigraphy is not for diagnosis of acute infection but for accurate estimation of the permanent residual lesions (2,3) at least 6 mo after the acute episode (4). A review of the literature shows that the percentage of DMSA sequelae varies considerably from author to author. For example, Jakobsson (4) found that, among those patients having DMSA lesions during the acute phase of infection, as many as 45% still had lesions 1 y later. Hoberman et al. (5) found no more than 15% with lesions at 6 mo. Several factors may explain these striking differences; one factor could be the difficulty in reporting on minimal residual defects.
Our study focused on these residual abnormalities, with the aim being to evaluate whether the level of interobserver reproducibility is influenced by this particular selection of cases.
MATERIALS AND METHODS
Selection of Patients
Forty-six patients (92 kidneys) were selected from a database. All of them were children, 3 mo to 16 y old (median age, 3 y), who underwent two DMSA studies. The first study was undertaken during the early days of acute pyelonephritis (defined on the basis of the classic clinical, biologic, and microbiologic criteria), and the second study for evaluation of residual sequelae took place 6 mo later. All patients were selected on the basis of the scintigraphic reports only, before access to the images was available. The selection was conducted to obtain various late DMSA patterns. For 37 kidneys, both early and late DMSA studies were considered normal; for 26 kidneys, the early DMSA study was abnormal, whereas the late control study was normal; for 14 kidneys, the late DMSA study was abnormal without significant change compared with the early study; and for 15 kidneys, the late DMSA study was abnormal, but significant improvement was indicated.
99mTc-DMSA Scintigraphy
Images were obtained using a gamma camera equipped with a high-resolution collimator 2–4 h after an intravenous injection of 99mTc-DMSA at a dose adapted to body surface (6). One posterior and two posterior oblique views were obtained in a 256 × 256 matrix. Additional views with zoom magnification and pinhole collimator were obtained when necessary.
For each kidney, the DMSA scan was interpreted as normal, abnormal, or equivocal. An equivocal interpretation was given when it was difficult to decide between normality and abnormality.
Design of Study
In a first step, all early and late DMSA scans were put together and three observers reported independently, without knowing whether the DMSA scan was an early or a late one. In a second step, the three observers reported some days later again on the late DMSA scan with the early DMSA scan for comparison.
Therefore, it was possible to evaluate the interobserver reproducibility for the early DMSA scan, the late DMSA scan alone, and the late DMSA scan with the early scan for comparison. Complete agreement meant that all three observers agreed on a normal, abnormal, or equivocal result. Partial agreement meant that two observers agreed on a normal or abnormal result and one considered the result as equivocal. No agreement meant any disagreement on normality and abnormality.
For each of the three observers, the reports on both the late DMSA scan alone and the late DMSA scan in the presence of the early scan were compared to assess the effect of the early DMSA scan on interpreting the late images. For this comparison, partial agreement was defined as one equivocal report and one normal or abnormal report.
RESULTS
Complete agreement between the three observers was reached in 75%, 78%, and 77% for the early DMSA scan, the late DMSA scan alone, and the late DMSA scan with the early one for comparison, respectively (Table 1). Partial agreement was reached in 14%, 10%, and 7% for the early DMSA scan, the late DMSA scan alone, and the late DMSA scan with the early scan for comparison, respectively. Disagreement on normality and abnormality was found in 11%, 12%, and 16% for the early DMSA scan, the late DMSA scan alone, and the late DMSA scan with the early scan for comparison, respectively. These differences were statistically not significant (χ2 test).
Interobserver Reproducibility
The analysis by pairs of observers revealed that the discordance was distributed equally among the three observers. For all three observers, the number of equivocal results was <5%, whether or not the early DMSA scan was available.
Concerning the intraobserver comparison between the late DMSA scan alone and the late DMSA scan in the presence of the early scan (Table 2), total agreement was observed in 96%, 89%, and 86% for observers 1, 2, and 3, respectively. Disagreement on normality and abnormality was observed in 2%, 2%, and 9%, respectively. For observers 2 and 3, the late DMSA scan was more often considered abnormal when the early scan was abnormal and was available for comparison.
Comparison of Reports on Late DMSA Scintigraphy With and Without Early DMSA Scan
When the early DMSA finding was normal, the late DMSA finding was always reported as normal by the three observers in case both scans were available for comparison. When only the late DMSA finding was available, an equivocal report was obtained in one, zero, and three cases for observers 1, 2, and 3, respectively, whereas an abnormal report was obtained in two, one, and two cases for the same three observers, respectively.
DISCUSSION
There have been conflicting recent reports concerning the reproducibility in reporting on DMSA scintigraphy (3,7–11). Several factors are responsible for these differences: the number and characteristics of the observers, the number and characteristics of the DMSA studies, the number and types of criteria used for evaluating the DMSA abnormalities, the type of display offered to the observers, and the manner in which the reproducibility is expressed. In a recent survey, De Sadeleer et al. (1) concluded that the overall reproducibility was excellent among a large number of nuclear medicine physicians.
However, the study by De Sadeleer et al. (1) was not designed to evaluate whether a selection of patients that was focused on late renal sequelae might influence the level of interobserver reproducibility. The general tendency is that lesions observed during the acute phase of renal infection often decrease in size and intensity or disappear when the examination is repeated some months later. Therefore, we attempted to evaluate whether the interobserver reproducibility would be different for acute lesions than for the remaining sequelae 6 mo later. Moreover, controversy exists as to whether a report on a late DMSA scan is facilitated by the fact that an early DMSA scan is available. On the one hand, having the early DMSA scan for comparison might constitute a bias with the smallest abnormality detected on the late DMSA scan in the suspected area then being considered as sequelae by some observers. On the other hand, knowing where the lesion was located on the acute DMSA scan should allow the observer to focus on this particular region, considering that small abnormalities in other parts of the kidneys are most likely not significant. In the absence of the acute DMSA scan, the observer might be tempted to describe abnormalities on the late scan, in areas unlikely to be abnormal, because of the fact that they were normal on the early scan and no recurrence of infection occurred. To test this hypothesis, 37 kidneys with normal early and lateDMSA findings were intentionally included in this study. We note that a random selection of patients would theoretically have been preferable. However, in a population of patients with acute pyelonephritis, the risk of developing late sequelae is low, and a random selection would then have included primarily late DMSA scans with normal findings.
The level of interobserver disagreement was rather low and was comparable with the results reported by De Sadeleer (1). The interobserver reproducibility was comparable for the late DMSA scan alone and the early DMSA scan. This finding suggests that the size or the intensity of regional impairment does not influence the quality of reporting.
The report on the late DMSA scan was modified slightly by the availability of the early DMSA scan; in the case of an abnormal early DMSA scan, the late scan was more often considered to be abnormal by two of the three observers when the early scan was available than when it was not available. It is difficult to conclude whether the availability of the early scan resulted in overdiagnosis of sequelae or, alternatively, in a higher sensitivity. On the other hand, in the case of normal early DMSA findings, the late DMSA scan was always considered to be normal with the early DMSA scan for comparison, whereas some equivocal or abnormal reports were obtained when this comparison was not made. Obviously, having only the late scan for reporting will result in some loss of specificity. These small differences in interpretation probably do not constitute a valuable argument to promote acute DMSA scanning simply to get a better interpretation of residual sequelae.
CONCLUSION
Interobserver reproducibility was high and was comparable for both early and late DMSA scintigraphy.
Footnotes
Received Aug. 7, 2000; revision accepted Dec. 4, 2000.
For correspondence contact: Amy Piepsz, PhD, Department of Radioisotopes, Centre Hospitalo-Universitaire Saint Pierre, 322, Rue Haute, B-1000 Brussels, Belgium.