TO THE EDITOR: In a recent paper (1), Silberstein reported data from his study assessing outcomes of radioiodine ablation in patients with differentiated thyroid carcinoma after imaging with 2 different isotopes. This study analyzed the results from 49 patients, 26 of whom received 123I before ablation and 23 of whom received 131I before ablation. Acknowledging the difficulties of adequately defining successful ablation, Silberstein reported that 81% of the patients receiving 123I had a successful ablation, compared with 74% of the patients receiving 131I, and that this difference was not statistically significant.
However, we would suggest that the author has overextrapolated from this result to the statement that “the same” ablation rate was achieved, irrespective of diagnostic agent. The logical conclusion of such a statement is that either agent could be used for the purpose, with no loss of patient benefit. Even if true, that conclusion is not demonstrated by Silberstein's study, as it is underpowered to detect what may be clinically significant differences between the techniques. What constitutes such a difference is always difficult to judge, but one might argue that a reduction in the ablation failure rate from 26% to 19% (i.e., nearly a 27% reduction in failures) is clinically significant. A simple power calculation (2) would have revealed that to detect the difference between 74% and 81% would require 479 patients for each diagnostic agent. Even if Silberstein had powered his study to look for a bigger difference of 15%, which we believe that most in the oncology community would agree represents a clinical improvement, achieving this difference would have required 71 patients for each diagnostic agent. The power calculations assume a 1-sided χ2 test, 80% power, and a 0.05 significance level. Conversely, for the patient numbers Silberstein reported, the rate of successful ablations would have needed to rise to 100% for 123I (compared with 131I) for the difference between the techniques to reach statistical significance (Fisher exact test, P = 0.014).
The danger of interpreting absence of evidence as absence of negative effects has recently been highlighted in this journal by a letter in which Walter et al. (3) made a plea for adequately powered trials. We would add our voice to that plea: Silberstein's study set out to answer an important question that was never going to be answered with the number of patients recruited. When studies are limited by the small number of patient referred through a single hospital or unit, a multicenter approach is the option of choice. Small-scale studies not only represent a waste of resources but also can lead to incorrect conclusions.
Footnotes
-
COPYRIGHT © 2008 by the Society of Nuclear Medicine, Inc.