There has been a trend over the past few decades toward increasing use of quantitation in cardiac perfusion imaging, initially for planar imaging (1–3) and subsequently for SPECT imaging (4–8). Such quantitation may assist the novice reader, thereby increasing the accuracy of the interpretation. Even in the setting of relatively expert readers, quantitation provides the opinion of a second experienced observer, possibly alerting the reader to defects that may have been overlooked. Artificial intelligence has the potential to further improve the quality of this second computer-observer through the use of expert systems or neural networks.
Expert Systems
The study presented by Garcia et al. (9) in this issue of The Journal of Nuclear Medicine takes quantitation to this next step, creating an expert system (PERFEX; Syntermed, Atlanta, GA) to analyze the results of the image quantitation and generate an interpretation of the examination. An expert system is a rule-based reasoning program, which uses a series of “if/then” rules in sequence to arrive at a conclusion. For example, one rule might be, “If a mild defect is present in the anterior wall, and the defect is nonreversible, and the patient is female, then it can be concluded with high certainty that the defect is artifactual.” Such rule-based systems have been shown to function reasonably well, even when dealing with uncertain data (i.e., possible rather than definite abnormalities). The degree of certainty can be propagated through the reasoning system to arrive at appropriate conclusions. Creation of such a set of rules requires lengthy interviews between the programmer and an expert reader, and subsequent refining of the rule set using sample cases. However, once the set of rules is created, the result is an expert system that is portable and available any time of day, bringing an “expert” to settings where the local expertise may be quite limited.
As noted by Datz et al. (10), care must be taken in validating the portability of computer diagnostic systems to assess the effect of differences in tracers, cameras, and acquisition techniques. Interestingly, the article by Garcia et al. (9) dealt with a mixture of protocols, including thallium stress-redistribution, same-day low-dose/high-dose sestamibi studies, and dual-isotope examinations, with a variety of stress techniques. However, because the inputs to the expert system for each study were the 32-segment quantitative bull’s-eye scores (expressed in SD units), such variation was likely accounted for by use of the appropriate reference database for each type of examination.
Several aspects of the article by Garcia et al. (9) deserve special attention. If the right coronary artery and circumflex artery territories are combined (which resolves certain problems arising from computer vs. human scoring), the sensitivity and specificity values are nearly identical for both the human and the computer readers when compared with an angiography gold standard (Table 1), which is an impressive result.
However, a concern with these results is the low specificity of the interpretations (21%–29% for presence of coronary artery disease, when operating at an approximately 85% sensitivity level). Whereas it is well known that referral bias can significantly drop the apparent specificity of a test (11), especially in the setting of an imperfect gold standard, these specificity values are lower than those of comparable quantitative and semiquantitative trials. For example, sensitivity and specificity were, respectively, 97% and 44% (12), 83% and 82% (13), 92% and 85% (14), and 93% and 78% (15) in other studies during which patients underwent both cardiac catheterization and perfusion imaging. It would be useful to assess the performance of the PERFEX expert system in a set of patients with a low probability of disease (sometimes referred to as the normalcy rate) to better measure its diagnostic accuracy. Given the extensive work of the authors in this area and the flexibility of the program with regard to exact imaging protocol, it is likely that they have an appropriate patient database available for this evaluation.
Also worthy of consideration is whether the expert system adds significant value when quantitation is already being used. There is fairly extensive literature on the sensitivity and specificity of the various cardiac perfusion quantitation programs, using simple thresholds for the number of abnormal pixels in each vessel territory (when compared with the appropriate sex-matched database). Using these thresholds is roughly comparable with creating an expert system with only 3–4 rules (e.g., “If the number of abnormal pixels in the right coronary artery territory exceeds the threshold, then the study is abnormal and suggests right coronary artery disease,” with similar rules for the other vessels and for the study overall). The expert system used by Garcia et al. (9) used 253 rules, and it would be interesting to assess the added value of the additional rules compared with the (simpler) standard quantitation. It might be possible for Garcia et al. to perform this comparison relatively easily, because no additional human interpretation of the studies would be required and because standard quantitation was already performed as an input to the expert system.
Neural Networks
Neural networks provide an alternative use of artificial intelligence in interpretation of myocardial perfusion studies. In a neural network, the programmer does not create explicit rules for interpretation but instead gathers a large collection of sample cases for training. To reduce the number of cases required for training and accelerate learning, a processed image may be used as the input for the system in place of the raw slice data or full bull’s-eye images. Authors have used coarser segmented bull’s-eye images (with only 8–16 “pixels”), Fourier derivatives of the bull’s-eye image reduced to 30 values, and segmented blackout maps (from standard quantitation) as inputs to the neural networks (16–22). The output nodes (diagnoses) are connected to the input nodes through a layer of “hidden” nodes to increase the flexibility of the reasoning system. Weighting functions connect the layers of the network; these functions are derived from repeated presentation of case examples (“training”) rather than being programmed into the system by an expert reader. Results of applications of these systems toward the diagnosis of coronary artery disease have been encouraging, with recent sensitivity and specificity values comparable with standard quantitation programs (16).
A potential disadvantage of neural networks is that once they are created, their method of function is difficult to describe in a way that can be comprehended by humans, making it impossible for the user to figure out how the program arrived at a given conclusion. Given that their exact function is obscure, great care must be taken in moving neural networks from one site to another, for fear that a trivial difference in procedure or processing technique may significantly alter the resulting interpretation. Despite this, a multicenter trial of a cardiac neural network system yielded good results even when differences in technique were known to be present (16).
Case-based reasoning programs are similar to neural networks in that they use a large set of examples with known diagnoses to arrive at a conclusion. However, they omit the neural network “training” step and instead give a diagnosis by finding the most similar case in their database of known examples. Khorsand et al. (23) report results similar to standard cardiac quantitation using this method.
The Future
Regardless of its exact nature, a validated computer interpretation assistant is likely to improve the quality of cardiac study interpretation. Lindahl et al. (18) compared physician interpretation of bull’s-eye images with and without such decision support, which in this case was provided by a neural network system. That study showed that intraobserver variability dropped by 10%, with a similar improvement in interobserver variability, and noted a significant improvement in diagnostic accuracy when compared with results of cardiac catheterization.
Although quantitative aids and artificial intelligence systems are valuable in an adjunct capacity, human readers will remain essential, with a superior ability to identify image artifacts and processing errors, compensate for variation in patient body size, and discuss examinations with referring physicians.
Washington University School of Medicine
St. Louis, Missouri
Footnotes
Received Apr. 4, 2001; accepted Apr. 9, 2001.
For correspondence or reprints contact: Jerold W. Wallis, MD, Radiology–Nuclear Medicine, Campus Box 8223, 510 S. Kingshighway, St. Louis, MO 63110-1076.