Evaluating the performance of a computer-based consultant

Abstract
The performance of a computer-based clinical consultation system is evaluated. The program, called MYCIN, is designed to function as an aid for infectious disease diagnosis and therapy selection, with an initial emphasis on bacteremias. The evaluation methodology is discussed, as well as the difficulties encountered in attempting to evaluate clinical judgments. Specialists in infectious diseases judged MYCIN's final therapy recommendation, and intermediate conclusions about the significance of the infection and identity of infecting organisms. The evaluation techniques described may be useful in assessing the performance of other clinical decision aids. Results of the evaluation show that the program's therapy recommendations meet Stanford experts' standards of acceptable practice 90.9% of the time (table 2), with some variation noted both among individual experts and between Stanford experts and others (tables 1, 2).