Reliability of performance on standardized patient cases: A comparison of consistency measures based on generalizability theory

Abstract
Standardized patient cases have assumed an important role in the assessment of clinical competence in recent years. The reliability (consistency) of performance across standardized patient cases has been determined with consistency measures derived from generalizability theory—namely, the generalizability coefficient, Ep2; the dependability index, ; and the dependability index with cutoff, ϕ(C). These three consistency measures can be computed for quantitatively scored cases and for dichotomously scored cases; hence, six consistency measures could be computed for a given examination. Our purpose was to draw attention to the sizable differences among the computed values of these consistency measures for a new set of clinical competence examination data and to provide a review of the interpretations of the different measures. The findings showed considerable differences among the consistency measures, the number of cases needed to achieve the 0.80 reliability level, and the time required to administer that number of cases. These differences underscore the need to carefully identify the specific consistency measure used in a given study and to attend closely to the interpretation associated with that measure.