Assessment of clinical skills with standardized patients: State of the art

Abstract
A little more than 10 years ago, the objective structured clinical examination (OSCE) was introduced. It includes several “stations,”; at which examinees perform a variety of clinical tasks. Although an OSCE may involve a range of testing methods, standardized patients (SPs), who are nonphysicians trained to play the role of a patient, are commonly used to assess clinical skills. This article provides a comprehensive review of large‐scale studies of the psychometric characteristics of SP‐based tests. Across studies, reliability analyses consistently indicate that the major source of measurement error is variation in examinee performance from station to station (termed content specificity in the medical‐problem‐solving literature). As a consequence, tests must include large numbers of stations to obtain a stable, reproducible assessment of examinee skills. Disagreements among raters observing examinee performance and differences between SPs playing the same patient role appear to have less effect on the precision of scores, as long as examinees are randomly assigned to raters and SPs. Results of validation studies (e.g., differences in group performance, correlations with other measures) are generally favorable, though not particularly informative. Future validation research should investigate the impact of station format, timing, and instructions on examinee performance; study the procedures used to translate examinee behavior into station and test scores; and work on rater and SP bias. Several recommendations are offered for improving SP‐based tests. These include (a) focusing on assessment of history taking, physical examination, and communication skills, with separately administered written tests used to measure diagnostic and management skills, (b) adoption of a mastery‐testing framework for score interpretation, and (c) development of improved procedures for setting pass‐fail standards. Use of generalizability theory in analyzing and reporting results of psychometric studies is also suggested.