The misinterpretation of the standard error of measurement in medical education: A primer on the problems, pitfalls and peculiarities of the three different standard errors of measurement

18 April 2012

journal article
Published by Informa UK Limited in Medical Teacher

Vol. 34 (7), 569-576
https://doi.org/10.3109/0142159x.2012.670318

Abstract

Background: In high-stakes assessments in medical education, such as final undergraduate examinations and postgraduate assessments, an attempt is frequently made to set confidence limits on the probable true score of a candidate. Typically, this is carried out using what is referred to as the standard error of measurement (SEM). However, it is often the case that the wrong formula is applied, there actually being three different formulae for use in different situations. Aims: To explain and clarify the calculation of the SEM, and differentiate three separate standard errors, which here are called the standard error of measurement (SEmeas), the standard error of estimation (SEest) and the standard error of prediction (SEpred). Results: Most accounts describe the calculation of SEmeas. For most purposes, though, what is required is the standard error of estimation (SEest), which has to be applied not to a candidate's actual score but to their estimated true score after taking into account the regression to the mean that occurs due to the unreliability of an assessment. A third formula, the standard error of prediction (SEpred) is less commonly used in medical education, but is useful in situations such as counselling, where one needs to predict a future actual score on an examination from a previous actual score on the same examination. Conclusions: The various formulae can produce predictions that differ quite substantially, particularly when reliability is not particularly high, and the mark in question is far removed from the average performance of candidates. That can have important, unintended consequences, particularly in a medico-legal context.

Keywords

This publication has 10 references indexed in Scilit:

The standard error of measurement is a more appropriate measure of quality for postgraduate medical assessments than is reliability: an analysis of MRCP(UK) examinations
BMC Medical Education, 2010
Health Measurement Scales
Published by Oxford University Press (OUP) ,2008
RESEARCH METHODOLOGY: Procedures for Establishing Defensible Absolute Passing Scores on Performance Examinations in Health Professions Education
Teaching and Learning in Medicine, 2006
Reliability: on the reproducibility of assessment data
Medical Education, 2004
Item response theory: applications of modern test theory in medical education
Medical Education, 2003
Confidence Intervals for True Scores: Is There a Correct Approach?
Journal of Psychoeducational Assessment, 2001
Standard Error of Measurement
Educational Measurement: Issues and Practice, 1991
Confidence intervals for true scores and retest scores on clinical tests
Journal of Clinical Psychology, 1986
A Comparison of Five Methods for Estimating the Standard Error of Measurement at Specific Score Levels
Applied Psychological Measurement, 1985
The continuing misinterpretation of the standard error of measurement.
Psychological Bulletin, 1979

Cited by 27 articles