The misinterpretation of the standard error of measurement in medical education: A primer on the problems, pitfalls and peculiarities of the three different standard errors of measurement
- 18 April 2012
- journal article
- Published by Informa UK Limited in Medical Teacher
- Vol. 34 (7), 569-576
- https://doi.org/10.3109/0142159x.2012.670318
Abstract
Background: In high-stakes assessments in medical education, such as final undergraduate examinations and postgraduate assessments, an attempt is frequently made to set confidence limits on the probable true score of a candidate. Typically, this is carried out using what is referred to as the standard error of measurement (SEM). However, it is often the case that the wrong formula is applied, there actually being three different formulae for use in different situations. Aims: To explain and clarify the calculation of the SEM, and differentiate three separate standard errors, which here are called the standard error of measurement (SEmeas), the standard error of estimation (SEest) and the standard error of prediction (SEpred). Results: Most accounts describe the calculation of SEmeas. For most purposes, though, what is required is the standard error of estimation (SEest), which has to be applied not to a candidate's actual score but to their estimated true score after taking into account the regression to the mean that occurs due to the unreliability of an assessment. A third formula, the standard error of prediction (SEpred) is less commonly used in medical education, but is useful in situations such as counselling, where one needs to predict a future actual score on an examination from a previous actual score on the same examination. Conclusions: The various formulae can produce predictions that differ quite substantially, particularly when reliability is not particularly high, and the mark in question is far removed from the average performance of candidates. That can have important, unintended consequences, particularly in a medico-legal context.Keywords
This publication has 10 references indexed in Scilit:
- The standard error of measurement is a more appropriate measure of quality for postgraduate medical assessments than is reliability: an analysis of MRCP(UK) examinationsBMC Medical Education, 2010
- Health Measurement ScalesPublished by Oxford University Press (OUP) ,2008
- RESEARCH METHODOLOGY: Procedures for Establishing Defensible Absolute Passing Scores on Performance Examinations in Health Professions EducationTeaching and Learning in Medicine, 2006
- Reliability: on the reproducibility of assessment dataMedical Education, 2004
- Item response theory: applications of modern test theory in medical educationMedical Education, 2003
- Confidence Intervals for True Scores: Is There a Correct Approach?Journal of Psychoeducational Assessment, 2001
- Standard Error of MeasurementEducational Measurement: Issues and Practice, 1991
- Confidence intervals for true scores and retest scores on clinical testsJournal of Clinical Psychology, 1986
- A Comparison of Five Methods for Estimating the Standard Error of Measurement at Specific Score LevelsApplied Psychological Measurement, 1985
- The continuing misinterpretation of the standard error of measurement.Psychological Bulletin, 1979