Psychometric characteristics of integrated multi-specialty examinations: Ebel ratings and unidimensionality

1 November 2012

journal article
research article
Published by Taylor & Francis Ltd in Assessment & Evaluation in Higher Education

Vol. 37 (7), 787-804
https://doi.org/10.1080/02602938.2011.573843

Abstract

Over recent years, UK medical schools have moved to more integrated summative examinations. This paper analyses data from the written assessment of undergraduate medical students to investigate two key psychometric aspects of this type of high-stakes assessment. Firstly, the strength of the relationship between examiner predictions of item performance (as required under the Ebel standard setting method employed) and actual item performance (‘facility’) in the examination is explored. It is found that there is a systematic pattern of difference between these two measures, with examiners tending to underestimate the difficulty of items classified as relatively easy, and overestimating that of items classified harder. The implications of these differences for standard setting are considered. Secondly, the integration of the assessment raises the question as to whether the student total score in the exam can provide a single meaningful measure of student performance across a broad range of medical specialties. Therefore, Rasch measurement theory is employed to evaluate psychometric characteristics of the examination, including its dimensionality. Once adjustment is made for item interdependency, the examination is shown to be unidimensional with fit to the Rasch model implying that a single underlying trait, clinical knowledge, is being measured.

This publication has 16 references indexed in Scilit:

The assessment revolution that has passed England by: Rasch measurement
British Educational Research Journal, 2010
A primer on classical test theory and item response theory for assessments in medical education
Medical Education, 2010
The Rasch measurement model in rheumatology: What is it and why use it? When should it be applied, and what should one look for in a Rasch paper?
Arthritis Care & Research, 2007
Standard Setting
Published by SAGE Publications ,2007
Estimating the Minimum Number of Judges Required for Test-centred Standard Setting on Written Assessments. Do Discussion and Iteration have an Influence?
Advances in Health Sciences Education, 2006
Standard Setting for Clinical Competence at Graduation from Medical School: A Comparison of Passing Scores Across Five Medical Schools
Advances in Health Sciences Education, 2006
Re-Evaluating the NCLEX-RN^® Passing Standard
Journal of Nursing Measurement, 2005
STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT
The Lancet, 1986
COMPARABILITY OF METHODS FOR SETTING STANDARDS
Journal of Educational Measurement, 1980
Consequences of Using the Rasch Model for Educational Assessment*
British Educational Research Journal, 1979

Cited by 6 articles