Methods for assessing reliability and validity for a measurement tool: a case study and critique using the WHO haemoglobin colour scale

27 April 2004

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 23 (10), 1603-1619
https://doi.org/10.1002/sim.1804

Abstract

Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was conducted to evaluate the usefulness of a specific measurement tool (the WHO Colour Scale) is then used to illustrate the application of these methods. The WHO Colour Scale was developed under the auspices of the WHO to provide a simple portable and reliable method of detecting anaemia. This Colour Scale is a discrete interval scale, whereas the actual haemoglobin values it is used to estimate are on a continuous interval scale and can be measured accurately using electrical laboratory equipment. The methods we consider are: linear regression, correlation coefficients, paired t-tests plotting differences against mean values and deriving limits of agreement; kappa and weighted kappa statistics, sensitivity and specificity, an intraclass correlation coefficient and the repeatability coefficient. We note that although the definition and properties of each of these methods is well established inappropriate methods continue to be used in medical literature for assessing reliability and validity, as evidenced in the context of the evaluation of the WHO Colour Scale. Copyright © 2004 John Wiley & Sons, Ltd.

Keywords

This publication has 17 references indexed in Scilit:

Training health workers to assess anaemia with the WHO haemoglobin colour scale
Tropical Medicine & International Health, 2000
Field trial of a haemoglobin colour scale: an effective tool to detect anaemia in preschool children
Tropical Medicine & International Health, 2000
Measuring Agreement Between Two Raters for Ordinal Response: a Model-based Approach
Journal of the Royal Statistical Society: Series D (The Statistician), 1999
An inexpensive and reliable new haemoglobin colour scale for assessing anaemia.
Journal of Clinical Pathology, 1998
Comparing methods of measurement: why plotting difference against standard method is misleading
The Lancet, 1995
A critical discussion of intraclass correlation coefficients
Statistics in Medicine, 1994
The analysis of ordinal agreement data: beyond weighted kappa
Journal of Clinical Epidemiology, 1993
A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement
Computers in Biology and Medicine, 1990
STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT
The Lancet, 1986
Intraclass correlations: Uses in assessing rater reliability.
Psychological Bulletin, 1979

Cited by 40 articles