An Overview on Assessing Agreement with Continuous Measurements

Abstract
Reliable and accurate measurements serve as the basis for evaluation in many scientific disciplines. Issues related to reliable and accurate measurement have evolved over many decades, dating back to the nineteenth century and the pioneering work of Galton ( 1886 Galton , F. ( 1886 ). Family likeness in stature . Proceedings of the Royal Society 40 : 42 – 73 . [Google Scholar] ), Pearson ( 1896 Pearson , K. ( 1896 ). VII. Mathematical contributions to the theory of evolution—III. Regression, heredity and panmixia . Philosophical Transactions of the Royal Society, Series A 187 : 253 – 318 . [Crossref] [Google Scholar] 1899 Pearson , K. , Lee , A. , Bramley-Moore , L. ( 1899 ). VI. Mathematical contributions to the theory of evolution—VI. Genetic (reproductive) selection: Inheritance of fertility in man, and of fecundity in thoroughbred racehorses . Philosophical Transactions of the Royal Society, Series A 192 : 257 – 330 . [Crossref] [Google Scholar] 1901 Pearson , K. ( 1901 ). VIII. Mathematical contributions to the theory of evolution—IX. On the principle of homotyposis and its relation to heredity, to the variability of the individual, and to that of the race. Part I. Homotyposis in the vegetable kingdom . Philosophical Transactions of the Royal Society, Series A 197 : 285 – 379 . [Google Scholar] ), and Fisher ( 1925 Fisher , R. A. ( 1925 ). Statistical Methods for Research Workers . Edinburgh : Oliver and Boyd . [Crossref] [Google Scholar] ). Requiring a new measurement to be identical to the truth is often impractical, either because (1) we are willing to accept a measurement up to some tolerable (or acceptable) error, or (2) the truth is simply not available to us, either because it is not measurable or is only measurable with some degree of error. To deal with issues related to both (1) and (2), a number of concepts, methods, and theories have been developed in various disciplines. Some of these concepts have been used across disciplines, while others have been limited to a particular field but may have potential uses in other disciplines. In this paper, we elucidate and contrast fundamental concepts employed in different disciplines and unite these concepts into one common theme: assessing closeness (agreement) of observations. We focus on assessing agreement with continuous measurements and classify different statistical approaches as (1) descriptive tools; (2) unscaled summary indices based on absolute differences of measurements; and (3) scaled summary indices attaining values between –1 and 1 for various data structures, and for cases with and without a reference. We also identify gaps that require further research and discuss future directions in assessing agreement.

This publication has 86 references indexed in Scilit: