Assigning readers to cases in imaging studies using balanced incomplete block designs

1 September 2021

journal article
research article
Published by SAGE Publications in Statistical Methods in Medical Research

Vol. 30 (10), 2288-2312
https://doi.org/10.1177/09622802211037074

Abstract

In many imaging studies, each case is reviewed by human readers and characterized according to one or more features. Often, the inter-reader agreement of the feature indications is of interest in addition to their diagnostic accuracy or association with clinical outcomes. Complete designs in which all participating readers review all cases maximize efficiency and guarantee estimability of agreement metrics for all pairs of readers but often involve a heavy reading burden. Assigning readers to cases using balanced incomplete block designs substantially reduces reading burden by having each reader review only a subset of cases, while still maintaining estimability of inter-reader agreement for all pairs of readers. Methodology for data analysis and power and sample size calculations under balanced incomplete block designs is presented and applied to simulation studies and an actual example. Simulation studies results suggest that such designs may reduce reading burdens by >40% while in most scenarios incurring a <20% increase in the standard errors and a <8% and <20% reduction in power to detect between-modality differences in diagnostic accuracy and

κ

statistics, respectively.

This publication has 19 references indexed in Scilit:

Measuring and managing radiologist workload: Measuring radiologist reporting times using data from a Radiology Information System
Journal of Medical Imaging and Radiation Oncology, 2013
Prognostically relevant gene signatures of high-grade serous ovarian carcinoma
JCI Insight, 2012
Multi-reader ROC Studies with Split-plot Designs
Academic Radiology, 2012
Sample size for positive and negative predictive value in diagnostic research using case–control designs
Biostatistics, 2008
Reader studies for validation of CAD systems
Neural Networks, 2008
The Measurement of Interrater Agreement
Published by Wiley ,2003
The construction of exact D-optimum experimental designs with application to blocking response surface designs
Biometrika, 1989
Balanced Incomplete Block Designs for Inter-Rater Reliability Studies
Applied Psychological Measurement, 1981
A Coefficient of Agreement for Nominal Scales
Educational and Psychological Measurement, 1960
A new method of arranging variety trials involving a large number of varieties
The Journal of Agricultural Science, 1936

Cited by 2 articles