Exact Bootstrap Variances of the Area Under ROC Curve

Abstract
The area under the Receiver Operating Characteristic (ROC) curve (AUC) and related summary indices are widely used for assessment of accuracy of an individual and comparison of performances of several diagnostic systems in many areas including studies of human perception, decision making, and the regulatory approval process for new diagnostic technologies. Many investigators have suggested implementing the bootstrap approach to estimate variability of AUC-based indices. Corresponding bootstrap quantities are typically estimated by sampling a bootstrap distribution. Such a process, frequently termed Monte Carlo bootstrap, is often computationally burdensome and imposes an additional sampling error on the resulting estimates. In this article, we demonstrate that the exact or ideal (sampling error free) bootstrap variances of the nonparametric estimator of AUC can be computed directly, i.e., avoiding resampling of the original data, and we develop easy-to-use formulas to compute them. We derive the formulas for the variances of the AUC corresponding to a single given or random reader, and to the average over several given or randomly selected readers. The derived formulas provide an algorithm for computing the ideal bootstrap variances exactly and hence improve many bootstrap methods proposed earlier for analyzing AUCs by eliminating the sampling error and sometimes burdensome computations associated with a Monte Carlo (MC) approximation. In addition, the availability of closed-form solutions provides the potential for an analytical assessment of the properties of bootstrap variance estimators. Applications of the proposed method are shown on two experimentally ascertained datasets that illustrate settings commonly encountered in diagnostic imaging. In the context of the two examples we also demonstrate the magnitude of the effect of the sampling error of the MC estimators on the resulting inferences.