Confidence Bands for ROC Curves

Abstract
We address the problem of comparing the performance of classifiers. In this paper we study techniques for generating and evaluating confidence bands on ROC curves. Historically this has been done using one-dimensional confidence intervals by freezing one variable-the false-positive rate, or threshold on the classification scoring function. We adapt two prior methods and introduce a new radial sweep method to generate confidence bands. We show, through empirical studies, that the bands are too tight and introduce a general optimization methodology for creating bands that better fit the data, as well as methods for evaluating confidence bands. We show empirically that the optimized confidence bands fit much better and that, using our new evaluation method, it is possible to gauge the relative fit of different confidence bands.