Partially pooled covariance matrix estimation in discriminant analysis

Abstract
The Linear Discriminant Rule (LD) is theoretically justified for use in classification when the population within-groups covariance matrices are equal, something rarely known in practice. As an alternative, the Quadratic Discriminant Rule (QD) avoids assuming equal covariance matrices, but requires the estimation of a large number of parameters. Hence, the performance of QD may be poor if the training set sizes are small or moderate. In fact, simulation studies have shown that in the two-groups case LD often outperforms QD for small training sets even when the within -groups covariance matrices differ substantially. The present article shows this to be true when there are more than two groups, as well. Thus, it would seem reasonable and useful to develop a data-based method of classification that, in effect, represents a compromise between QD and LD. In this article we develop such a method based on an empirical Bayes formulation in which the within-groups covariance matrices are assumed to be outcomes of a common prior distribution whose parameters are estimated from the data. Two classification rules are developed under this framework and, through the use of extensive simulations, are compared to existing methods when the number of groups is moderate.

This publication has 6 references indexed in Scilit: