Robust Empirical Bayes Estimation of Means From Stratified Samples

Abstract
This article considers simultaneous estimation of means from several strata, where each stratum contains a finite number of elements. For example, quite often estimates of annual incomes, unemployment rates, and so forth must be made simultaneously for many areas. Our results find application in small area estimation where very few samples are available from an individual area, and an estimate of a certain area mean or simultaneous estimates of several area means can be improved by incorporating information from similar neighboring areas. Ghosh and Meeden (1986) considered empirical Bayes estimation of the finite population mean assuming a normal superpopulation model. Their estimators generalize automatically when one estimates a vector of stratum means. This article relaxes the normality assumption and assumes instead that the posterior expectation of any stratum mean is a linear function of the sample observations. This is the so-called “posterior linearity” property as described in Ericson (1969b), Hartigan (1969), and Goldstein (1975), among others. The empirical Bayes estimators of Ghosh and Meeden (1986), however, enjoy a certain robustness property in the sense that they can be motivated with the basic assumption of posterior linearity rather than normality of the superpopulation. For a small number of strata, Monte Carlo simulations undertaken in this article clearly indicate that the Bayes risks of these empirical Bayes estimators are usually much smaller than the Bayes risk vector of sample means not only when the underlying distribution is normal, but also when they are binomial and Poisson. In addition, in the binomial case, the empirical Bayes estimators of Ghosh and Meeden perform better than some rival empirical Bayes estimators that can be developed along the lines of Morris (1983). This is quite interesting, especially since the latter uses the additional information about the form of the variance of the binomial distribution as a function of the mean. For the Poisson case, our estimators perform very closely to some of the rival estimators that can be proposed along the lines of Morris (1983) using the additional information that the variance of a Poisson distribution equals its mean. We have also found that several asymptotic (as the number of strata is large) optimality properties of the empirical Bayes estimators of Ghosh and Meeden established under the normality of the superpopulation continue to hold under the modest assumption of existence of certain superpopulation moments. This type of asymptotics is particularly suited for small area estimation where the number of small areas is very large but the sample size within each small area is typically very small. It is interesting to note that when samples of equal size are drawn from different strata, the empirical Bayes estimators of Ghosh and Meeden (1986) include as special cases the James—Stein estimators, the positive-part James—Stein estimators, and the estimators of Lindley and Smith (1972). A new feature of the present article is that the Bayes risks of such estimators, calculated without the normality assumption, indicate a clear robustness of procedures motivated originally under normality.