Abstract
We consider two-stage models of the kind used in parametric empirical Bayes (PEB) methodology, calling them conditionally independent hierarchical models. We suppose that there are k “units,” which may be experimental subjects, cities, study centers, etcetera. At the first stage, the observation vectors Yi for units i = 1, …, k are independently distributed with densities p(yi | θi ), or more generally, p(yi | θi, λ). At the second stage, the unit-specific parameter vectors θi are iid with densities p(θi | λ). The PEB approach proceeds by regarding the second-stage distribution as a prior and noting that, if λ were known, inference about θ could be based on its posterior. Since λ is not known, the simplest PEB methods estimate the parameter λ by maximum likelihood or some variant, and then treat λ as if it were known to be equal to this estimate. Although this procedure is sometimes satisfactory, a well-known defect is that it neglects the uncertainty due to the estimation of λ. In this article we suggest that approximate Bayesian inference can provide simple and manageable solutions to this problem. In Bayesian inferences, a prior density π(·) on λ is introduced, the posterior p(λ | y) is calculated, and the posterior density of θi is then equal to the expectation, with respect to p(λ | y), of the conditional posterior p(θi | yi, λ). From the Bayesian point of view, the PEB estimate is of interest because it is a first-order approximation to the posterior mean [having an error of order O(k −1)]. Letting Eλ and Vλ denote the expectation and variance with respect to p(λ | y), we may write the posterior variance of θi as V(θi | y) = Eλ {V(θi | yi, λ)} + Vλ {E(θi | yi, λ)}. The conditional posterior variance , where is the maximum likelihood estimator, approximates only the first term. When we include an approximation to the second term we obtain a first-order approximation to the posterior variance itself. In many examples, this elementary method, incorporating approximations to both terms, will substantially account for the estimation of λ. We briefly consider second-order approximations, noting that the work of Deely and Lindley (1981) may be extended using expansions derived by Lindley (1980), Mosteller and Wallace (1964), Tierney and Kadane (1986), and Tierney, Kass, and Kadane (1989). We suggest that second-order approximations provide rough and, often, easily computed assessments of accuracy of first-order approximations. Although we confine our data-analytical examples to simple models, we believe the methods will be useful in general settings. An important area of application is longitudinal data analysis.