Nonconjugate Bayesian Estimation of Covariance Matrices and Its Use in Hierarchical Models

Abstract
The problem of estimating a covariance matrix in small samples has been considered by several authors following early work by Stein. This problem can be especially important in hierarchical models where the standard errors of fixed and random effects depend on estimation of the covariance matrix of the distribution of the random effects. We propose a set of hierarchical priors (HPs) for the covariance matrix that produce posterior shrinkage toward a specified structure—here we examine shrinkage toward diagonality. We then address the computational difficulties raised by incorporating these priors, and nonconjugate priors in general, into hierarchical models. We apply a combination of approximation, Gibbs sampling (possibly with a Metropolis step), and importance reweighting to fit the models, and compare this hybrid approach to alternative Markov Chain Monte Carlo methods. Our investigation involves three alternative HPs. The first works with the spectral decomposition of the covariance matrix and produces both shrinkage of the eigenvalues toward each other and shrinkage of the rotation matrix toward the identity. The second produces shrinkage of the correlations toward 0, and the third uses a conjugate Wishart distribution to shrink toward diagonality. A simulation study shows that the first two HPs can be very effective in reducing small-sample risk, whereas the conjugate Wishart version sometimes performs very poorly. We evaluate the computational algorithm in the context of a normal nonlinear random-effects model and illustrate the methodology with a logistic random-effects model.