Predicting Multivariate Responses in Multiple Linear Regression

Abstract
Summary: We look at the problem of predicting several response variables from the same set of explanatory variables. The question is how to take advantage of correlations between the response variables to improve predictive accuracy compared with the usual procedure of doing individual regressions of each response variable on the common set of predictor variables. A new procedure is introduced called the curds and whey method. Its use can substantially reduce prediction errors when there are correlations between responses while maintaining accuracy even if the responses are uncorrelated. In extensive simulations, the new procedure is compared with several previously proposed methods for predicting multiple responses (including partial least squares) and exhibits superior accuracy. One version can be easily implemented in the context of standard statistical packages.