Auxiliary outcome data and the mean score method

30 November 1994

journal article
Published by Elsevier BV in Journal of Statistical Planning and Inference

Vol. 42 (1-2), 137-160
https://doi.org/10.1016/0378-3758(94)90194-5

Abstract

In medical research outcomes of interest, Y, are often difficult to ascertain on a sufficiently large number of study subjects. Cost is frequently an issue for example. A more feasible approach might to be ascertain an easily measured but less accurate surrogate outcome variable, A, and to supplement the study with a validation sample of observations for whom both Y and A have been measured. In the context of a regression model P_β(Y∣X) with X a covariate vector, we propose a method called mean score to make inference about β using such data. This method does not require specification of the association between Y and A and is semiparametric in this sense. More-over, in contrast to previous work by Espeland and Odoroff (J. Amer. Statist. Assoc. 80 (1985), 663–670), and Buonaceorsi (J. Amer. Statist. Assoc. 85 (1990), 1075–1082), sampling of the true outcome can depend on both covariate and auxiliary data. Two illustrations in real medical contexts demonstrate that auxiliary data can substantially improve efficiency over standard statistical designs. Designs which incorporate auxiliary data may become increasingly useful as budgetary restrictions and health care management play a larger role in medical research. p]A third illustration demonstrates that the mean score method can be useful in the classical setting when observational datasets contain missing outcome data. Data need not be missing at random in the usual sense (Rubin (Multiple Imputation for Non-Response in Surveys (1987) Wiley, New York). Indeed the mean score method can adjust for biases induced by violation of the missing at random assumption in certain settings. We contend that the mean score method will be particularly useful in observational studies where it is possible, although perhaps inconvenient, to retrieve missing data.

Keywords

This publication has 17 references indexed in Scilit:

Evaluating Therapeutic Interventions: Some Issues and Experiences
Statistical Science, 1992
Increasing Precision or Reducing Expense in Regression Experiments by Using Information from a Concomitant Variable
Published by JSTOR ,1991
Double Sampling for Exact Values in Some Multivariate Measurement Error Problems
Journal of the American Statistical Association, 1990
Surrogate endpoints in clinical trials: Cancer
Statistics in Medicine, 1989
Surrogate endpoints in clinical trials: Ophthalmologic disorders
Statistics in Medicine, 1989
Log-Linear Models for Doubly Sampled Categorical Data Fitted by the EM Algorithm
Journal of the American Statistical Association, 1985
The relationships among ventricular arrhythmias, left ventricular dysfunction, and mortality in the 2 years after myocardial infarction.
Circulation, 1984
Log-Linear Models for Categorical Data With Misclassification and Double Sampling
Journal of the American Statistical Association, 1979
On the Use of Double Sampling Schemes in Analyzing Categorical Data with Misclassification Errors
Journal of the American Statistical Association, 1977
On the Unique Consistent Solution to the Likelihood Equations
Journal of the American Statistical Association, 1977

Cited by 62 articles