Auxiliary outcome data and the mean score method
- 30 November 1994
- journal article
- Published by Elsevier BV in Journal of Statistical Planning and Inference
- Vol. 42 (1-2), 137-160
- https://doi.org/10.1016/0378-3758(94)90194-5
Abstract
In medical research outcomes of interest, Y, are often difficult to ascertain on a sufficiently large number of study subjects. Cost is frequently an issue for example. A more feasible approach might to be ascertain an easily measured but less accurate surrogate outcome variable, A, and to supplement the study with a validation sample of observations for whom both Y and A have been measured. In the context of a regression model Pβ(Y∣X) with X a covariate vector, we propose a method called mean score to make inference about β using such data. This method does not require specification of the association between Y and A and is semiparametric in this sense. More-over, in contrast to previous work by Espeland and Odoroff (J. Amer. Statist. Assoc. 80 (1985), 663–670), and Buonaceorsi (J. Amer. Statist. Assoc. 85 (1990), 1075–1082), sampling of the true outcome can depend on both covariate and auxiliary data. Two illustrations in real medical contexts demonstrate that auxiliary data can substantially improve efficiency over standard statistical designs. Designs which incorporate auxiliary data may become increasingly useful as budgetary restrictions and health care management play a larger role in medical research. p]A third illustration demonstrates that the mean score method can be useful in the classical setting when observational datasets contain missing outcome data. Data need not be missing at random in the usual sense (Rubin (Multiple Imputation for Non-Response in Surveys (1987) Wiley, New York). Indeed the mean score method can adjust for biases induced by violation of the missing at random assumption in certain settings. We contend that the mean score method will be particularly useful in observational studies where it is possible, although perhaps inconvenient, to retrieve missing data.Keywords
This publication has 17 references indexed in Scilit:
- Evaluating Therapeutic Interventions: Some Issues and ExperiencesStatistical Science, 1992
- Increasing Precision or Reducing Expense in Regression Experiments by Using Information from a Concomitant VariablePublished by JSTOR ,1991
- Double Sampling for Exact Values in Some Multivariate Measurement Error ProblemsJournal of the American Statistical Association, 1990
- Surrogate endpoints in clinical trials: CancerStatistics in Medicine, 1989
- Surrogate endpoints in clinical trials: Ophthalmologic disordersStatistics in Medicine, 1989
- Log-Linear Models for Doubly Sampled Categorical Data Fitted by the EM AlgorithmJournal of the American Statistical Association, 1985
- The relationships among ventricular arrhythmias, left ventricular dysfunction, and mortality in the 2 years after myocardial infarction.Circulation, 1984
- Log-Linear Models for Categorical Data With Misclassification and Double SamplingJournal of the American Statistical Association, 1979
- On the Use of Double Sampling Schemes in Analyzing Categorical Data with Misclassification ErrorsJournal of the American Statistical Association, 1977
- On the Unique Consistent Solution to the Likelihood EquationsJournal of the American Statistical Association, 1977