Marginal Analysis of Incomplete Longitudinal Binary Data: A Cautionary Note on LOCF Imputation

27 August 2004

journal article
Published by Oxford University Press (OUP) in Biometrics

Vol. 60 (3), 820-828
https://doi.org/10.1111/j.0006-341x.2004.00234.x

Abstract

Summary In recent years there has been considerable research devoted to the development of methods for the analysis of incomplete data in longitudinal studies. Despite these advances, the methods used in practice have changed relatively little, particularly in the reporting of pharmaceutical trials. In this setting, perhaps the most widely adopted strategy for dealing with incomplete longitudinal data is imputation by the “last observation carried forward” (LOCF) approach, in which values for missing responses are imputed using observations from the most recently completed assessment. We examine the asymptotic and empirical bias, the empirical type I error rate, and the empirical coverage probability associated with estimators and tests of treatment effect based on the LOCF imputation strategy. We consider a setting involving longitudinal binary data with longitudinal analyses based on generalized estimating equations, and an analysis based simply on the response at the end of the scheduled follow-up. We find that for both of these approaches, imputation by LOCF can lead to substantial biases in estimators of treatment effects, the type I error rates of associated tests can be greatly inflated, and the coverage probability can be far from the nominal level. Alternative analyses based on all available data lead to estimators with comparatively small bias, and inverse probability weighted analyses yield consistent estimators subject to correct specification of the missing data process. We illustrate the differences between various methods of dealing with drop-outs using data from a study of smoking behavior.

Keywords

This publication has 26 references indexed in Scilit:

Marginal Methods for Incomplete Longitudinal Data Arising in Clusters
Journal of the American Statistical Association, 2002
Estimation in an empirical bayes model for longitudinal and cross-sectionally clustered binary data
The Canadian Journal of Statistics / La Revue Canadienne de Statistique, 2000
Semiparametric Regression for Repeated Outcomes with Nonignorable Nonresponse
Journal of the American Statistical Association, 1998
Semiparametric Regression for Repeated Outcomes with Nonignorable Nonresponse
Journal of the American Statistical Association, 1998
Modeling the Drop-Out Mechanism in Repeated-Measures Studies
Journal of the American Statistical Association, 1995
Analysis of Semiparametric Regression Models for Repeated Outcomes in the Presence of Missing Data
Journal of the American Statistical Association, 1995
Analysis of Semiparametric Regression Models for Repeated Outcomes in the Presence of Missing Data
Journal of the American Statistical Association, 1995
Marginal Modeling of Correlated Ordinal Data Using a Multivariate Plackett Distribution
Journal of the American Statistical Association, 1994
Missing Data, Imputation, and the Bootstrap
Journal of the American Statistical Association, 1994
Longitudinal data analysis using generalized linear models
Biometrika, 1986

Cited by 67 articles