Multiple Imputation in a Longitudinal Cohort Study: A Case Study of Sensitivity to Imputation Methods
Open Access
- 9 October 2014
- journal article
- research article
- Published by Oxford University Press (OUP) in American Journal of Epidemiology
- Vol. 180 (9), 920-932
- https://doi.org/10.1093/aje/kwu224
Abstract
Multiple imputation has entered mainstream practice for the analysis of incomplete data. We have used it extensively in a large Australian longitudinal cohort study, the Victorian Adolescent Health Cohort Study (1992–2008). Although we have endeavored to follow best practices, there is little published advice on this, and we have not previously examined the extent to which variations in our approach might lead to different results. Here, we examined sensitivity of analytical results to imputation decisions, investigating choice of imputation method, inclusion of auxiliary variables, omission of cases with excessive missing data, and approaches for imputing highly skewed continuous distributions that are analyzed as dichotomous variables. Overall, we found that decisions made about imputation approach had a discernible but rarely dramatic impact for some types of estimates. For model-based estimates of association, the choice of imputation method and decisions made to build the imputation model had little effect on results, whereas estimates of overall prevalence and prevalence stratified by subgroup were more sensitive to imputation method and settings. Multiple imputation by chained equations gave more plausible results than multivariate normal imputation for prevalence estimates but appeared to be more susceptible to numerical instability related to a highly skewed variable.Keywords
This publication has 28 references indexed in Scilit:
- Multiple imputation using chained equations: Issues and guidance for practiceStatistics in Medicine, 2010
- Multiple Imputation for Missing Data: Fully Conditional Specification Versus Multivariate Normal ImputationAmerican Journal of Epidemiology, 2010
- Multiple Imputation With Large Data Sets: A Case Study of the Children's Mental Health InitiativeAmerican Journal of Epidemiology, 2009
- Missing Data Analysis: Making It Work in the Real WorldAnnual Review of Psychology, 2009
- Robustness of a multivariate normal approximation for imputation of incomplete binary dataStatistics in Medicine, 2006
- Personality and substance use disorders in young adultsThe British Journal of Psychiatry, 2006
- A comparison of inclusive and restrictive strategies in modern missing data procedures.Psychological Methods, 2001
- The validity of two versions of the GHQ in the WHO study of mental illness in general health carePsychological Medicine, 1997
- Multiple Imputation after 18+ YearsJournal of the American Statistical Association, 1996
- Measuring psychiatric disorder in the community: a standardized assessment for use by lay interviewersPsychological Medicine, 1992