Evaluation of software for multiple imputation of semi-continuous data

1 June 2007

journal article
research article
Published by SAGE Publications in Statistical Methods in Medical Research

Vol. 16 (3), 243-258
https://doi.org/10.1177/0962280206074464

Abstract

It is now widely accepted that multiple imputation (MI) methods properly handle the uncertainty of missing data over single imputation methods. Several standard statistical software packages, such as SAS, R and STATA, have standard procedures or user-written programs to perform MI. The performance of these packages is generally acceptable for most types of data. However, it is unclear whether these applications are appropriate for imputing data with a large proportion of zero values resulting in a semi-continuous distribution. In addition, it is not clear whether the use of these applications is suitable when the distribution of the data needs to be preserved for subsequent analysis. This article reports the findings of a simulation study carried out to evaluate the performance of the MI procedures for handling semi-continuous data within these statistical packages. Complete resource use data on 1060 participants from a large randomized clinical trial were used as the simulation population from which 500 bootstrap samples were obtained and missing data imposed. The findings of this study showed differences in the performance of the MI programs when imputing semi-continuous data. Caution should be exercised when deciding which program should perform MI on this type of data.

Keywords

This publication has 24 references indexed in Scilit:

Fully conditional specification in multivariate imputation
Journal of Statistical Computation and Simulation, 2006
Surgical stabilisation of the spine compared with a programme of intensive rehabilitation for the management of patients with chronic low back pain: cost utility analysis based on a randomised controlled trial
BMJ, 2005
Multiple Imputation for Incomplete Data With Semicontinuous Variables
Journal of the American Statistical Association, 2003
Missing.... presumed at random: cost‐analysis of incomplete data
Health Economics, 2002
International Subarachnoid Aneurysm Trial (ISAT) of neurosurgical clipping versus endovascular coiling in 2143 patients with ruptured intracranial aneurysms: a randomised trial
The Lancet, 2002
Missing data: Our view of the state of the art.
Psychological Methods, 2002
Multiple Imputation for Missing Data
Sociological Methods & Research, 2000
The Calculation of Posterior Distributions by Data Augmentation
Journal of the American Statistical Association, 1987
Multiple Imputation for Interval Estimation from Simple Random Samples with Ignorable Nonresponse
Journal of the American Statistical Association, 1986
The central role of the propensity score in observational studies for causal effects
Biometrika, 1983

Cited by 77 articles