A Novel Imputation Approach for Sharing Protected Public Health Data
- 1 October 2021
- journal article
- research article
- Published by American Public Health Association in American Journal of Public Health
- Vol. 111 (10), 1830-1838
- https://doi.org/10.2105/ajph.2021.306432
Abstract
Objectives. To develop an imputation method to produce estimates for suppressed values within a shared government administrative data set to facilitate accurate data sharing and statistical and spatial analyses. Methods. We developed an imputation approach that incorporated known features of suppressed Massachusetts surveillance data from 2011 to 2017 to predict missing values more precisely. Our methods for 35 de-identified opioid prescription data sets combined modified previous or next substitution followed by mean imputation and a count adjustment to estimate suppressed values before sharing. We modeled 4 methods and compared the results to baseline mean imputation. Results. We assessed performance by comparing root mean squared error (RMSE), mean absolute error (MAE), and proportional variance between imputed and suppressed values. Our method outperformed mean imputation; we retained 46% of the suppressed value's proportional variance with better precision (22% lower RMSE and 26% lower MAE) than simple mean imputation. Conclusions. Our easy-to-implement imputation technique largely overcomes the adverse effects of low count value suppression with superior results to simple mean imputation. This novel method is generalizable to researchers sharing protected public health surveillance data. (Am J Public Health. 2021; 111(10):1830-1838. https://doi.org/10.2105/AJPH.2021.306432).Keywords
This publication has 13 references indexed in Scilit:
- Potentially Inappropriate Opioid Prescribing, Overdose, and Mortality in Massachusetts, 2011–2015Journal of General Internal Medicine, 2018
- Opioid Abuse in the United States and Department of Health and Human Services Actions to Address Opioid-Drug-Related Overdoses and DeathsJournal of Pain & Palliative Care Pharmacotherapy, 2015
- The Impact of Data Suppression on Local Mortality Rates: The Case of CDC WONDERAmerican Journal of Public Health, 2014
- Multiple imputation of missing values was not necessary before performing a longitudinal mixed-model analysisJournal of Clinical Epidemiology, 2013
- Five Essential Properties of Disease MapsAnnals of the American Association of Geographers, 2012
- Missing covariate data in medical research: To impute is better than to ignoreJournal of Clinical Epidemiology, 2010
- Review: A gentle introduction to imputation of missing valuesJournal of Clinical Epidemiology, 2006
- What Do We Do with Missing Data? Some Options for Analysis of Incomplete DataAnnual Review of Public Health, 2004
- Imputation of missing longitudinal data: a comparison of methodsJournal of Clinical Epidemiology, 2003
- Missing data in longitudinal studiesStatistics in Medicine, 1988