Automatic de-identification of textual documents in the electronic health record: a review of recent research
Open Access
- 2 August 2010
- journal article
- review article
- Published by Springer Science and Business Media LLC in BMC Medical Research Methodology
- Vol. 10 (1), 70
- https://doi.org/10.1186/1471-2288-10-70
Abstract
In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here.Keywords
This publication has 23 references indexed in Scilit:
- Recognizing Obesity and Comorbidities in Sparse DataJournal of the American Medical Informatics Association, 2009
- Repurposing the Clinical Record: Can an Existing Natural Language Processing System De-identify Clinical Notes?Journal of the American Medical Informatics Association, 2009
- A Software Tool for Removing Patient Identifying Information from Clinical DocumentsJournal of the American Medical Informatics Association, 2008
- Automated de-identification of free-text medical recordsBMC Medical Informatics and Decision Making, 2008
- A de-identifier for medical discharge summariesArtificial Intelligence in Medicine, 2008
- Identifying Patient Smoking Status from Medical Discharge RecordsJournal of the American Medical Informatics Association, 2008
- Evaluating the State-of-the-Art in Automatic De-identificationJournal of the American Medical Informatics Association, 2007
- Rapidly Retargetable Approaches to De-identification in Medical RecordsJournal of the American Medical Informatics Association, 2007
- State-of-the-art Anonymization of Medical Records Using an Iterative Machine Learning FrameworkJournal of the American Medical Informatics Association, 2007
- Evaluation of a Deidentification (De-Id) Software Engine to Share Pathology Reports and Clinical Documents for ResearchAmerican Journal of Clinical Pathology, 2004