Text Mining of Adverse Events in Clinical Trials: Deep Learning Approach
Open Access
- 24 December 2021
- journal article
- research article
- Published by JMIR Publications Inc. in JMIR Public Health and Surveillance
- Vol. 9 (12), e28632
- https://doi.org/10.2196/28632
Abstract
Background: Pharmacovigilance and safety reporting, which involve processes for monitoring the use of medicines in clinical trials, play a critical role in the identification of previously unrecognized adverse events or changes in the patterns of adverse events. Objective: This study aims to demonstrate the feasibility of automating the coding of adverse events described in the narrative section of the serious adverse event report forms to enable statistical analysis of the aforementioned patterns. Methods: We used the Unified Medical Language System (UMLS) as the coding scheme, which integrates 217 source vocabularies, thus enabling coding against other relevant terminologies such as the International Classification of Diseases–10th Revision, Medical Dictionary for Regulatory Activities, and Systematized Nomenclature of Medicine). We used MetaMap, a highly configurable dictionary lookup software, to identify the mentions of the UMLS concepts. We trained a binary classifier using Bidirectional Encoder Representations from Transformers (BERT), a transformer-based language model that captures contextual relationships, to differentiate between mentions of the UMLS concepts that represented adverse events and those that did not. Results: The model achieved a high F1 score of 0.8080, despite the class imbalance. This is 10.15 percent points lower than human-like performance but also 17.45 percent points higher than that of the baseline approach. Conclusions: These results confirmed that automated coding of adverse events described in the narrative section of serious adverse event reports is feasible. Once coded, adverse events can be statistically analyzed so that any correlations with the trialed medicines can be estimated in a timely fashion.This publication has 40 references indexed in Scilit:
- Portable automatic text classification for adverse drug reaction detection via multi-corpus trainingJournal of Biomedical Informatics, 2014
- A survey on annotation tools for the biomedical literatureBriefings in Bioinformatics, 2012
- Vaccine adverse event text mining system for extracting features from vaccine safety reportsJournal of the American Medical Informatics Association, 2012
- Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selectionJournal of the American Medical Informatics Association, 2011
- Medication information extraction with linguistic pattern matching and semantic rulesJournal of the American Medical Informatics Association, 2010
- ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reportsJournal of Biomedical Informatics, 2009
- Active Computerized Pharmacovigilance Using Natural Language Processing, Statistics, and Electronic Health Records: A Feasibility StudyJournal of the American Medical Informatics Association, 2009
- Detecting possible vaccine adverse events in clinical notes of the electronic medical recordVaccine, 2009
- Agreement, the F-Measure, and Reliability in Information RetrievalJournal of the American Medical Informatics Association, 2005
- The Unified Medical Language System (UMLS): integrating biomedical terminologyNucleic Acids Research, 2004