Text Mining of Adverse Events in Clinical Trials: Deep Learning Approach

Open Access

24 December 2021

journal article
research article
Published by JMIR Publications Inc. in JMIR Public Health and Surveillance

Vol. 9 (12), e28632
https://doi.org/10.2196/28632

Abstract

Background: Pharmacovigilance and safety reporting, which involve processes for monitoring the use of medicines in clinical trials, play a critical role in the identification of previously unrecognized adverse events or changes in the patterns of adverse events. Objective: This study aims to demonstrate the feasibility of automating the coding of adverse events described in the narrative section of the serious adverse event report forms to enable statistical analysis of the aforementioned patterns. Methods: We used the Uniﬁed Medical Language System (UMLS) as the coding scheme, which integrates 217 source vocabularies, thus enabling coding against other relevant terminologies such as the International Classification of Diseases–10th Revision, Medical Dictionary for Regulatory Activities, and Systematized Nomenclature of Medicine). We used MetaMap, a highly configurable dictionary lookup software, to identify the mentions of the UMLS concepts. We trained a binary classifier using Bidirectional Encoder Representations from Transformers (BERT), a transformer-based language model that captures contextual relationships, to differentiate between mentions of the UMLS concepts that represented adverse events and those that did not. Results: The model achieved a high F1 score of 0.8080, despite the class imbalance. This is 10.15 percent points lower than human-like performance but also 17.45 percent points higher than that of the baseline approach. Conclusions: These results confirmed that automated coding of adverse events described in the narrative section of serious adverse event reports is feasible. Once coded, adverse events can be statistically analyzed so that any correlations with the trialed medicines can be estimated in a timely fashion.

This publication has 40 references indexed in Scilit:

Portable automatic text classification for adverse drug reaction detection via multi-corpus training
Journal of Biomedical Informatics, 2014
A survey on annotation tools for the biomedical literature
Briefings in Bioinformatics, 2012
Vaccine adverse event text mining system for extracting features from vaccine safety reports
Journal of the American Medical Informatics Association, 2012
Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection
Journal of the American Medical Informatics Association, 2011
Medication information extraction with linguistic pattern matching and semantic rules
Journal of the American Medical Informatics Association, 2010
ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports
Journal of Biomedical Informatics, 2009
Active Computerized Pharmacovigilance Using Natural Language Processing, Statistics, and Electronic Health Records: A Feasibility Study
Journal of the American Medical Informatics Association, 2009
Detecting possible vaccine adverse events in clinical notes of the electronic medical record
Vaccine, 2009
Agreement, the F-Measure, and Reliability in Information Retrieval
Journal of the American Medical Informatics Association, 2005
The Unified Medical Language System (UMLS): integrating biomedical terminology
Nucleic Acids Research, 2004

Cited by 5 articles