Inferring new relations between medical entities using literature curated term co-occurrences

Open Access

1 July 2019

journal article
research article
Published by Oxford University Press (OUP) in JAMIA Open

Vol. 2 (3), 378-385
https://doi.org/10.1093/jamiaopen/ooz022

Abstract

Identifying new relations between medical entities, such as drugs, diseases, and side effects, is typically a resource-intensive task, involving experimentation and clinical trials. The increased availability of related data and curated knowledge enables a computational approach to this task, notably by training models to predict likely relations. Such models rely on meaningful representations of the medical entities being studied. We propose a generic features vector representation that leverages co-occurrences of medical terms, linked with PubMed citations. We demonstrate the usefulness of the proposed representation by inferring two types of relations: a drug causes a side effect and a drug treats an indication. To predict these relations and assess their effectiveness, we applied 2 modeling approaches: multi-task modeling using neural networks and single-task modeling based on gradient boosting machines and logistic regression. These trained models, which predict either side effects or indications, obtained significantly better results than baseline models that use a single direct co-occurrence feature. The results demonstrate the advantage of a comprehensive representation. Selecting the appropriate representation has an immense impact on the predictive performance of machine learning models. Our proposed representation is powerful, as it spans multiple medical domains and can be used to predict a wide range of relation types. The discovery of new relations between various medical entities can be translated into meaningful insights, for example, related to drug development or disease understanding. Our representation of medical entities can be used to train models that predict such relations, thus accelerating healthcare-related discoveries.

Keywords

Funding Information

European Union’s Horizon

This publication has 43 references indexed in Scilit:

Relating drug–protein interaction network with drug side effects
Bioinformatics, 2012
MeSHy: Mining unanticipated PubMed information using frequencies of occurrences and concurrences of MeSH terms
Journal of Biomedical Informatics, 2011
Systematic Drug Repositioning Based on Clinical Side-Effects
PLOS ONE, 2011
Benefits and strengths of the disproportionality analysis for identification of adverse drug reactions in a pharmacovigilance database
British Journal of Clinical Pharmacology, 2011
Using information mining of the medical literature to improve drug safety
Journal of the American Medical Informatics Association, 2011
Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data
Science Translational Medicine, 2011
Predicting drug side-effect profiles: a chemical fragment-based approach
BMC Bioinformatics, 2011
PREDICT: a method for inferring novel drug indications with application to personalized medicine
Molecular Systems Biology, 2011
STITCH: interaction networks of chemicals and proteins
Nucleic Acids Research, 2007
An interactive system for finding complementary literatures: a stimulus to scientific discovery
Artificial Intelligence, 1997