Drug drug interaction extraction from biomedical literature using syntax convolutional neural network

Abstract
Motivation: Detecting drug-drug interaction (DDI) has become a vital part of public health safety. Therefore, using text mining techniques to extract DDIs from biomedical literature has received great attentions. However, this research is still at an early stage and its performance has much room to improve. Results: In this article, we present a syntax convolutional neural network (SCNN) based DDI extraction method. In this method, a novel word embedding, syntax word embedding, is proposed to employ the syntactic information of a sentence. Then the position and part of speech features are introduced to extend the embedding of each word. Later, auto-encoder is introduced to encode the traditional bag-of-words feature (sparse 0–1 vector) as the dense real value vector. Finally, a combination of embedding-based convolutional features and traditional features are fed to the softmax classifier to extract DDIs from biomedical literature. Experimental results on the DDIExtraction 2013 corpus show that SCNN obtains a better performance (an F-score of 0.686) than other state-of-the-art methods. Availability and Implementation: The source code is available for academic use at http://202.118.75.18:8080/DDI/SCNN-DDI.zip. Contact:yangzh@dlut.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
Funding Information
  • the Natural Science Foundation of China (61070098, 61272373, 61340020, 61572102, 61572098)
  • the Fundamental Research Funds for the Central Universities (DUT13JB09, DUT14YQ213)
  • the Major State Research Development Program of China (2016YFC0901902)

This publication has 18 references indexed in Scilit: