DeepAffinity: interpretable deep learning of compound–protein affinity through unified recurrent and convolutional neural networks
Top Cited Papers
- 15 February 2019
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 35 (18), 3329-3338
- https://doi.org/10.1093/bioinformatics/btz111
Abstract
Drug discovery demands rapid quantification of compound–protein interaction (CPI). However, there is a lack of methods that can predict compound–protein affinity from sequences alone with high applicability, accuracy and interpretability. We present a seamless integration of domain knowledges and learning-based approaches. Under novel representations of structurally annotated protein sequences, a semi-supervised deep learning model that unifies recurrent and convolutional neural networks has been proposed to exploit both unlabeled and labeled data, for jointly encoding molecular representations and predicting affinities. Our representations and models outperform conventional options in achieving relative error in IC50 within 5-fold for test cases and 20-fold for protein classes not included for training. Performances for new protein classes with few labeled data are further improved by transfer learning. Furthermore, separate and joint attention mechanisms are developed and embedded to our model to add to its interpretability, as illustrated in case studies for predicting and explaining selective drug–target interactions. Lastly, alternative representations using protein sequences or compound graphs and a unified RNN/GCNN-CNN model using graph CNN (GCNN) are also explored to reveal algorithmic challenges ahead. Data and source codes are available at https://github.com/Shen-Lab/DeepAffinity. Supplementary data are available at Bioinformatics online.Keywords
Other Versions
Funding Information
- National Institute of General Medical Sciences
- National Institutes of Health (R35GM124952)
- Defense Advanced Research Projects Agency (FA8750-18-2-0027)
- Texas A&M High Performance Research Computing
This publication has 40 references indexed in Scilit:
- Predicting drug-target interactions using restricted Boltzmann machinesBioinformatics, 2013
- A Systematic Prediction of Multiple Drug-Target Interactions from Chemical, Genomic, and Pharmacological DataPLOS ONE, 2012
- Rational Approaches to Improving Selectivity in Drug DesignJournal of Medicinal Chemistry, 2012
- Drug Off-Target Effects Predicted Using Structural Analysis in the Context of a Metabolic Network ModelPLoS Computational Biology, 2010
- Predicting new molecular targets for known drugsNature, 2009
- PubChem: a public information system for analyzing bioactivities of small moleculesNucleic Acids Research, 2009
- STITCH: interaction networks of chemicals and proteinsNucleic Acids Research, 2007
- BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinitiesNucleic Acids Research, 2006
- X-ray Structure of Active Site-inhibited Clotting Factor XaPublished by Elsevier BV ,1996
- Indexing by latent semantic analysisJournal of the American Society for Information Science, 1990