Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules
- 24 June 2013
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Modeling
- Vol. 53 (7), 1563-1575
- https://doi.org/10.1021/ci400187y
Abstract
Shallow machine learning methods have been applied to chemoinformatics problems with some success. As more data becomes available and more complex problems are tackled, deep machine learning methods may also become useful. Here, we present a brief overview of deep learning methods and show in particular how recursive neural network approaches can be applied to the problem of predicting molecular properties. However, molecules are typically described by undirected cyclic graphs, while recursive approaches typically use directed acyclic graphs. Thus, we develop methods to address this discrepancy, essentially by considering an ensemble of recursive neural networks associated with all possible vertex-centered acyclic orientations of the molecular graph. One advantage of this approach is that it relies only minimally on the identification of suitable molecular descriptors because suitable representations are learned automatically from the data. Several variants of this approach are applied to the problem of predicting aqueous solubility and tested on four benchmark data sets. Experimental results show that the performance of the deep learning methods matches or exceeds the performance of other state-of-the-art methods according to several evaluation metrics and expose the fundamental limitations arising from training sets that are too small or too noisy. A Web-based predictor, AquaSol, is available online through the ChemDB portal (cdb.ics.uci.edu) together with additional material.Keywords
This publication has 49 references indexed in Scilit:
- Learning to Predict Chemical ReactionsJournal of Chemical Information and Modeling, 2011
- In Silico Prediction of Aqueous Solubility: The Solubility ChallengeJournal of Chemical Information and Modeling, 2009
- NNcon: improved protein contact map prediction using 2D-recursive neural networksNucleic Acids Research, 2009
- Learning to play Go using recursive neural networksNeural Networks, 2008
- A Fast Learning Algorithm for Deep Belief NetsNeural Computation, 2006
- Graph kernels for chemical informaticsNeural Networks, 2005
- Towards Optimal Descriptor Subset Selection with Support Vector Machines in Classification and RegressionQSAR & Combinatorial Science, 2004
- Ring perception. A new algorithm for directly finding the smallest set of smallest rings from a connection tableJournal of Chemical Information and Computer Sciences, 1993
- Linear Solvation Energy Relationships: 36. Molecular Properties Governing Solubilities of Organic Nonelectrolytes In WaterJournal of Pharmaceutical Sciences, 1986
- An Algorithm for Finding the Smallest Set of Smallest RingsJournal of Chemical Information and Computer Sciences, 1976