XenoSite: Accurately Predicting CYP-Mediated Sites of Metabolism with Neural Networks

Abstract
Understanding how xenobiotic molecules are metabolized is important because it influences the safety, efficacy, and dose of medicines and how they can be modified to improve these properties. The cytochrome P450s (CYPs) are proteins responsible for metabolizing 90% of drugs on the market, and many computational methods can predict which atomic sites of a molecule—sites of metabolism (SOMs)—are modified during CYP-mediated metabolism. This study improves on prior methods of predicting CYP-mediated SOMs by using new descriptors and machine learning based on neural networks. The new method, XenoSite, is faster to train and more accurate by as much as 4% or 5% for some isozymes. Furthermore, some “incorrect” predictions made by XenoSite were subsequently validated as correct predictions by revaluation of the source literature. Moreover, XenoSite output is interpretable as a probability, which reflects both the confidence of the model that a particular atom is metabolized and the statistical likelihood that its prediction for that atom is correct.