Genetic Algorithm-Optimized QSPR Models for Bioavailability, Protein Binding, and Urinary Excretion

Abstract
In this work, a genetic algorithm (GA) was applied to build up a set of QSPR (quantitative structure−property relationship) models for human absolute oral bioavailability, plasma protein binding, and urinary excretion using the counts of molecular fragments as descriptors. For a pharmacokinetic property, the consensus score of a set of models (20 or 30) was found to improve the correlation coefficient and reduce the standard error significantly. Key fragments that may boost or reduce pharmacokinetic properties were also identified. Databases searches were performed for a set of key fragments identified by bioavailability models. The percentage of hit rates of bioavailability-boosting fragments were significantly higher than those of bioavailability-reducing fragments for MDDR (MDL Drug Data Report), a database of drugs and drug leads entered or entering development. On the other hand, the opposite trend was observed for ACD (Available Chemicals Directory), a database of all kinds of available compounds.