Multiple-Ligand-Based Virtual Screening: Methods and Applications of the MTree Approach

Abstract
We present a novel approach for ligand-based virtual screening by combining query molecules into a multiple feature tree model called MTree. All molecules are described by the established feature tree descriptor, which is derived from a topological molecular graph. A new pairwise alignment algorithm leads to a consistent topological molecular alignment based on chemically reasonable matching of corresponding functional groups. These multiple feature tree models find application in ligand-based virtual screening to identify new lead structures for chemical optimization. Retrospective virtual screening with MTree models generated for angiotensin-converting enzyme and the α1a receptor on a large candidate database yielded enrichment factors up to 71 for the first 1% of the screened database. MTree models outperformed database searches using single feature trees in terms of hit rates and quality and additionally identified alternative molecular scaffolds not included in any of the query molecules. Furthermore, relevant molecular features, which are known to be important for affinity to the target, are identified by this new methodology.