Abstract
Using the DUD-E+ benchmark, we explore the impact of using a single protein pocket or ligand for virtual screening compared with using ensembles of alternative pockets, ligands, and sets thereof. For both structure-based and ligand-based approaches, the precise characterization of the binding site in question had a significant impact on screening performance. Using the single original DUD-E protein, Surflex-Dock yielded mean ROC area of 0.81±0.11. Using the cognate ligand instead, with the eSim method for screening, yielded 0.77±0.14. Moving to ensembles of five protein pocket variants increased docking performance to 0.84±0.09. Results for the analogous ligand-based approach (using the five crystallographically aligned cognate ligands) was 0.83±0.11. Using the same ligands, but making use of an automatically generated mutual alignment, yielded mean AUC nearly as good as from single-structure docking: 0.80±0.12. Detailed results and statistical analyses show that structure-based and ligand-based methods are complementary and can be fruitfully combined to enhance screening efficiency. A hybrid approach combining ensemble docking with eSim-based screening produced the best and most consistent performance (mean ROC area of 0.89±0.08 and 1% early enrichment of 46-fold). Based on results from both the docking and ligand-similarity approaches, it is clearly unwise to make use of a single arbitrarily chosen protein structure for docking or single ligand query for similarity-based screening.
Funding Information
  • National Institute of General Medical Sciences (R01-GM101689)

This publication has 43 references indexed in Scilit: