SFCscoreRF: A Random Forest-Based Scoring Function for Improved Affinity Prediction of Protein–Ligand Complexes
- 10 June 2013
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Modeling
- Vol. 53 (8), 1923-1933
- https://doi.org/10.1021/ci400120b
Abstract
A major shortcoming of empirical scoring functions for protein–ligand complexes is the low degree of correlation between predicted and experimental binding affinities, as frequently observed not only for large and diverse data sets but also for SAR series of individual targets. Improvements can be envisaged by developing new descriptors, employing larger training sets of higher quality, and resorting to more sophisticated regression methods. Herein, we describe the use of SFCscore descriptors to develop an improved scoring function by means of a PDBbind training set of 1005 complexes in combination with random forest for regression. This provided SFCscoreRF as a new scoring function with significantly improved performance on the PDBbind and CSAR–NRC HiQ benchmarks in comparison to previously developed SFCscore functions. A leave-cluster-out cross-validation and performance in the CSAR 2012 scoring exercise point out remaining limitations but also directions for further improvements of SFCscoreRF and empirical scoring functions in general.Keywords
This publication has 33 references indexed in Scilit:
- Scoring Functions for Protein–Ligand InteractionsPublished by Wiley ,2012
- CSAR Benchmark Exercise of 2010: Combined Evaluation Across All Submitted Scoring FunctionsJournal of Chemical Information and Modeling, 2011
- The Challenge of Affinity Prediction: Scoring Functions for Structure‐Based Virtual ScreeningPublished by Wiley ,2011
- Comparison of Several Molecular Docking Programs: Pose Prediction and Virtual Screening AccuracyJournal of Chemical Information and Modeling, 2009
- Comparative Assessment of Scoring Functions on a Diverse Test SetJournal of Chemical Information and Modeling, 2009
- SFCscore: Scoring functions for affinity prediction of protein–ligand complexesProteins-Structure Function and Bioinformatics, 2008
- Towards the development of universal, fast and highly accurate docking/scoring methods: a long way to goBritish Journal of Pharmacology, 2008
- A general approach for developing system‐specific functions to score protein–ligand docked complexes using support vector inductive logic programmingProteins-Structure Function and Bioinformatics, 2007
- A Critical Assessment of Docking Programs and Scoring FunctionsJournal of Medicinal Chemistry, 2005
- Virtual Screening of Molecular Databases Using a Support Vector MachineJournal of Chemical Information and Modeling, 2005