Benchmarking of QSAR Models for Blood-Brain Barrier Permeation

Abstract
Using the largest available database of 328 blood−brain distribution (logBB) values, a quantitative benchmark was proposed to allow for a consistent comparison of the predictive accuracy of current and future logBB/quantitative structure−activity relationship (-QSAR) models. The usefulness of the benchmark was illustrated by comparing the global and k-nearest neighbors (kNN) multiple-linear regression (MLR) models based on the linear free-energy relationship (LFER) descriptors, and one non-LFER-based MLR model. The leave-one-out (LOO) and leave-group-out Monte Carlo (MC) cross-validation results (q2 = 0.766, qms = 0.290, and qmsmc = 0.311) indicated that the LFER-based kNN-MLR model was currently one of the most accurate predictive logBB-QSAR models. The LOO, MC, and kNN-MLR methods have been implemented in the QSAR-BENCH program, which is freely available from www.dmitrykonovalov.org for academic use.