Prediction of Human Blood: Air Partition Coefficient: A Comparison of Structure‐Based and Property‐Based Methods
- 1 December 2003
- journal article
- research article
- Published by Wiley in Risk Analysis
- Vol. 23 (6), 1173-1184
- https://doi.org/10.1111/j.0272-4332.2003.00390.x
Abstract
In recent years, there has been increased interest in the development and use of quantitative structure-activity/property relationship (QSAR/QSPR) models. For the most part, this is due to the fact that experimental data is sparse and obtaining such data is costly, while theoretical structural descriptors can be obtained quickly and inexpensively. In this study, three linear regression methods, viz. principal component regression (PCR), partial least squares (PLS), and ridge regression (RR), were used to develop QSPR models for the estimation of human blood:air partition coefficient (logPblood:air) for a group of 31 diverse low-molecular weight volatile chemicals from their computed molecular descriptors. In general, RR was found to be superior to PCR or PLS. Comparisons were made between models developed using parameters based solely on molecular structure and linear regression (LR) models developed using experimental properties, including saline:air partition coefficient (logPsaline:air) and olive oil:air partition coefficient (logPolive oil:air), as independent variables, indicating that the structure-property correlations are comparable to the property-property correlations. The best models, however, were those that used rat logPblood:air as the independent variable. Haloalkane subgroups were modeled separately for comparative purposes and, although models based on the congeneric compounds were superior, the models developed on the complete set of diverse compounds were of acceptable quality. The structural descriptors were placed into one of three classes based on level of complexity: topostructural (TS), topochemical (TC), or three-dimensional/geometrical (3D). Modeling was performed using the structural descriptor classes both in a hierarchical fashion and separately. The results indicate that highest quality structure-based models, in terms of descriptor classes, were those derived using TC descriptors.Keywords
This publication has 26 references indexed in Scilit:
- The Peculiar Shrinkage Properties of Partial Least Squares RegressionJournal of the Royal Statistical Society Series B: Statistical Methodology, 2000
- Topological Indices: Their Nature and Mutual RelatednessJournal of Chemical Information and Computer Sciences, 2000
- QSPR Studies on Vapor Pressure, Aqueous Solubility, and the Prediction of Water−Air Partition CoefficientsJournal of Chemical Information and Computer Sciences, 1998
- Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State InformationJournal of Chemical Information and Computer Sciences, 1995
- A combined theory for PCA and PLSJournal of Chemometrics, 1995
- A Statistical View of Some Chemometrics Regression ToolsTechnometrics, 1993
- The electrotopological state: structure information at the atomic level for molecular graphsJournal of Chemical Information and Computer Sciences, 1991
- A new approach for devising local graph invariants: Derived topological indices with low degeneracy and good correlation abilityJournal of Mathematical Chemistry, 1987
- Inflation of R 2 in Best Subset RegressionTechnometrics, 1980
- Principal Components Regression in Exploratory Statistical ResearchJournal of the American Statistical Association, 1965