Abstract
In this work, we report the development of a novel QSAR technique combining genetic algorithms and neural networks for selecting a subset of relevant descriptors and building the optimal neural network architecture for QSAR studies. This technique uses a neural network to map the dependent property of interest with the descriptors preselected by the genetic algorithm. This technique differs from other variable selection techniques combining genetic algorithms to neural networks by two main features: (1) The variable selection search performed by the genetic algorithm is not constrained to a defined number of descriptors. (2) The optimal neural network architecture is explored in parallel with the variable selection by dynamically modifying the size of the hidden layer. By using both artificial data and real biological data, we show that this technique can be used to build both classification and regression models and outperforms simpler variable selection techniques mainly for nonlinear data sets. The results obtained on real data are compared to previous work using other modeling techniques. We also discuss some important issues in building QSAR models and good practices for QSAR studies.