Classifier Performance Estimation Under the Constraint of a Finite Sample Size: Resampling Schemes Applied to Neural Network Classifiers
- 1 August 2007
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2007 International Joint Conference on Neural Networks
- No. 21614393,p. 1762-1766
- https://doi.org/10.1109/ijcnn.2007.4371224
Abstract
In a practical classifier design problem, the sample size is limited, and the available finite sample needs to be used both to design a classifier and to predict the classifier's performance for the true population. Since a larger sample is more representative of the population, it is advantageous to design the classifier with all the available cases, and to use a resampling technique for performance prediction. We conducted a Monte-Carlo simulation study to compare the ability of different resampling techniques in predicting the performance of a neural network classifier designed with the available sample. We investigated a technique based on the cross-validation, the leave-one-out method, and three different types of bootstrapping, namely, the ordinary, .632, and .632+ bootstrap techniques. Our results indicated that, under the study conditions, there can be a large difference in the accuracy of the prediction obtained from different resampling methods, especially when the feature space dimensionality is relatively large and the sample size is small. Under this type of conditions, the .632 and .632+ bootstrap methods were superior to the other techniques studied. Although this investigation is performed under some specific conditions, it reveals important trends for the problem of classifier performance prediction under the constraint of a limited data set.Keywords
This publication has 5 references indexed in Scilit:
- Comparison of Non-Parametric Methods for Assessing Classifier Performance in Terms of ROC ParametersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Classifier design for computer‐aided diagnosis: Effects of finite sample size on the mean performance of classical and neural network classifiersMedical Physics, 1999
- Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously-distributed dataStatistics in Medicine, 1998
- Effects of sample size in classifier designIEEE Transactions on Pattern Analysis and Machine Intelligence, 1989
- Estimating the Error Rate of a Prediction Rule: Improvement on Cross-ValidationJournal of the American Statistical Association, 1983