A survey of cross-validation procedures for model selection

Top Cited Papers

Open Access

1 January 2010

journal article
research article
Published by Institute of Mathematical Statistics in Statistics Surveys

Vol. 4 (none), 40-79
https://doi.org/10.1214/09-ss054

Abstract

Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its (apparent) universality. Many results exist on model selection performances of cross-validation procedures. This survey intends to relate these results to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results. As a conclusion, guidelines are provided for choosing the best cross-validation procedure according to the particular features of the problem in hand.

Keywords

This publication has 100 references indexed in Scilit:

Minimal Penalties for Gaussian Model Selection
Probability Theory and Related Fields, 2006
Asymptotic Optimality of Likelihood-Based Cross-Validation
Statistical Applications in Genetics and Molecular Biology, 2004
Bandwidth selection in robust smoothing
Journal of Nonparametric Statistics, 1993
Smoothed cross-validation
Probability Theory and Related Fields, 1992
Data-Driven Bandwidth Choice for Density Estimation Based on Dependent Data
The Annals of Statistics, 1990
A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods
Biometrika, 1989
Estimating the Dimension of a Model
The Annals of Statistics, 1978
The Predictive Sample Reuse Method with Applications
Journal of the American Statistical Association, 1975
Some Comments onC_p
Technometrics, 1973
Estimation of Error Rates in Discriminant Analysis
Technometrics, 1968

Cited by 2658 articles