Extended Bayesian information criteria for model selection with large model spaces
Top Cited Papers
Open Access
- 1 September 2008
- journal article
- Published by Oxford University Press (OUP) in Biometrika
- Vol. 95 (3), 759-771
- https://doi.org/10.1093/biomet/asn034
Abstract
The ordinary Bayesian information criterion is too liberal for model selection when the model space is large. In this paper, we re-examine the Bayesian paradigm for model selection and propose an extended family of Bayesian information criteria, which take into account both the number of unknown parameters and the complexity of the model space. Their consistency is established, in particular allowing the number of covariates to increase to infinity with the sample size. Their performance in various situations is evaluated by simulation studies. It is demonstrated that the extended Bayesian information criteria incur a small loss in the positive selection rate but tightly control the false discovery rate, a desirable property in many applications. The extended Bayesian information criteria are extremely useful for variable selection in problems with a moderate sample size but with a huge number of covariates, especially in genome-wide association studies, which are now an active area in genetics research.Keywords
This publication has 12 references indexed in Scilit:
- High-dimensional graphs and variable selection with the LassoThe Annals of Statistics, 2006
- Genome-wide strategies for detecting multiple loci that influence complex diseasesNature Genetics, 2005
- Model selection in irregular problems: Applications to mapping quantitative trait lociBiometrika, 2004
- Modifying the Schwarz Bayesian Information Criterion to Locate Multiple Interacting Quantitative Trait LociGenetics, 2004
- A Model Selection Approach for the Identification of Quantitative Trait Loci in Experimental CrossesJournal of the Royal Statistical Society Series B: Statistical Methodology, 2002
- Variable Selection via Nonconcave Penalized Likelihood and its Oracle PropertiesJournal of the American Statistical Association, 2001
- A strongly consistent procedure for model selection in a regression problemBiometrika, 1989
- Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index SetThe Annals of Statistics, 1987
- Estimating the Dimension of a ModelThe Annals of Statistics, 1978
- The Large-Sample Distribution of the Likelihood Ratio for Testing Composite HypothesesThe Annals of Mathematical Statistics, 1938