A robust variable screening procedure for ultra-high dimensional data
- 30 May 2021
- journal article
- research article
- Published by SAGE Publications in Statistical Methods in Medical Research
- Vol. 30 (8), 1816-1832
- https://doi.org/10.1177/09622802211017299
Abstract
Variable selection in ultra-high dimensional regression problems has become an important issue. In such situations, penalized regression models may face computational problems and some pre-screening of the variables may be necessary. A number of procedures for such pre-screening has been developed; among them the Sure Independence Screening (SIS) enjoys some popularity. However, SIS is vulnerable to outliers in the data, and in particular in small samples this may lead to faulty inference. In this paper, we develop a new robust screening procedure. We build on the density power divergence (DPD) estimation approach and introduce DPD-SIS and its extension iterative DPD-SIS. We illustrate the behavior of the methods through extensive simulation studies and show that they are superior to both the original SIS and other robust methods when there are outliers in the data. Finally, we illustrate its use in a study on regulation of lipid metabolism.Funding Information
- Department of Science and Technology, Government of India (INSPIRE Faculty Research Grant)
This publication has 39 references indexed in Scilit:
- Feature Screening via Distance Correlation LearningJournal of the American Statistical Association, 2012
- Principled sure independence screening for Cox models with ultra-high-dimensional covariatesJournal of Multivariate Analysis, 2012
- Stability SelectionJournal of the Royal Statistical Society Series B: Statistical Methodology, 2010
- Nearly unbiased variable selection under minimax concave penaltyThe Annals of Statistics, 2010
- Sure Independence Screening for Ultrahigh Dimensional Feature SpaceJournal of the Royal Statistical Society Series B: Statistical Methodology, 2008
- The sparsity and bias of the Lasso selection in high-dimensional linear regressionThe Annals of Statistics, 2008
- Robust Linear Model Selection Based on Least Angle RegressionJournal of the American Statistical Association, 2007
- The Adaptive Lasso and Its Oracle PropertiesJournal of the American Statistical Association, 2006
- Regression Approaches for Microarray Data AnalysisJournal of Computational Biology, 2003
- Robust and efficient estimation by minimising a density power divergenceBiometrika, 1998