Model-Free Feature Screening for Ultrahigh-Dimensional Data

Top Cited Papers

1 December 2011

journal article
Published by Informa UK Limited in Journal of the American Statistical Association

Vol. 106 (496), 1464-1475
https://doi.org/10.1198/jasa.2011.tm10563

Abstract

With the recent explosion of scientific data of unprecedented size and complexity, feature ranking and screening are playing an increasingly important role in many scientific studies. In this article, we propose a novel feature screening procedure under a unified model framework, which covers a wide variety of commonly used parametric and semiparametric models. The new method does not require imposing a specific model structure on regression functions, and thus is particularly appealing to ultrahigh-dimensional regressions, where there are a huge number of candidate predictors but little information about the actual model forms. We demonstrate that, with the number of predictors growing at an exponential rate of the sample size, the proposed procedure possesses consistency in ranking, which is both useful in its own right and can lead to consistency in selection. The new procedure is computationally efficient and simple, and exhibits a competent empirical performance in our intensive simulations and real data analysis.

Keywords

This publication has 17 references indexed in Scilit:

Sure independence screening in generalized linear models with NP-dimensionality
The Annals of Statistics, 2010
Sure Independence Screening for Ultrahigh Dimensional Feature Space
Journal of the Royal Statistical Society Series B: Statistical Methodology, 2008
The Dantzig selector: Statistical estimation when p is much larger than n
The Annals of Statistics, 2007
Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data
Bioinformatics, 2005
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Journal of the American Statistical Association, 2001
Generalized Partially Linear Single-Index Models
Journal of the American Statistical Association, 1997
Better Subset Regression Using the Nonnegative Garrote
Technometrics, 1995
On almost Linearity of Low Dimensional Projections from High Dimensional Data
The Annals of Statistics, 1993
Optimal Smoothing in Single-Index Models
The Annals of Statistics, 1993
Sliced Inverse Regression for Dimension Reduction
Journal of the American Statistical Association, 1991

Cited by 350 articles