Sparse regression with exact clustering

Open Access

1 January 2010

journal article
Published by Institute of Mathematical Statistics in Electronic Journal of Statistics

Vol. 4 (none), 1055-1096
https://doi.org/10.1214/10-ejs578

Abstract

This paper studies a generic sparse regression problem with a customizable sparsity pattern matrix, motivated by, but not limited to, a supervised gene clustering problem in microarray data analysis. The clustered lasso method is proposed with the l₁-type penalties imposed on both the coefficients and their pairwise differences. Somewhat surprisingly, it behaves differently than the lasso or the fused lasso – the exact clustering effect expected from the l₁ penalization is rarely seen in applications. An asymptotic study is performed to investigate the power and limitations of the l₁-penalty in sparse regression. We propose to combine data-augmentation and weights to improve the l₁ technique. To address the computational issues in high dimensions, we successfully generalize a popular iterative algorithm both in practice and in theory and propose an ‘annealing’ algorithm applicable to generic sparse regressions (including the fused/clustered lasso). Some effective accelerating techniques are further investigated to boost the convergence. The accelerated annealing (AA) algorithm, involving only matrix multiplications and thresholdings, can handle a large design matrix as well as a large sparsity pattern matrix.

Keywords

This publication has 26 references indexed in Scilit:

Sparse regression with exact clustering
Electronic Journal of Statistics, 2010
Thresholding-based iterative selection procedures for model selection and shrinkage
Electronic Journal of Statistics, 2009
The sparsity and bias of the Lasso selection in high-dimensional linear regression
The Annals of Statistics, 2008
Rejoinder: One-step sparse estimates in nonconcave penalized likelihood models
The Annals of Statistics, 2008
Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR
Biometrics, 2008
Sparsity oracle inequalities for the Lasso
Electronic Journal of Statistics, 2007
The Adaptive Lasso and Its Oracle Properties
Journal of the American Statistical Association, 2006
Regularization and Variable Selection Via the Elastic Net
Journal of the Royal Statistical Society Series B: Statistical Methodology, 2005
An iterative thresholding algorithm for linear inverse problems with a sparsity constraint
Communications on Pure and Applied Mathematics, 2004
On the Mann iterative process
Transactions of the American Mathematical Society, 1970

Cited by 78 articles