Sparse regression with exact clustering
Open Access
- 1 January 2010
- journal article
- Published by Institute of Mathematical Statistics in Electronic Journal of Statistics
- Vol. 4 (none), 1055-1096
- https://doi.org/10.1214/10-ejs578
Abstract
This paper studies a generic sparse regression problem with a customizable sparsity pattern matrix, motivated by, but not limited to, a supervised gene clustering problem in microarray data analysis. The clustered lasso method is proposed with the l1-type penalties imposed on both the coefficients and their pairwise differences. Somewhat surprisingly, it behaves differently than the lasso or the fused lasso – the exact clustering effect expected from the l1 penalization is rarely seen in applications. An asymptotic study is performed to investigate the power and limitations of the l1-penalty in sparse regression. We propose to combine data-augmentation and weights to improve the l1 technique. To address the computational issues in high dimensions, we successfully generalize a popular iterative algorithm both in practice and in theory and propose an ‘annealing’ algorithm applicable to generic sparse regressions (including the fused/clustered lasso). Some effective accelerating techniques are further investigated to boost the convergence. The accelerated annealing (AA) algorithm, involving only matrix multiplications and thresholdings, can handle a large design matrix as well as a large sparsity pattern matrix.Keywords
This publication has 26 references indexed in Scilit:
- Sparse regression with exact clusteringElectronic Journal of Statistics, 2010
- Thresholding-based iterative selection procedures for model selection and shrinkageElectronic Journal of Statistics, 2009
- The sparsity and bias of the Lasso selection in high-dimensional linear regressionThe Annals of Statistics, 2008
- Rejoinder: One-step sparse estimates in nonconcave penalized likelihood modelsThe Annals of Statistics, 2008
- Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCARBiometrics, 2008
- Sparsity oracle inequalities for the LassoElectronic Journal of Statistics, 2007
- The Adaptive Lasso and Its Oracle PropertiesJournal of the American Statistical Association, 2006
- Regularization and Variable Selection Via the Elastic NetJournal of the Royal Statistical Society Series B: Statistical Methodology, 2005
- An iterative thresholding algorithm for linear inverse problems with a sparsity constraintCommunications on Pure and Applied Mathematics, 2004
- On the Mann iterative processTransactions of the American Mathematical Society, 1970