Sparse Principal Component Analysis

Top Cited Papers

1 June 2006

journal article
research article
Published by Taylor & Francis Ltd in Journal of Computational and Graphical Statistics

Vol. 15 (2), 265-286
https://doi.org/10.1198/106186006x113430

Abstract

Principal component analysis (PCA) is widely used in data processing and dimensionality reduction. However, PCA suffers from the fact that each principal component is a linear combination of all the original variables, thus it is often difficult to interpret the results. We introduce a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings. We first show that PCA can be formulated as a regression-type optimization problem; sparse loadings are then obtained by imposing the lasso (elastic net) constraint on the regression coefficients. Efficient algorithms are proposed to fit our SPCA models for both regular multivariate data and gene expression arrays. We also give a new formula to compute the total variance of modified principal components. As illustrations, SPCA is applied to real and simulated data with encouraging results.

Keywords

This publication has 14 references indexed in Scilit:

Regularization and Variable Selection Via the Elastic Net
Journal of the Royal Statistical Society Series B: Statistical Methodology, 2005
Least angle regression
The Annals of Statistics, 2004
Diagnosis of multiple cancer types by shrunken centroids of gene expression
Proceedings of the National Academy of Sciences of the United States of America, 2002
Multiclass cancer diagnosis using tumor gene expression signatures
Proceedings of the National Academy of Sciences of the United States of America, 2001
A new approach to variable selection in least squares problems
IMA Journal of Numerical Analysis, 2000
Loading and correlations in the interpretation of principle compenents
Journal of Applied Statistics, 1995
Rotation of principal components: choice of normalization constraints
Journal of Applied Statistics, 1995
Principal Variables
Technometrics, 1984
Two Case Studies in the Application of Principal Component Analysis
Journal of the Royal Statistical Society Series C: Applied Statistics, 1967

Cited by 1924 articles