Principal component analysis: a review and recent developments
Top Cited Papers
- 13 April 2016
- journal article
- review article
- Published by The Royal Society in Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
- Vol. 374 (2065), 20150202
- https://doi.org/10.1098/rsta.2015.0202
Abstract
Large datasets are increasingly common and are often difficult to interpret. Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, increasing interpretability but at the same time minimizing information loss. It does so by creating new uncorrelated variables that successively maximize variance. Finding such new variables, the principal components, reduces to solving an eigenvalue/eigenvector problem, and the new variables are defined by the dataset at hand, not a priori , hence making PCA an adaptive data analysis technique. It is adaptive in another sense too, since variants of the technique have been developed that are tailored to various different data types and structures. This article will begin by introducing the basic ideas of PCA, discussing what it can and cannot do. It will then describe some variants of PCA and their application.Funding Information
- Portuguese Science Foundation FCT (PEst-OE/MAT/UI0006/2014)
This publication has 38 references indexed in Scilit:
- Robust principal component analysis?Journal of the ACM, 2011
- Super-sparse principal component analyses for high-throughput genomic dataBMC Bioinformatics, 2010
- On Consistency and Sparsity for Principal Components Analysis in High DimensionsJournal of the American Statistical Association, 2009
- A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysisBiostatistics, 2009
- Sparse Principal Component AnalysisJournal of Computational and Graphical Statistics, 2006
- Functional Data AnalysisPublished by Wiley ,2005
- Generalized Minkowski metrics for mixed feature-type data analysisIEEE Transactions on Systems, Man, and Cybernetics, 1994
- Principal VariablesTechnometrics, 1984
- The biplot graphic display of matrices with application to principal component analysisBiometrika, 1971
- Analysis of a complex of statistical variables into principal components.Journal of Educational Psychology, 1933