A Novel Sparse Compositional Technique Reveals Microbial Perturbations
Top Cited Papers
Open Access
- 26 February 2019
- journal article
- research article
- Published by American Society for Microbiology in mSystems
- Vol. 4 (1), e00016-19
- https://doi.org/10.1128/msystems.00016-19
Abstract
The central aims of many host or environmental microbiome studies are to elucidate factors associated with microbial community compositions and to relate microbial features to outcomes. However, these aims are often complicated by difficulties stemming from high-dimensionality, non-normality, sparsity, and the compositional nature of microbiome data sets. A key tool in microbiome analysis is beta diversity, defined by the distances between microbial samples. Many different distance metrics have been proposed, all with varying discriminatory power on data with differing characteristics. Here, we propose a compositional beta diversity metric rooted in a centered log-ratio transformation and matrix completion called robust Aitchison PCA. We demonstrate the benefits of compositional transformations upstream of beta diversity calculations through simulations. Additionally, we demonstrate improved effect size, classification accuracy, and robustness to sequencing depth over the current methods on several decreased sample subsets of real microbiome data sets. Finally, we highlight the ability of this new beta diversity metric to retain the feature loadings linked to sample ordinations revealing salient intercommunity niche feature importance. IMPORTANCE By accounting for the sparse compositional nature of microbiome data sets, robust Aitchison PCA can yield high discriminatory power and salient feature ranking between microbial niches. The software to perform this analysis is available under an open-source license and can be obtained at https://github.com/biocore/DEICODE; additionally, a QIIME 2 plugin is provided to perform this analysis at https://library.qiime2.org/plugins/deicode/.Keywords
Funding Information
- HHS | National Institutes of Health (AR071731)
- National Science Foundation (1332344)
- National Science Foundation (1144086)
- U.S. Department of Energy (DE-SC0012658)
- U.S. Department of Energy (DE-SC0012586)
This publication has 54 references indexed in Scilit:
- Multiple factor analysis: principal component analysis for multitable and multiblock data setsWIREs Computational Statistics, 2013
- The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-omeGigaScience, 2012
- Associating microbiome composition with environmental covariates using generalized UniFrac distancesBioinformatics, 2012
- Microbial community resemblance methods differ in their ability to detect biologically relevant patternsNature Methods, 2010
- Succession of microbial consortia in the developing infant gut microbiomeProceedings of the National Academy of Sciences of the United States of America, 2010
- Forensic identification using skin bacterial communitiesProceedings of the National Academy of Sciences of the United States of America, 2010
- Pyrosequencing-Based Assessment of Soil pH as a Predictor of Soil Bacterial Community Structure at the Continental ScaleApplied and Environmental Microbiology, 2009
- Microbial community profiling for human microbiome projects: Tools, techniques, and challengesGenome Research, 2009
- Quantitative and Qualitative β Diversity Measures Lead to Different Insights into Factors That Structure Microbial CommunitiesApplied and Environmental Microbiology, 2007
- UniFrac: a New Phylogenetic Method for Comparing Microbial CommunitiesApplied and Environmental Microbiology, 2005