Focus on the spectra that matter by clustering of quantification data in shotgun proteomics
Open Access
- 26 June 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Communications
- Vol. 11 (1), 1-12
- https://doi.org/10.1038/s41467-020-17037-3
Abstract
In shotgun proteomics, the analysis of label-free quantification experiments is typically limited by the identification rate and the noise level in the quantitative data. This generally causes a low sensitivity in differential expression analysis. Here, we propose a quantification-first approach for peptides that reverses the classical identification-first workflow, thereby preventing valuable information from being discarded in the identification stage. Specifically, we introduce a method, Quandenser, that applies unsupervised clustering on both MS1 and MS2 level to summarize all analytes of interest without assigning identities. This reduces search time due to the data reduction. We can now employ open modification and de novo searches to identify analytes of interest that would have gone unnoticed in traditional pipelines. Quandenser+Triqler outperforms the state-of-the-art method MaxQuant+Perseus, consistently reporting more differentially abundant proteins for all tested datasets. Software is available for all major operating systems at https://github.com/statisticalbiotechnology/quandenser, under Apache 2.0 license. Matching mass spectra to peptide sequences is the usual first step in proteomics data analysis, often followed by peptide quantification. Here, the authors show that clustering and quantifying mass spectral features prior to peptide identification can increase the sensitivity of label-free quantitative proteomics.Funding Information
- Vetenskapsrådet (2017-04030)
This publication has 49 references indexed in Scilit:
- PRIDE Cluster: building a consensus of proteomics dataNature Methods, 2013
- Fast Multi-blind Modification Search through Tandem Mass SpectrometryMolecular & Cellular Proteomics, 2012
- Faster SEQUEST Searching for Peptide Identification from Tandem Mass SpectraJournal of Proteome Research, 2011
- Spectral archives: extending spectral libraries to analyze both identified and unidentified spectraNature Methods, 2011
- More than 100,000 Detectable Peptide Species Elute in Single Shotgun Proteomics Runs but the Majority is Inaccessible to Data-Dependent LC−MS/MSJournal of Proteome Research, 2011
- Novel Oxidative Modifications in Redox-Active Cysteine ResiduesMolecular & Cellular Proteomics, 2011
- A statistical framework for protein quantitation in bottom-up MS-based proteomicsBioinformatics, 2009
- Systematic and integrative analysis of large gene lists using DAVID bioinformatics resourcesNature Protocols, 2008
- ProteoWizard: open source software for rapid proteomics tools developmentBioinformatics, 2008
- Significance analysis of microarrays applied to the ionizing radiation responseProceedings of the National Academy of Sciences of the United States of America, 2001