Strategies to enable large-scale proteomics for reproducible research
Open Access
- 30 July 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Communications
- Vol. 11 (1), 1-13
- https://doi.org/10.1038/s41467-020-17641-3
Abstract
Reproducible research is the bedrock of experimental science. To enable the deployment of large-scale proteomics, we assess the reproducibility of mass spectrometry (MS) over time and across instruments and develop computational methods for improving quantitative accuracy. We perform 1560 data independent acquisition (DIA)-MS runs of eight samples containing known proportions of ovarian and prostate cancer tissue and yeast, or control HEK293T cells. Replicates are run on six mass spectrometers operating continuously with varying maintenance schedules over four months, interspersed with ~5000 other runs. We utilise negative controls and replicates to remove unwanted variation and enhance biological signal, outperforming existing methods. We also design a method for reducing missing values. Integrating these computational modules into a pipeline (ProNorM), we mitigate variation among instruments over time and accurately predict tissue proportions. We demonstrate how to improve the quantitative analysis of large-scale DIA-MS data, providing a pathway toward clinical proteomics.Keywords
Funding Information
- Department of Health | National Health and Medical Research Council (GNT1138536, GNT1047070)
- Cancer Institute NSW (REG171150)
- NSW Ministry of Health (CMP-01)
- University of Sydney
- Medical Research Futures Fund
This publication has 51 references indexed in Scilit:
- A cross-platform toolkit for mass spectrometry and proteomicsNature Biotechnology, 2012
- Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome AnalysisMolecular & Cellular Proteomics, 2012
- The sva package for removing batch effects and other unwanted variation in high-throughput experimentsBioinformatics, 2012
- Reversed‐phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cellsProteomics, 2011
- Skyline: an open source document editor for creating and analyzing targeted proteomics experimentsBioinformatics, 2010
- ProCOC: The prostate cancer outcomes cohort studyBMC Urology, 2008
- Quantitative chemical proteomics reveals mechanisms of action of clinical ABL kinase inhibitorsNature Biotechnology, 2007
- Adjusting batch effects in microarray expression data using empirical Bayes methodsBiostatistics, 2006
- Forecasting with artificial neural networks:: The state of the artInternational Journal of Forecasting, 1998
- Multilayer perceptrons for classification and regressionNeurocomputing, 1991