Bolt: a New Age Peptide Search Engine for Comprehensive MS/MS Sequencing Through Vast Protein Databases in Minutes
- 26 August 2019
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of the American Society for Mass Spectrometry
- Vol. 30 (11), 2408-2418
- https://doi.org/10.1007/s13361-019-02306-3
Abstract
Recent increases in mass spectrometry speed, sensitivity, and resolution now permit comprehensive proteomics coverage. However, the results are often hindered by sub-optimal data processing pipelines. In almost all MS/MS peptide search engines, users must limit their search space to a canonical database due to time constraints and q value considerations, but this typically does not reflect the individual genetic variations of the organism being studied. In addition, engines will nearly always assume the presence of only fully tryptic peptides and limit PTMs to a handful. Even on high-performance servers, these search engines are computationally expensive, and most users decide to dial back their search parameters. We present Bolt, a new cloud-based search engine that can search more than 900,000 protein sequences (canonical, isoform, mutations, and contaminants) with 41 post-translation modifications and N-terminal and C-terminal partial tryptic search in minutes on a standard configuration laptop. Along with increases in speed, Bolt provides an additional benefit of improvement in high-confidence identifications. Sixty-one percent of peptides uniquely identified by Bolt may be validated by strong fragmentation patterns, compared with 13% of peptides uniquely identified by SEQUEST and 6% of peptides uniquely identified by Mascot. Furthermore, 30% of unique Bolt identifications were verified by all three software on the longer gradient analysis, compared with only 20% and 27% for SEQUEST and Mascot identifications respectively. Bolt represents, to the best of our knowledge, the first fully scalable, cloud-based quantitative proteomic solution that can be operated within a user-friendly GUI interface. Data are available via ProteomeXchange with identifier PXD012700.Keywords
This publication has 29 references indexed in Scilit:
- Protein Analysis by Shotgun/Bottom-up ProteomicsChemical Reviews, 2013
- Byonic: Advanced Peptide and Protein Identification SoftwareCurrent Protocols in Bioinformatics, 2012
- The ABRF Proteomics Research Group Studies: Educational exercises for qualitative and quantitative proteomic analysesProteomics, 2011
- Andromeda: A Peptide Search Engine Integrated into the MaxQuant EnvironmentJournal of Proteome Research, 2011
- Deconvolution and Database Search of Complex Tandem Mass Spectra of Intact ProteinsMolecular & Cellular Proteomics, 2010
- The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass SpectraMolecular & Cellular Proteomics, 2007
- TANDEM: matching proteins with tandem mass spectraBioinformatics, 2004
- PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometryRapid Communications in Mass Spectrometry, 2003
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- Method to Correlate Tandem Mass Spectra of Modified Peptides to Amino Acid Sequences in the Protein DatabaseAnalytical Chemistry, 1995