A New Tag Index Scheme Enables Fast Peptide Retrieval for Protein Identification
Open Access
- 1 January 2022
- journal article
- research article
- Published by Scientific Research Publishing, Inc. in Journal of Computer and Communications
- Vol. 10 (04), 14-23
- https://doi.org/10.4236/jcc.2022.104002
Abstract
Sequence tag index in the field of computational proteomics can be used to facilitate faster open-search-based identification of modified peptides and in-depth analysis of mass spectrometry data. In protein-identification search engines, sequence tag index are playing a prominent role in recent ten years due to fast searching speed. However, in pursuit of less index space consumption, some protein search engines design excessively concise index schemes which lead to higher computational burden. We proposed a new tag index scheme named TIIP with a better balance between space and time complexity. TIIP has a unique two-level hierarchical index structure which allows rapid retrieval of all peptide sequences and their corresponding masses. Theoretically, the index space consumption of TIIP is not much higher compared to the typical tag index schemes, but the time complexity of sequence retrieval can be reduced to O(1), and practically, TIIP has about one million fold improvement in searching speed compared with brute force approach.Keywords
This publication has 21 references indexed in Scilit:
- Fast Multi-blind Modification Search through Tandem Mass SpectrometryMolecular & Cellular Proteomics, 2012
- Speeding up tandem mass spectrometry-based database searching by longest common prefixBMC Bioinformatics, 2010
- pFind 2.0: a software package for peptide and protein identification via tandem mass spectrometryRapid Communications in Mass Spectrometry, 2007
- InsPecT: Identification of Posttranslationally Modified Peptides from Tandem Mass SpectraAnalytical Chemistry, 2005
- pFind: a novel database-searching software system for automated peptide and protein identification via tandem mass spectrometryBioinformatics, 2005
- Exploiting the kernel trick to correlate fragment ions for peptide identification via tandem mass spectrometryBioinformatics, 2004
- TANDEM: matching proteins with tandem mass spectraBioinformatics, 2004
- GutenTag: High-Throughput Sequence Tagging via an Empirically Derived Fragmentation ModelAnalytical Chemistry, 2003
- Mass spectrometry-based proteomicsNature, 2003
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994