A method of finding optimal weight factors for compound identification in gas chromatography–mass spectrometry
Open Access
- 13 February 2012
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 28 (8), 1158-1163
- https://doi.org/10.1093/bioinformatics/bts083
Abstract
Motivation: The compound identification in gas chromatography–mass spectrometry (GC–MS) is achieved by matching the experimental mass spectrum to the mass spectra in a spectral library. It is known that the intensities with higher m/z value in the GC–MS mass spectrum are the most diagnostic. Therefore, to increase the relative significance of peak intensities of higher m/z value, the intensities and m/z values are usually transformed with a set of weight factors. A poor quality of weight factors can significantly decrease the accuracy of compound identification. With the significant enrichment of the mass spectral database and the broad application of GC–MS, it is important to re-visit the methods of discovering the optimal weight factors for high confident compound identification. Results: We developed a novel approach to finding the optimal weight factors only through a reference library for high accuracy compound identification. The developed approach first calculates the ratio of skewness to kurtosis of the mass spectral similarity scores among spectra (compounds) in a reference library and then considers a weight factor with the maximum ratio as the optimal weight factor. We examined our approach by comparing the accuracy of compound identification using the mass spectral library maintained by the National Institute of Standards and Technology. The results demonstrate that the optimal weight factors for fragment ion peak intensity and m/z value found by the developed approach outperform the current weight factors for compound identification. Availability: The results and R package are available at http://stage.louisville.edu/faculty/x0zhan17/software/ software-development. Contact: s0kim023@louisville.edu; xiang.zhang@louisville.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 12 references indexed in Scilit:
- Wavelet- and Fourier-Transform-Based Spectrum Similarity Approaches to Compound Identification in Gas Chromatography/Mass SpectrometryAnalytical Chemistry, 2011
- MassBank: a public repository for sharing mass spectral data for life sciencesJournal of Mass Spectrometry, 2010
- Statistical Data Analysis ExplainedPublished by Wiley ,2008
- Analysis of Peptide MS/MS Spectra from Large-Scale Proteomics Experiments Using Spectrum LibrariesAnalytical Chemistry, 2006
- Using Annotated Peptide Mass Spectrum Libraries for Protein IdentificationJournal of Proteome Research, 2006
- Improving large‐scale proteomics by clustering of mass spectrometry dataProteomics, 2004
- A Method for Quantitatively Differentiating Crude Natural Extracts Using High-Performance Liquid Chromatography−Electrospray Mass SpectrometryAnalytical Chemistry, 1998
- Reliability ranking and scaling improvements to the probability based matching system for unknown mass spectraAnalytical Chemistry, 1985
- Mass Spectral Library Searches Using Ion Series Data CompressionJournal of Chemical Information and Computer Sciences, 1979
- Identification of mass spectra by computer-searching a file of known spectraAnalytical Chemistry, 1971