Identifying metabolic enzymes with multiple types of association evidence

Open Access

29 March 2006

journal article
research article
Published by Springer Science and Business Media LLC in BMC Bioinformatics

Vol. 7 (1), 177
https://doi.org/10.1186/1471-2105-7-177

Abstract

Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities.

This publication has 58 references indexed in Scilit:

Expression dynamics of a cellular metabolic network
Molecular Systems Biology, 2005
A global view of pleiotropy and phenotypically derived gene function in yeast
Molecular Systems Biology, 2005
Similarities and Differences in Genome-Wide Expression Data of Six Organisms
PLoS Biology, 2003
Identification of functional links between genes using phylogenetic profiles
Bioinformatics, 2003
Tests for Gene Clustering
Journal of Computational Biology, 2003
Identification of the tRNA-Dihydrouridine Synthase Family
Online Journal of Public Health Informatics, 2002
Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry
Nature, 2002
Functional organization of the yeast proteome by systematic analysis of protein complexes
Nature, 2002
A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting
Journal of Computer and System Sciences, 1997
Basic local alignment search tool
Journal of Molecular Biology, 1990

Cited by 109 articles