Genome evolution reveals biochemical networks and functional modules
- 12 December 2003
- journal article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences of the United States of America
- Vol. 100 (26), 15428-15433
- https://doi.org/10.1073/pnas.2136809100
Abstract
The analysis of completely sequenced genomes uncovers an astonishing variability between species in terms of gene content and order. During genome history, the genes are frequently rear-ranged, duplicated, lost, or transferred horizontally between genomes. These events appear to be stochastic, yet they are under selective constraints resulting from the functional interactions between genes. These genomic constraints form the basis for a variety of techniques that employ systematic genome comparisons to predict functional associations among genes. The most powerful techniques to date are based on conserved gene neighborhood, gene fusion events, and common phylogenetic distributions of gene families. Here we show that these techniques, if integrated quantitatively and applied to a sufficiently large number of genomes, have reached a resolution which allows the characterization of function at a higher level than that of the individual gene: global modularity becomes detectable in a functional protein network. In Escherichia coli, the predicted modules can be bench-marked by comparison to known metabolic pathways. We found as many as 74% of the known metabolic enzymes clustering together in modules, with an average pathway specificity of at least 84%. The modules extend beyond metabolism, and have led to hundreds of reliable functional predictions both at the protein and pathway level. The results indicate that modularity in protein networks is intrinsically encoded in present-day genomes.This publication has 42 references indexed in Scilit:
- The σE Regulon and the Identification of Additional Sporulation Genes in Bacillus subtilisJournal of Molecular Biology, 2003
- Molecular evolution meets the genomics revolutionNature Genetics, 2003
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- Predictome: a database of putative functional links between proteinsNucleic Acids Research, 2002
- Horizontal Gene Transfer in Prokaryotes: Quantification and ClassificationAnnual Review of Microbiology, 2001
- Interplay between an AAA module and an integrin I domain may regulate the function of magnesium chelataseJournal of Molecular Biology, 2001
- Detecting Protein Function and Protein-Protein Interactions from Genome SequencesScience, 1999
- Predicting function: from genes to genomes and backJournal of Molecular Biology, 1998
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997