The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

Top Cited Papers

Open Access

29 November 2013

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 42 (D1), D206-D214
https://doi.org/10.1093/nar/gkt1226

Abstract

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

Keywords

This publication has 33 references indexed in Scilit:

Update on activities at the Universal Protein Resource (UniProt) in 2013
Nucleic Acids Research, 2012
EcoCyc: fusing model organism databases with systems biology
Nucleic Acids Research, 2012
Real Time Metagenomics: Using k-mers to annotate metagenomes
Bioinformatics, 2012
KEGG for integration and interpretation of large-scale molecular data sets
Nucleic Acids Research, 2011
Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy
Nucleic Acids Research, 2011
FIGfams: yet another set of protein families
Nucleic Acids Research, 2009
MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes
DNA Research, 2008
Identifying bacterial genes and endosymbiont DNA with Glimmer
Bioinformatics, 2007
The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation
Nucleic Acids Research, 2006
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997

Cited by 3702 articles