A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach
Open Access
- 17 September 2012
- journal article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 13 (1), 237
- https://doi.org/10.1186/1471-2105-13-237
Abstract
Yeasts are a model system for exploring eukaryotic genome evolution. Next-generation sequencing technologies are poised to vastly increase the number of yeast genome sequences, both from resequencing projects (population studies) and from de novo sequencing projects (new species). However, the annotation of genomes presents a major bottleneck for de novo projects, because it still relies on a process that is largely manual. Here we present the Yeast Genome Annotation Pipeline (YGAP), an automated system designed specifically for new yeast genome sequences lacking transcriptome data. YGAP does automatic de novo annotation, exploiting homology and synteny information from other yeast species stored in the Yeast Gene Order Browser (YGOB) database. The basic premises underlying YGAP's approach are that data from other species already tells us what genes we should expect to find in any particular genomic region and that we should also expect that orthologous genes are likely to have similar intron/exon structures. Additionally, it is able to detect probable frameshift sequencing errors and can propose corrections for them. YGAP searches intelligently for introns, and detects tRNA genes and Ty-like elements. In tests on Saccharomyces cerevisiae and on the genomes of Naumovozyma castellii and Tetrapisispora blattae newly sequenced with Roche-454 technology, YGAP outperformed another popular annotation program (AUGUSTUS). For S. cerevisiae and N. castellii, 91-93% of YGAP's predicted gene structures were identical to those in previous manually curated gene sets. YGAP has been implemented as a webserver with a user-friendly interface athttp://wolfe.gen.tcd.ie/annotation.Keywords
This publication has 47 references indexed in Scilit:
- Evolutionary erosion of yeast sex chromosomes by mating-type switching accidentsProceedings of the National Academy of Sciences of the United States of America, 2011
- Functional annotations for the Saccharomyces cerevisiae genome: the knowns and the known unknownsTrends in Microbiology, 2009
- Evolution of pathogenicity and sexual reproduction in eight Candida genomesNature, 2009
- Using native and syntenically mapped cDNA alignments to improve de novo gene findingBioinformatics, 2008
- Independent sorting-out of thousands of duplicated gene pairs in two yeast species descended from a whole-genome duplicationProceedings of the National Academy of Sciences of the United States of America, 2007
- A large-scale full-length cDNA analysis to explore the budding yeast transcriptomeProceedings of the National Academy of Sciences of the United States of America, 2006
- AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed systemNucleic Acids Research, 2006
- Genome evolution in yeastsNature, 2004
- Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiaeNature, 2004
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003