Sybil: Methods and Software for Multiple Genome Comparison and Visualization
- 1 January 2007
- book chapter
- research article
- Published by Springer Science and Business Media LLC in Methods in molecular biology (Clifton, N.J.)
- Vol. 408, 93-108
- https://doi.org/10.1007/978-1-59745-547-3_6
Abstract
With the successful completion of genome sequencing projects for a variety of model organisms, the selection of candidate organisms for future sequencing efforts has been guided increasingly by a desire to enable comparative genomics. This trend has both depended on and encouraged the development of software tools that can elucidate and capitalize on the similarities and differences between genomes. “Sybil,” one such tool, is a primarily web-based software package whose primary goal is to facilitate the analysis and visualization of comparative genome data, with a particular emphasis on protein and gene cluster data. Herein, a two-phase protein clustering algorithm, used to generate protein clusters suitable for analysis through Sybil and a method for creating graphical displays of protein or gene clusters that span multiple genomes are described. When combined, these two relatively simple techniques provide the user of the Sybil software (The Institute for Genomic Research [TIGR] Bioinformatics Department) with a browsable graphical display of his or her “input” genomes, showing which genes are conserved based on the parameters supplied to the protein clustering algorithm. For any given protein cluster the graphical display consists of a local alignment of the genomes in which the clustered genes are located. The genomes are arranged in a vertical stack, as in a multiple alignment, and shaded areas are used to connect genes in the same cluster, thus displaying conservation at the protein level in the context of the underlying genomic sequences. The authors have found this display—and slight variants thereof—useful for a variety of annotation and comparison tasks, ranging from identifying “missed” gene models or single-exon discrepancies between orthologous genes, to finding large or small regions of conserved gene synteny, and investigating the properties of the breakpoints between such regions.This publication has 21 references indexed in Scilit:
- Comparative Genomics of Emerging Human Ehrlichiosis AgentsPLoS Genetics, 2006
- The Genome Sequence of Trypanosoma cruzi , Etiologic Agent of Chagas DiseaseScience, 2005
- SynBrowse: a synteny browser for comparative sequence analysisBioinformatics, 2005
- OrthoMCL: Identification of Ortholog Groups for Eukaryotic GenomesGenome Research, 2003
- The Bioperl Toolkit: Perl Modules for the Life SciencesGenome Research, 2002
- The Human Genome Browser at UCSCGenome Research, 2002
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- Automatic clustering of orthologs and in-paralogs from pairwise species comparisonsJournal of Molecular Biology, 2001
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Basic local alignment search toolJournal of Molecular Biology, 1990