Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison
Top Cited Papers
Open Access
- 28 January 2010
- journal article
- Published by Springer Science and Business Media LLC in Standards in Genomic Sciences
- Vol. 2 (1), 117-134
- https://doi.org/10.4056/sigs.531120
Abstract
The pragmatic species concept for Bacteria and Archaea is ultimately based on DNA-DNA hybridization (DDH). While enabling the taxonomist, in principle, to obtain an estimate of the overall similarity between the genomes of two strains, this technique is tedious and error-prone and cannot be used to incrementally build up a comparative database. Recent technological progress in the area of genome sequencing calls for bioinformatics methods to replace the wet-lab DDH by in-silico genome-to-genome comparison. Here we investigate state-of-the-art methods for inferring whole-genome distances in their ability to mimic DDH. Algorithms to efficiently determine high-scoring sequence pairs or maximally unique matches perform well as a basis of inferring intergenomic distances. The examined distance functions, which are able to cope with heavily reduced genomes and repetitive sequence regions, outperform previously described ones regarding the correlation with and error ratios in emulating DDH. Simulation of incompletely sequenced genomes indicates that some distance formulas are very robust against missing fractions of genomic information. Digitally derived genome-to-genome distances show a better correlation with 16S rRNA gene sequence distances than DDH values. The future perspectives of genome-informed taxonomy are discussed, and the investigated methods are made available as a web service for genome-based species delineation. DOI: 10.4056/sigs.531120Keywords
This publication has 38 references indexed in Scilit:
- En route to a genome-based classification of Archaea and Bacteria?Systematic and Applied Microbiology, 2010
- Complete genome sequence of Chitinophaga pinensis type strain (UQM 2034T)Standards in Genomic Sciences, 2010
- Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairsStandards in Genomic Sciences, 2010
- A phylogeny-driven genomic encyclopaedia of Bacteria and ArchaeaNature, 2009
- A Genomic Distance Based on MUM Indicates Discontinuity between Most Bacterial Species and GeneraJournal of Bacteriology, 2009
- General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variabilityBMC Evolutionary Biology, 2008
- The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadataNucleic Acids Research, 2007
- SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARBNucleic Acids Research, 2007
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- Report of the Ad Hoc Committee on Reconciliation of Approaches to Bacterial SystematicsInternational Journal of Systematic and Evolutionary Microbiology, 1987