Co-phylog: an assembly-free phylogenomic approach for closely related organisms

Open Access

18 January 2013

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 41 (7), e75
https://doi.org/10.1093/nar/gkt003

Abstract

With the advent of high-throughput sequencing technologies, the rapid generation and accumulation of large amounts of sequencing data pose an insurmountable demand for efficient algorithms for constructing whole-genome phylogenies. The existing phylogenomic methods all use assembled sequences, which are often not available owing to the difficulty of assembling short-reads; this obstructs phylogenetic investigations on species without a reference genome. In this report, we present co-phylog, an assembly-free phylogenomic approach that creates a ‘micro-alignment’ at each ‘object’ in the sequence using the ‘context’ of the object and calculates pairwise distances before reconstructing the phylogenetic tree based on those distances. We explored the parameters’ usages and the optimal working range of co-phylog, assessed co-phylog using the simulated next-generation sequencing (NGS) data and the real NGS raw data. We also compared co-phylog method with traditional alignment and alignment-free methods and illustrated the advantages and limitations of co-phylog method. In conclusion, we demonstrated that co-phylog is efficient algorithm and that it delivers high resolution and accurate phylogenies using whole-genome unassembled sequencing data, especially in the case of closely related organisms, thereby significantly alleviating the computational burden in the genomic era.

Keywords

Other Versions

This publication has 32 references indexed in Scilit:

pIRS: Profile-based Illumina pair-end reads simulator
Bioinformatics, 2012
ART: a next-generation sequencing read simulator
Bioinformatics, 2011
ALF—A Simulation Framework for Genome Evolution
Molecular Biology and Evolution, 2011
Field guide to next‐generation DNA sequencers
Molecular Ecology Resources, 2011
Whole-proteome phylogeny of prokaryotes by feature frequency profiles: An alignment-free method with optimal feature resolution
Proceedings of the National Academy of Sciences of the United States of America, 2009
Whole-Genome-Based Phylogeny and Divergence of the Genus Brucella
Journal of Bacteriology, 2009
The Ribosomal Database Project: improved alignments and new tools for rRNA analysis
Nucleic Acids Research, 2008
Alignment Uncertainty and Genomic Analysis
Science, 2008
The Average Common Substring Approach to Phylogenomic Reconstruction
Journal of Computational Biology, 2006
Whole Proteome Prokaryote Phylogeny Without Sequence Alignment: A K -String Composition Approach
Journal of Molecular Evolution, 2004

Cited by 114 articles