Abstract
With the advent of high-throughput sequencing technologies, the rapid generation and accumulation of large amounts of sequencing data pose an insurmountable demand for efficient algorithms for constructing whole-genome phylogenies. The existing phylogenomic methods all use assembled sequences, which are often not available owing to the difficulty of assembling short-reads; this obstructs phylogenetic investigations on species without a reference genome. In this report, we present co-phylog, an assembly-free phylogenomic approach that creates a ‘micro-alignment’ at each ‘object’ in the sequence using the ‘context’ of the object and calculates pairwise distances before reconstructing the phylogenetic tree based on those distances. We explored the parameters’ usages and the optimal working range of co-phylog, assessed co-phylog using the simulated next-generation sequencing (NGS) data and the real NGS raw data. We also compared co-phylog method with traditional alignment and alignment-free methods and illustrated the advantages and limitations of co-phylog method. In conclusion, we demonstrated that co-phylog is efficient algorithm and that it delivers high resolution and accurate phylogenies using whole-genome unassembled sequencing data, especially in the case of closely related organisms, thereby significantly alleviating the computational burden in the genomic era.