Leveraging skewed transcript abundance by RNA-Seq to increase the genomic depth of the tree of life

4 January 2010

journal article
research article
Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences of the United States of America

Vol. 107 (4), 1476-1481
https://doi.org/10.1073/pnas.0910449107

Abstract

Assembling the tree of life is a major goal of biology, but progress has been hindered by the difficulty and expense of obtaining the orthologous DNA required for accurate and fully resolved phylogenies. Next-generation DNA sequencing technologies promise to accelerate progress, but sequencing the genomes of hundreds of thousands of eukaryotic species remains impractical. Eukaryotic transcriptomes, which are smaller than genomes and biased toward highly expressed genes that tend to be conserved, could potentially provide a rich set of phylogenetic characters. We sampled the transcriptomes of 10 mosquito species by assembling 36-bp sequence reads into phylogenomic data matrices containing hundreds of thousands of orthologous nucleotides from hundreds of genes. Analysis of these data matrices yielded robust phylogenetic inferences, even with data matrices constructed from surprisingly few sequence reads. This approach is more efficient, data-rich, and economical than traditional PCR-based and EST-based methods and provides a scalable strategy for generating phylogenomic data matrices to infer the branches and twigs of the tree of life.

This publication has 55 references indexed in Scilit:

GenBank
Nucleic Acids Research, 2009
Evidence for an ancient adaptive episode of convergent molecular evolution
Proceedings of the National Academy of Sciences of the United States of America, 2009
RNA-Seq: a revolutionary tool for transcriptomics
Nature Reviews Genetics, 2009
Resolving Arthropod Phylogeny: Exploring Phylogenetic Signal within 41 kb of Protein-Coding Nuclear Gene Sequence
Systematic Biology, 2008
Accurate whole human genome sequencing using reversible terminator chemistry
Nature, 2008
Mapping and quantifying mammalian transcriptomes by RNA-Seq
Nature Methods, 2008
RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models
Bioinformatics, 2006
Initial sequencing and analysis of the human genome
Nature, 2001
Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference
Molecular Biology and Evolution, 1999

Cited by 118 articles