OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds
Open Access
- 8 April 2013
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 41 (10), 5149-5163
- https://doi.org/10.1093/nar/gkt216
Abstract
A crucial step in analyzing mRNA-Seq data is to accurately and efficiently map hundreds of millions of reads to the reference genome and exon junctions. Here we present OLego, an algorithm specifically designed for de novo mapping of spliced mRNA-Seq reads. OLego adopts a multiple-seed-and-extend scheme, and does not rely on a separate external aligner. It achieves high sensitivity of junction detection by strategic searches with small seeds (∼14 nt for mammalian genomes). To improve accuracy and resolve ambiguous mapping at junctions, OLego uses a built-in statistical model to score exon junctions by splice-site strength and intron size. Burrows–Wheeler transform is used in multiple steps of the algorithm to efficiently map seeds, locate junctions and identify small exons. OLego is implemented in C++ with fully multithreaded execution, and allows fast processing of large-scale data. We systematically evaluated the performance of OLego in comparison with published tools using both simulated and real data. OLego demonstrated better sensitivity, higher or comparable accuracy and substantially improved speed. OLego also identified hundreds of novel micro-exons (in vivo. OLego is freely available at http://zhanglab.c2b2.columbia.edu/index.php/OLego.This publication has 43 references indexed in Scilit:
- Expansion of the eukaryotic proteome by alternative splicingNature, 2010
- RNA and DiseaseCell, 2009
- RNA-Seq: a revolutionary tool for transcriptomicsNature Reviews Genetics, 2009
- Alternative isoform regulation in human tissue transcriptomesNature, 2008
- Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell linesNature Genetics, 2008
- Mapping and quantifying mammalian transcriptomes by RNA-SeqNature Methods, 2008
- Splicing Regulation in Neurologic DiseaseNeuron, 2006
- Alternative splicing and RNA selection pressure — evolutionary consequences for eukaryotic genomesNature Reviews Genetics, 2006
- Nova regulates brain-specific splicing to shape the synapseNature Genetics, 2005
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002