ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data
Top Cited Papers
- 20 January 2023
- journal article
- research article
- Published by American Association for the Advancement of Science (AAAS) in Science Advances
- Vol. 9 (3), eabq5072
- https://doi.org/10.1126/sciadv.abq5072
Abstract
Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.This publication has 58 references indexed in Scilit:
- nhmmer: DNA homology search with profile HMMsBioinformatics, 2013
- STAR: ultrafast universal RNA-seq alignerBioinformatics, 2012
- GENCODE: The reference human genome annotation for The ENCODE ProjectGenome Research, 2012
- Landscape of transcription in human cellsNature, 2012
- Functional consequences of developmentally regulated alternative splicingNature Reviews Genetics, 2011
- CD44 splice isoform switching in human and mouse epithelium is essential for epithelial-mesenchymal transition and breast cancer progressionJCI Insight, 2011
- Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiationNature Biotechnology, 2010
- ZEB1 Enhances Transendothelial Migration and Represses the Epithelial Phenotype of Prostate Cancer CellsMolecular Biology of the Cell, 2009
- Comprehensive splice-site analysis using comparative genomicsNucleic Acids Research, 2006
- An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphsNucleic Acids Research, 2006