An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs
Open Access
- 1 January 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 34 (10), 3150-3160
- https://doi.org/10.1093/nar/gkl396
Abstract
Reconstructing full-length transcript isoforms from sequence fragments (such as ESTs) is a major interest and challenge for bioinformatic analysis of pre-mRNA alternative splicing. This problem has been formulated as finding traversals across the splice graph, which is a directed acyclic graph (DAG) representation of gene structure and alternative splicing. In this manuscript we introduce a probabilistic formulation of the isoform reconstruction problem, and provide an expectation-maximization (EM) algorithm for its maximum likelihood solution. Using a series of simulated data and expressed sequences from real human genes, we demonstrate that our EM algorithm can correctly handle various situations of fragmentation and coupling in the input data. Our work establishes a general probabilistic framework for splice graph-based reconstructions of full-length isoforms.Keywords
This publication has 52 references indexed in Scilit:
- A Polar Mechanism Coordinates Different Regions of Alternative Splicing within a Single GeneMolecular Cell, 2005
- Gene and alternative splicing annotation with AIRGenome Research, 2005
- Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencingNature Biotechnology, 2004
- ESTGenes: Alternative Splicing From ESTs in EnsemblGenome Research, 2004
- The impact of very short alternative splicing on protein structures and functions in the human genomeTrends in Genetics, 2004
- The Multiassembly Problem: Reconstructing Multiple Transcript Isoforms From EST Fragment MixturesGenome Research, 2004
- Improving the Arabidopsis genome annotation using maximal transcript alignment assembliesNucleic Acids Research, 2003
- Single Molecule Profiling of Alternative Pre-mRNA SplicingScience, 2003
- Inferring Alternative Splicing Patterns in Mouse from a Full-Length cDNA Library and Microarray DataGenome Research, 2002
- dbEST — database for “expressed sequence tags”Nature Genetics, 1993