A machine learning-based framework for modeling transcription elongation
- 9 February 2021
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences of the United States of America
- Vol. 118 (6)
- https://doi.org/10.1073/pnas.2007450118
Abstract
RNA polymerase II (Pol II) generally pauses at certain positions along gene bodies, thereby interrupting the transcription elongation process, which is often coupled with various important biological functions, such as precursor mRNA splicing and gene expression regulation. Characterizing the transcriptional elongation dynamics can thus help us understand many essential biological processes in eukaryotic cells. However, experimentally measuring Pol II elongation rates is generally time and resource consuming. We developed PEPMAN (polymerase II elongation pausing modeling through attention-based deep neural network), a deep learning-based model that accurately predicts Pol II pausing sites based on the native elongating transcript sequencing (NET-seq) data. Through fully taking advantage of the attention mechanism, PEPMAN is able to decipher important sequence features underlying Pol II pausing. More importantly, we demonstrated that the analyses of the PEPMAN-predicted results around various types of alternative splicing sites can provide useful clues into understanding the cotranscriptional splicing events. In addition, associating the PEPMAN prediction results with different epigenetic features can help reveal important factors related to the transcription elongation process. All these results demonstrated that PEPMAN can provide a useful and effective tool for modeling transcription elongation and understanding the related biological factors from available high-throughput sequencing data.Funding Information
- National Natural Science Foundation of China (61872216, 81630103, 31900862)
- Turing AI Institute of Nanjing and Zhongguancun Haihua Institute for Frontier Information Technology (None)
This publication has 61 references indexed in Scilit:
- Intragenic DNA methylation modulates alternative splicing by recruiting MeCP2 to promote exon recognitionCell Research, 2013
- Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoansNature Reviews Genetics, 2012
- Applied Force Provides Insight into Transcriptional Pausing and Its Modulation by Transcription Factor NusAMolecular Cell, 2011
- Promoter proximal pausing and the control of gene expressionCurrent Opinion in Genetics & Development, 2011
- Splicing-Dependent RNA Polymerase Pausing in YeastMolecular Cell, 2010
- Global Analysis of Nascent RNA Reveals Transcriptional Pausing in Terminal ExonsMolecular Cell, 2010
- ChIP‐Seq: A Method for Global Identification of Regulatory Elements in the GenomeCurrent Protocols in Molecular Biology, 2010
- Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell IdentitiesMolecular Cell, 2010
- Neuronal cell depolarization induces intragenic chromatin modifications affecting NCAM alternative splicingProceedings of the National Academy of Sciences of the United States of America, 2009
- A Chromatin Landmark and Transcription Initiation at Most Promoters in Human CellsCell, 2007