SeqFold: Genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data
Open Access
- 11 October 2012
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 23 (2), 377-387
- https://doi.org/10.1101/gr.138545.112
Abstract
We present an integrative approach, SeqFold, that combines high-throughput RNA structure profiling data with computational prediction for genome-scale reconstruction of RNA secondary structures. SeqFold transforms experimental RNA structure information into a structure preference profile (SPP) and uses it to select stable RNA structure candidates representing the structure ensemble. Under a high-dimensional classification framework, SeqFold efficiently matches a given SPP to the most likely cluster of structures sampled from the Boltzmann-weighted ensemble. SeqFold is able to incorporate diverse types of RNA structure profiling data, including parallel analysis of RNA structure (PARS), selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq), fragmentation sequencing (FragSeq) data generated by deep sequencing, and conventional SHAPE data. Using the known structures of a wide range of mRNAs and noncoding RNAs as benchmarks, we demonstrate that SeqFold outperforms or matches existing approaches in accuracy and is more robust to noise in experimental data. Application of SeqFold to reconstruct the secondary structures of the yeast transcriptome reveals the diverse impact of RNA secondary structure on gene regulation, including translation efficiency, transcription initiation, and protein-RNA interactions. SeqFold can be easily adapted to incorporate any new types of high-throughput RNA structure profiling data and is widely applicable to analyze RNA structures in any transcriptome.Keywords
This publication has 54 references indexed in Scilit:
- Understanding the Errors of SHAPE-Directed RNA Structure ModelingBiochemistry, 2011
- Modeling and automation of sequencing-based characterization of RNA structureProceedings of the National Academy of Sciences of the United States of America, 2011
- Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq)Proceedings of the National Academy of Sciences, 2011
- FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencingNature Methods, 2010
- SHAPE-directed RNA secondary structure predictionMethods, 2010
- Short RNAs Are Transcribed from Repressed Polycomb Target Genes and Interact with Polycomb Repressive Complex-2Molecular Cell, 2010
- Architecture and secondary structure of an entire HIV-1 RNA genomeNature, 2009
- Accurate SHAPE-directed RNA structure determinationProceedings of the National Academy of Sciences, 2009
- Structural inference of native and partially folded RNA by high-throughput contact mappingProceedings of the National Academy of Sciences, 2008
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990