RNA secondary structure prediction using deep learning with thermodynamic integration
Top Cited Papers
Open Access
- 11 February 2021
- journal article
- research article
- Published by Springer Science and Business Media LLC in Nature Communications
- Vol. 12 (1), 1-9
- https://doi.org/10.1038/s41467-021-21194-4
Abstract
Accurate predictions of RNA secondary structures can help uncover the roles of functional non-coding RNAs. Although machine learning-based models have achieved high performance in terms of prediction accuracy, overfitting is a common risk for such highly parameterized models. Here we show that overfitting can be minimized when RNA folding scores learnt using a deep neural network are integrated together with Turner's nearest-neighbor free energy parameters. Training the model with thermodynamic regularization ensures that folding scores and the calculated free energy are as close as possible. In computational experiments designed for newly discovered non-coding RNAs, our algorithm (MXfold2) achieves the most robust and accurate predictions of RNA secondary structures without sacrificing computational efficiency compared to several other algorithms. The results suggest that integrating thermodynamic information could help improve the robustness of deep learning-based predictions of RNA secondary structure. Accurately predicting the secondary structure of non-coding RNAs can help unravel their function. Here the authors propose a method integrating thermodynamic information and deep learning to improve the robustness of RNA secondary structure prediction compared to several existing algorithms.Funding Information
- MEXT | Japan Society for the Promotion of Science (19H04210, 19K22897, 17H06410, 18J21767, 17H06410)
This publication has 38 references indexed in Scilit:
- Infernal 1.1: 100-fold faster RNA homology searchesBioinformatics, 2013
- CD-HIT: accelerated for clustering the next-generation sequencing dataBioinformatics, 2012
- A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and moreRNA, 2011
- ViennaRNA Package 2.0Algorithms for Molecular Biology, 2011
- Rfam: Wikipedia, clans and the "decimal" releaseNucleic Acids Research, 2010
- RNAstructure: software for RNA secondary structure prediction and analysisBMC Bioinformatics, 2010
- NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structureNucleic Acids Research, 2009
- CENTROIDFOLD: a web server for RNA secondary structure predictionNucleic Acids Research, 2009
- VARNA: Interactive drawing and editing of the RNA secondary structureBioinformatics, 2009
- Efficient siRNA selection using hybridization thermodynamicsNucleic Acids Research, 2007