miTAR: a hybrid deep learning-based approach for predicting miRNA targets

Abstract
MicroRNAs (miRNAs) have been shown to play essential roles in a wide range of biological processes. Many computational methods have been developed to identify targets of miRNAs. However, the majority of these methods depend on pre-defined features that require considerable efforts and resources to compute and often prove suboptimal at predicting miRNA targets. We developed a novel hybrid deep learning-based (DL-based) approach that is capable of predicting miRNA targets at a higher accuracy. This approach integrates convolutional neural networks (CNNs) that excel in learning spatial features and recurrent neural networks (RNNs) that discern sequential features. Therefore, our approach has the advantages of learning both the intrinsic spatial and sequential features of miRNA:target. The inputs for our approach are raw sequences of miRNAs and genes that can be obtained effortlessly. We applied our approach on two human datasets from recently miRNA target prediction studies and trained two models. We demonstrated that the two models consistently outperform the previous methods according to evaluation metrics on test datasets. Comparing our approach with currently available alternatives on independent datasets shows that our approach delivers substantial improvements in performance. We also show with multiple evidences that our approach is more robust than other methods on small datasets. Our study is the first study to perform comparisons across multiple existing DL-based approaches on miRNA target prediction. Furthermore, we examined the contribution of a Max pooling layer in between the CNN and RNN and demonstrated that it improves the performance of all our models. Finally, a unified model was developed that is robust on fitting different input datasets. We present a new DL-based approach for predicting miRNA targets and demonstrate that our approach outperforms the current alternatives. We supplied an easy-to-use tool, miTAR, at https://github.com/tjgu/miTAR. Furthermore, our analysis results support that Max Pooling generally benefits the hybrid models and potentially prevents overfitting for hybrid models.