Research and Implementation of DNA Molecular Sequence Pattern Matching Algorithm Based on Bioinformatics
- 1 January 2023
- journal article
- Published by Hans Publishers in Computer Science and Application
- Vol. 13 (02), 236-250
- https://doi.org/10.12677/csa.2023.132024
Abstract
Bioinformatics is a science that integrates advanced biological science and computer technology. It integrates mathematics, information science and computer technology to scientifically organize, sort out and conclude the information of biology and medicine. DNA sequence alignment is one of the most important and basic research directions in bioinformatics and an important means to explore the relationship between genes and diseases. The main objective of this paper is to find all sequences that are identical to the target sequence and whose occurrence probability is greater than the given threshold in the uncertain molecular sequence data and to give the total number of target sequences and the starting site of each target sequence. In this paper, a weighted suffix tree-based DNA sequence pattern matching algorithm is proposed to solve the problem that the existing molecular sequence pattern matching algorithm based on “space for time” is limited to the calculation of times, and the image stereo matching method based on the double DNA sequence alignment algorithm in bioinformatics is limited to uncertain source data. This method uses weighted suffix trees as the main data structure, improves the matching accuracy of uncertain source data, and solves the problem that map data structure is limited to number calculation. Experimental results show that the proposed algorithm has improved the matching speed and sensitivity to a certain extent.Keywords
This publication has 17 references indexed in Scilit:
- DNC4mC-Deep: Identification and Analysis of DNA N4-Methylcytosine Sites Based on Different Encoding Schemes By Using Deep LearningCells, 2020
- Bioinformatics approaches for deciphering the epitranscriptome: Recent progress and emerging topicsComputational and Structural Biotechnology Journal, 2020
- ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m1A RNA Methylation SitesIEEE Access, 2020
- Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural NetworkIEEE Access, 2020
- Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrixBMC Molecular and Cell Biology, 2019
- RNAm5CPred: Prediction of RNA 5-Methylcytosine Sites Based on Three Different Kinds of Nucleotide CompositionMolecular Therapy Nucleic Acids, 2019
- DeepSite: bidirectional LSTM and CNN models for predicting DNA–protein bindingInternational Journal of Machine Learning and Cybernetics, 2019
- The roles of DNA, RNA and histone methylation in ageing and cancerNature Reviews Molecular Cell Biology, 2019
- iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequencesOncotarget, 2016
- 40 years of suffix treesCommunications of the ACM, 2016