Deep learning model for metagenome fragment classification using spaced k-mers feature extraction
Open Access
- 25 May 2020
- journal article
- Published by Institute of Research and Community Services Diponegoro University (LPPM UNDIP) in Jurnal Teknologi dan Sistem Komputer
- Vol. 8 (3), 234-238
- https://doi.org/10.14710/jtsiskom.2020.13407
Abstract
An open challenge in bioinformatics is the analysis of the sequenced metagenomes from the various environments. Several studies demonstrated bacteria classification at the genus level using k-mers as feature extraction where the highest value of k gives better accuracy but it is costly in terms of computational resources and computational time. Spaced k-mers method was used to extract the feature of the sequence using 111 1111 10001 where 1 was a match and 0 was the condition that could be a match or did not match. Currently, deep learning provides the best solutions to many problems in image recognition, speech recognition, and natural language processing. In this research, two different deep learning architectures, namely Deep Neural Network (DNN) and Convolutional Neural Network (CNN), trained to approach the taxonomic classification of metagenome data and spaced k-mers method for feature extraction. The result showed the DNN classifier reached 90.89 % and the CNN classifier reached 88.89 % accuracy at the genus level taxonomy.Keywords
Funding Information
- Institut Pertanian Bogor
This publication has 14 references indexed in Scilit:
- DeepInteract: Deep Neural Network Based Protein-Protein Interaction Prediction ToolCurrent Bioinformatics, 2017
- Deep learningNature, 2015
- Advances in Machine Learning for Processing and Comparison of Metagenomic DataPublished by Elsevier BV ,2014
- A metagenome-wide association study of gut microbiota in type 2 diabetesNature, 2012
- Metagenomic Analyses: Past and Future TrendsApplied and Environmental Microbiology, 2011
- MetaSim—A Sequencing Simulator for Genomics and MetagenomicsPLOS ONE, 2008
- What's in the mix: phylogenetic classification of metagenome sequence samplesCurrent Opinion in Microbiology, 2007
- MEGAN analysis of metagenomic dataGenome Research, 2007
- An obesity-associated gut microbiome with increased capacity for energy harvestNature, 2006
- PatternHunter: faster and more sensitive homology searchBioinformatics, 2002