A novel hierarchical clustering algorithm for gene sequences
Open Access
- 23 July 2012
- journal article
- research article
- Published by Springer Science and Business Media LLC in BMC Bioinformatics
- Vol. 13 (1), 174
- https://doi.org/10.1186/1471-2105-13-174
Abstract
Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of k-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors.Keywords
This publication has 50 references indexed in Scilit:
- A novel clustering method via nucleotide-based Fourier power spectrum analysisJournal of Theoretical Biology, 2011
- Alignment-free detection of local similarity among viral and bacterial genomesBioinformatics, 2011
- Alignment-free estimation of nucleotide diversityBioinformatics, 2010
- Efficient estimation of pairwise distances between genomesBioinformatics, 2009
- A novel feature-based method for whole genome phylogenetic analysis without alignment: Application to HEV genotyping and subtypingBiochemical and Biophysical Research Communications, 2008
- CLUSS: Clustering of protein sequences based on a new similarity measureBMC Bioinformatics, 2007
- The Evolution of Mammalian Gene FamiliesPLOS ONE, 2006
- Alignment-free sequence comparison—a reviewBioinformatics, 2003
- Basic local alignment search toolJournal of Molecular Biology, 1990
- Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences of the United States of America, 1988