K-means clustering based on gower similarity coefficient: A comparative study
- 1 April 2013
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2013 5th International Conference on Modeling, Simulation and Applied Optimization (ICMSAO)
Abstract
Clustering is one of the most important Data Mining tasks employed in knowledge extraction and to partition data sets into similar groups. We present in this paper k-means clustering algorithm with different metrics and similarity measures in particular Gower similarity coeffecient. We use external validity measures to compare the result of k-means using weka. The experiments are carried out for various data sets of VCI machine learning data repository. Experimental results show that the accuracy of k-means algorithm using Gower similarity coeffecient is better than the other tested metrics for the used data sets.Keywords
This publication has 10 references indexed in Scilit:
- A binarization strategy for modelling mixed data in multigroup classificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- K-means clustering using Max-min distance measurePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2009
- Combined use of association rules mining and clustering methods to find relevant links between binary rare attributes in a large data setComputational Statistics & Data Analysis, 2007
- K-means clustering versus validation measuresPublished by Association for Computing Machinery (ACM) ,2006
- Elements of Information TheoryPublished by Wiley ,2001
- On Clustering Validation TechniquesJournal of Intelligent Information Systems, 2001
- Data mining and KDD: Promise and challengesFuture Generation Computer Systems, 1997
- Data mining: an overview from a database perspectiveIEEE Transactions on Knowledge and Data Engineering, 1996
- Pictures of relevance: A geometric analysis of similarity measuresJournal of the American Society for Information Science, 1987
- A General Coefficient of Similarity and Some of Its PropertiesBiometrics, 1971