Optimal Value for Number of Clusters in a Dataset for Clustering Algorithm
Open Access
- 30 April 2022
- journal article
- Published by Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP in International Journal of Engineering and Advanced Technology
- Vol. 11 (4), 24-29
- https://doi.org/10.35940/ijeat.d3417.0411422
Abstract
It is essential to know the parameters required to clustering the dataset. One of the parameters is the number of clusters k and it is very important to select the k value to get deficient results on clustering. There are few algorithms to find the k value for k-means algorithm and it requires specifying a maximum value for k or a range of values for k as an input. This paper proposes a novel method Optimal cluster number estimation algorithm (OCNE) to find the optimal number of clusters without specifying the maximum or range of k values or knee point detection in the graph. In the experiment, this method is compared with the different existing methods with deficient real-world as well as synthetic datasets and provides good performance.Keywords
This publication has 36 references indexed in Scilit:
- Convex Clustering: An Attractive Alternative to Hierarchical ClusteringPLoS Computational Biology, 2015
- Data Mining Task Tools Techniques and ApplicationsIJARCCE, 2014
- Automatic Method for Determining Cluster Number Based on Silhouette CoefficientAdvanced Materials Research, 2014
- Privacy-preserving data mining: A feature set partitioning approachInformation Sciences, 2010
- Determining the Number of Clusters Using the Weighted Gap StatisticBiometrics, 2007
- Determination of cluster number in clustering microarray dataApplied Mathematics and Computation, 2005
- Improving the Efficiency of a Clustering Genetic AlgorithmLecture Notes in Computer Science, 2004
- An evolutionary technique based on K-Means algorithm for optimal clustering inInformation Sciences, 2002
- Estimating the Number of Clusters in a Data Set Via the Gap StatisticJournal of the Royal Statistical Society Series B: Statistical Methodology, 2001
- Silhouettes: A graphical aid to the interpretation and validation of cluster analysisJournal of Computational and Applied Mathematics, 1987