Impact of Similarity Measures in K-means Clustering Method used in Movie Recommender Systems

Abstract
One of the most challenging tasks in today's era is to deliver personalized information to the user or group of users according to their preferences. The decision making process is used as a tool by the recommender systems to suggests various products and items. The goal of recommender system is to deliver germane information to the user based on their likings. In this research work, we study and compare the effect of similarity measures used in k-means clustering in movie recommender systems. Our proposed method used the sampling, PCA and k-means clustering to recommend movie from the MovieLens dataset. In the whole process, some similarity measures are used in k-means clustering such as Euclidean, Minkowski, Mahalanobis, Cosine similarity and Pearson correlation. The aim of our work is to study the effect of similar measures in movie recommender systems in terms of standard deviation (SD), mean absolute error (MAE), root mean square error (RMSE), t-value, Dunn Matrix, average similarity and computational time using publicly available MovieLens dataset. The results achieved from the experiments indicates that Cosine similarity is best technique in movie recommender system in terms of accuracy, efficiency and processing speed and also able to get MAE of 0.65, which is best between all similarity measures.

This publication has 20 references indexed in Scilit: