K-nearest neighbor performance for Nusantara scripts image transliteration

Abstract
The concept of classification using the k-nearest neighbor (KNN) method is simple, easy to understand, and easy to be implemented in the system. The main challenge in classification with KNN is determining the proximity measure of an object and how to make a compact reference class. This paper studied the implementation of the KNN for the automatic transliteration of Javanese, Sundanese, and Bataknese script images into Roman script. The study used the KNN algorithm with the number k set to 1, 3, 5, 7, and 9. Tests used the image dataset of 2520 data. With the 3-fold and 10-fold cross-validation, the results exposed the accuracy differences if the area of the extracted image, the number of neighbors in the classification, and the number of data training were different.
Funding Information
  • Universitas Sanata Dharma (042/LPPM USD/V/2019)

This publication has 9 references indexed in Scilit: