Improving recall of k-nearest neighbor algorithm for classes of uneven size

1 November 2013

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Abstract

The k-nearest neighbor algorithm is one of the most suitable method of classification for its simplicity, adaptability and performance. The real problem arises when classes do overlap and when samples size is unevenly distributed between categories. Many studies present optimization techniques on discriminant metrics, on weighting the features, on using probabilistic measures or adjusting the prototypes position. Classes that are represented by a small sample size are overwhelmed by the large number of prototypes of dominated groups. In this paper we describe a method of weighting the prototypes for each class of the k nearest neighbors to cope with the uneven distribution of data. The proposed method increases the classification rate in terms of recall measure.

Keywords

This publication has 8 references indexed in Scilit:

A probabilistic approach for semi-supervised nearest neighbor classification
Pattern Recognition Letters, 2012
Dimensionality reduction by minimizing nearest-neighbor classification error
Pattern Recognition Letters, 2011
Improving nearest neighbor rule with a simple adaptive distance measure
Pattern Recognition Letters, 2007
Learning weighted metrics to minimize nearest-neighbor classification error
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006
Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence
Pattern Recognition, 2005
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
Expert Systems with Applications, 2005
k ‐Nearest Neighbor Algorithm
Published by Wiley ,2004
Learning prototypes and distances (LPD). A prototype reduction technique based on nearest neighbor error minimization
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004

Cited by 3 articles