Classification in the Presence of Label Noise: A Survey

Top Cited Papers

17 December 2013

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks and Learning Systems

Vol. 25 (5), 845-869
https://doi.org/10.1109/tnnls.2013.2292894

Abstract

Label noise is an important issue in classification, with many potential negative consequences. For example, the accuracy of predictions may decrease, whereas the complexity of inferred models and the number of necessary training samples may increase. Many works in the literature have been devoted to the study of label noise and the development of techniques to deal with label noise. However, the field lacks a comprehensive survey on the different types of label noise, their consequences and the algorithms that consider label noise. This paper proposes to fill this gap. First, the definitions and sources of label noise are considered and a taxonomy of the types of label noise is proposed. Second, the potential consequences of label noise are discussed. Third, label noise-robust, label noise cleansing, and label noise-tolerant algorithms are reviewed. For each category of approaches, a short discussion is proposed to help the practitioner to choose the most suitable technique in its own particular field of application. Eventually, the design of experiments is also discussed, what may interest the researchers who would like to test their own algorithms. In this paper, label noise consists of mislabeled instances: no additional information is assumed to be available like e.g., confidences on labels.

Keywords

This publication has 128 references indexed in Scilit:

Accounting for Control Mislabeling in Case–Control Biomarker Studies
Journal of Proteome Research, 2011
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering, 2009
Robust supervised classification with mixture models: Learning from data with uncertain labels
Pattern Recognition, 2009
Learning from partially supervised data using mixture models and belief functions
Pattern Recognition, 2009
Semi-supervised protein subcellular localization
BMC Bioinformatics, 2009
Classification in the presence of class noise using a probabilistic Kernel Fisher method
Pattern Recognition, 2007
Kernel PCA for novelty detection
Pattern Recognition, 2007
Switching class labels to generate classification ensembles
Pattern Recognition, 2005
Analysis of evidence-theoretic decision rules for pattern classification
Pattern Recognition, 1997
Relative sensitivity of a family of closest-point graphs in computer vision applications
Pattern Recognition, 1991

Cited by 1007 articles