An improved adaboost algorithm for imbalanced data based on weighted KNN
- 1 March 2017
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Imbalanced data become an obstacle in data mining nowadays, minority class sometimes are more important than majority class, just like in medical diagnosis, credit card fraud and etc. This paper focuses on the imbalanced data problem that adaboost algorithm cannot get a proper accuracy rate for minority class, and propose an improved adaboost algorithm for imbalanced data based on weighted KNN(K-Adaboost). K-Adaboost uses KNN algorithm to cut down majority class weights which is near to minority class, so that the classify can pay more attention to minority class. Besides, the paper uses a new error function and sets a threshold during classifying process in order to avoid weight distortion.Keywords
This publication has 8 references indexed in Scilit:
- Measuring classifier performance: a coherent alternative to the area under the ROC curveMachine Learning, 2009
- A study of the behavior of several methods for balancing machine learning training dataACM SIGKDD Explorations Newsletter, 2004
- Mining with rarityACM SIGKDD Explorations Newsletter, 2004
- Learning When Training Data are Costly: The Effect of Class Distribution on Tree InductionJournal of Artificial Intelligence Research, 2003
- Improved Rooftop Detection in Aerial Images with Machine LearningMachine Learning, 2003
- Evaluating boosting algorithms to classify rare classes: comparison and improvementsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Learning and making decisions when costs and probabilities are both unknownPublished by Association for Computing Machinery (ACM) ,2001
- A Decision-Theoretic Generalization of On-Line Learning and an Application to BoostingJournal of Computer and System Sciences, 1997