On the Enhancement of Classification Algorithms Using Biased Samples
Open Access
- 24 October 2019
- journal article
- research article
- Published by IBERAMIA: Sociedad Iberoamericana de Inteligencia Artificial in INTELIGENCIA ARTIFICIAL
- Vol. 22 (64), 36-46
- https://doi.org/10.4114/intartif.vol22iss64pp36-46
Abstract
Classification algorithms' performance could be enhanced by selecting many representative points to be included in the training sample. In this paper, a new border and rare biased sampling (BRBS) scheme is proposed by assigning each point in the dataset an importance factor. The importance factor of border points and rare points (i.e. points belong to rare classes) is higher than other points. Then the points are selected to be in the training sample depending on these factors. Including these points in the training sample enhances classifiers experience. The results of experiments on 10 UCI machine learning repository datasets prove that the BRBS algorithm outperforms many sampling algorithms and enhanced the performance of several classification algorithms by about 8%. BRBS is proposed to be easy to configure, covering all points space, and generate a unique samples every time it is executed.Keywords
This publication has 7 references indexed in Scilit:
- Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTEInformation Sciences, 2018
- Fuzzy rule-based oversampling technique for imbalanced and incomplete data learningKnowledge-Based Systems, 2018
- Social network analysis: Characteristics of online social networks after a disasterInternational Journal of Information Management, 2018
- DENDIS: A new density-based sampling for clustering algorithmExpert Systems with Applications, 2016
- Classifying highly imbalanced ICU dataHealth Care Management Science, 2012
- Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets LearningLecture Notes in Computer Science, 2005
- SMOTE: Synthetic Minority Over-sampling TechniqueJournal of Artificial Intelligence Research, 2002