MSVM-kNN: Combining SVM and k-NN for Multi-class Text Classification
- 1 July 2008
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 133-140
- https://doi.org/10.1109/wscs.2008.36
Abstract
Text categorization is the process of assigning documents to a set of previously fixed categories. It is widely used in many data-oriented management applications. Many popular algorithms for text categorization have been proposed, such as Naive Bayes, k-Nearest Neighbor (k-NN), Support Vector Machine (SVM). However, those classification approaches do not perform well in every case, for example, SVM can not identify categories of documents correctly when the texts are in cross zones of multi-categories, k-NN cannot effectively solve the problem of overlapped categories borders. In this paper, we propose an approach named as Multi-class SVM-kNN (MSVM-kNN) which is the combination of SVM and k-NN. In the approach, SVM is first used to identify category borders, then k-NN classifies documents among borders. MSVM-kNN can overcome the shortcomings of SVM and k-NN and improve the performance of multi-class text classification. The experimental results show MSVM-kNN performs better than SVM or kNN.Keywords
This publication has 13 references indexed in Scilit:
- Support vector machine classification for large data sets via minimum enclosing ball clusteringNeurocomputing, 2008
- Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classificationInformation Sciences, 2007
- Fuzzy support vector machine for multi-class text categorizationInformation Processing & Management, 2007
- SemreX: Towards Large-Scale Literature Information Retrieval and Browsing with Semantic AssociationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- Document indexing in text categorizationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- A MFoM learning approach to robust multiclass multi-label text categorizationPublished by Association for Computing Machinery (ACM) ,2004
- SVM vs regularized least squares classificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- A scalability analysis of classifiers in text categorizationPublished by Association for Computing Machinery (ACM) ,2003
- Machine learning in automated text categorizationACM Computing Surveys, 2002
- A statistical learning learning model of text classification for support vector machinesPublished by Association for Computing Machinery (ACM) ,2001