Spam Detection Using Dynamic Weighted Voting Based on Clustering
- 1 December 2008
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2008 Second International Symposium on Intelligent Information Technology Application
- Vol. 2, 122-126
- https://doi.org/10.1109/iita.2008.140
Abstract
In the last decade spam detection has been addressed as a text classification or categorization problem. In this paper we propose a new dynamic weighted voting method based on the combination of clustering and weighted voting, and apply it to the task of spam filtering. In order to classify a new sample, it first compares with all cluster centroids and its similarity to each cluster is identified; Classifiers in the vicinity of the input sample obtain greater weight for the final decision of the ensemble. The evaluation shows that the algorithm outperforms pure SVM.Keywords
This publication has 8 references indexed in Scilit:
- A dynamic overproduce-and-choose strategy for the selection of classifier ensemblesPattern Recognition, 2008
- A local boosting algorithm for solving classification problemsComputational Statistics & Data Analysis, 2008
- An Ensemble-Based Incremental Learning Approach to Data FusionIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2007
- Classifier selection for majority votingInformation Fusion, 2005
- Switching between selection and fusion in combining classifiers: an experimentIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2002
- Decision templates for multiple classifier fusion: an experimental comparisonPattern Recognition, 2001
- Support vector machines for spam categorizationIEEE Transactions on Neural Networks, 1999
- An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and VariantsMachine Learning, 1999