Accelerated PSO Swarm Search Feature Selection for Data Stream Mining Big Data
Top Cited Papers
- 1 June 2015
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Services Computing
- Vol. 9 (1), 33-45
- https://doi.org/10.1109/tsc.2015.2439695
Abstract
Big Data though it is a hype up-springing many technical challenges that confront both academic research communities and commercial IT deployment, the root sources of Big Data are founded on data streams and the curse of dimensionality. It is generally known that data which are sourced from data streams accumulate continuously making traditional batch-based model induction algorithms infeasible for real-time data mining. Feature selection has been popularly used to lighten the processing load in inducing a data mining model. However, when it comes to mining over high dimensional data the search space from which an optimal feature subset is derived grows exponentially in size, leading to an intractable demand in computation. In order to tackle this problem which is mainly based on the high-dimensionality and streaming format of data feeds in Big Data, a novel lightweight feature selection is proposed. The feature selection is designed particularly for mining streaming data on the fly, by using accelerated particle swarm optimization (APSO) type of swarm search that achieves enhanced analytical accuracy within reasonable processing time. In this paper, a collection of Big Data with exceptionally large degree of dimensionality are put under test of our new feature selection algorithm for performance evaluation.Keywords
Funding Information
- Adaptive OVFDT with Incremental Pruning
- ROC Corrective Learning for Data Stream Mining (MYRG073(Y2-L2)-FST12-FCC)
- University of Macau
- FST
- RDAO
This publication has 12 references indexed in Scilit:
- Feature Selection in Life Science Classification: Metaheuristic Swarm SearchIT Professional, 2014
- Mining big dataACM SIGKDD Explorations Newsletter, 2013
- Accelerated Particle Swarm Optimization and Support Vector Machine for Business Optimization and ApplicationsCommunications in Computer and Information Science, 2011
- Rough set theory with discriminant analysis in analyzing electricity loadsExpert Systems with Applications, 2009
- Learning from Time-Changing Data with Adaptive WindowingPublished by Society for Industrial & Applied Mathematics (SIAM) ,2007
- New Options for Hoeffding TreesPublished by Springer Science and Business Media LLC ,2007
- Top-Down Induction of Decision Trees Classifiers—A SurveyIEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 2005
- Mining data streamsACM SIGMOD Record, 2005
- Mining high-speed data streamsPublished by Association for Computing Machinery (ACM) ,2000
- K*: An Instance-based Learner Using an Entropic Distance MeasurePublished by Elsevier BV ,1995