FUSION

6 November 2017

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

https://doi.org/10.1145/3132847.3132886

Abstract

Traditional data stream classification assumes that data is generated from a single non-stationary process. On the contrary, multistream classification problem involves two independent non-stationary data generating processes. One of them is the source stream that continuously generates labeled data. The other one is the target stream that generates unlabeled test data from the same domain. The distribution represented by the source stream data is biased compared to that of the target stream. Moreover, these streams may have asynchronous concept drifts between them. The multistream classification problem is to predict the class labels of target stream instances by utilizing labeled data from the source stream. This kind of scenario is often observed in real-world applications due to scarcity of labeled data. The only existing approach for multistream classification uses separate drift detection on the streams for addressing the asynchronous concept drift problem. If a concept drift is detected in any of the streams, it uses an expensive batch technique for data shift adaptation. These add significant execution overhead, and limit its usability. In this paper, we propose an efficient solution for multistream classification by fusing drift detection into online data shift adaptation. We study the theoretical convergence rate and computational complexity of the proposed approach. Moreover, empirical results on benchmark data sets indicate significantly improved performance over the baseline methods.

Keywords

Funding Information

International Business Machines Corporation (Faculty Award (Research))
Air Force Office of Scientific Research (FA9550-14-1-0173)
National Science Foundation (1737978)

This publication has 3 references indexed in Scilit:

Adaptive learning and mining for data streams and frequent patterns
ACM SIGKDD Explorations Newsletter, 2009
Learning and evaluating classifiers under sample selection bias
Published by Association for Computing Machinery (ACM) ,2004
Error Correlation and Error Reduction in Ensemble Classifiers
Connection Science, 1996

Cited by 114 articles