Statistically Robust Evaluation of Stream-Based Recommender Systems
- 17 December 2019
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Knowledge and Data Engineering
- Vol. 33 (7), 2971-2982
- https://doi.org/10.1109/tkde.2019.2960216
Abstract
Online incremental models for recommendation are nowadays pervasive in both the industry and the academia. However, there is not yet a standard evaluation methodology for the algorithms that maintain such models. Moreover, online evaluation methodologies available in the literature generally fall short on the statistical validation of results, since this validation is not trivially applicable to stream-based algorithms. We propose a k-fold validation framework for the pairwise comparison of recommendation algorithms that learn from user feedback streams, using prequential evaluation. Our proposal enables continuous statistical testing on adaptive-size sliding windows over the outcome of the prequential process, allowing practitioners and researchers to make decisions in real time based on solid statistical evidence. We present a set of experiments to gain insights on the sensitivity and robustness of two statistical tests - McNemar's and Wilcoxon signed rank - in a streaming data environment. Our results show that besides allowing a real-time, fine-grained online assessment, the online versions of the statistical tests are at least as robust as the batch versions, and definitely more robust than a simple prequential single-fold approach.Keywords
Funding Information
- Fundação para a Ciência e a Tecnologia (UID/EEA/50014/2019)
This publication has 29 references indexed in Scilit:
- Efficient Online Evaluation of Big Data Stream ClassifiersPublished by Association for Computing Machinery (ACM) ,2015
- An overview on the exploitation of time in collaborative filteringWIREs Data Mining and Knowledge Discovery, 2015
- Forgetting methods for incremental matrix factorization in recommender systemsPublished by Association for Computing Machinery (ACM) ,2015
- Fast Incremental Matrix Factorization for Recommendation with Positive-Only FeedbackPublished by Springer Science and Business Media LLC ,2014
- Time-aware recommender systems: a comprehensive survey and analysis of existing evaluation protocolsUser Modelling and User-Adapted Interaction, 2013
- On evaluating stream learning algorithmsMachine Learning, 2012
- Evaluating recommender systems from the user’s perspective: survey of the state of the artUser Modelling and User-Adapted Interaction, 2012
- Adaptive Learning from Evolving Data StreamsLecture Notes in Computer Science, 2009
- Controlled experiments on the web: survey and practical guideData Mining and Knowledge Discovery, 2008
- Evaluating collaborative filtering recommender systemsACM Transactions on Information Systems, 2004