Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation

Open Access

13 September 2021

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

https://doi.org/10.1145/3460231.3474256

Abstract

A dialog-based interactive recommendation task is where users can express natural-language feedback when interacting with the recommender system. However, the users’ feedback, which takes the form of natural-language critiques about the recommendation at each iteration, can only allow the recommender system to obtain a partial portrayal of the users’ preferences. Indeed, such partial observations of the users’ preferences from their natural-language feedback make it challenging to correctly track the users’ preferences over time, which can result in poor recommendation performances and a less effective satisfaction of the users’ information needs when in presence of limited iterations. Reinforcement learning, in the form of a partially observable Markov decision process (POMDP), can simulate the interactions between a partially observable environment (i.e. a user) and an agent (i.e. a recommender system). To alleviate such a partial observation issue, we propose a novel dialog-based recommendation model, the Estimator-Generator-Evaluator (EGE) model, with Q-learning for POMDP, to effectively incorporate the users’ preferences over time. Specifically, we leverage an Estimator to track and estimate users’ preferences, a Generator to match the estimated preferences with the candidate items to rank the next recommendations, and an Evaluator to judge the quality of the estimated preferences considering the users’ historical feedback. Following previous work, we train our EGE model by using a user simulator which itself is trained to describe the differences between the target users’ preferences and the recommended items in natural language. Thorough and extensive experiments conducted on two recommendation datasets – addressing images of fashion products (namely dresses and shoes) – demonstrate that our proposed EGE model yields significant improvements in comparison to the existing state-of-the-art baseline models.

Keywords

Funding Information

Engineering and Physical Sciences Research Council (EP/R018634/1)

This publication has 6 references indexed in Scilit:

Towards Hands-Free Visual Dialog Interactive Recommendation
Proceedings of the AAAI Conference on Artificial Intelligence, 2020
Self-Attentive Sequential Recommendation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2018
Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2018
Self-Critical Sequence Training for Image Captioning
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2017
Deep Residual Learning for Image Recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016
DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2016

Cited by 8 articles