A stochastic model of human-machine interaction for learning dialog strategies

1 January 2000

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Speech and Audio Processing

Vol. 8 (1), 11-23
https://doi.org/10.1109/89.817450

Abstract

, In this paper, we propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i. e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i. e., the MDP parameters that quantify the user's behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups. Index Terms, Dialog systems, Markov decision process, rein-forcement learning, sequential decision process, speech, spoken language systems.

Keywords

This publication has 15 references indexed in Scilit:

Prompt constrained natural language-evolving the next generation of telephony services
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Using Markov decision process for learning dialogue strategies
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
User modeling for spoken dialogue system evaluation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
The LIMSI RailTel System: Field trial of a telephone service for rail travel information
Speech Communication, 1997
PARADISE
Published by Association for Computational Linguistics (ACL) ,1997
Head automata and bilingual tiling
Published by Association for Computational Linguistics (ACL) ,1996
The application of semantic classification trees to natural language understanding
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1995
The TRAINS project: a case study in building a conversational planning agent
Journal of Experimental & Theoretical Artificial Intelligence, 1995
Statistical language processing using hidden understanding models
Published by Association for Computational Linguistics (ACL) ,1994
Stochastic representation of semantic structure for speech understanding
Speech Communication, 1992

Cited by 253 articles