Efficient model learning for dialog management

Intelligent planning algorithms such as the Partially Observable Markov Decision Process (POMDP) have succeeded in dialog management applications [10, 11, 12] because they are robust to the inherent uncertainty of human interaction. Like all dialog planning systems, however, POMDPs require an accura...

Full description

Saved in:

Bibliographic Details
Published in:	2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI) pp. 65 - 72
Main Authors:	Doshi, Finale, Roy, Nicholas
Format:	Conference Proceeding
Language:	English
Published:	New York, NY, USA ACM 10.03.2007 IEEE
Series:	ACM Conferences
Subjects:	Abstracts Computing methodologies > Machine learning > Learning paradigms Computing methodologies > Machine learning > Machine learning approaches > Markov decision processes Convergence decision-making under uncertainty Face History Human-robot interaction model learning Planning Pragmatics Robots Theory of computation > Theory and algorithms for application domains > Machine learning theory > Markov decision processes human-robot interaction decision-making under uncertainty model learning
ISBN:	1595936173, 9781595936172
ISSN:	2167-2121
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Intelligent planning algorithms such as the Partially Observable Markov Decision Process (POMDP) have succeeded in dialog management applications [10, 11, 12] because they are robust to the inherent uncertainty of human interaction. Like all dialog planning systems, however, POMDPs require an accurate model of the user (e.g., what the user might say or want). POMDPs are generally specified using a large probabilistic model with many parameters. These parameters are difficult to specify from domain knowledge, and gathering enough data to estimate the parameters accurately a priori is expensive.In this paper, we take a Bayesian approach to learning the user model simultaneously with dialog manager policy. At the heart of our approach is an efficient incremental update algorithm that allows the dialog manager to replan just long enough to improve the current dialog policy given data from recent interactions. The update process has a relatively small computational cost, preventing long delays in the interaction. We are able to demonstrate a robust dialog manager that learns from interaction data, out-performing a hand-coded model in simulation and in a robotic wheelchair application.
Bibliography:	SourceType-Conference Papers & Proceedings-1 ObjectType-Conference Paper-1 content type line 25
ISBN:	1595936173 9781595936172
ISSN:	2167-2121
DOI:	10.1145/1228716.1228726