Reinforcement learning from human reward: Discounting in episodic tasks

1 September 2012

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 878-885
https://doi.org/10.1109/roman.2012.6343862

Abstract

Several studies have demonstrated that teaching agents by human-generated reward can be a powerful technique. However, the algorithmic space for learning from human reward has hitherto not been explored systematically. Using model-based reinforcement learning from human reward in goal-based, episodic tasks, we investigate how anticipated future rewards should be discounted to create behavior that performs well on the task that the human trainer intends to teach. We identify a “positive circuits” problem with low discounting (i.e., high discount factors) that arises from an observed bias among humans towards giving positive reward. Empirical analyses indicate that high discounting (i.e., low discount factors) of human reward is necessary in goal-based, episodic tasks and lend credence to the existence of the positive circuits problem.

Keywords

This publication has 9 references indexed in Scilit:

How Humans Teach Agents
International Journal of Social Robotics, 2012
Augmented Reinforcement Learning for Interaction with Non-expert Humans in Agent Domains
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Effect of human guidance and state space size on Interactive Reinforcement Learning
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning
IEEE International Conference on Rehabilitation Robotics (ICORR), 2011
Teaching a Robot to Perform Task through Imitation and On-line Feedback
Lecture Notes in Computer Science, 2011
Dynamic Reward Shaping: Training a Robot by Voice
Lecture Notes in Computer Science, 2010
Interactively shaping agents via human reinforcement
Published by Association for Computing Machinery (ACM) ,2009
A survey of robot learning from demonstration
Robotics and Autonomous Systems, 2008
Teachable robots: Understanding human teaching behavior to build more effective robot learners
Artificial Intelligence, 2007

Cited by 40 articles