A New Improved Penalty Avoiding Rational Policy Making Algorithm for Keepaway with Continuous State Spaces

Abstract

The penalty avoiding rational policy making algorithm (PARP) [1] previously improved to save memory and cope with uncertainty, i.e., IPARP [2], requires that states be discretized in real environments with continuous state spaces, using function approximation or some other method. Especially, in PARP, a method that discretizes state using a basis functions is known [3]. Because this creates a new basis function based on the current input and its next observation, however, an unsuitable basis function may be generated in some asynchronous multiagent environments. We therefore propose a uniform basis function and range extent of the basis function is estimated before learning. We show the effectiveness of our proposal using a soccer game task called “Keepaway.”

Keywords

This publication has 5 references indexed in Scilit:

Extension of Improved Penalty Avoiding Rational Policy Making algorithm to tile coding environment for keepaway tasks
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2008
Reinforcement Learning for Penalty Avoidance in Continuous State Spaces
Journal of Advanced Computational Intelligence and Intelligent Informatics, 2007
Experimental Analysis of Reward Design for Continuing Task in Multiagent Domains -- RoboCup Soccer Keepaway --
Transactions of the Japanese Society for Artificial Intelligence, 2006
Reinforcement Learning for RoboCup Soccer Keepaway
Adaptive Behavior, 2005
Reinforcement learning for penalty avoiding policy making
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002

Cited by 11 articles