A New Improved Penalty Avoiding Rational Policy Making Algorithm for Keepaway with Continuous State Spaces
- 20 November 2009
- journal article
- Published by Fuji Technology Press Ltd. in Journal of Advanced Computational Intelligence and Intelligent Informatics
- Vol. 13 (6), 675-682
- https://doi.org/10.20965/jaciii.2009.p0675
Abstract
The penalty avoiding rational policy making algorithm (PARP) [1] previously improved to save memory and cope with uncertainty, i.e., IPARP [2], requires that states be discretized in real environments with continuous state spaces, using function approximation or some other method. Especially, in PARP, a method that discretizes state using a basis functions is known [3]. Because this creates a new basis function based on the current input and its next observation, however, an unsuitable basis function may be generated in some asynchronous multiagent environments. We therefore propose a uniform basis function and range extent of the basis function is estimated before learning. We show the effectiveness of our proposal using a soccer game task called “Keepaway.”Keywords
This publication has 5 references indexed in Scilit:
- Extension of Improved Penalty Avoiding Rational Policy Making algorithm to tile coding environment for keepaway tasksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2008
- Reinforcement Learning for Penalty Avoidance in Continuous State SpacesJournal of Advanced Computational Intelligence and Intelligent Informatics, 2007
- Experimental Analysis of Reward Design for Continuing Task in Multiagent Domains -- RoboCup Soccer Keepaway --Transactions of the Japanese Society for Artificial Intelligence, 2006
- Reinforcement Learning for RoboCup Soccer KeepawayAdaptive Behavior, 2005
- Reinforcement learning for penalty avoiding policy makingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002