Approximate policy iteration: a survey and some new methods
Open Access
- 19 July 2011
- journal article
- Published by Springer Science and Business Media LLC in Control Theory and Technology
- Vol. 9 (3), 310-335
- https://doi.org/10.1007/s11768-011-1005-3
Abstract
No abstract availableKeywords
This publication has 48 references indexed in Scilit:
- Projected equation methods for approximate solution of large linear systemsJournal of Computational and Applied Mathematics, 2009
- Learning Tetris Using the Noisy Cross-Entropy MethodNeural Computation, 2006
- A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference LearningDiscrete Event Dynamic Systems, 2006
- Basis Function Adaptation in Temporal Difference Reinforcement LearningAnnals of Operations Research, 2005
- A Tutorial on the Cross-Entropy MethodAnnals of Operations Research, 2005
- 10.1162/jmlr.2003.4.6.1107Applied Physics Letters, 2000
- Mean-Field Theory for Batched TD(λ)Neural Computation, 1997
- An analysis of temporal-difference learning with function approximationIEEE Transactions on Automatic Control, 1997
- On the Convergence of Stochastic Iterative Dynamic Programming AlgorithmsNeural Computation, 1994
- Distributed dynamic programmingIEEE Transactions on Automatic Control, 1982