Algorithms for Reinforcement Learning
- 1 January 2010
- journal article
- Published by Springer Science and Business Media LLC in Synthesis Lectures on Artificial Intelligence and Machine Learning
- Vol. 4 (1), 1-103
- https://doi.org/10.2200/s00268ed1v01y201005aim009
Abstract
Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. What distinguishes reinforcement learning from supervised learning is that only partial feedback is given to the learner about the learner's predictions. Further, the predictions may have long term effects through influencing the future state of the controlled system. Thus, time plays a special role. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. Reinforcement learning is of great interest because of the large number of practical applications that it can be used to address, ranging from problems in artificial intelligence to operations research or control engineering. In this book, we focus on those algorithms of reinforcement learning that build on the powerful theory of dynamic programming. We give a fairly comprehensive ca...Keywords
This publication has 100 references indexed in Scilit:
- Natural Actor-CriticNeurocomputing, 2008
- Learning Representation and Control in Markov Decision Processes: New FrontiersFoundations and Trends® in Machine Learning, 2007
- Opportunities and challenges in using online preference data for vehicle pricing: A case study at General MotorsJournal of Revenue and Pricing Management, 2006
- Basis Function Adaptation in Temporal Difference Reinforcement LearningAnnals of Operations Research, 2005
- On the Almost Sure Rate of Convergence of Linear Stochastic Approximation AlgorithmsIEEE Transactions on Information Theory, 2004
- On the Convergence of Stochastic Iterative Dynamic Programming AlgorithmsNeural Computation, 1994
- Asynchronous stochastic approximation and Q-learningMachine Learning, 1994
- Applied Nonparametric Regression.Biometrics, 1994
- Likelihood ratio gradient estimation for stochastic systemsCommunications of the ACM, 1990
- A theory of cerebellar functionMathematical Biosciences, 1971