Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control
- 22 January 2007
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
- Vol. 37 (1), 240-247
- https://doi.org/10.1109/tsmcb.2006.880135
Abstract
In this correspondence, adaptive critic approximate dynamic programming designs are derived to solve the discrete-time zero-sum game in which the state and action spaces are continuous. This results in a forward-in-time reinforcement learning algorithm that converges to the Nash equilibrium of the corresponding zero-sum game. The results in this correspondence can be thought of as a way to solve the Riccati equation of the well-known discrete-time Hinfin optimal control problem forward in time. Two schemes are presented, namely: 1) a heuristic dynamic programming and 2) a dual-heuristic dynamic programming, to solve for the value function and the costate of the game, respectively. An Hinfin autopilot design for an F-16 aircraft is presented to illustrate the resultsKeywords
This publication has 11 references indexed in Scilit:
- Adaptive linear quadratic control using policy iterationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Handbook of Learning and Approximate Dynamic ProgrammingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Hamilton-Jacobi-Isaacs formulation for constrained input nonlinear systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Adaptive dynamic programmingIEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 2002
- Value-function reinforcement learning in Markov gamesCognitive Systems Research, 2001
- Adaptive critic designsIEEE Transactions on Neural Networks, 1997
- The discrete-time Riccati equation related to the H/sub ∞/ control problemIEEE Transactions on Automatic Control, 1994
- Neuronlike adaptive elements that can solve difficult learning control problemsIEEE Transactions on Systems, Man, and Cybernetics, 1983
- Kronecker products and matrix calculus in system theoryIEEE Transactions on Circuits and Systems, 1978
- Punish/Reward: Learning with a Critic in Adaptive Threshold SystemsIEEE Transactions on Systems, Man, and Cybernetics, 1973