Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control

Abstract

In this correspondence, adaptive critic approximate dynamic programming designs are derived to solve the discrete-time zero-sum game in which the state and action spaces are continuous. This results in a forward-in-time reinforcement learning algorithm that converges to the Nash equilibrium of the corresponding zero-sum game. The results in this correspondence can be thought of as a way to solve the Riccati equation of the well-known discrete-time H_infin optimal control problem forward in time. Two schemes are presented, namely: 1) a heuristic dynamic programming and 2) a dual-heuristic dynamic programming, to solve for the value function and the costate of the game, respectively. An H_infin autopilot design for an F-16 aircraft is presented to illustrate the results

Keywords

This publication has 11 references indexed in Scilit:

Adaptive linear quadratic control using policy iteration
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Handbook of Learning and Approximate Dynamic Programming
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Hamilton-Jacobi-Isaacs formulation for constrained input nonlinear systems
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Adaptive dynamic programming
IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 2002
Value-function reinforcement learning in Markov games
Cognitive Systems Research, 2001
Adaptive critic designs
IEEE Transactions on Neural Networks, 1997
The discrete-time Riccati equation related to the H/sub ∞/ control problem
IEEE Transactions on Automatic Control, 1994
Neuronlike adaptive elements that can solve difficult learning control problems
IEEE Transactions on Systems, Man, and Cybernetics, 1983
Kronecker products and matrix calculus in system theory
IEEE Transactions on Circuits and Systems, 1978
Punish/Reward: Learning with a Critic in Adaptive Threshold Systems
IEEE Transactions on Systems, Man, and Cybernetics, 1973

Cited by 148 articles