Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning
Top Cited Papers
- 11 April 2014
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Automatic Control
- Vol. 59 (11), 3051-3056
- https://doi.org/10.1109/tac.2014.2317301
Abstract
In this technical note, an online learning algorithm is developed to solve the linear quadratic tracking (LQT) problem for partially-unknown continuous-time systems. It is shown that the value function is quadratic in terms of the state of the system and the command generator. Based on this quadratic form, an LQT Bellman equation and an LQT algebraic Riccati equation (ARE) are derived to solve the LQT problem. The integral reinforcement learning technique is used to find the solution to the LQT ARE online and without requiring the knowledge of the system drift dynamics or the command generator dynamics. The convergence of the proposed online algorithm to the optimal control solution is verified. To show the efficiency of the proposed approach, a simulation example is provided.Keywords
Funding Information
- National Science Foundation (ECCS-1128050)
- NSF (IIS-1208623)
- ONR (N00014-13-1-0562)
- AFOSR EOARD (13-3055)
- China NNSF (61120106011)
- China Education Ministry Project 111 (B08015)
This publication has 22 references indexed in Scilit:
- Optimal Tracking Control Scheme for Discrete-Time Nonlinear Systems with Approximation ErrorsLecture Notes in Computer Science, 2013
- Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systemsAutomatica, 2012
- Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamicsAutomatica, 2012
- Integral reinforcement learning with explorations for continuous-time nonlinear systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output DataIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010
- Optimal tracking control of affine nonlinear discrete-time systems with unknown internal dynamicsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2009
- Adaptive optimal control for continuous-time linear systems based on policy iterationAutomatica, 2009
- A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration AlgorithmIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2008
- Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input SaturationIEEE Transactions on Automatic Control, 2006
- Adaptive linear quadratic control using policy iterationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005