Abstract
Differential dynamic programming is a technique, based on dynamic programming rather than the calculus of variations, for determining the optimal control function of a nonlinear system. Unlike conventional dynamic programming where the optimal cost function is considered globally, differential dynamic programming applies the principle of optimality in the neighborhood of a nominal, possibly nonoptimal, trajectory. This allows the coefficients of a linear or quadratic expansion of the cost function to be computed in reverse time along the trajectory: these coefficients may then be used to yield a new improved trajectory (i.e., the algorithms are of the "successive sweep" type). A class of nonlinear control problems, linear in the control variables, is studied using differential dynamic programming. It is shown that for the free-end-point problem, the first partial derivatives of the optimal cost function are continuous throughout the state space, and the second partial derivatives experience jumps at switch points of the control function. A control problem that has an aualytic solution is used to illustrate these points. The fixed-end-point problem is converted into an equivalent free-end-point problem by adjoining the end-point constraints to the cost functional using Lagrange multipliers: a useful interpretation for Pontryagin's adjoint variables for this type of problem emerges from this treatment. The above results are used to devise new second- and first-order algorithms for determining the optimal bang-bang control by successively improving a nominal guessed control function. The usefulness of the proposed algorithms is illustrated by the computation of a number of control problem examples.

This publication has 13 references indexed in Scilit: