Model-free control of nonlinear stochastic systems with discrete-time measurements

Consider the problem of developing a controller for general (nonlinear and stochastic) systems where the equations governing the system are unknown. Using discrete-time measurement, this paper presents an approach for estimating a controller without building or assuming a model for the system. Such an approach has potential advantages in accommodating complex systems with possibly time-varying dynamics. The controller is constructed through use of a function approximator, such as a neural network or polynomial. This paper considers the use of the simultaneous perturbation stochastic approximation algorithm which requires only system measurements. A convergence result for stochastic approximation algorithms with time-varying objective functions and feedback is established. It is shown that this algorithm can greatly enhance the efficiency over more standard stochastic approximation algorithms based on finite-difference gradient approximations.