Deep Learning Techniques for Autonomous Spacecraft Guidance During Proximity Operations

Abstract
This paper investigates the use of deep learning techniques for real-time optimal spacecraft guidance during terminal rendezvous maneuvers, in presence of both operational constraints and stochastic effects, such as an inaccurate knowledge of the initial spacecraft state and the presence of random in-flight disturbances. The performance of two well-studied deep learning methods, behavioral cloning (BC) and reinforcement learning (RL), is investigated on a linear multi-impulsive rendezvous mission. To this aim, a multilayer perceptron network, with custom architecture, is designed to map any observation of the actual spacecraft relative position and velocity to the propellant-optimal control action, which corresponds to a bounded-magnitude impulsive velocity variation. In the BC approach, the deep neural network is trained by supervised learning on a set of optimal trajectories, generated by routinely solving the deterministic optimal control problem via convex optimization, starting from scattered initial conditions. Conversely, in the RL approach, a state-of-the-art actor–critic algorithm, proximal policy optimization, is used for training the network through repeated interactions with the stochastic environment. Eventually, the robustness and propellant efficiency of the obtained closed-loop control policies are assessed and compared by means of a Monte Carlo analysis, carried out by considering different test cases with increasing levels of perturbations.

This publication has 26 references indexed in Scilit: