Reinforcement Learning Based Control of Coherent Transport by Adiabatic Passage of Spin Qubits

Abstract
Several tasks involving the determination of the time evolution of a system of solid state qubits require stochastic methods in order to identify the best sequence of gates and the time of interaction among the qubits. The major success of deep learning in several scientific disciplines has suggested its application to quantum information as well. Thanks to its capability to identify best strategy in those problems involving a competition between the short term and the long term rewards, reinforcement learning (RL) method has been successfully applied, for instance, to discover sequences of quantum gate operations minimizing the information loss. In order to extend the application of RL to the transfer of quantum information, we focus on Coherent Transport by Adiabatic Passage (CTAP) on a chain of three semiconductor quantum dots (QD). This task is usually performed by the so called counter-intuitive sequence of gate pulses. Such sequence is capable of coherently transfer an electronic population from the first to the last site of an odd chain of QDs, by leaving the central QD unpopulated. We apply a technique to find nearly optimal gate pulse sequence without explicitly give any prior knowledge of the underlying physical system to the RL agent. Using the advantage actor-critic algorithm, with a small neural net as function approximator, we trained a RL agent to choose the best action at every time step of the physical evolution to achieve the same results previously found only by ansatz solutions.