Multi-Agent Reinforcement Learning Based Cognitive Anti-Jamming

Abstract
This paper proposes a reinforcement learning based approach to anti-jamming communications with wideband autonomous cognitive radios (WACRs) in a multi-agent environment. Assumed system model allows multiple WACRs to simultaneously operate over the same (wide) spectrum band. Each radio attempts to evade the transmissions of other WACRs as well as avoiding a jammer signal that sweeps across the whole spectrum band of interest. The WACR makes use of its spectrum knowledge acquisition ability to detect and identify the location (in frequency) of this sweeping jammer and the signals of other WACRs. This information and reinforcement learning is used to successfully learn a sub-band selection policy to avoid both the jammer signal as well as interference from other radios. It is shown, through simulations, that the proposed learning-based sub-band selection policy has low computational complexity and significantly outperforms the random sub-band selection policy.

This publication has 13 references indexed in Scilit: