Distributed Q-learning based dynamic spectrum management in cognitive cellular systems: Choosing the right learning rate

Abstract
This paper presents the concept of the Win-or-Learn-Fast (WoLF) variable learning rate for distributed Q-learning based dynamic spectrum management algorithms. It demonstrates the importance of choosing the learning rate correctly by simulating a large scale stadium temporary event network. The results show that using the WoLF variable learning rate provides a significant improvement in quality of service, in terms of the probabilities of file blocking and interruption, over typical values of fixed learning rates. The results have also demonstrated that it is possible to provide a better and more robust quality of service using distributed Q-learning with a WoLF variable learning rate, than a spectrum sensing based opportunistic spectrum access scheme, but with no spectrum sensing involved.

This publication has 12 references indexed in Scilit: