Tuning continual exploration in reinforcement learning: An optimality property of the Boltzmann strategy

Abstract
No abstract available