Optimal Learning by Experimentation

Abstract
This paper considers a problem of optimal learning by experimentation by a single decision maker. Most of the analysis is concerned with the characterisation of limit beliefs and actions. We take a two-stage approach to this problem: first, understand the case where the agent's payoff function is deterministic; then, address the additional issues arising when noise is present. Our analysis indicates that local properties of the payoff function (such as smoothness) are crucial in determining whether the agent eventually attains the true maximum payoff or not. The paper also makes a limited attempt at characterising optimal experimentation strategies.