Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis
- 1 January 2012
- book chapter
- conference paper
- Published by Springer Science and Business Media LLC in Lecture Notes in Computer Science
Abstract
No abstract availableKeywords
Other Versions
This publication has 7 references indexed in Scilit:
- On Upper-Confidence Bound Policies for Switching Bandit ProblemsLecture Notes in Computer Science, 2011
- Deviations of Stochastic Bandit RegretLecture Notes in Computer Science, 2011
- Solving two‐armed Bernoulli bandit problems using a Bayesian learning automatonInternational Journal of Intelligent Computing and Cybernetics, 2010
- Exploration–exploitation tradeoff using variance estimates in multi-armed banditsTheoretical Computer Science, 2009
- Finite-time Analysis of the Multiarmed Bandit ProblemMachine Learning, 2002
- Asymptotically efficient adaptive allocation rulesAdvances in Applied Mathematics, 1985
- ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLESBiometrika, 1933