Thompson Sampling: An Asymptotically Optimal Finite-Time Analysis

Publisher Website

1 January 2012

book chapter
conference paper
Published by Springer Science and Business Media LLC in Lecture Notes in Computer Science

p. 199-213
https://doi.org/10.1007/978-3-642-34106-9_18

Abstract

No abstract available

Keywords

Other Versions

Version 2, 2012-05-18, preprints

This publication has 7 references indexed in Scilit:

On Upper-Confidence Bound Policies for Switching Bandit Problems
Lecture Notes in Computer Science, 2011
Deviations of Stochastic Bandit Regret
Lecture Notes in Computer Science, 2011
Solving two‐armed Bernoulli bandit problems using a Bayesian learning automaton
International Journal of Intelligent Computing and Cybernetics, 2010
Exploration–exploitation tradeoff using variance estimates in multi-armed bandits
Theoretical Computer Science, 2009
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning, 2002
Asymptotically efficient adaptive allocation rules
Advances in Applied Mathematics, 1985
ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES
Biometrika, 1933

Cited by 143 articles