搜索结果: 1-6 共查到“理论统计学 Bandits”相关记录6条 . 查询时间(0.093 秒)
Efficient Optimal Learning for Contextual Bandits
Efficient Optimal Learning Contextual Bandits
2011/7/6
We address the problem of learning in an online setting where the learner repeatedly observes features, selects among a set of actions, and receives reward for the action taken.
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
Finite-Time Multi-armed Bandits Problems Kullback-Leibler Divergences
2011/6/20
We consider a Kullback-Leibler-based algorithmfor the stochastic multi-armed bandit prob-
lem in the case of distributions with finite supports (not necessarily known beforehand),
whose asymptotic r...
PAC-Bayesian Analysis of Martingales and Multiarmed Bandits
PAC-Bayesian Analysis Martingales Multiarmed Bandits
2011/6/21
We present two alternative ways to apply PAC-Bayesian analysis to sequences of dependent
random variables. The first is based on a new lemma that enables to bound expectations
of convex functions of...
The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond
Stochastic Bandits Beyond KL-UCB
2011/3/21
This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems. We prove two distinct results: first, for arbitrary bounded rew...
Nonparametric Bandits with Covariates
Bandit regression regret inferior sampling rate minimax rate
2010/3/11
We consider a bandit problem which involves sequential sampling from two populations
(arms). Each arm produces a noisy reward realization which depends on an observable
random covariate. The goal is...
We consider a generalization of stochastic bandits where the set of arms, X, is allowed to be
a generic measurable space and the mean-payoff function is “locally Lipschitz” with respect to a
dissimi...