The time until the final zero crossing of random sums with application to nonparametric bandit theory

作者:

Highlights:

摘要

Motivated by problems in machine learning and more fundamentally, by non-Bayesian, nonparametric problems in the sequential design of experiments, this work contributes to the task of attaining probability bounds for the number of times suboptimal bandits are chosen in a nonterminating sequence of experiments. To our knowledge, previously only the growth of the expectation of the number of incorrect choices has been examined. The derivation is founded, in part, on new contributions to the theory of zero crossings for sums of biased independent, identically distributed (i.i.d.) random variables.

论文关键词:

论文评审过程:Available online 1 April 2002.

论文官网地址:https://doi.org/10.1016/0096-3003(94)90197-X