Logarithmic regret algorithms for online convex optimization

作者：Elad Hazan, Amit Agarwal, Satyen Kale

摘要

In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters a sequence of (possibly unrelated) convex cost functions. Zinkevich (ICML 2003) introduced this framework, which models many natural repeated decision-making problems and generalizes many existing problems such as Prediction from Expert Advice and Cover’s Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret \(O(\sqrt{T})\) , for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight.

论文关键词：Online learning, Online optimization, Regret minimization, Portfolio management

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10994-007-5016-8