A novel estimator based learning automata algorithm

作者：Hao Ge, Wen Jiang, Shenghong Li, Jianhua Li, Yifan Wang, Yuchun Jing

摘要

Reinforcement learning is one of the subjects of Artificial Intelligence and learning automata have been considered as one of the most powerful tools in this research area. On the evolution of learning automata, the rate of convergence is the most primary goal of designing a learning algorithm. In this paper, we propose a deterministic-estimator based learning automata (LA) of which the estimate of each action is the upper bound of a confidence interval, rather than the Maximum Likelihood Estimate (MLE) that has been widely used in current schemes of Estimator LA. The philosophy here is to assign more confidence on actions that are selected only for a few times, so that the automaton is encouraged to explore the uncertain actions. When all the actions have been fully explored, the automaton behaves just like the Generalized Pursuit Algorithm. A refined analysis is presented to show the ?-optimality of the proposed algorithm. It has been demonstrated by extensive simulations that the presented learning automaton (LA) is faster than any deterministic estimator learning automata that have been reported to date. Moreover, we extend our algorithm to the stochastic estimator schemes. It is also shown that the extended LA has achieved a significant performance improvement, comparing with the current state of the art algorithm of learning automata, especially in complex and confusing environments.

论文关键词：Learning automata, Stationary environment, Estimator LA, Discrete estimator algorithm, Deterministic estimator algorithm

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-014-0594-1