A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions

作者:Ying Guo, Hao Ge, Shenghong Li

摘要

Learning automata (LA) as a powerful tool for reinforcement learning which belongs to the subject of Artificial Intelligence, could search for the optimal state adaptively in a random environment. In the past decades quite a few FALA algorithms are maturely developed but exposing critical defects, when they are applied to optimize continuous functions. In order to overcome their shortcomings and explore a higher-performance LA, we propose a novel CALA algorithm to solve the function optimization problems via one kind of LA prototypes, i.e, the continuous action-set reinforcement learning automata, which is abbreviated as CARLA. The key mechanism of the proposed algorithm lies in a combination of equidistant discretization and linear interpolation. Specifically, four categories of application models are constructed. Two of them are created to obtain continuous actions when the priori information is finite ones, thus avoiding the drawbacks of FALA. The realization of this functionality recourses to the so-called cumulative distribution function (CDF) and a new concept of area surrounded by curves (AsbC) respectively. The other two models are modified versions to balance the trade-off between accuracy and speed. Moreover, these models are expanded to their generalized versions so that multidimensional function optimization problems can be handled as well. A massive amount of experiments including four benchmarks and three scenarios are designed to demonstrate the effectiveness and efficiency of the proposed application models. The proposed algorithm outperforms the state of the arts of LA as well as optimization algorithms, with a high accuracy rate, a fast convergence speed, and a competitive time consumption, especially in noised environments.

论文关键词:Artificial intelligence, Reinforcement learning, Learning automata, CALA, Function optimization

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-016-0853-4