Attention to multiple local critics in decision making and control

作者:

Highlights:

摘要

Dealing with uncertainties and lack of knowledge about problems and situations, there is a perpetual difficulty to evaluate the situations and action values in all time steps. On the other hand, the design of critics which delicately guide the agent even with reinforcement rewards and punishments in these complicated or blurred environments is laborious and cumbersome. In this study, we propose a framework for concurrent learning of control of attention to the sensory space, attention to the various critics to evaluate the selected motor actions, and the motor actions themselves. Previous works include the implementation of the control of attention for selecting the most important parts of data and/or reducing the dimensionality of the input space. However, decision making can depend not only on objective sensory data, but also upon mental states or subjective inputs as well. Specifically, we examine attention for the evaluations of selected action by various local critics, as well as sensory inputs. Each local critic evaluates the agent’s actions regarding to its standpoints on task domain. Our agent tries to learn the degree of importance of each local critic’s assessment, using sparse super-critic punishing revisions. So the agent learns the way of combining or even disregarding some of local critics while it learns to focus on the appropriate subset of features, and learns physical actions concurrently. By discovering the effective combination of local critics, the agent does not need any prior accurately designed critic. The mathematical formulation of proposed learning method is developed. Also, in order to evaluate the proposed method, two benchmarks are discussed. The effect of using attention control on robustness is analyzed via Monte Carlo analysis. The experimental results show the efficiency of proposed formulation in presence of uncertainties.

论文关键词:Attention control,Reinforcement learning,Q-learning,Brain emotional learning controller,Multi objective problems,Model free control

论文评审过程:Available online 25 March 2010.

论文官网地址:https://doi.org/10.1016/j.eswa.2010.03.029