Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development

作者：

摘要

In this paper, we first discuss the meaning of physical embodiment and the complexity of the environment in the context of multi-agent learning. We then propose a vision-based reinforcement learning method that acquires cooperative behaviors in a dynamic environment. We use the robot soccer game initiated by RoboCup (Kitano et al., 1997) to illustrate the effectiveness of our method. Each agent works with other team members to achieve a common goal against opponents. Our method estimates the relationships between a learner's behaviors and those of other agents in the environment through interactions (observations and actions) using a technique from system identification. In order to identify the model of each agent, Akaike's Information Criterion is applied to the results of Canonical Variate Analysis to clarify the relationship between the observed data in terms of actions and future observations. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior policy. The proposed method is applied to a soccer playing situation. The method successfully models a rolling ball and other moving agents and acquires the learner's behaviors. Computer simulations and real experiments are shown and a discussion is given.

论文关键词：Multi-agent learning,Vision-based learning,Reinforcement learning,Cooperative behavior,Physical embodiment

论文评审过程：Received 1 February 1998, Revised 1 November 1998, Available online 27 July 1999.

论文官网地址：https://doi.org/10.1016/S0004-3702(99)00026-0