Model-Free Optimal Consensus Control for Multi-agent Systems Based on DHP Algorithm

作者：Haoen Shi, Yanghe Feng, Chaoxu Mu, Yunkai Wu

摘要

This paper developes a novel model-free dual heuristic dynamic programming (DHP) algorithm combined with policy iteration and least square techniques to implement optimal consensus control of discrete-time multi-agent systems. The coupled Hamilton-Jacobi-Bellman (HJB) equations are required to be solved to achieve optimal consensus control, which is generally difficult especially under the case of unknown mathematical models. To overcome above difficulties, the DHP method is carried out by reinforcement learning utilizing online collected data rather than the accurate system dynamics. First, the performance index and corresponding Bellman equation are acquired. Each agent’s value function has quadratic form. Then, a model network is employed to approximate the accurate system dynamics. The Q-function Bellman equation is obtained next. By taking the derivative of Q-function, the DHP method is applied to construct the update formula. Convergence and stability analysis of proposed algorithm are presented. Two simulation examples are provided to illustrate the validity of the proposed algorithm.

论文关键词：Reinforcement learning, DHP, Optimal control, Least square

论文评审过程：

论文官网地址：https://doi.org/10.1007/s11063-021-10641-4