An adaptive adjustment strategy for bolt posture errors based on an improved reinforcement learning algorithm

作者:Wentao Luo, Jianfu Zhang, Pingfa Feng, Haochen Liu, Dingwen Yu, Zhijun Wu

摘要

Designing an intelligent and autonomous system remains a great challenge in the assembly field. Most reinforcement learning (RL) methods are applied to experiments with relatively small state spaces. However, the complicated situation and high-dimensional spaces of the assembly environment cause traditional RL methods to behave poorly in terms of their efficiency and accuracy. In this paper, a model-driven adaptive proximal proximity optimization (MAPPO) method was presented to make the assembly system autonomously rectify the bolt posture error. In the MAPPO method, a probabilistic tree and adaptive reward mechanism were used to improve the calculation efficiency and accuracy of the traditional PPO method. The size of the action space was reduced by establishing a hierarchical logical relationship for each parameter with a probabilistic tree. Based on an adaptive reward mechanism, the phenomenon that the algorithm easily falls into local minima could be improved. Finally, the proposed method was verified based on the Unity simulation engine. The advancement and robustness of the proposed model were also validated by comparing different cases in simulations and experiments. The results revealed that MAPPO has better algorithm efficiency and accuracy compared with other state-of-the-art algorithms.

论文关键词:Model-driven method, Intelligent assembly, Probabilistic tree, Adaptive reward mechanism, Reinforcement learning, Physical simulation engine

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01906-x