Lifelong reinforcement learning with temporal logic formulas and reward machines

作者:

Highlights:

摘要

Continuously learning new tasks using high-level ideas or knowledge is a key capability of humans. In this paper, we propose lifelong reinforcement learning with sequential linear temporal logic formulas and reward machines (LSRM), which enables an agent to leverage previously learned knowledge to accelerate the learning of logically specified tasks. For a more flexible specification of tasks, we first introduce sequential linear temporal logic (SLTL), which is a supplement to the existing linear temporal logic (LTL) formal language. We then utilize reward machines (RMs) to exploit structural reward functions for tasks encoded with high-level events, and propose an automatic extension of RMs and efficient knowledge transfer over tasks for continuous lifelong learning. Experimental results show that LSRM outperforms methods that learn the target tasks from scratch by taking advantage of the task decomposition using SLTL and the knowledge transfer over RMs during the lifelong learning process.

论文关键词:Lifelong reinforcement learning,Reward machine,Temporal logic

论文评审过程:Received 3 March 2022, Revised 4 August 2022, Accepted 5 August 2022, Available online 11 August 2022, Version of Record 3 October 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.109650