Characterizing reinforcement learning methods through parameterized learning problems

作者：Shivaram Kalyanakrishnan, Peter Stone

摘要

The field of reinforcement learning (RL) has been energized in the past few decades by elegant theoretical results indicating under what conditions, and how quickly, certain algorithms are guaranteed to converge to optimal policies. However, in practical problems, these conditions are seldom met. When we cannot achieve optimality, the performance of RL algorithms must be measured empirically. Consequently, in order to meaningfully differentiate learning methods, it becomes necessary to characterize their performance on different problems, taking into account factors such as state estimation, exploration, function approximation, and constraints on computation and memory. To this end, we propose parameterized learning problems, in which such factors can be controlled systematically and their effects on learning methods characterized through targeted studies. Apart from providing very precise control of the parameters that affect learning, our parameterized learning problems enable benchmarking against optimal behavior; their relatively small sizes facilitate extensive experimentation.

论文关键词：Reinforcement learning, Empirical evaluation, Partial observability, Function approximation, Parameterized learning problem

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10994-011-5251-x