Planning and acting in partially observable stochastic domains

作者:

摘要

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable MDPs (pomdps). We then outline a novel algorithm for solving pomdps off line and show how, in some cases, a finite-memory controller can be extracted from the solution to a POMDP. We conclude with a discussion of how our approach relates to previous work, the complexity of finding exact solutions to pomdps, and of some possibilities for finding approximate solutions.

论文关键词:Planning,Uncertainty,Partially observable Markov decision processes

论文评审过程:Received 11 October 1995, Revised 17 January 1998, Available online 5 October 1998.

论文官网地址:https://doi.org/10.1016/S0004-3702(98)00023-X