An integrated approach to solving influence diagrams and finite-horizon partially observable decision processes

作者:

摘要

We show how to integrate a variable elimination approach to solving influence diagrams with a value iteration approach to solving finite-horizon partially observable Markov decision processes (POMDPs). The integration of these approaches creates a variable elimination algorithm for influence diagrams that has much more relaxed constraints on elimination order, which allows improved scalability in many cases. The new algorithm can also be viewed as a generalization of the value iteration algorithm for POMDPs that solves non-Markovian as well as Markovian problems, in addition to leveraging a factored representation for improved efficiency. The development of a single algorithm that integrates and generalizes both of these classic algorithms, one for influence diagrams and the other for POMDPs, unifies these two approaches to solving Bayesian decision problems in a way that combines their complementary advantages.

论文关键词:Influence diagram,Variable elimination,Partially observable Markov decision process,Dynamic programming,Decision-theoretic planning

论文评审过程:Received 9 October 2019, Revised 2 August 2020, Accepted 23 November 2020, Available online 2 December 2020, Version of Record 29 January 2021.

论文官网地址:https://doi.org/10.1016/j.artint.2020.103431