Application of Markov decision processes to search problems

摘要

Many decision problems contain, in some form, a NP-hard combinatorial problem. Therefore decision support systems have to solve such combinatorial problems in a reasonable time. Many combinatorial problems can be solved by a search method. The search methods used in decision support systems have to be robust in the sense that they can handle a large variety of (user defined) constraints and that they allow user interaction, i.e. they allow a decision maker to control the search process manually. In this paper we show how Markov decision processes can be used to guide a random search process. We first formulate search problems as a special class of Markov decision processes such that the search space of a search problem is the state space of the Markov decision process. In general it is not possible to compute an optimal control procedure for these Markov decision processes in a reasonable time. We therefore, define several simplifications of the original problem that have much smaller state spaces. For these simplifications, decompositions and abstractions, we find optimal strategies and use the exact solutions of these simplified problems to guide a randomized search process. The search process selects states for further search at random with probabilities based on the optimal strategies of the simplified problems. This randomization is a substitute for explicit backtracking and avoids problems with local extrema. These randomized search procedures are repeated as long as we have time to solve the problem. The best solution of those generated during that time is accepted. We illustrate the approach with two examples: the N-puzzle and a job shop scheduling problem.