Embedding decision-analytic control in a learning architecture

作者：

摘要

An autonomous agent's control problem is often formulated as the attempt to minimize the expected cost of accomplishing a goal. This paper presents a three-dimensional view of the control problem that is substantially more realistic. The agent's control policy is assessed along three dimensions: deliberation cost, execution cost, and goal value. The agent must choose which goal to attend to as well as which action to take. Our control policy seeks to maximize satisfaction by trading execution cost and goal value while keeping deliberation cost low. The agent's control decisions are guided by the MU heuristic—choose the alternative whose marginal expected utility is maximal. Thus, when necessary, the agent will prefer easily-achieved goals to attractive but difficult-to-attain alternatives. The MU heuristic is embedded in an architecture with record-keeping and learning capabilities. The architecture offers its control module expected utility and expected cost estimates that are gradually refined as the agent accumulates experience. A programmer is not required to supply that knowledge, and the estimates are provided without recourse to distributional assumptions.

论文关键词：

论文评审过程：Available online 19 February 2003.

论文官网地址：https://doi.org/10.1016/0004-3702(91)90008-8