From vision to multimodal communication: Incremental route descriptions

作者：Wolfgang Maaß

摘要

In the last few years, within cognitive science, there has been a growing interest in the connection between vision and natural language. The question of interest is: How can we discuss what we see. With this question in mind, we will look at the area ofincremental route descriptions. Here, a speaker step-by-step presents the relevant route information in a 3D-environment. The speaker must adjust his/her descriptions to the currently visible objects. Two major questions arise in this context: 1. How is visually obtained information used in natural language generation? and 2. How are these modalities coordinated? We will present a computational framework for the interaction of vision and natural language descriptions which integrates several processes and representations. Specifically discussed is the interaction between the spatial representation and the presentation representation used for natural language descriptions. We have implemented a prototypical version of the proposed model, called MOSES.

论文关键词：spatial cognition, wayfinding, multimodal presentation, object representation

论文评审过程：

论文官网地址：https://doi.org/10.1007/BF00849072