Understanding dynamic scenes based on human sequence evaluation

作者：

Highlights：

•

摘要

In this paper, a Cognitive Vision System (CVS) is presented, which explains the human behaviour of monitored scenes using natural-language texts. This cognitive analysis of human movements recorded in image sequences is here referred to as Human Sequence Evaluation (HSE) which defines a set of transformation modules involved in the automatic generation of semantic descriptions from pixel values. In essence, the trajectories of human agents are obtained to generate textual interpretations of their motion, and also to infer the conceptual relationships of each agent w.r.t. its environment. For this purpose, a human behaviour model based on Situation Graph Trees (SGTs) is considered, which permits both bottom-up (hypothesis generation) and top-down (hypothesis refinement) analysis of dynamic scenes. The resulting system prototype interprets different kinds of behaviour and reports textual descriptions in multiple languages.

论文关键词：Image Sequence Evaluation,High-level processing of monitored scenes,Segmentation and tracking in complex scenes,Event recognition in dynamic scenes,Human motion understanding,Human behaviour interpretation,Natural-language text generation,Realistic demonstrators

论文评审过程：Received 23 February 2007, Revised 14 January 2008, Accepted 6 February 2008, Available online 14 February 2008.

论文官网地址：https://doi.org/10.1016/j.imavis.2008.02.004