On the semantics of visual behaviour, structured events and trajectories of human action

作者：

Highlights：

•

摘要

The problem of modelling the semantics of visual events without segmentation or computation of object-centred trajectories is addressed. Two examples are presented. The first illustrates the detection of autonomous visual events without segmentation. The second shows how high-level semantics can be extracted without spatio-temporal tracking or modelling of object trajectories. We wish to infer the semantics of human behavioural patterns for autonomous visual event recognition in dynamic scenes. This is achieved by learning to model the temporal structures of pixel-wise change energy histories using CONDENSATION. The performance of a pixel-energy-history based event model is compared to that of an adaptive Gaussian mixture based scene model.Given low-level autonomous visual events, grouping and high-level reasoning are required to both infer associations between these events and give meaning to their associations. We present an approach for modelling the semantics of interactive human behaviours for the association of a moving head and two hands under self-occlusion and intersection from a single camera view. For associating and tracking the movements of multiple intersecting body parts, we compare the effectiveness of spatio-temporal dynamics based prediction to that of reasoning about body-part associations based on modelling semantics using Bayesian belief networks.

论文关键词：Adaptive Gaussian mixture models,Autonomous visual events,Bayesian belief nets,CONDENSATION,Discontinuous motion trajectories,Dynamic scene models,Pixel-energy-history,Segmentation,Semantics of visual behaviour

论文评审过程：Available online 16 September 2002.

论文官网地址：https://doi.org/10.1016/S0262-8856(02)00096-3