Recognizing expressions from face and body gesture by temporal normalized motion and appearance features

作者：

Highlights：

•

摘要

Recently, recognizing affects from both face and body gestures attracts more attentions. However, it still lacks of efficient and effective features to describe the dynamics of face and gestures for real-time automatic affect recognition. In this paper, we combine both local motion and appearance feature in a novel framework to model the temporal dynamics of face and body gesture. The proposed framework employs MHI-HOG and Image-HOG features through temporal normalization or bag of words to capture motion and appearance information. The MHI-HOG stands for Histogram of Oriented Gradients (HOG) on the Motion History Image (MHI). It captures motion direction and speed of a region of interest as an expression evolves over the time. The Image-HOG captures the appearance information of the corresponding region of interest. The temporal normalization method explicitly solves the time resolution issue in the video-based affect recognition. To implicitly model local temporal dynamics of an expression, we further propose a bag of words (BOW) based representation for both MHI-HOG and Image-HOG features. Experimental results demonstrate promising performance as compared with the state-of-the-art. Significant improvement of recognition accuracy is achieved as compared with the frame-based approach that does not consider the underlying temporal dynamics.

论文关键词：Affect recognition,Facial feature,Body gesture,MHI-HOG,Image-HOG

论文评审过程：Received 18 October 2011, Revised 18 June 2012, Accepted 28 June 2012, Available online 11 July 2012.

论文官网地址：https://doi.org/10.1016/j.imavis.2012.06.014