Head detection using motion features and multi level pyramid architecture

作者:

Highlights:

摘要

Monitoring large crowds using video cameras is a challenging task. Detecting humans in video is becoming essential for monitoring crowd behavior. However, occlusion and low resolution in the region of interest hinders accurate crowd segmentation. In such scenarios, it is likely that only the head is visible, and often very small. Most existing people-detection systems rely on low-level visual appearance features such as the Histogram of Oriented Gradients (HOG), and these are unsuitable for detecting human heads at low resolutions. In this paper, a novel head detector is presented using motion histogram features. The shape and the motion information, including crowd direction and magnitude, is learned and used to detect humans in occluded crowds. We introduce novel features based on a multi level pyramid architecture for Motion Boundary Histogram (MBH) and Histogram of Oriented Optical Flow (HOOF), derived from the TV-L1 optical flow. In addition, a new feature, called Relative Motion Distance (RMD) is proposed to efficiently capture correlation statistics. For classification distinguishing human head from similar features, a two-stage Support Vector Machine (SVM) is used, and an explicit kernel mapping on our motion histogram features is performed using Bhattacharyya-distance kernels. A second stage of classification is required to reduce the number of false positives. The proposed features and system were tested on videos from the PETS 2009 dataset and compared with state-of-the-art features, against which our system reported excellent results.

论文关键词:

论文评审过程:Received 27 June 2014, Revised 23 December 2014, Accepted 18 April 2015, Available online 23 April 2015, Version of Record 1 June 2015.

论文官网地址:https://doi.org/10.1016/j.cviu.2015.04.007