Multiple Granularity Modeling: A Coarse-to-Fine Framework for Fine-grained Action Analysis

作者:Bingbing Ni, Vignesh R. Paramathayalan, Teng Li, Pierre Moulin

摘要

Detecting fine-grained human action from video sequence is challenging. In this work, we propose to decompose this difficult analytic problem into two sequential tasks with increasing granularity. Firstly, we infer the coarse interaction status, i.e., which object is being manipulated and where the interaction occurs. To address the issue of frequent mutual occlusions during manipulation, we propose an interaction tracking framework in which hand (object) position and interaction status are jointly tracked by explicitly modeling the occlusion context. Secondly, for a given query sequence, the inferred interaction status is utilized to efficiently identify a small set of candidate matching sequences from the annotated training set. Frame-level action labels are then transferred to the query sequence by setting up the matching between the query and candidate sequences. Comprehensive experiments on two challenging fine-grained activity datasets show that: (1) the proposed interaction tracking approach achieves high tracking accuracy for multiple mutually occluded objects (hands) during manipulation action; and (2) the proposed multiple granularity analysis framework achieves superior action detection performance improvement over state-of-the-art methods.

论文关键词:Multiple granularity, Fine-grained action detection, Multiple object tracking, Nonparametric label transfer

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11263-016-0891-8