Primary object discovery and segmentation in videos via graph-based transductive inference

作者:

Highlights:

摘要

The proliferation of video data makes it imperative to develop automatic approaches that semantically analyze and summarize the ever-growing massive visual data. As opposed to existing approaches built on still images, we propose an algorithm that detects recurring primary object and learns cohort object proposals over space-time in video. Our core contribution is a graph transduction process that exploits both appearance cues learned from rudimentary detections of object-like regions, and the intrinsic structures within video data. By exploiting the fact that rudimentary detections of recurring objects in video, despite appearance variation and sporadity of detection, collectively describe the primary object, we are able to learn a holistic model given a small set of object-like regions. This prior knowledge of the recurring primary object can be propagated to the rest of the video to generate a diverse set of object proposals in all frames, incorporating both spatial and temporal cues. This set of rich descriptions underpins a robust object segmentation method against the changes in appearance, shape and occlusion in natural videos. We present extensive experiments on challenging datasets that demonstrate the superior performance of our approach compared with the state-of-the-art methods.

论文关键词:

论文评审过程:Received 2 October 2014, Revised 17 August 2015, Accepted 15 November 2015, Available online 13 January 2016, Version of Record 13 January 2016.

论文官网地址:https://doi.org/10.1016/j.cviu.2015.11.006