4G-VOS: Video Object Segmentation using guided context embedding

作者:

Highlights:

摘要

Video Object Segmentation (VOS) is a fundamental task required in many high-level real-world computer vision applications. VOS becomes challenging due to the presence of background distractors as well as to object appearance variations. Many existing VOS approaches use online model updates to capture the appearance variations which incurs high computational cost. Template matching and propagation-based VOS methods, although cost-effective, suffer from performance degradation under challenging scenarios such as occlusion and background clutter. In order to tackle these challenges, we propose a network architecture dubbed 4G-VOS to encode video context for improved VOS performance to tackle these challenges. To preserve long term semantic information, we propose a guided transfer embedding module. We employ a global instance matching module to generate similarity maps from the initial image and the mask. Besides, we use a generative directional appearance module to estimate and dynamically update the foreground/background class probabilities in a spherical embedding space. Moreover, during feature refinement, existing approaches may lose contextual information. Therefore, we propose a guided pooled decoder to exploit the global and local contextual information during feature refinement. The proposed framework is an end-to-end learning architecture that is trained in an offline fashion. Evaluations over three VOS benchmark datasets including DAVIS2016, DAVIS2017, and YouTube-VOS have demonstrated outstanding performance of the proposed algorithm compared to 40 existing state-of-the-art methods.

论文关键词:Video Object Segmentation,Feature transfer and matching,Spherical embedding,Feature refinement,Channel convolutional neural networks,Encoder–decoder

论文评审过程:Received 21 January 2021, Revised 12 July 2021, Accepted 14 August 2021, Available online 30 August 2021, Version of Record 6 September 2021.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.107401