Co-segmentation inspired attention module for video-based computer vision tasks

作者：

Highlights：

• Exploring the applicability of “Co-segmentation” concept in video-based tasks.

• Generic “Co-Segmentation inspired Attention Module” (COSAM) to plug-in to any CNN.

• The COSAM module captures task-specific salient regions in an end-to-end manner.

• The COSAM module aids interpretability through its spatial attention mask.

• Experiments on three video-based tasks show the effectiveness of the COSAM module.

摘要

•Exploring the applicability of “Co-segmentation” concept in video-based tasks.•Generic “Co-Segmentation inspired Attention Module” (COSAM) to plug-in to any CNN.•The COSAM module captures task-specific salient regions in an end-to-end manner.•The COSAM module aids interpretability through its spatial attention mask.•Experiments on three video-based tasks show the effectiveness of the COSAM module.

论文关键词：Attention,Co-segmentation,Person re-ID,Video-captioning,Video classification

论文评审过程：Received 26 November 2021, Revised 2 August 2022, Accepted 3 August 2022, Available online 9 August 2022, Version of Record 23 August 2022.

论文官网地址：https://doi.org/10.1016/j.cviu.2022.103532