Video semantic segmentation via feature propagation with holistic attention

作者：

Highlights：

• Propose a Light, Efficient and Real-time network (denoted as LERNet) as a strong backbone network for per-frame processing.

• Efficient feature propagation across redundant video frames with key frame selection scheduling.

• Use temporal holistic attention to imply spatial correlations between key frames and non-key frames.

• Achieve a speed of 131 fps on the CityScapes dataset.

摘要

•Propose a Light, Efficient and Real-time network (denoted as LERNet) as a strong backbone network for per-frame processing.•Efficient feature propagation across redundant video frames with key frame selection scheduling.•Use temporal holistic attention to imply spatial correlations between key frames and non-key frames.•Achieve a speed of 131 fps on the CityScapes dataset.

论文关键词：Real-time,Attention mechanism,Feature propagation,Video semantic segmentation

论文评审过程：Received 7 June 2019, Revised 9 January 2020, Accepted 10 February 2020, Available online 11 February 2020, Version of Record 11 May 2020.

论文官网地址：https://doi.org/10.1016/j.patcog.2020.107268