CANet: Co-attention network for RGB-D semantic segmentation

作者:

Highlights:

• We propose a novel CANet for RGB-D semantic segmentation, and the key co-attention fusion part consists of three modules, i.e. the PCFM, the CCFM, and the FCM, where the PCFM and CCFM aggregate the position-wise and channel-wise features of RGB and depth images, and the FCM produces the final fused features by integrating the output features from the PCFM, the CCFM, and the mixture branch.

• We perform extensive experiments on the NYUDv2 and SUN-RGBD datasets, where the CANet significantly improves RGB-D semantic segmentation results, achieving state-ofthe- art performance on these two popular RGB-D benchmarks.

摘要

•We propose a novel CANet for RGB-D semantic segmentation, and the key co-attention fusion part consists of three modules, i.e. the PCFM, the CCFM, and the FCM, where the PCFM and CCFM aggregate the position-wise and channel-wise features of RGB and depth images, and the FCM produces the final fused features by integrating the output features from the PCFM, the CCFM, and the mixture branch.•We perform extensive experiments on the NYUDv2 and SUN-RGBD datasets, where the CANet significantly improves RGB-D semantic segmentation results, achieving state-ofthe- art performance on these two popular RGB-D benchmarks.

论文关键词:RGB-D,Multi-modal fusion,Co-attention,Semantic segmentation

论文评审过程:Received 28 June 2020, Revised 30 September 2021, Accepted 27 November 2021, Available online 29 November 2021, Version of Record 6 December 2021.

论文官网地址:https://doi.org/10.1016/j.patcog.2021.108468