Encoder deep interleaved network with multi-scale aggregation for RGB-D salient object detection

作者：

Highlights：

• We design a two-stream deep interleaved encoder network to obtain multi-level continuous multi-modal features for saliency detection.

• We utilize a cross-modal mutual guidance module to locate the salient region.

• We present a residual multi-scale aggregation module to combine the global-to-local context progressively.

• Our method performs favorably against other state-of-the-art saliency detection algorithms, and the network can run at about 93 FPS in the testing stage.

摘要

•We design a two-stream deep interleaved encoder network to obtain multi-level continuous multi-modal features for saliency detection.•We utilize a cross-modal mutual guidance module to locate the salient region.•We present a residual multi-scale aggregation module to combine the global-to-local context progressively.•Our method performs favorably against other state-of-the-art saliency detection algorithms, and the network can run at about 93 FPS in the testing stage.

论文关键词：RGB-D salient object detection,Deep interleaved encoder,Cross-modal mutual guidance,Residual multi-scale feature aggregation,Real-time

论文评审过程：Received 7 January 2021, Revised 10 March 2022, Accepted 22 March 2022, Available online 24 March 2022, Version of Record 29 March 2022.

论文官网地址：https://doi.org/10.1016/j.patcog.2022.108666