Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features

Highlights：

• We propose to leverage the aggregation of view and instance attentive features for multi-view 3D object retrieval.

• To leverage local view-relevant discriminative information within each of the view images, we propose a View Attention Module (VAM) to learn view attentive features for each view image.

• To leverage global correlative information across all the view images, we propose an Instance Attention Module (IAM) to learn instance attentive features for each view image.

• We propose to employ ArcFace loss together with cosine distance based triplet-center loss as the metric learning guidance to learn discriminative representations in the angular feature space.

摘要

•We propose to leverage the aggregation of view and instance attentive features for multi-view 3D object retrieval.•To leverage local view-relevant discriminative information within each of the view images, we propose a View Attention Module (VAM) to learn view attentive features for each view image.•To leverage global correlative information across all the view images, we propose an Instance Attention Module (IAM) to learn instance attentive features for each view image.•We propose to employ ArcFace loss together with cosine distance based triplet-center loss as the metric learning guidance to learn discriminative representations in the angular feature space.

论文关键词：View-based 3D object retrieval,View attention module,Instance attention module,ArcFace loss,Cosine distance triplet-center loss

论文评审过程：Received 1 December 2021, Revised 3 April 2022, Accepted 5 April 2022, Available online 11 April 2022, Version of Record 25 April 2022.