Multi-label image recognition with two-stream dynamic graph convolution networks

作者:

Highlights:

摘要

Recent studies use Graph Convolution Networks (GCN) to model label correlation for multi-label images because of the outstanding performance of GCN in relational modeling tasks. However, the traditional GCN has low generalization, and the current state-of-the-arts' accuracy is poor. Therefore, we propose a Two-Stream Dynamic Graph Convolution Network (2S-DGCN) to improve the performance of multi-label image recognition. In 2S-DGCN, we first obtain the Up Confidence Score of prediction categories (UCS), the content-aware category and the label discriminant vector by a Semantic Attention Module (SAM) and a Dynamic Graph Convolution Network (DGCN) in upstream. Then fed the new graph feature nodes reconstructed by lateral embedding the content-aware category and the label discriminant vector into a DGCN to produce the Down Confidence Score of prediction categories (DCS) in downstream. Finally, the Final Confidence Score of prediction categories (FCS) for multi-label image recognition is synthesized by fusing the UCS and DCS. Extensive experiments on the public multi-label benchmarks achieve mAPs of 85.6% on MS-COCO and 95.4% on VOC 2007. The results of compared experiment and visualization demonstrate that our method has better performance than the current state-of-the-art methods.

论文关键词:Multi-label image recognition,Two streams,Reconstructing graph feature nodes,Dynamic graph convolution networks

论文评审过程:Received 17 May 2021, Accepted 2 June 2021, Available online 24 June 2021, Version of Record 26 June 2021.

论文官网地址:https://doi.org/10.1016/j.imavis.2021.104238