IIT-GAT: Instance-level image transformation via unsupervised generative attention networks with disentangled representations

作者：

Highlights：

•

摘要

Image-to-image translation is an important research field in computer vision, which is widely associated with Generative Adversarial Networks (GANs) and dual learning. However, the existing methods mainly translate the global image of the source domain to the target domain, which fails to implement instance-level image-to-image translation, and the translation results in the target domain cannot be controlled. In this paper, an instance-level image-to-image translation network (IIT-GAT) is proposed, which includes attention module and feature-encoder module. The attention module is used to guide our model to focus on more interesting instance to generate instance masks, which helps to separate instance and background of an image. The feature-encoder module is used to embed the images into two different spaces: domain-invariant content space and domain-specific attribute space. The content features and attribute features of different images are used as input to generator simultaneously to improve the controllability of image-to-image translation. To this end, we introduce a local self-reconstruction loss that encourages the network to learn the style feature of target instances. Generally, our method not only improves the quality of instance-level image-to-image translation, but also increases controllability on this basis. Extensive experiments are conducted on multiple datasets to validate the effectiveness of the proposed framework, and the results show our method has better performance than previous methods.

论文关键词：Generative adversarial networks,Image-to-image translation,Attention mechanism,Disentangled representation

论文评审过程：Received 21 December 2020, Revised 4 April 2021, Accepted 2 May 2021, Available online 4 May 2021, Version of Record 7 May 2021.

论文官网地址：https://doi.org/10.1016/j.knosys.2021.107122