DMDIT: Diverse multi-domain image-to-image translation

作者：

Highlights：

•

摘要

Cross-domain image translation studies have shown brilliant progress in recent years, which intend to learn the mapping between two different domains. A good cross-domain image translation model should meet the following conditions: (1) do not rely on paired dataset, (2) can deal with multiple domains, (3) obtain diverse outputs with the same source image. Most state-of-art studies are devoted to addressing two of them i.e., either (1) and (2), or (1) and (3). In this paper, we construct a unified diverse multi-domain image to image translation framework (DMDIT) which can satisfy the above three requirements simultaneously. Different from traditional approaches, the proposed generator can achieve diverse and multi-label image-to-image translation while retaining the underlying features of the input image. The diverse outputs are obtained through a latent noise sampled from the normal distribution randomly. To further improve the multiplicity of the outputs, we propose a novel style regularization loss to restrain the latent noise. The mode collapse problem usually occurs due to the lack of constraints on the noise, so we embed a noise separation module in the discriminator to avoid this issue. In addition, we apply an attention mechanism to make the model attentively focus on the most attribute-relevant regions, helping to improve the quality of the generated images. Extensive qualitative and quantitative evaluations clearly demonstrate the effectiveness of our approach.

论文关键词：Multi-domain,Multi-modality,Image translation

论文评审过程：Received 14 March 2021, Revised 12 July 2021, Accepted 14 July 2021, Available online 16 July 2021, Version of Record 30 July 2021.

论文官网地址：https://doi.org/10.1016/j.knosys.2021.107311