Lightweight dynamic conditional GAN with pyramid attention for text-to-image synthesis

作者：

Highlights：

• We propose a Conditional Manipulating Modular (CM-M) in Conditional Manipulating Block (CM-B) to compensate semantic information.

• We develop a Pyramid Attention Refine Block (PAR-B) to capture multi-scale context.

• The perceptual loss L1 and image-consistency loss L2 are used to optimize the generator to improve the sharpness and consistency of generated images.

摘要

•We propose a Conditional Manipulating Modular (CM-M) in Conditional Manipulating Block (CM-B) to compensate semantic information.•We develop a Pyramid Attention Refine Block (PAR-B) to capture multi-scale context.•The perceptual loss L1 and image-consistency loss L2 are used to optimize the generator to improve the sharpness and consistency of generated images.

论文关键词：Text-to-image synthesis,Conditional generative adversarial network (CGAN),Network complexity,Disentanglement process,Entanglement process,Information compensation,Pyramid attentive fusion

论文评审过程：Received 18 June 2019, Revised 19 March 2020, Accepted 15 April 2020, Available online 29 May 2020, Version of Record 1 November 2020.

论文官网地址：https://doi.org/10.1016/j.patcog.2020.107384