Lightweight dynamic conditional GAN with pyramid attention for text-to-image synthesis

作者:

Highlights:

• We propose a Conditional Manipulating Modular (CM-M) in Conditional Manipulating Block (CM-B) to compensate semantic information.

• We develop a Pyramid Attention Refine Block (PAR-B) to capture multi-scale context.

• The perceptual loss L1 and image-consistency loss L2 are used to optimize the generator to improve the sharpness and consistency of generated images.

摘要

•We propose a Conditional Manipulating Modular (CM-M) in Conditional Manipulating Block (CM-B) to compensate semantic information.•We develop a Pyramid Attention Refine Block (PAR-B) to capture multi-scale context.•The perceptual loss L1 and image-consistency loss L2 are used to optimize the generator to improve the sharpness and consistency of generated images.

论文关键词:Text-to-image synthesis,Conditional generative adversarial network (CGAN),Network complexity,Disentanglement process,Entanglement process,Information compensation,Pyramid attentive fusion

论文评审过程:Received 18 June 2019, Revised 19 March 2020, Accepted 15 April 2020, Available online 29 May 2020, Version of Record 1 November 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107384