Loss functions for pose guided person image generation

作者:

Highlights:

摘要

Pose guided person image generation aims to transform a source person image to a target pose. It is an ill-posed problem as we often need to generate pixels that are invisible in the source image. Recent works focus on designing new architectures of deep neural networks and show promising performance. However, they simply adopt loss functions widely used in generic image generation tasks, e.g., adversarial loss, L1-norm loss, perceptual loss, and style loss, which fail to consider the unique structural patterns of a person. In addition, it remains unclear how each individual loss and their combinations impact the generated person images. The goal of this paper is to have a comprehensive study of loss functions for pose guided person image generation. After revisiting these generic loss functions, we consider the structural similarity (SSIM) index as a loss function since it is widely used as the evaluation metric and can capture the perceptual quality of generated images. In addition, motivated by the observation that a person can be divided into part regions with homogeneous pixel values or texture, we extend the SSIM loss into a novel Part-based SSIM (PSSIM) loss to explicitly account for the articulated body structure. A new PSSIM metric is then proposed naturally to access the quality of generated person images. In order to have a deep investigation of loss functions, we conduct extensive experiments including single-loss analysis, multi-loss combination analysis, optimal loss combination search, and comparison with state-of-the-art methods. Both quantitative and qualitative results indicate that (1) using different loss functions significantly impacts the generated person images, (2) the combination of adversarial loss, perceptual loss, and PSSIM loss is the optimal choice for person image generation, and (3) the proposed PSSIM loss is complementary to prior losses and helps improve the performance of state-of-the art methods. We have made the source code publicly available at https://github.com/shyern/Pose-Transfer-pSSIM.git.

论文关键词:Person image generation,Loss function analysis,Structure similarity index

论文评审过程:Received 4 March 2021, Revised 21 July 2021, Accepted 24 September 2021, Available online 4 October 2021, Version of Record 11 October 2021.

论文官网地址:https://doi.org/10.1016/j.patcog.2021.108351