Loss functions for pose guided person image generation

摘要

Pose guided person image generation aims to transform a source person image to a target pose. It is an ill-posed problem as we often need to generate pixels that are invisible in the source image. Recent works focus on designing new architectures of deep neural networks and show promising performance. However, they simply adopt loss functions widely used in generic image generation tasks, e.g., adversarial loss, L1-norm loss, perceptual loss, and style loss, which fail to consider the unique structural patterns of a person. In addition, it remains unclear how each individual loss and their combinations impact the generated person images. The goal of this paper is to have a comprehensive study of loss functions for pose guided person image generation. After revisiting these generic loss functions, we consider the structural similarity (SSIM) index as a loss function since it is widely used as the evaluation metric and can capture the perceptual quality of generated images. In addition, motivated by the observation that a person can be divided into part regions with homogeneous pixel values or texture, we extend the SSIM loss into a novel Part-based SSIM (PSSIM) loss to explicitly account for the articulated body structure. A new PSSIM metric is then proposed naturally to access the quality of generated person images. In order to have a deep investigation of loss functions, we conduct extensive experiments including single-loss analysis, multi-loss combination analysis, optimal loss combination search, and comparison with state-of-the-art methods. Both quantitative and qualitative results indicate that (1) using different loss functions significantly impacts the generated person images, (2) the combination of adversarial loss, perceptual loss, and PSSIM loss is the optimal choice for person image generation, and (3) the proposed PSSIM loss is complementary to prior losses and helps improve the performance of state-of-the art methods. We have made the source code publicly available at https://github.com/shyern/Pose-Transfer-pSSIM.git.