A dual-source approach for 3D human pose estimation from single images

作者:

Highlights:

摘要

In this work we address the challenging problem of 3D human pose estimation from single images. Recent approaches learn deep neural networks to regress 3D pose directly from images. One major challenge for such methods, however, is the collection of large amounts of training data. Particularly, collecting a large number of unconstrained images that are annotated with accurate 3D poses is impractical. We therefore propose to use two independent training sources. The first source consists of accurate 3D motion capture data, and the second source consists of unconstrained images with annotated 2D poses. To incorporate both sources, we propose a dual-source approach that combines 2D pose estimation with efficient 3D pose retrieval. To this end, we first convert the motion capture data into a normalized 2D pose space, and separately learn a 2D pose estimation model from the image data. During inference, we estimate the 2D pose and efficiently retrieve the nearest 3D poses. We then jointly estimate a mapping from the 3D pose space to the image and reconstruct the 3D pose. We provide a comprehensive evaluation of the proposed method and experimentally demonstrate the effectiveness of our approach, even when the skeleton structures of the two sources differ substantially.

论文关键词:

论文评审过程:Received 20 September 2017, Revised 24 January 2018, Accepted 22 March 2018, Available online 4 April 2018, Version of Record 5 December 2018.

论文官网地址:https://doi.org/10.1016/j.cviu.2018.03.007