Exploiting projective geometry for view-invariant monocular human motion analysis in man-made environments

摘要

Example-based approaches have been very successful for human motion analysis but their accuracy strongly depends on the similarity of the viewpoint in testing and training images. In practice, roof-top cameras are widely used for video surveillance and are usually placed at a significant angle from the floor, which is different from typical training viewpoints. We present a methodology for view-invariant monocular human motion analysis in man-made environments in which we exploit some properties of projective geometry and the presence of numerous easy-to-detect straight lines. We also assume that observed people move on a known ground plane. First, we model body poses and silhouettes using a reduced set of training views. Then, during the online stage, the homography that relates the selected training plane to the input image points is calculated using the dominant 3D directions of the scene, the location on the ground plane and the camera view in both training and testing images. This homographic transformation is used to compensate for the changes in silhouette due to the novel viewpoint. In our experiments, we show that it can be employed in a bottom-up manner to align the input image to the training plane and process it with the corresponding view-based silhouette model, or top-down to project a candidate silhouette and match it in the image. We present qualitative and quantitative results on the CAVIAR dataset using both bottom-up and top-down types of framework and demonstrate the significant improvements of the proposed homographic alignment over a commonly used similarity transform.