Multibody Grouping from Motion Images | 数据学习(DataLearner)

摘要

We want to deduce, from a sequence of noisy two-dimensional images of a scene of several rigid bodies moving independently in three dimensions, the number of bodies and the grouping of given feature points in the images to the bodies. Prior processing is assumed to have identified features or points common to all frames and the images are assumed to be created by orthographic projection (i.e., perspective effects are minimal). We describe a computationally inexpensive algorithm that can determine which points or features belong to which rigid body using the fact that, with exact observations in orthographic projection, points on a single body lie in a three or less dimensional linear manifold of frame space. If there are enough observations and independent motions, these manifolds can be viewed as a set linearly independent, four or less dimensional subspaces. We show that the row echelon canonical form provides direct information on the grouping of points to these subspaces. Treatment of the noise is the most difficult part of the problem. This paper uses a statistical approach to estimate the grouping of points to subspaces in the presence of noise by computing which partition has the maximum likelihood. The input data is assumed to be contaminated with independent Gaussian noise. The algorithm can base its estimates on a user-supplied standard deviation of the noise, or it can estimate the noise from the data. The algorithm can also be used to estimate the probability of a user-specified partition so that the hypothesis can be combined with others using Bayesian statistics.