Caltech
Center for Neuromorphic Systems Engineering

Home
Research
News
People

[back]

Human Motion Detection and Classification
Claudio Fanti, Pietro Peron

Abstract. We foresee a future in which machines autonomously interact with Humans in the surrounding environment. So far, very good results have been achieved in detecting the presence of Humans and labeling their body parts by means of graphical-models based algorithms. We unavoidably have to deal with uncertainty and reasoning in absence of complete information. To that extent, we explore and enhance the state of the art in probabilistic inference and sampling techniques having the machines understanding human actions as a primary application.

Motivation. It is becoming more and more a necessity for machines to autonomously interact with Humans in the surrounding environment. Detecting and interpreting human presence, actions and activities is one of the most valuable functions of our own visual system. Endowing machines with the same ability would enable a great number of useful industrial applications ranging from convenient non-contact user interfaces for consumer products, to on-board safety systems for automobiles, and surveillance systems for stores and museums.

In order to interpret human activities a system must be able to detect human presence. Further more, it is fundamental to localize the visible parts of the body and characterize the corresponding regions of the image (or label them). Once a labeling is achieved, the different parts of the body may be tracked in time and their trajectories and/or spatiotemporal energy patterns can be used in the classification of actions and activities.

So far, we primarily focused on detection and labeling, restricting ourselves to a specific context known as the "Johansson problem" (its generalization, in fact). More precisely, the position and velocity of point-features are input to a system that decides whether human motion is present. The system also assigns probabilistic labels to the detected features. The method is shown to perform very well on both artificial and real image sequences. We also address the problem of unsupervised learning of the model structure.

Research. Our investigation in the field of graphical-models and probabilistic inference has led to a powerful schema that is able to learn a probabilistic model of the human body, describing the correlation between the random variables that represent the position and motion of each body part. To achieve invariance with respect to translation we refer the data to a center of gravity of the body to be treated as a hidden variable in a variant of the EM algorithm. A message-passing algorithm determines the labeling based on the potentials of the clique graph (or tree). We conducted experiments both on artificial data and motion-captured image sequences. The results show that we can successfully label the body with very high accuracy even when a substantial amount of noise is present.

Moving one step further, we are considering the problem of describing the dynamic of the body to infer the action that is being observed in a probabilistic fashion by means of hybrid bayesian networks. Furthermore, in order to work directly on grayscale images, we are investigating ways of incorporating the data association problem directly into the (dynamic) probabilistic model of the human body.

References
Fanti C, Polito M and Perona P - "An Improved Scheme for Detection and Labeling in Johansson's Displays" - Submitted to NIPS 2003

 


top