Dissertation/Thesis Abstract

Human pose estimation from a single view point
by Siddiqui, Matheen, Ph.D., University of Southern California, 2009, 155; 3368720
Abstract (Summary)

We address the estimation of human poses from a single view point in images and sequences. This is an important problem with a range of applications in human computer interaction, security and surveillance monitoring, image understanding, and motion capture. In this work we develop methods that make use of single view cameras, stereo, and range sensors.

First, we develop a 2D limb tracking scheme in color images using skin color and edge information. Multiple 2D limb models are used to enhance tracking of the underlying 3D structure. This includes models for lateral forearm views (waving) as well as for pointing gestures.

In our color image pose tracking framework, we find candidate 2D articulated model configurations by searching for locally optimal configurations under a weak but computationally manageable fitness function. By parameterizing 2D poses by their joint locations organized in a tree structure, candidates can be efficiently and exhaustively localized in a bottom-up manner. We then adapt this algorithm for use on sequences and develop methods to automatically construct a fitness function from annotated image data.

With a stereo camera, we use depth data to track the movement of a user using an articulated upper body model. We define an objective function that evaluates the saliency of this upper body model with a stereo depth image and track the arms of a user by numerically maintaining the optimum using an annealed particle filter.

In range sensors, we use a DDMCMC approach to find an optimal pose based on a likelihood that compares synthesized and observed depth images. To speed up convergence of this search, we make use of bottom up detectors that generate candidate part locations. Our Markov chain dynamics explore solutions about these parts and thus combine bottom up and top down processing. The current performance is 10fps and we provide quantitative performance evaluation using hand annotated data. We demonstrate significant improvement over a baseline ICP approach. This algorithm is then adapted to estimate the specific shape parameters of subjects for use in tracking.

Indexing (document details)
Advisor: Medioni, Gerard
Commitee: Gratch, Jonathan, Kuo, C.-C. Jay
School: University of Southern California
Department: Computer Science
School Location: United States -- California
Source: DAI-B 70/08, Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science
Keywords: Articulated motion, Human pose estimation, Range sensors
Publication Number: 3368720
ISBN: 9781109295160