Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions

Description:

Accurate model and typically with high DOF. ... 3000 training (Male model) 3000 testing (Female model) 3D Pose. Synthesized Silhouettes sampled ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 47
Provided by: Tian50
Category:

less

Transcript and Presenter's Notes

Title: Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions


1
Articulated Pose Estimation in a Learned Smooth
Space of Feasible Solutions
  • Taipeng Tian, Rui Li and Stan Sclaroff
  • Computer Science Dept.
  • Boston University

2
Introduction
  • Motivating application
  • Gesture Recognition
  • Fixed Gesture Lexicon.
  • For example

Aircraft Signaler hand gestures
Basketball Refereehand Signals
Traffic Controllerhand Signals
3
Problem Definition
Input (Observation)
Output
Pose Estimation
2D Projected Marker Positions
Silhouette (Alt Moments)
4
Related Work Pose Estimation from a Single
Image
  • Geometry Based
  • Taylor CVIU 01
  • Barron Kakadiaris IVC 01
  • Parameswaran Chellappa CVPR 04
  • Learning Based
  • Rosales Sclaroff HUMO 00
  • Agarwal Triggs CVPR 04
  • Others
  • Lee Cohen CVPR 04
  • Shakhnarovich, Viola, Darrell ICCV 03
  • Mori, Ren, Efros and Malik CVPR 04
  • Many more

5
Idea 1 Learning Mappings
Image Features
  • Specialized Mapping Architechture (SMA)Rosales
    and Sclaroff NIPS 01
  • Relevance Vector RegressionAgarwal and Triggs
    CVPR 04

Pose
6
Idea 1 Learning Mappings
Image Features
  • Specialized Mapping Architechture (SMA)Rosales
    and Sclaroff NIPS 01
  • Relevance Vector RegressionAgarwal and Triggs
    CVPR 04

Pose
7
Idea 2 Exploring the Solution Space
  • Simulated AnnealingDeutscher et al. CVPR 00
  • Monte Carlo Markov Chain
  • Lee and Cohen CVPR 04
  • etc

8
Idea 2 Exploring the Solution Space
  • Simulated AnnealingDeutscher et al. CVPR 00
  • Monte Carlo Markov Chain Lee and Cohen CVPR
    04
  • etc
  • Accurate model and typically with high DOF.
  • Exploring the pose space for a solution
    consistent with observations.
  • Difficult for high DOF.
  • Computationally intensive.

9
Key Observations
  • We have a constrained set of poses.
  • Not necessary to explore the full parameter
    space.
  • Combine two ideas
  • Learn Mappings
  • Explore a constrained space (i.e. learned model
    of body poses)

Aircraft Signaler hand gestures
Basketball Refereehand Signals
Traffic Controllerhand Signals
10
Overview of Framework
X Latent Space
Learning Phase
Y Training Data
Learn a model of human body poses
Learn the rendering function F(.)
11
Learning a Model of Human Poses
  • Gaussian Process Latent Variable Model (GPLVM)
    Neil Lawrence NIPS 04 is used.
  • GPLVM originally used for visualizing high
    dimensional data
  • Grochow et al. (SIGGRAPH 03) uses it to solve
    the inverse kinematics problem for human motion
    animation.
  • Currently we use it for automated articulated
    body pose inference

12
Gaussian Process Latent Variable Model(GPLVM)
Overview
Higher Dimensional
Probabilistic Mapping
Lower Dimensional / Latent Space
13
GPLVM Training Learning a Model of Body Poses
  • Given training set of 2D projected marker
    positions yi (each yi is of D dimension)
  • Goal Learn parameters

Corresponding latent variable values for each
training data point
Variables related to the Kernel
14
Kernel Function
  • Also known as covariance function.
  • Measures the similarity of the latent variables x
    and x.
  • For a data set of size N, we form an N by N
    kernel matrix K, in which Ki,j k(xi, xj).

15
GPLVM Training Learning a Model of Body Poses
  • For a single dimension, the likelihood of y given
    the Gaussian Process (GP) model parameters is
  • Joint likelihood for D dimensions is

16
  • To learn GPLVM from the training set yi, we
    maximize the following posterior

Negative Log
And placing the priors
17
  • To learn GPLVM from the training set yi, we
    maximize the following posterior

Negative Log
Computationally Intensive. A subset is chosen to
compute the kernel matrix. This subset of poses
is called the Active Set.
18
  • For a new pair (x,y) we can predict using
  • This eqn. can be used to solve for x given y or y
    given x, via gradient descent.

19
GPLVM
20
GPLVM
21
GPLVM
Left hand raised silhouettes tend to be clustered
together
22
GPLVM
Does not always do a good job
23
About GPLVM
  • Allows mapping to and from the lower dimensional
    space.
  • Allows smooth parameterization (i.e. allows
    derivatives) in lower dimensional space.
  • Two dimensions work well for our data set.
    (Growchow et al. uses 2-5)

24
Learning the Forward/Rendering Function
Input 2D Pose
Similar to Rosales and Sclaroff
Silhouettes (Represented using Alt Moments)
25
Overview of Framework
X Latent Space
Learning Phase
Y Training Data
Learn a model of human body poses
Learn the rendering function F(.)
26
Pose Inference
Typical Regularization (Also used by Agarwal and
Triggs)
27
Data Term
Forward function (Rendering function)
Silhouette (Alt Moments)
2D Projected Marker Positions
28
Regularization Term
Independent of feature s
Replace with prior knowledge term (i.e the
learned model of poses)
29
Pose Inference
Also need to talk about initialization
Solution obtained using Conjugate Gradient -
Initialization using Active Set
30
Data Collection
3D Pose
  • 12 gestures in the flight director lexicon
  • Synthesize 6000 pairs of (Silhouette, Pose) pairs
    using Poser
  • 3000 training (Male model)
  • 3000 testing (Female model)

Synthesized Silhouettes sampled Uniformly over
the frontal view-sphere
31
Experiments (Synthetic Data)
(a) Silhouette images generated by Poser 5 (Test
Set)
(b) Estimation from SMA (Specialized Mapping
Architecture)
(c) Our Approach
(d) Ground Truth
32
Comparison with SMA
33
Additional Constraints
Additional constraints can be added to achieve
more accurate estimate, e.g. temporal consistency
34
Experiments (Real Data)
(a) Silhouette images of real person
(b) SMA (Specialized Mapping Architecture)
(c) Our Approach (Without Temporal Consistency)
(d) Our Approach (With Temporal Consistency)
35
Experiments (Real Data)
(a) Silhouette images of real person
(b) SMA (Specialized Mapping Architecture)
(c) Our Approach (Without Temporal Consistency)
(d) Our Approach (With Temporal Consistency)
36
Conclusion
  • Proposed a novel method for Pose estimation for a
    pre-defined gesture lexicon.
  • Interesting to note that two dimension is enough
    in our case.
  • Technique is fast. (about 0.1 sec per frame in
    Matlab)
  • Tracking as an extension. video

37
Thank You ?
38
Comments after the talk
  • Related Works
  • Bullets / Summary of Strength vs Weakness
  • Why we need this work?
  • Include year of publication for the related work
    (eg Rosales Sclaroff work not mentioned,
    Smichisecu work not mentioned)
  • Order the related work temporally?
  • Include an introduction slide and motivating
    slide
  • How to Motivate this work?
  • State of the art is so and so We found this
    common weakness. So we proposed this work..
  • Human Pose not mentioned in Intro
  • At the end of the talk say why use this work over
    the others
  • Why GPLVM and not other reduction techniques?
    Like LLE/PCA/ISOMAP etc
  • Give a top overview of the algorithm. A flow
    chart view?
  • Explain the L(x,y) mapping using an illustration
    like the mapping between two planes. Clearly say
    what is high dimension y and what is low
    dimension x
  • Give reference for GPLVM or website link.
  • Add a slide on Math of GPLVM
  • The Tikhonov regularization approach of
    minimizing phi(y)-s regularization term.
    Usually the regularization term is Dx but now
    we chose L(x,y). Explain why
  • Slide to talk about temporal constraint.
  • Why learn the rendering function? i.e because we
    want to take the derivative
  • Give the numbers for the training set and this
    gives an idea how good are the quantitative
    results

39
Related Work
  • Model Based
  • Simulated AnnealingDeutscher et al CVPR 00
  • Kinematic Jump ProcessesSminchisescu and Triggs
    CVPR 03
  • Monte Carlo Markov Chain Lee and Cohen CVPR
    04
  • etc
  • Learning Based
  • Specialized Mapping Architechture (SMA)Rosales
    and Sclaroff NIPS 01
  • Relevance Vector RegressionAgarwal and Triggs
    CVPR 04
  • Parameter Sensitive HashingShakhnarovich et al
    CVPR 03
  • etc

40
  • To learn GPLVM from the training set yi, we
    maximize the following posterior

Negative Log
41
Overview of Framework (Learning Phase)
Learning a model of human body poses (Using GPLVM)
Learn the Rendering Function F(.)
42
Overview of Framework (Estimation Phase)
Search over learned model of human body pose for
solution consistent with observation
43
Kernel Function
  • measures the similarity of the latent variables x
    and x.
  • For a data set of N, we can form a N by N kernel
    matrix K, in which Ki,j k(xi, xj).

how correlated x, x are in general
spread of the function
noise in the prediction
44
GPLVM Training Learning a Model of Body Poses
  • To learn the parameters of the GPLVM from the
    training set yi, we maximize the following
    posterior

And placing the priors
45
Gaussian Process Latent Variable Model(GPLVM)
Original space representation
Express how well the two value matches
Space of Feasible Poses
Low dimensional parameterization
46
  • For a new pair (x,y) we can predict using
Write a Comment
User Comments (0)
About PowerShow.com