VisionBased Recognition of Continuos Dynamic Hand Gestures - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

VisionBased Recognition of Continuos Dynamic Hand Gestures

Description:

Affine Model: Planar Model: For example: 4.2 Constructing ... Conclusion: For our gesture command set, choosing affine model is necessary and sufficient. ... – PowerPoint PPT presentation

Number of Views:358
Avg rating:3.0/5.0
Slides: 59
Provided by: yuanx
Category:

less

Transcript and Presenter's Notes

Title: VisionBased Recognition of Continuos Dynamic Hand Gestures


1
Vision-Based Recognition of Continuos Dynamic
Hand Gestures
  • Yuanxin Zhu
  • Department of Computer Science Technology
  • Tsinghua University, Beijing, China

2
Outline
  • 1. Introduction
  • 2. Literature Review
  • 3. Interaction Model and Prototype System Design
  • 4. Real-time Segmentation of Hand Gestures
  • 5. Parameterized Image Motion Model and Robust
    Regression
  • 6. Spatio-temporal Appearance Modeling
  • 7. Dynamic Time Warping Based Recognition
  • 8. Experiment Results
  • 9. Summary
  • 10. Future Work

3
1 Introduction
  • Human-computer interaction (HCI) has become an
    increasingly important part of our daily lives.
  • Keyboards and mice are the most popular mode of
    HCI.
  • Virtual Reality and Wearable Computing require
    novel interaction modalities with following
    characteristics
  • in a way that humans communicate with each other.
  • Hand gesture is a natural and intuitive
    communication mode.
  • Other applications Sign Language Recognition,
    video transmission, and so on.

4
1 Introduction
  • Vision-based recognition of dynamic hand gestures
    is a challenging interdisciplinary project.
  • hand gestures are rich in diversities,
    multi-meanings, and space-time variation.
  • human hand is a complex non-rigid object.
  • computer vision itself is a ill-pose problem.

5
1 Introduction
  • To recognize continuous dynamic hand gesture
  • Design of gesture command set and interaction
    model.
  • Real-time segmentation of gesture streams.
  • Modeling, analysis, and recognition of gestures.
  • Real-time processing is mandatory for practically
    using hand gestures in HCI.

6
2. State of the the Art of Hand Gesture
Recognition
  • 2.1 Hand gesture taxonomy and interaction model
  • 2.2 Hand gesture modeling
  • 2.3 Hand gesture Analysis
  • 2.4 Hand gesture recognition techniques

7
2.1 Taxonomy of Gesture for Human-computer
Interaction
Fig.1 A Taxonomy of hand gestures for
Human-computer Interaction. Meaningful gestures
are differentiated from unintentional movements.
Gestures used for manipulation of objects are
separated from the gestures which possess
inherent communicational character. Symbols are
those gestures having a linguistic role. They
symbolize some referential action or are used as
modalizers, often of speech.
8
2.2 Hand Gesture Modeling
  • Fig. 2 Classification of hand gesture models

9
2.2 Hand Gesture Modeling
  • (a) (b) (c)
    (d) (e)
  • Fig.3 Representing the same hand posture by
    different hand models. (a) 3-D textured
    volumetric model (b) 3-D wireframe volumetric
    model (c) 3-D skeletal model (d) Binary
    silhouette (e) Contour model.

10
2.3 Gesture Analysis
  • Gesture detection and feature extraction
  • skin color clues based approaches
  • motion clues based approaches
  • multiple clues based approaches
  • features include gray image, binary silhouette,
    moving region, edge, contour, and so on.

11
2.3 Gesture Analysis
  • Recovering gesture model parameters
  • Estimation of 3-D hand /arm model parameters
  • two sets of parameters angular (joint angles)
    and linear (palm dimensions)
  • the initial parameter estimation
  • the parameter update as the hand gesture evolve
    in time.
  • Estimation of appearance based model parameters
  • image motion estimation (e.g. optical flow)
  • shape analysis (e.g. computing moments)
  • histogram based feature parameters (e.g. )
  • active contour model.

12
2.4 Gesture Recognition Techniques
  • Fig. 4 Classification of hand gesture
    recognition techniques

13
3.1 Interaction Model
  • Strength and weakness of gesture based
    interaction
  • Structure of interaction model
  • users performing gestures follow three steps.
  • suitable feedback
  • apply gesture based input to appropriate tasks
  • A set of rules for designing gesture command set.
  • Performing gestures intentionally and
    intensively, easy to learn, be symmetrical ...

14
3.2 A Prototype System Gesture-controlled
Panoramic Map Browser
  • (a)
    (b)
  • Fig. 5 Gesture-controlled panoramic map browser.
    (a) System setting (b) User interface.

15
3.3 Gesture Command Set
  • Four translation gesture commands
  • move up (1) move down (2) move left (3) move
    right (4)
  • Six rotation gesture commands
  • yaw right (7) yaw left (8) roll clockwise (9)
    roll counterclockwise (10) pitch down (11)
    pitch (12)
  • Two other gesture commands
  • zoom in (5) zoom out (6).

16
4 Real-Rime Segmentation of Continuous Dynamic
Hand Gestures
  • Goals
  • segment the moving hand from background.
  • partion of gesture streams into meaningful
    sections.
  • Methodology
  • integrating multiple clues skin color, motion.
  • post-processing (morphological filtering
    techniques).

17
Fig. 6 Processing flow chart of real-time
segmentation
18
5. Recovering Image Motion Model Parameters by
Robust Regression
  • 5.1 Parameterized Image Motion Model
  • 5.2 Constructing Objective Function
  • 5.3 Robust Error Norms
  • 5.4 Simultaneous Over Relaxation with
    Continuation Method.
  • 5.5 Multi-resolution Analysis.
  • 5.6 Examples of Experiment Results.

19
4.1 Parameterized Image Motion Models
  • Define
  • Translation Model
  • Affine Model
  • Planar Model
  • For example

20
4.2 Constructing Objective Function
  • Brightness Constancy assumption

Taking the Taylor series expansion, simplifying,
and dropping terms above first order gives
Recover model parameters by minimizing following
objective function
21
5.3 Robust Error Norms
Quadratic Truncated quadratic Geman-McClure
function Lorentzian function
22
5.3 Robust Error Norms
  • Fig. 7 Geman-McClure function. (a) Geman-McClure
    function (b) Its derivative function.

23
5.4 Simultaneous Over Relaxation with
Continuation Method
  • The iterative updating equation at n1 iteration

Where,
24
5.5 Multi-resolution Analysis
Fig. 8 Illustration of multi-resolution
analysis.
25
5.6 Examples of Image Motion Estimation
(a) (b)
(c)
  • (d) (e)
    (f)
  • Fig.9 An example of robust image motion
    regression. (a) and (b) are the 2nd and 3nd
    frames in an image sequence. (c) Inliers and
    Outliers identified according to the result of
    the first regression. (d) Segmentation of the
    moving hand. (e) outliers identified according to
    result of the second regression. (e) The
    difference image between (a) and (b).

26
4.6 Examples of Image Motion Estimation
(a) (b)
(c)
  • (d) (e)
    (f)
  • Fig. 10 Another example of robust image motion
    regression

27
6. Spatio-Temporal Appearance Modeling
  • 6. 1. Inter-frame Motion Appearance
  • 6.2. Inner-frame Shape Appearance
  • 6.3. Spatio-temporal Appearance

28
6. 1. Inter-frame Motion Appearance
29
6.2. Inner-frame Shape Appearance
30
6.3. Spatio-temporal Appearance
Where,
31
7.1 Dynamic Time Warping
Fig. 11 DTW assumes that the endpoints of the
two patterns have been accurately located and
formulates the pattern matching problem as
finding the optimal path from the start to the
end on a finite grid. The optimal path can be
found efficiently by dynamic programming.
32
7.2 Modified DTW
  • Our experiments find that the traditional DTW is
    not adequate to match two spatio-temporal
    appearance patterns.
  • Unlike the high sampling rate used in speech
    recognition, the sampling rate is usually 10 Hz
    in hand gesture recognition. Therefore, the
    fluctuation in the time axis of hand gesture
    patterns is much sharper than that of speech
    patterns.
  • A modified DTW algorithm, a kind of non-linear
    re-sampling technique, is developed to
    dynamically warp each spatio-temporal pattern to
    a fixed temporal length, which can reserve
    necessary temporal information and spatial
    distribution of original patterns.

33
7.3 Template based Recognition
  • The distance between two sptio-temporal
    appearance patterns is calculated based on
    correlation between their warped patterns.
  • Given a training set, a reference template is
    created for each type of gestures by a minimax
    type of optimization, then template-based
    classification technique is employed to
    recognized hand gestuers.

34
8. Experiment Results
  • 8.1 Examples of Hand Gesture Segmentation.
  • 8.2 Choosing Image Motion Models.
  • 8.3 Examples of Spatio-temporal Appearance.
  • 8.4 Examples of Warped Spatio-temporal
    Appearance.
  • 8.5 Motion Appearance versus Shape Appearance.
  • 8.6 Testing.

35
8.1 Examples of hand gesture segmentation
Fig.12 Segmentation result of a move up hand
gesture.
36
8.1 Examples of hand gesture segmentation
Fig. 13 Segmentation result of a move left
hand gesture.
37
8.1 Examples of hand gesture segmentation
Fig. 14 Segmentation result of a zoom in hand
gesture.
38
8.1 Examples of hand gesture segmentation
Fig. 15 Segmentation result of a yaw right
hand gesture.
39
8.2 Choosing Image Motion Model
Recognition rates when choosing different image
motion model
Conclusion For our gesture command set, choosing
affine model is necessary and sufficient.
40
8.3 Examples of Spatio-temporal Appearances
Table 1 Spatio-temporal appearance model
parameters of the move up gesture sample.
41
8.3 Examples of Spatio-temporal Appearances
Table 2 Spatio-temporal appearance parameters of
the move left gesture sample.
42
8.3 Examples of Spatio-temporal Appearances
Table 3 Spatio-temporal appearance parameters of
the zoom in gesture sample.
43
8.3 Examples of Spatio-temporal Appearances
Table 4 Spatio-temporal appearance parameters of
the yaw right gesture sample.
44
8.4 Determining of Warping Length
45
8.5 Examples of Warped Spatio-temporal Appearance
Table 6 Parameters of the warped spatio-temporal
appearance of the move up gesture sample.
46
8.5 Examples of Warped Spatio-temporal Appearances
Table 7 Parameters of the warped spatio-temporal
appearance of the move left gesture sample.
47
8.5 Examples of Warped Spatio-temporal Appearances
Table 8 Parameters of the warped spatio-temporal
appearance of the zoom in gesture sample.
48
8.5 Examples of Warped Spatio-temporal Appearances
Table 9 Parameters of the warped spatio-temporal
appearance of the yaw right gesture sample.
49
8.6 Motion Appearance Vs Shape Appearance
  • To explore the discrimination power of motion
    appearance or shape appearance separately, two
    experiments are carried out, one with only motion
    appearances being feature vectors and the other
    with only shape appearances being feature
    vectors.

50
8.7 Testing Experiment
  • The average recognition rate achieved on the test
    set is 89.6 .
  • Gesture-controlled panoramic map controller.
  • The prototype system can recognize hand gestures
    performed by a trained user with accuracy ranged
    from 83 to 92.

51
9. Summary
  • Aiming at real-time gesture-controlled
    human-computer interaction, we propose novel
    approaches for visual modeling, analysis, and
    recognition of continuous dynamic hand gestures.

52
9. Summary
  • A spatio-temporal appearance model is proposed to
    represent dynamic hand gestures.
  • The model integrates temporal information,
    motion and shape appearances.
  • The motion appearance represents the image
    appearance changes caused by motion itself, not a
    temporal sequence of static configurations.
  • The shape appearance is based on the geometrical
    features of an ellipse fitted to the hand image
    region rather than the simply moment-based
    features.

53
9. Summary
  • Novel approaches are developed to extract model
    parameters by hierarchically integrating multiple
    clues.
  • At low level, fusion of flesh chrominance
    analysis and coarse image motion detection is
    employed to detect and segment hand gestures
  • At high level, the model parameters are recovered
    by integrating fine image motion estimation and
    shape analysis.
  • The approaches achieve both real-time processing
    and high recognition rates.

54
9. Summary
  • A modified Dynamic Time Warping algorithm is
    suggested for eliminating time variation of
    spatio-temporal appearance patterns due to
    various gesturing rates.
  • It is a kind of non-linear re-sampling
    technique.
  • It can reserve necessary temporal information and
    spatial distribution of original patterns.

55
9. Summary
  • A prototype system, gesture-controlled panoramic
    map browser, is designed and implemented to
    demonstrate the usability of gesture-controlled
    real-time interaction.
  • Dynamic hand gestures are recognized without
    resorting to any special marks, limited or
    uniform background, or particular illumination.
  • Only one uncalibrated video camera is utilized.
  • Higher recognition rates are achieved.
  • User is allowed to perform continuous hand
    gestures, starting at any point within the view
    field of the camera.

56
10. Future Work
  • We currently assume that the moving skin color
    region in the scene is the gesturing hand, which
    could be invalid when there appears a moving
    human face. Exploiting simple geometrical model
    of human body can alleviate this problem, in that
    case multiple cameras can be necessary.

57
10. Future Work
  • To practically use hand gestures in HCI, more
    gestural commands will be needed.
  • Some kind of commands would be more reasonably
    input by static hand gestures (hand postures).
  • On the other hand, speech commands will be an
    alternative to some gestural commands.
  • Cooperating hand gesture recognition into
    multi-modal interface (MMI) is our next work.

58
  • Thanks!
Write a Comment
User Comments (0)
About PowerShow.com