Segmentation and Tracking of Multiple Humans in Crowded Environments - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Segmentation and Tracking of Multiple Humans in Crowded Environments

Description:

Joint image likelihood for multiple objects and the background. The visible part of object ... We assume that we have the sample in the (g-1)th iteration , ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 39
Provided by: Char1161
Category:

less

Transcript and Presenter's Notes

Title: Segmentation and Tracking of Multiple Humans in Crowded Environments


1
Segmentation and Tracking of Multiple Humans in
Crowded Environments
  • Tao Zhao, Ram Nevatia, Bo WuIEEE TRANSACTIONS
    ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
    VOL. 30, NO. 7, JULY 2008

2
Outline
  • Introduction
  • Overview
  • Probabilistic modeling
  • Computing MAP by efficient MCMC
  • Experimental results
  • Conclusion

3
Introduction
  • Segmentation and tracking of multiple humans in
    crowded situations is made difficult by
    interobject occlusion.

4
Introduction
  • The method is feasible for a crowed scene
  • persistent and temporarily heavy occlusion
  • Do not require that humans isolated when they
    first enter the scene.
  • More complex shape models are needed.
  • Joint reasoning about the collection of objects
    is needed..

5
Introduction
  • Main features of this work
  • A three-dimensional part-based human body model
    which enables the segmentation and tracking of
    humans in 3D and the inference of interobject
    occlusion naturally.
  • A Bayesian framework that integrates segmentaion
    and tracking based on a joint likelihood for the
    appearance of multiple objects.

6
Introduction
  • The design of an efficient Markov chain dynamics,
    directed by proposal probabilities based on image
    cues.
  • The incorporation of a color-based background
    model in a mean-shift tracking step.

7
Overview
  • The prior models
  • Background model
  • Based on a background model, the foreground blobs
    are extracted as the basic observation.
  • 3D human shape model
  • Since the hypotheses are in 3D, occlusion
    reasoning is straightforward.
  • Camera model Ground Plane
  • Multiple 3D human hypotheses are projected onto
    the image plane and matched with the foreground
    blobs.

8
Overview
  • The segmentation and tracking are integrated in a
    unified framework and interoperate along time

9
Overview
  • We formulate the problem as one of Bayesian
    inference to find the best interpretation given
    the image observations, the prior model, and the
    estimates from the previous frame analysis.
  • That is the maximun a posteriori (MAP) estimation.

10
Overview
  • The state to be estimated at each frame
  • The number of objects
  • Their correspondences to the objects in the
    previous frame (if any).
  • Their parameters (for example, position)
  • Uncertainty of the parameters

11
Probabilistic modeling
  • Our goal is to estimate the state at time t,
    ?(t), given the image observation, I(1),, I(t)
  • ? the state of the objects.? the solution
    space.

12
Probabilistic modeling
  • a state containing n objects can be written
    aswhere ki is the unique identity of the ith
    object whose parameters are mi and ?n is the
    solution space of exactly n objects.
  • The entire solution space is

13
3D human shape model
  • The parameter of an individual human, m, are
    defined based on a 3D human shape model.
  • Do not attempt to capture the detailed shape and
    articulation parameters of the human body.

Head, torso, and legs, with fixed spatial
relationship.
14
3D human shape model
  • The parameters (mi) to describe 3D human
    hypothesis
  • size (hi) 3D height of the model, it also
    control the overall scaling of the object in the
    three directions.
  • thickness (fi) captures extra scaling in the
    horizontal directions.
  • position (ui or (xi,yi)) the image position of
    the head.

15
3D human shape model
  • orientation (oi) 3D orientation of the body
  • Orientations of the models are quantized into few
    levels for computation efficiency.
  • inclination (ii) 2D inclination of the body
  • There is the chance that the body may be inclined
    slgithly.

16
Object appearance model
  • We use a color histogram of the object,
    defined within the object
    shape.
  • It help establish correspondence in tracking
    because it is insensitive to the nonrigidity of
    human motion.
  • There exists an efficient algorithm, for example,
    the mean-shift technique, to optimize a
    histogram-based object function.

17
Background appearance model
  • The probability of pixel j being from the
    background is

18
The prior distribution
  • The first term
  • is independent of time and is defined
    by
  • Si is the projected image of the ith object and
    Si is its area.

19
The prior distribution
  • P(ofrontal)P(oprofile)1/2
  • P(xi,yi) is a uniform distribution in the region
    where a human head is plausible
  • P(hi) is a Gaussian distribution N(?h,?h2)
    truncated in the range of hmin,hmax
  • P(fi) is a Gaussian distribution N(?f,?f2)
    truncated in the range of fmin,fmax
  • P(ii) is a Gaussian distribution N(?i,?i2)

20
The prior distribution
  • the second term
  • We approximate it by
  • We rearrange ?(t) and ?(t-1) as such that
    one of
    is true.

21
The prior distribution
  • Passoc
  • We assume that the position and the inclination
    of an object follow constant velocity models with
    Gaussian noise.

22
The prior distribution
  • The height and thickness follow a Gaussian
    distribution.
  • We use Kalman filters for temporal estimation.
  • Pnew Pdead
  • the likelihood of the initialization of a new
    track
  • the likelihood of the termination of a existing
    track
  • They are set empirically according to the
    distance of the object to the entrance/exits.

23
Joint image likelihood for multiple objects and
the background
  • The visible part of object ( )
  • determined by the depth order of all of the
    objects, which can be inferred from their 3D
    position and the camera model.
  • Non object region ( )

24
Joint image likelihood for multiple objects and
the background
  • The joint likelihood P(I?) consists of two
    terms
  • The first term

25
Joint image likelihood for multiple objects and
the background
  • di is the color histogram of the background image
    within the visibility mask of object i.
  • pi is the color histogram of the object.
  • is
    the Bhattachayya coefficient, which reflects the
    similarity of the two histogram.

26
Joint image likelihood for multiple objects and
the background
  • The second term is
  • ejlog(Pb(Ij)) is the probability of belonging to
    the background model

27
Computing MAP by efficient MCMC
  • Computing the MAP is an optimization problem.
  • Optimization is challenging
  • An unknown number of objects, the solution space
    contains subspaces of varying dimension.
  • Includes both discrete variables and continuous
    variable.
  • we adapt a data-driven Markov chain Monte Carlo
    (MCMC) approach to explore this complex solution
    space.

28
Computing MAP by efficient MCMC
  • MCMC method with jump/diffusion dynamics to
    sample the posterior probability.
  • Jump cause the Markov chain to move between
    subspaces with different dimension and traverse
    the discrete variables.
  • Diffusions make the Markov chain sample
    continuous variables.
  • In the process of sampling, the best solution is
    recorded and the uncertainty associated with the
    solution is also obtained.

29
Computing MAP by efficient MCMC
30
Computing MAP by efficient MCMC
  • MCMC method
  • We want to design a Markov chain with stationary
    distribution
    .
  • At the gth iteration, we sample a candidate state
    ? from a proposal distribution q(?g ?g-1).
  • If the candidate state ? is accepted, ?g ? .
  • Otherwise, ?g ?g-1.

31
Computing MAP by efficient MCMC
  • Markov chain constructed in this way has its
    stationary distribution equal to P(), independent
    of the choice of the proposal probability q() and
    the initial state ?0.
  • The choice of the proposal probability q() can
    affect the efficiency of MCMC significantly.
  • Using more informed proposal probabilities, for
    example, as in the data-driven MCMC, will make
    the Markov chain traverse the solution space more
    efficiently. Therefore, the proposal distribution
    is written as q(?g ?g-1, I).

32
Markov chain dynamic
  • The dynamics correspond to the proposal
    distribution with a mixture densitywhere A is
    the set of all dynamic add, remove, establish,
    break, exchange, diff
  • We assume that we have the sample in the (g-1)th
    iteration
    ,and now propose a
    candidate ? for the gth iteration.

33
Markov chain dynamic
  • Dynamics
  • object hypothesis addition
  • Sample the parameter of a new human hypothesis
    (kn1,mn1) and add it to ?g-1.
  • object hypothesis removal
  • establish correspondence

34
Markov chain dynamic
  • break correspondence
  • exchange identity
  • Parameter update

35
Experimental results
  • Evaluation on an outdoor scene

36
(No Transcript)
37
Experimental results
  • There are 20 occlusions events overall, nine of
    which are heavy occlusions.
  • We use 500 iterations per frame.
  • Trajectory-based errors
  • Trajectories of three objects are broken once (ID
    28 -gt ID 35, ID 31 -gt ID 32, ID 30 -gt ID 41)
  • Trajectories initialization
  • Some start when the objects are only partial
    inside.
  • Only the initialization of three objects (object
    31, 50, 52) are noticeably delayed.
  • Partially occlusion and/or the lack of contrast
    with the background are the causes of the delays.
  • The detection rate and the false the false-alarm
    are 98.13 and 0.27 percent.

38
Conclusion
  • A principled approach to simultaneously detect
    and track humans in a crowed scene.
  • We formulate the problem as a Bayesian MAP
    estimation problem.
  • The inference is performed by an MCMC-based
    approach to explore the joint solution space.
  • The success lies in the integration of the
    top-down Bayesian formulation following the image
    formation process and the bottom-up features that
    are directly extracted from images.
Write a Comment
User Comments (0)
About PowerShow.com