Segmentation and Tracking of Multiple Humans in Crowded Environments - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

Segmentation and Tracking of Multiple Humans in Crowded Environments

Description:

Joint image likelihood for multiple objects and the background. The visible part of object ... We assume that we have the sample in the (g-1)th iteration , ... – PowerPoint PPT presentation

Number of Views:82

Avg rating:3.0/5.0

Slides: 39

Provided by: Char1161

Category:

more less

Transcript and Presenter's Notes

Title: Segmentation and Tracking of Multiple Humans in Crowded Environments

1
Segmentation and Tracking of Multiple Humans in
Crowded Environments

Tao Zhao, Ram Nevatia, Bo WuIEEE TRANSACTIONS
ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 30, NO. 7, JULY 2008

2
Outline

Introduction
Overview
Probabilistic modeling
Computing MAP by efficient MCMC
Experimental results
Conclusion

3
Introduction

Segmentation and tracking of multiple humans in
crowded situations is made difficult by
interobject occlusion.

4
Introduction

The method is feasible for a crowed scene
persistent and temporarily heavy occlusion
Do not require that humans isolated when they
first enter the scene.
More complex shape models are needed.
Joint reasoning about the collection of objects
is needed..

5
Introduction

Main features of this work
A three-dimensional part-based human body model
which enables the segmentation and tracking of
humans in 3D and the inference of interobject
occlusion naturally.
A Bayesian framework that integrates segmentaion
and tracking based on a joint likelihood for the
appearance of multiple objects.

6
Introduction

The design of an efficient Markov chain dynamics,
directed by proposal probabilities based on image
cues.
The incorporation of a color-based background
model in a mean-shift tracking step.

7
Overview

The prior models
Background model
Based on a background model, the foreground blobs
are extracted as the basic observation.
3D human shape model
Since the hypotheses are in 3D, occlusion
reasoning is straightforward.
Camera model Ground Plane
Multiple 3D human hypotheses are projected onto
the image plane and matched with the foreground
blobs.

8
Overview

The segmentation and tracking are integrated in a
unified framework and interoperate along time

9
Overview

We formulate the problem as one of Bayesian
inference to find the best interpretation given
the image observations, the prior model, and the
estimates from the previous frame analysis.
That is the maximun a posteriori (MAP) estimation.

10
Overview

The state to be estimated at each frame
The number of objects
Their correspondences to the objects in the
previous frame (if any).
Their parameters (for example, position)
Uncertainty of the parameters

11
Probabilistic modeling

Our goal is to estimate the state at time t,
?(t), given the image observation, I(1),, I(t)
? the state of the objects.? the solution
space.

12
Probabilistic modeling

a state containing n objects can be written
aswhere ki is the unique identity of the ith
object whose parameters are mi and ?n is the
solution space of exactly n objects.
The entire solution space is

13
3D human shape model

The parameter of an individual human, m, are
defined based on a 3D human shape model.
Do not attempt to capture the detailed shape and
articulation parameters of the human body.

Head, torso, and legs, with fixed spatial
relationship.
14
3D human shape model

The parameters (mi) to describe 3D human
hypothesis
size (hi) 3D height of the model, it also
control the overall scaling of the object in the
three directions.
thickness (fi) captures extra scaling in the
horizontal directions.
position (ui or (xi,yi)) the image position of
the head.

15
3D human shape model

orientation (oi) 3D orientation of the body
Orientations of the models are quantized into few
levels for computation efficiency.
inclination (ii) 2D inclination of the body
There is the chance that the body may be inclined
slgithly.

16
Object appearance model

We use a color histogram of the object,
defined within the object
shape.
It help establish correspondence in tracking
because it is insensitive to the nonrigidity of
human motion.
There exists an efficient algorithm, for example,
the mean-shift technique, to optimize a
histogram-based object function.

17
Background appearance model

The probability of pixel j being from the
background is

18
The prior distribution

The first term
is independent of time and is defined
by
Si is the projected image of the ith object and
Si is its area.

19
The prior distribution

P(ofrontal)P(oprofile)1/2
P(xi,yi) is a uniform distribution in the region
where a human head is plausible
P(hi) is a Gaussian distribution N(?h,?h2)
truncated in the range of hmin,hmax
P(fi) is a Gaussian distribution N(?f,?f2)
truncated in the range of fmin,fmax
P(ii) is a Gaussian distribution N(?i,?i2)

20
The prior distribution

the second term
We approximate it by
We rearrange ?(t) and ?(t-1) as such that
one of
is true.

21
The prior distribution

Passoc
We assume that the position and the inclination
of an object follow constant velocity models with
Gaussian noise.

22
The prior distribution

The height and thickness follow a Gaussian
distribution.
We use Kalman filters for temporal estimation.
Pnew Pdead
the likelihood of the initialization of a new
track
the likelihood of the termination of a existing
track
They are set empirically according to the
distance of the object to the entrance/exits.

23
Joint image likelihood for multiple objects and
the background

The visible part of object ( )
determined by the depth order of all of the
objects, which can be inferred from their 3D
position and the camera model.
Non object region ( )

24
Joint image likelihood for multiple objects and
the background

The joint likelihood P(I?) consists of two
terms
The first term

25
Joint image likelihood for multiple objects and
the background

di is the color histogram of the background image
within the visibility mask of object i.
pi is the color histogram of the object.
is
the Bhattachayya coefficient, which reflects the
similarity of the two histogram.

26
Joint image likelihood for multiple objects and
the background

The second term is
ejlog(Pb(Ij)) is the probability of belonging to
the background model

27
Computing MAP by efficient MCMC

Computing the MAP is an optimization problem.
Optimization is challenging
An unknown number of objects, the solution space
contains subspaces of varying dimension.
Includes both discrete variables and continuous
variable.
we adapt a data-driven Markov chain Monte Carlo
(MCMC) approach to explore this complex solution
space.

28
Computing MAP by efficient MCMC

MCMC method with jump/diffusion dynamics to
sample the posterior probability.
Jump cause the Markov chain to move between
subspaces with different dimension and traverse
the discrete variables.
Diffusions make the Markov chain sample
continuous variables.
In the process of sampling, the best solution is
recorded and the uncertainty associated with the
solution is also obtained.

29
Computing MAP by efficient MCMC
30
Computing MAP by efficient MCMC

MCMC method
We want to design a Markov chain with stationary
distribution
.
At the gth iteration, we sample a candidate state
? from a proposal distribution q(?g ?g-1).
If the candidate state ? is accepted, ?g ? .
Otherwise, ?g ?g-1.

31
Computing MAP by efficient MCMC

Markov chain constructed in this way has its
stationary distribution equal to P(), independent
of the choice of the proposal probability q() and
the initial state ?0.
The choice of the proposal probability q() can
affect the efficiency of MCMC significantly.
Using more informed proposal probabilities, for
example, as in the data-driven MCMC, will make
the Markov chain traverse the solution space more
efficiently. Therefore, the proposal distribution
is written as q(?g ?g-1, I).

32
Markov chain dynamic

The dynamics correspond to the proposal
distribution with a mixture densitywhere A is
the set of all dynamic add, remove, establish,
break, exchange, diff
We assume that we have the sample in the (g-1)th
iteration
,and now propose a
candidate ? for the gth iteration.

33
Markov chain dynamic

Dynamics
object hypothesis addition
Sample the parameter of a new human hypothesis
(kn1,mn1) and add it to ?g-1.
object hypothesis removal
establish correspondence

34
Markov chain dynamic

break correspondence
exchange identity
Parameter update

35
Experimental results

Evaluation on an outdoor scene

36
(No Transcript)
37
Experimental results

There are 20 occlusions events overall, nine of
which are heavy occlusions.
We use 500 iterations per frame.
Trajectory-based errors
Trajectories of three objects are broken once (ID
28 -gt ID 35, ID 31 -gt ID 32, ID 30 -gt ID 41)
Trajectories initialization
Some start when the objects are only partial
inside.
Only the initialization of three objects (object
31, 50, 52) are noticeably delayed.
Partially occlusion and/or the lack of contrast
with the background are the causes of the delays.
The detection rate and the false the false-alarm
are 98.13 and 0.27 percent.

38
Conclusion

A principled approach to simultaneously detect
and track humans in a crowed scene.
We formulate the problem as a Bayesian MAP
estimation problem.
The inference is performed by an MCMC-based
approach to explore the joint solution space.
The success lies in the integration of the
top-down Bayesian formulation following the image
formation process and the bottom-up features that
are directly extracted from images.