Title: Multiple Object Class Detection with a Generative Model
1Multiple Object Class Detection with a Generative
Model
- K. Mikolajczyk B. Leibe and B. Schiele
Carolina Galleguillos
2Goal
- Simultaneous recognition and localization of
multiple object classes using a generative model. - Recognition Codebook (features are shared
among several object classes). - Detection Probabilistic model for various
objects in the same image.
3Introduction
Multiple Object class detection performance is
far from single object.
Single Object class detection is a mature problem
- Their approach
- Fast and dense sampling of scale invariant
features. - Effective object representation.
- Efficient and reliable training and recognition.
4Introduction
- Other approaches
- Based on feature detectors
- Local features several detectors.
- Based on appearance clusters
- Visual vocab. codebook keywords.
- Represent object classes
- Star shape graphical model etc.
5Features - Appearance
- We can compute them efficiently
- Scale space pyramid with a Gaussian kernel.
-
- For each level Canny edge detection with
Laplacian automatic scale (position scale and
dominant orientation). - For each edge point we identify a region of
interest (in the gradient orientation). This
region is described by SIFT descriptors (128
dimensional vector). - Use of PCA for dimensionality reduction (to 40
dimensions).
6Features - Geometry
- Rotation invariance Convert position of features
in polar coordinates.
d distance to object center. f angle.
dominant gradient orientation.
7Hierarchical Codebook
Hierarchical tree of clusters Appearance
clusters (formed by similar features at first
level) Each cluster has several geometric
distributions that correspond to object classes
(info about geom. relations between object
centers and local appearance).
Node is a hyperball
8Building Tree Efficiently
- Apply K-means to divide space (top-down).
- Use reciprocal nearest neighbor in each k-means
partition with a similarity threshold. - Apply agglomerative clustering (bottom up).
- Euclidean distance to group clusters.
-
- Clustering trace is used to construct the tree.
9Building Tree Efficiently
10Building Tree Efficiently
11Building Tree Efficiently
12Building Tree Efficiently
13Building Tree Efficiently
14Tree - Advantages
- Appearance clusters are shared within one image
and among different classes (and object parts). - Compact representation.
- Represent individual objects or all object
classes. - Efficient search.
15Recognition
F features. A appearance clusters. G geometric
distribution.
Decision
Each feature likelihood is modeled by a mixture
of distributions from appearance clusters which
match to a query feature.
16Recognition
- Problem Similar objects in the model have
probabilities comparables in shared clusters. - Condition each feature can contribute only to
one hypothesis. - Average confusion factor between pairs of
objects. - If approaches to 1 we remove from both
hypothesis all info that come from those clusters.
17Learning
- Joint probability distributions are separated in
two terms - To estimate de model
Extract features F from labeled training
examples. Build appearance clusters match the
features back to the cluster centers (threshold
ß). Each feature that matches to contributes
to the prob. estimates for the appearance and to
its geometric distrib. at the position.
18Fast Matching
- Match features to cluster centers using a ball
tree. - Represent query and model as tree structures.
- Match two trees computing Euclidean distance
between centroids of top nodes.
If distance is smaller than the sum of their
radii then the first node is compared with all
the children of the intersecting node.
Same precision to exhaustive search and 200 times
faster.
19Experimental results
5 object classes pedestrian cars motorbikes
bicycles and RPG shooter.
20Experimental results
Recall is higher and the number of appearance
clusters grow sub-linearly with increasing number
of object classes
Motorbike test data
21Conclusions
- Approach capable of detecting multiple object
classes simultaneously in images using a single
codebook. - Performance comparable with state of the art
discriminative approaches. - Efficient method for building object class
representation and recognition.