Streetlevel Scene Understanding - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Streetlevel Scene Understanding

Description:

Streetlevel Scene Understanding – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 69
Provided by: paulst4
Category:

less

Transcript and Presenter's Notes

Title: Streetlevel Scene Understanding


1
Street-level Scene Understanding
  • Paul Sturgess, Karteek Alahari, Pushmeet Kohli,
    Chris Russell, Lubor Ladický,
  • Philip Torr.

2
Motivation
  • Abundance of street level imagery
  • Google Street View
  • Microsoft Live Search Maps
  • Yotta DCL
  • Aim identify object-classes automatically
  • Build highway inventories
  • Augment driving experience

3
We would like a framework
  • That can combine
  • Deal with Things and stuff. (Ted Adelson)
  • Efficient optimization.
  • Leverage Video
  • Large scale Learning
  • Some stuff under review so wont give all the
    details, but will sketch out the plan.
  • Use of a global (CRF) energy
  • A hierarchical representation
  • (vehicle has sub categories car, van etc)

4
Google Street View
54 Great Russell St
5
Yotta
http//www.geospatialvision.com/
6
Yotta
  • The scenario is as follows
  • a van drives around the roads of the UK, in the
    van are
  • GPS equipment and multiple calibrated cameras,
  • synchronized to capture and store an image every
    two metres
  • giving a massive data set.

7
CamVidBrostow ECCV et al 2008
  • 31 hand labelled object classes void

Brostow et al 2009
8
CamVid
  • We use the top 11 classes void

9
Labelled Ground Truth
10
Sub-window Classification
  • There is a car at x,y,size
  • Classifies a sub-image as before
  • Requires a separate model for each object-class
  • Good results for rigid objects

Lampert, Blaschko, Hofmann (2008)
11
Sub-window Classification
  • But what about amorphous objects such as road and
    sky-stuff!!
  • Not so good for amorphous objects
  • Lots of pixels mislabelled in bounding box
    segment
  • Lots of the image is left un-classified

un-classified
Misclassified
12
Segment ? Classify
  • Refines a sub-window segment
  • Less miss-classified pixels
  • More unclassified pixels
  • The model becomes prohibitively large as we add
    more object-classes

Larlus,Verbeek and Jurie 2008
13
Segment ? Classify
  • Need to classify each segment
  • But what if the segments are wrong?

14
Classify ? Segment
  • Every pixel in image is classified as one of a
    set of object-class labels

15
We would like a framework
  • That can combine
  • Super pixels-edges or segments
  • Sliding window classifiers
  • Use of a global (CRF) energy (so we know what we
    are optimizing)
  • A hierarchical representation (vehicle has sub
    categories car, van etc)
  • Efficient optimization.
  • Some stuff under review so wont give all the
    details, but will sketch out the plan.

16
Sketch
  • Lots of ongoing work
  • Results not perfect
  • Need more training data
  • Snap shot/sketch ahead.
  • Philosophy-link algorithms to problems (what can
    we solve now or soon?)

17
Enforcing Label Consistency using Higher Order
Potentials
  • CVPR08 VOC 08
  • Joint work with Lubor Ladicky and Pushmeet Kohli
  • Cambridge 2008

18
Image labelling Problems
Assign a label to each image pixel
Object Segmentation
Image Denoising
Geometry Estimation
Sky
Building
Tree
Grass
19
Segmentation Taster (VOC2008 data) Competition
"comp5" (train on VOC2008 data) Accuracy () -
Entries in parentheses are synthesized from
detection results.
All pretty bad, ours slightly worse than
some-problem training data Big overhead to entry,
lots of stuff to code, first year we entered
20
Object Segmentation using CRFs
(Shotton et al. ECCV 2006)
CRF Energy
Unary potentials based on Colour, Location and
Texture features
Encourages label consistency in adjacent pixels
21
Limitations of Pairwise CRFs
  • Encourages short boundaries (Shrinkage bias)
  • Can only enforce label consistency in pairs of
    pixels
  • Inability to incorporate region based features

Image
Unary Potential
MAP-CRF Solution
22
Label Consistency in Image Regions
Image (MSRC)
Segmentation (Mean shift)
  • All pixels constituting some regions belong to
  • Same plane (Orientation)
  • (Hoiem, Efros, Herbert, ICCV05)
  • Same object
  • (Russel, Efros, Sivic, Freeman, Zisserman,
    CVPR06)

23
Image labelling using Segments
Object Labelling
Unsupervised Segmentation
Image
  • Geometric Context
  • Hoiem et al, ICCV05
  • Object Segmentation
  • He et al. ECCV06, Yang et al. CVPR07,
    Rabinovich et al. ICCV07, Batra et al. CVPR08
  • Interactive Video Segmentation
  • Wang, SIGGRAPH 2005

Not robust to Inconsistent Segments!
24
Our Higher Order CRF Model
Encourages label consistency in regions
Multiple Segmentations
c
Comaniciu and Meer PAMI 2002 Shi and Malik PAMI
2000 Felzenszwalb and Huttenlocher IJCV 2004
25
Higher Order Energy Functions
Unary
Pairwise
Higher order
  • Efficient BP in Higher Order MRFs
  • ECCV06 (Lan, Roth, Huttenlocher, Black)
  • 2x2 cliques learned using FOE model
  • Approximation methods to make BP feasible
  • Search a restricted state space
  • 16 minutes per iteration

26
Label Consistency in Segments
  • Encourages consistency within super-pixels
  • Takes the form of a PN Potts model
  • Kohli et al. CVPR 2007

c
27
Label Consistency in Segments
  • Encourages consistency within super-pixels
  • Takes the form of a PN Potts model
  • Kohli et al. CVPR 2007

c
Cost 0
28
Label Consistency in Segments
  • Encourages consistency within super-pixels
  • Takes the form of a PN Potts model
  • Kohli et al. CVPR 2007

c
Cost f (c)
29
Label Consistency in Segments
  • Encourages consistency within super-pixels
  • Takes the form of a PN Potts model
  • Kohli et al. CVPR 2007

Does not distinguish between Good/Bad Segments !
c
Cost f (c)
30
Quality based Label Consistency
Label inconsistency cost depends on segment
quality
31
Quality based Label Consistency
Label inconsistency cost depends on segment
quality
  • How to measure quality G(c)?
  • Ren and Malik ICCV03, Rabinovich et al. ICCV07,
    many others
  • Colour and Texture Similarity
  • Contour Energy

Measure quality from variance in feature responses
Higher order generalization of contrast-sensitive
pairwise potential
32
Quality based Label Consistency
Segment Quality (darker is better)
Mean shift segmentation
MSRC image
33
Robust Consistency Potentials
gmax
PN Potts
0
Too Rigid!
0
1
Inconsistent Pixels
Kohli, Ladicky, Torr, CVPR 2008
Kohli, Kumar, Torr, CVPR 2007
34
Robust Consistency Potentials
Add multiple potentials to generate arbitrary
concave increasing function.
35
Higher Order Cliques
  • A way of assigning a cost if all the pixels in a
    clique take a particular value.
  • Cliques can come
  • from detectors
  • Features
  • Segments/super pixels
  • Optimizer sorts it out (CRF energy).

36
Minimizing Higher order Energy Functions
  • Message passing is computationally expensive
  • High runtime and space complexity - O(LN)
  • L Number of Labels, N Size of Clique
  • Efficient BP for Higher Order MRFs
  • Lan et al. ECCV 06, Potetz CVPR 2007
  • 2x2 clique potentials for Image Denoising
  • Take minutes per iteration (Hours to converge)

37
Graph Cuts for Minimizing Higher order Energy
Functions (Our Approach)
  • Binary label problems can be solved exactly
  • Can handle very high order energy functions
  • Extremely efficient computation time in the
    order of seconds
  • Graph Cut based move making algorithm for
    Multilabel Functions
  • Primal Dual methods indicate how accurate we are
    (duality gap).

38
Solving the PN Potts Model
  • Computing the optimal expansion move

Source
Ms
v1
v2
vn
Mt
Sink
39
Solving the PN Potts Model
  • Computing the optimal expansion move

Source
Ms
v1
v2
vn
Case 1 all ti 0 (xi xi )
Mt
Cost
Sink
40
Solving the PN Potts Model
  • Computing the optimal expansion move

Source
Ms
v1
v2
vn
Case 2 all ti 1 (xi a)
Mt
Cost
Sink
41
Solving the PN Potts Model
  • Computing the optimal expansion move

Source
Ms
v1
v2
vn
Case 3 ti 0,1 (xi xi , a)
Mt
Cost
Sink
42
Source

Sink
43
Source
Source Clique nodes
Pixel nodes



Sink Clique nodes
Sink
44
Source





Sink
45
Opens up to Hierarchies
Same sort of idea as deep belief nets but
tractable inference
46
Overview of our Method
Higher Order Energy
Unary Potentials Shotton et al. ECCV 2006

Energy Minimization
Contrast Sensitive Pairwise Potentials

Segmentation Solution
Higher Order Potentials (Multiple Segmentations)
47
Experimental results
Datasets MSRC (21), Sowerby (7)
Shotton et al. ECCV 2006
He et al. CVPR 04
48
Qualitative Results
Image (MSRC-21)
Pairwise CRF
Higher order CRF
Ground Truth
Grass
Sheep
49
Qualitative Results (Contd..)
Image (MSRC-21)
Pairwise CRF
Higher order CRF
Ground Truth
50
Qualitative Results (Contd..)
Image (MSRC-21)
Pairwise CRF
Higher order CRF
Ground Truth
Results can be improved using image specific
colour models
Rother et al. SIGGRAPH 2004 Shotton et al. ECCV
2006
51
Quantitative Results Problems
Rough ground truth segmentations
Fine structures have small influence on overall
pixel accuracy
52
Generating Accurate Segmentations
  • Generated accurate segmentation of 27 images
  • 30 minutes per image

Image (MSRC-21)
Original Segmentation
New Segmentation
53
Relationship between Qualitative and Quantitative
Results
Pairwise CRF
Higher order CRF
Ground Truth
Image (MSRC-21)
Overall Pixel Accuracy
95.8
98.7
Small changes in pixel accuracy can lead to large
improvements in segmentation results.
54
Quantitative Accuracy
  • Measure accuracy in labelling boundary pixels.
  • Accuracy evaluated in boundary bands of variable
    width

Hand-labelled Segmentation
Trimap (8-pixels)
Trimap (16-pixels)
Image (MSRC-21)
55
Quantitative Accuracy
  • Measure accuracy in labelling boundary pixels.
  • Accuracy evaluated in boundary bands of variable
    width

56
Generating Multiple Segmentations
Sampling likely segmentations Tu and Zhu PAMI
2002 Segmentations at multiple scales Sharon et
al. CVPR 2001
Generate multiple segmentations by using varying
parameters of segmentation algorithms Russell
et al. CVPR 2006
Unsupervised Segmentation algorithms Comaniciu
and Meer PAMI 2002 Shi and Malik PAMI
2000 Felzenszwalb and Huttenlocher IJCV 2004
57
Qualitative Results (Contd..)
Image (MSRC-21)
Pairwise CRF
Higher order CRF
Ground Truth
58
Results
background
person
aeroplane
background
dinning table
person
background
horse
car
background
background
background
bird
train
VOC2008 image
Result
VOC2008 image
Result
59
Extension to Video
  • Tracking features
  • Use of 3D
  • Space time super pixels (on volume)

60
Cues from Point-clouds, Brostow et al
Brostow et al (2008)
61
From 3D to 2D
Brostow et al (2008)
62
Point-clouds for object-class segmentation
  • Cues
  • SfM
  • Texton
  • Learning/inference
  • Energy involves unary and pairwise terms
  • Pixels
  • Superpixels
  • Sets of super pixels

63
Higher Order Potential
  • Single Segmentation?
  • Combine multiple segmentations

64
Unary
pairwise
higher order
G-Truth
Raw
65
Result
Unary Pairwise Higher Order
G-Truth
Raw
66
Results Summary
  • Note some non super pixel friendly classes worse
    e.g. pole, sign.
  • Combine detectors things with this.
  • Detectors fit very naturally into this framework.

67
Results for all test frames
Raw Image
Ground Truth
Unary Pairwise
Unary Pairwise Higher Order
68
Conclusion
  • Big problem CamVid
  • only 700 images labelled,
  • half used to test half to train.
  • Training Data
  • how to get much more
  • Internet games, ESP, label me etc.
  • Unsupervised training? Semi Supervised?
  • Once got data
  • how to do inference?
  • large Scale learning?
Write a Comment
User Comments (0)
About PowerShow.com