Self-Paced Learning for Semantic Segmentation - PowerPoint PPT Presentation


Title: Self-Paced Learning for Semantic Segmentation


1
Self-Paced Learning forSemantic Segmentation
  • M. Pawan Kumar

2
(No Transcript)
3
Self-Paced Learning forLatent Structural SVM
  • M. Pawan Kumar

Daphne Koller
Benjamin Packer
4
Aim
To learn accurate parameters for latent
structural SVM
Input x
Output y ? Y
Hidden Variable h ? H
Deer
Y Bison, Deer, Elephant, Giraffe,
Llama, Rhino
5
Aim
To learn accurate parameters for latent
structural SVM
Feature ?(x,y,h) (HOG, BoW)
Parameters w
(y,h) maxy?Y,h?H wT?(x,y,h)
6
Motivation
Math is for losers !!
Real Numbers
Imaginary Numbers
eip1 0
FAILURE BAD LOCAL MINIMUM
7
Motivation
Euler was a Genius!!
Real Numbers
Imaginary Numbers
eip1 0
SUCCESS GOOD LOCAL MINIMUM
8
Motivation
Start with easy examples, then consider hard
ones
Simultaneously estimate easiness and
parameters Easiness is property of data sets, not
single instances
Easy vs. Hard
Expensive
Easy for human ? Easy for machine
9
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

10
Latent Structural SVM
Felzenszwalb et al, 2008, Yu and Joachims, 2009
Training samples xi
Ground-truth label yi
Loss Function ?(yi, yi(w), hi(w))
11
Latent Structural SVM
(yi(w),hi(w)) maxy?Y,h?H wT?(x,y,h)
min w2 C?i?(yi, yi(w), hi(w))
Non-convex Objective
Minimize an upper bound
12
Latent Structural SVM
(yi(w),hi(w)) maxy?Y,h?H wT?(x,y,h)
min w2 C?i ?i
maxhiwT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) -
?i
Still non-convex
Difference of convex
CCCP Algorithm - converges to a local minimum
13
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

14
Concave-Convex Procedure
Start with an initial estimate w0
hi maxh?H wtT?(xi,yi,h)
Update
Update wt1 by solving a convex problem
min w2 C?i ?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
15
Concave-Convex Procedure
Looks at all samples simultaneously
Hard samples will cause confusion
Start with easy samples, then consider hard
ones
16
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

17
Self-Paced Learning
REMINDER
Simultaneously estimate easiness and
parameters Easiness is property of data sets, not
single instances
18
Self-Paced Learning
Start with an initial estimate w0
hi maxh?H wtT?(xi,yi,h)
Update
Update wt1 by solving a convex problem
min w2 C?i ?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
19
Self-Paced Learning
min w2 C?i ?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
20
Self-Paced Learning
vi ? 0,1
min w2 C?i vi?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Trivial Solution
21
Self-Paced Learning
vi ? 0,1
min w2 C?i vi?i - ?ivi/K
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Large K
Medium K
Small K
22
Self-Paced Learning
Alternating Convex Search
Biconvex Problem
vi ? 0,1
min w2 C?i vi?i - ?ivi/K
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Large K
Medium K
Small K
23
Self-Paced Learning
Start with an initial estimate w0
hi maxh?H wtT?(xi,yi,h)
Update
Update wt1 by solving a convex problem
min w2 C?i vi?i - ?i vi/K
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Decrease K ? K/?
24
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

25
Object Detection
Input x - Image
Output y ? Y
Latent h - Box
? - 0/1 Loss
Y Bison, Deer, Elephant, Giraffe,
Llama, Rhino
Feature ?(x,y,h) - HOG
26
Object Detection
Mammals Dataset
271 images, 6 classes
90/10 train/test split
4 folds
27
Object Detection
Self-Paced
CCCP
28
Object Detection
Self-Paced
CCCP
29
Object Detection
Self-Paced
CCCP
30
Object Detection
Self-Paced
CCCP
31
Object Detection
Objective value
Test error
32
Handwritten Digit Recognition
Input x - Image
Output y ? Y
Latent h - Rotation
? - 0/1 Loss
MNIST Dataset
Y 0, 1, , 9
Feature ?(x,y,h) - PCA Projection
33
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
34
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
35
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
36
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
37
Motif Finding
Input x - DNA Sequence
Output y ? Y
Y 0, 1
Latent h - Motif Location
? - 0/1 Loss
Feature ?(x,y,h) - Ng and Cardie, ACL 2002
38
Motif Finding
UniProbe Dataset
40,000 sequences
50/50 train/test split
5 folds
39
Motif Finding
Average Hamming Distance of Inferred Motifs
SPL
SPL
SPL
SPL
40
Motif Finding
SPL
Objective Value
41
Motif Finding
SPL
Test Error
42
Noun Phrase Coreference
Input x - Nouns
Output y - Clustering
Latent h - Spanning Forest over Nouns
Feature ?(x,y,h) - Yu and Joachims, ICML 2009
43
Noun Phrase Coreference
60 documents
MUC6 Dataset
1 predefined fold
50/50 train/test split
44
Noun Phrase Coreference
MITRE Loss
Pairwise Loss
- Significant Improvement
- Significant Decrement
45
Noun Phrase Coreference
SPL
MITRE Loss
SPL
Pairwise Loss
46
Noun Phrase Coreference
SPL
MITRE Loss
SPL
Pairwise Loss
47
Summary
  • Automatic Self-Paced Learning
  • Concave-Biconvex Procedure
  • Generalization to other Latent models
  • Expectation-Maximization
  • E-step remains the same
  • M-step includes indicator variables vi

Kumar, Packer and Koller, NIPS 2010
View by Category
About This Presentation
Title:

Self-Paced Learning for Semantic Segmentation

Description:

Self-Paced Learning for Semantic Segmentation M. Pawan Kumar Self-Paced Learning for Latent Structural SVM M. Pawan Kumar Aim Aim Motivation Motivation Motivation ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 48
Provided by: philg85
Learn more at: http://research.microsoft.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Self-Paced Learning for Semantic Segmentation


1
Self-Paced Learning forSemantic Segmentation
  • M. Pawan Kumar

2
(No Transcript)
3
Self-Paced Learning forLatent Structural SVM
  • M. Pawan Kumar

Daphne Koller
Benjamin Packer
4
Aim
To learn accurate parameters for latent
structural SVM
Input x
Output y ? Y
Hidden Variable h ? H
Deer
Y Bison, Deer, Elephant, Giraffe,
Llama, Rhino
5
Aim
To learn accurate parameters for latent
structural SVM
Feature ?(x,y,h) (HOG, BoW)
Parameters w
(y,h) maxy?Y,h?H wT?(x,y,h)
6
Motivation
Math is for losers !!
Real Numbers
Imaginary Numbers
eip1 0
FAILURE BAD LOCAL MINIMUM
7
Motivation
Euler was a Genius!!
Real Numbers
Imaginary Numbers
eip1 0
SUCCESS GOOD LOCAL MINIMUM
8
Motivation
Start with easy examples, then consider hard
ones
Simultaneously estimate easiness and
parameters Easiness is property of data sets, not
single instances
Easy vs. Hard
Expensive
Easy for human ? Easy for machine
9
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

10
Latent Structural SVM
Felzenszwalb et al, 2008, Yu and Joachims, 2009
Training samples xi
Ground-truth label yi
Loss Function ?(yi, yi(w), hi(w))
11
Latent Structural SVM
(yi(w),hi(w)) maxy?Y,h?H wT?(x,y,h)
min w2 C?i?(yi, yi(w), hi(w))
Non-convex Objective
Minimize an upper bound
12
Latent Structural SVM
(yi(w),hi(w)) maxy?Y,h?H wT?(x,y,h)
min w2 C?i ?i
maxhiwT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) -
?i
Still non-convex
Difference of convex
CCCP Algorithm - converges to a local minimum
13
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

14
Concave-Convex Procedure
Start with an initial estimate w0
hi maxh?H wtT?(xi,yi,h)
Update
Update wt1 by solving a convex problem
min w2 C?i ?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
15
Concave-Convex Procedure
Looks at all samples simultaneously
Hard samples will cause confusion
Start with easy samples, then consider hard
ones
16
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

17
Self-Paced Learning
REMINDER
Simultaneously estimate easiness and
parameters Easiness is property of data sets, not
single instances
18
Self-Paced Learning
Start with an initial estimate w0
hi maxh?H wtT?(xi,yi,h)
Update
Update wt1 by solving a convex problem
min w2 C?i ?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
19
Self-Paced Learning
min w2 C?i ?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
20
Self-Paced Learning
vi ? 0,1
min w2 C?i vi?i
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Trivial Solution
21
Self-Paced Learning
vi ? 0,1
min w2 C?i vi?i - ?ivi/K
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Large K
Medium K
Small K
22
Self-Paced Learning
Alternating Convex Search
Biconvex Problem
vi ? 0,1
min w2 C?i vi?i - ?ivi/K
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Large K
Medium K
Small K
23
Self-Paced Learning
Start with an initial estimate w0
hi maxh?H wtT?(xi,yi,h)
Update
Update wt1 by solving a convex problem
min w2 C?i vi?i - ?i vi/K
wT?(xi,yi,hi) - wT?(xi,y,h) ?(yi, y, h) - ?i
Decrease K ? K/?
24
Outline
  • Latent Structural SVM
  • Concave-Convex Procedure
  • Self-Paced Learning
  • Experiments

25
Object Detection
Input x - Image
Output y ? Y
Latent h - Box
? - 0/1 Loss
Y Bison, Deer, Elephant, Giraffe,
Llama, Rhino
Feature ?(x,y,h) - HOG
26
Object Detection
Mammals Dataset
271 images, 6 classes
90/10 train/test split
4 folds
27
Object Detection
Self-Paced
CCCP
28
Object Detection
Self-Paced
CCCP
29
Object Detection
Self-Paced
CCCP
30
Object Detection
Self-Paced
CCCP
31
Object Detection
Objective value
Test error
32
Handwritten Digit Recognition
Input x - Image
Output y ? Y
Latent h - Rotation
? - 0/1 Loss
MNIST Dataset
Y 0, 1, , 9
Feature ?(x,y,h) - PCA Projection
33
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
34
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
35
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
36
Handwritten Digit Recognition
SPL
C
C
C
- Significant Difference
37
Motif Finding
Input x - DNA Sequence
Output y ? Y
Y 0, 1
Latent h - Motif Location
? - 0/1 Loss
Feature ?(x,y,h) - Ng and Cardie, ACL 2002
38
Motif Finding
UniProbe Dataset
40,000 sequences
50/50 train/test split
5 folds
39
Motif Finding
Average Hamming Distance of Inferred Motifs
SPL
SPL
SPL
SPL
40
Motif Finding
SPL
Objective Value
41
Motif Finding
SPL
Test Error
42
Noun Phrase Coreference
Input x - Nouns
Output y - Clustering
Latent h - Spanning Forest over Nouns
Feature ?(x,y,h) - Yu and Joachims, ICML 2009
43
Noun Phrase Coreference
60 documents
MUC6 Dataset
1 predefined fold
50/50 train/test split
44
Noun Phrase Coreference
MITRE Loss
Pairwise Loss
- Significant Improvement
- Significant Decrement
45
Noun Phrase Coreference
SPL
MITRE Loss
SPL
Pairwise Loss
46
Noun Phrase Coreference
SPL
MITRE Loss
SPL
Pairwise Loss
47
Summary
  • Automatic Self-Paced Learning
  • Concave-Biconvex Procedure
  • Generalization to other Latent models
  • Expectation-Maximization
  • E-step remains the same
  • M-step includes indicator variables vi

Kumar, Packer and Koller, NIPS 2010
About PowerShow.com