Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification - PowerPoint PPT Presentation

About This Presentation
Title:

Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification

Description:

Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification Andrea Frome, Yoram Singer, Fei Sha, Jitendra Malik – PowerPoint PPT presentation

Number of Views:182
Avg rating:3.0/5.0
Slides: 30
Provided by: vno7
Category:

less

Transcript and Presenter's Notes

Title: Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification


1
Learning Globally-Consistent Local Distance
Functions for Shape-Based Image Retrieval and
Classification
  • Andrea Frome, Yoram Singer, Fei Sha, Jitendra
    Malik

2
Goal
3
Nearest neighbor classification
D ( , )
4
Nearest neighbor classification
D ( , )
5
Learning a Distance Metric from Relative
Comparisons
Schulz Joachims, NIPS 03
( - )T
6
(No Transcript)
7
Approach
image i
image j
8
Approach
image i
dji,m
image j
9
Approach
image i
Dji S wj,mdji,m
image j
10
Approach
image i
lt
Dki
Dji
image j
11
Core
wj,m ?
image j
12
Derivations
  • Notation
  • Large-margin formulation
  • Dual problem
  • Solution

13
Notations
for triplet i, j, k
14
Large-margin formulation
15
SVM
16
SVM
17
SVM
18
SVM
19
Soft-margin SVM
20
Derivation
21
Dual
22
Details Features and descriptors
  • Find 400 features per image
  • Compute geometric blur descriptor

23
Descriptors
  • Geometric blur

24
Descriptors
  • Two sizes of geometric blur (42 pixels and 70
    pixels)
  • Each is 204 dimensions (4 orientations and 51
    samples each)
  • HSV histograms of 42-pixel patches

25
Choosing triplets
  • Caltech101 at 15 images per class
  • 31.8 million triplets
  • Many are easy to satisfy
  • For each image j, for each feature
  • Find the N images I with closest features
  • For each negative example i in I, form triplets
    (j, k, i)
  • Eliminates half of triplets

26
Choosing C
27
Choosing C
  • Train with multiple values of C, testing on a
    held-out part of the training set
  • Choose whichever gives the best results
  • For each C, run online version of the training
    algorithm
  • Make one sweep through training triplets
  • For each misclassified triplet (i,j,k), update
    weights for the three images
  • Choose C which gets the most right answers

28
Results
  • At 15 training examples per class 63.2 (3
    improvement)
  • At 20 training examples per class 66.6 (5
    improvement)

29
Results
  • Confusion matrix

Hardest categories crocodile, cougar_body,
cannon, bass
30
Questions
  • Is there any disadvantage to a non-metric
    distance function?
  • Could the images be embedded in a metric space?
  • Why not learn everything?
  • Include a feature for each image pixel
  • Include multiple types of descriptors
  • Could this be used for to do unsupervised
    learning for sets of tagged images (e.g., for
    image segmentation)?
  • Can you learn a single distance per class?
Write a Comment
User Comments (0)
About PowerShow.com