CS 676: Computer vision and image processing - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

CS 676: Computer vision and image processing

Description:

... annotated images of objects like cars, motorbikes etc. ... R. Zhang and Z. Zhang. Hidden semantic concept discovery in region based image retrieval. 2004. ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 16
Provided by: sujeet6
Category:

less

Transcript and Presenter's Notes

Title: CS 676: Computer vision and image processing


1
CS 676 Computer vision and image processing
  • Discovering objects and their categories in
    image collections
  • Ankur Saxena(Y3167058)
  • Sujeet Kumar(Y3177360)

2
Motivation
  • Scope for improvement in topic discovery
    approaches
  • Successful results of generative methods(pLSA,
    LDA) with bag of words representation in text
    domain
  • Extending the approach to images using visual
    analog of text words
  • Detecting objects with their categories in an
    unsupervised manner for efficient handling of
    large data

3
Bag-of-words
  • A document is a collection of M words
  • A corpus (collection of documents) is summarized
    in a term-document matrix

4
Probabilistic Latent Semantic Analysis (pLSA)
  • Goal To find out topics from a collection of
    documents (images)
  • pLSA is based on the expectation maximization
    algorithm Hoffman 99

5
pLSA
Slide credit Josef Sivic
6
Learning the pLSA parameters
Observed counts of word i in document j
Maximize likelihood of data using EM.
Slide credit Josef Sivic
7
EM for pLSA (training on a corpus)
  • E-step compute posterior probabilities for the
    latent variables
  • M-step maximize the expected complete data
    log-likelihood

8
Graphical View of pLSA

9
Training
  • The visual analogue of words are first obtained
    using SIFT descriptors which are invariant
    towards scaling, affine transforms etc.
  • The number of topics is chosen apriori
  • Using annotated images, probability matrix is
    obtained for mixing coefficients ie. Probability
    of word given a topic

10
Testing a new image
  • Given a novel image, significant features are
    first extracted using SIFT
  • These features are then compared with the word
    dictionary formed to get n(w,d) ie. Number of
    occurences of word w in document d
  • The topic for the document is obtained by
    maximizing the following expression (EM algorithm)

11
Datasets and standard routines
  • For training and testing, the caltech dataset
    would be used containing annotated images of
    objects like cars, motorbikes etc.
  • For calculating SIFT features, we are using the
    implementation present on http//www.robots.ox.ac.
    uk/vgg/research/affine

12
RESULTS for experiment over three categories
Bike, Sunflower and Pagoda temple
  • Testing results on a new (out of training set)
    image of motorbike. The SIFT words are shown in
    the image with red ones corresponding to bike,
    green for sunflower and blue for pagoda resp.

13
Contd..
  • Testing result for a new sunflower image

14
Contd..
  • Testing result for a new pagoda temple image

15
References
  • J. Sivic, B. C. Russell, A. A. Efros, A.
    Zisserman, W. T. Freeman. Discovering object
    categories in image collections, 2005.
  • G. Csurka, C. Bray, C. Dance, and L. Fan. Visual
    categorization with bags of keypoints. 2004
  • T. Hofmann. Probabilistic latent semantic
    indexing. 1999
  • D. Lowe. Object recognition from local
    scale-invariant features.1999.
  • R. Zhang and Z. Zhang. Hidden semantic concept
    discovery in region based image retrieval. 2004.
Write a Comment
User Comments (0)
About PowerShow.com