CS 676: Computer vision and image processing

About This Presentation

Title:

CS 676: Computer vision and image processing

Description:

... annotated images of objects like cars, motorbikes etc. ... R. Zhang and Z. Zhang. Hidden semantic concept discovery in region based image retrieval. 2004. ... – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 16

Provided by: sujeet6

Category:

more less

Transcript and Presenter's Notes

Title: CS 676: Computer vision and image processing

1
CS 676 Computer vision and image processing

Discovering objects and their categories in
image collections
Ankur Saxena(Y3167058)
Sujeet Kumar(Y3177360)

2
Motivation

Scope for improvement in topic discovery
approaches
Successful results of generative methods(pLSA,
LDA) with bag of words representation in text
domain
Extending the approach to images using visual
analog of text words
Detecting objects with their categories in an
unsupervised manner for efficient handling of
large data

3
Bag-of-words

A document is a collection of M words
A corpus (collection of documents) is summarized
in a term-document matrix

4
Probabilistic Latent Semantic Analysis (pLSA)

Goal To find out topics from a collection of
documents (images)
pLSA is based on the expectation maximization
algorithm Hoffman 99

5
pLSA
Slide credit Josef Sivic
6
Learning the pLSA parameters
Observed counts of word i in document j
Maximize likelihood of data using EM.
Slide credit Josef Sivic
7
EM for pLSA (training on a corpus)

E-step compute posterior probabilities for the
latent variables
M-step maximize the expected complete data
log-likelihood

8
Graphical View of pLSA

9
Training

The visual analogue of words are first obtained
using SIFT descriptors which are invariant
towards scaling, affine transforms etc.
The number of topics is chosen apriori
Using annotated images, probability matrix is
obtained for mixing coefficients ie. Probability
of word given a topic

10
Testing a new image

Given a novel image, significant features are
first extracted using SIFT
These features are then compared with the word
dictionary formed to get n(w,d) ie. Number of
occurences of word w in document d
The topic for the document is obtained by
maximizing the following expression (EM algorithm)

11
Datasets and standard routines

For training and testing, the caltech dataset
would be used containing annotated images of
objects like cars, motorbikes etc.
For calculating SIFT features, we are using the
implementation present on http//www.robots.ox.ac.
uk/vgg/research/affine

12
RESULTS for experiment over three categories
Bike, Sunflower and Pagoda temple

Testing results on a new (out of training set)
image of motorbike. The SIFT words are shown in
the image with red ones corresponding to bike,
green for sunflower and blue for pagoda resp.

13
Contd..

Testing result for a new sunflower image

14
Contd..

Testing result for a new pagoda temple image

15
References

J. Sivic, B. C. Russell, A. A. Efros, A.
Zisserman, W. T. Freeman. Discovering object
categories in image collections, 2005.
G. Csurka, C. Bray, C. Dance, and L. Fan. Visual
categorization with bags of keypoints. 2004
T. Hofmann. Probabilistic latent semantic
indexing. 1999
D. Lowe. Object recognition from local
scale-invariant features.1999.
R. Zhang and Z. Zhang. Hidden semantic concept
discovery in region based image retrieval. 2004.