Title: Discovering Collocation Patterns: from Visual Words to Visual Phrases
1Discovering Collocation Patternsfrom Visual
Words to Visual Phrases
Junsong Yuan, Ying Wu and Ming Yang CVPR07
2Discovering Visual Collocation
3An exciting idea detour
- Related Work J. Sivic et al. CVPR04, B. C.
Russell et al. CVPR06, G. Wang et al.
CVPR06, T. Quack et al. CIVR06, S. C. Zhu et al.
IJCV05,
4Confrontation
- Spatial characteristics of images
- over-counting co-occurrence frequency
- Uncertainty in visual patterns
- Continuous visual feature quantized word
- Visual synonym and polysemy
5Our Approach
6Selecting visual phrases
- Visual collocations may occur by chance
- Selecting phrases by a likelihood ratio test
- H0 occurrence of phrase P is randomly generated
- H1 phrase P is generated by a hidden pattern
- Prior
- Likelihood
- Check if words are co-located together by chance
or statistically meaningful
7Discovery of visual phrases
Frequent Word-sets ( Pgt2 )
Closed FIM
A B F P
C D E S
A B F T
C D E X
A B D K
AB
CD
DE
CE
AE
AF
BE
BF
CDE
ABF
ABE
pair-wise student t-test
ranked by L(P)
Group Database
likelihood ratio
AB
15.7 14.3 12.2 10.9 9.7
AF
Visual Phrase Lexicon (VPL)
ABF
BF
CD
8Frequent Itemset Mining (FIM)
- If an itemset is frequent ? then all of its
subsets must also be frequent
9Phrase Summarization
- Measuring the similarity between visual phrases
by KL-divergence Yan et al., SIGKDD 05 - Clustering visual phrases by Normalized-cut
10Pattern Summarization Results
Face database summarizing top-10 phrases into 6
semantic phrase patterns
Car database summarizing top-10 phrases into 2
semantic phrase patterns
11Partition of visual word lexicon
- Metric learning method
- Neighborhood component analysis (NCA).
Goldberger, et al., NIPS05 - improve the leave-one-out performance of the
nearest neighbor classifier
12Evaluation
- K-NN spatial group K5
- Two image category database car (123 images) and
face (435 images) - Precision of visual phrase lexicon
- the percentage of visual phrases Pi ? ? that are
located in the foreground object - Precision of background word lexicon
- the percentage of background words Wi ? O- that
are located in the background - Percentage of images that are retrieved
13Results visual phrases from car category
Visual phrase pattern 1 wheels
different colors represent different semantic
meanings
Visual phrase pattern 2 car bodies
14Results visual phrases from face category
15Comparison