Local and Global Algorithms for Disambiguation to Wikipedia - PowerPoint PPT Presentation

About This Presentation

Title:

Local and Global Algorithms for Disambiguation to Wikipedia

Description:

for Disambiguation to Wikipedia Lev Ratinov1, Dan Roth1, Doug Downey2, Mike Anderson3 1University of Illinois at Urbana-Champaign 2Northwestern University – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 42

Provided by: DanR97

Learn more at: https://cogcomp.seas.upenn.edu

Category:

more less

Transcript and Presenter's Notes

Title: Local and Global Algorithms for Disambiguation to Wikipedia

1
Local and Global Algorithms for Disambiguation
to Wikipedia

Lev Ratinov1, Dan Roth1, Doug Downey2, Mike
Anderson3
1University of Illinois at Urbana-Champaign
2Northwestern University
3Rexonomy

March 2011
2
Information overload

3
Organizing knowledge
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
4
Cross-document co-reference resolution
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
5
Reference resolution (disambiguation to
Wikipedia)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
6
The reference collection has structure
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
Is_a
Is_a
Used_In
Released
Succeeded
7
Analysis of Information Networks
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
8
Here Wikipedia as a knowledge resource . but
we can use other resources
Is_a
Is_a
Used_In
Released
Succeeded
9
Talk outline

High-level algorithmic approach.
Bi-partite graph matching with global and local
inference.
Local Inference.
Experiments Results
Global Inference.
Experiments Results
Results, Conclusions
Demo

10
Problem formulation - matching/ranking problem
Text Document(s)News, Blogs,
Wikipedia Articles
11
Local approach
Text Document(s)News, Blogs,
Wikipedia Articles

G is a solution to the problem
A set of pairs (m,t)
m a mention in the document
t the matched Wikipedia Title

12
Local approach
Text Document(s)News, Blogs,
Wikipedia Articles

G is a solution to the problem
A set of pairs (m,t)
m a mention in the document
t the matched Wikipedia Title

Local score of matching the mention to the title
13
Local Global using the Wikipedia structure
Text Document(s)News, Blogs,
Wikipedia Articles
A global term evaluating how good the
structure of the solution is
14
Can be reduced to an NP-hard problem
Text Document(s)News, Blogs,
Wikipedia Articles
15
A tractable variation
Text Document(s)News, Blogs,
Wikipedia Articles

Invent a surrogate solution G
disambiguate each mention independently.
Evaluate the structure based on pair-wise
coherence scores ?(ti,tj)

16
Talk outline

High-level algorithmic approach.
Bi-partite graph matching with global and local
inference.
Local Inference.
Experiments Results
Global Inference.
Experiments Results
Results, Conclusions
Demo

17
I. Baseline P(TitleSurface Form)
P(TitleChicago)
18
II. Context(Title)
Context(Charcoal) a font called __ is used to
19
III. Text(Title)
Just the text of the page (one per title)
20
Putting it all together

City Vs Font (0.99-0.0001, 0.01-0.2, 0.03-0.01)
Band Vs Font (0.001-0.0001, 0.001-0.2,
0.02-0.01)
Training ranking SVM
Consider all title pairs.
Train a ranker on the pairs (learn to prefer the
correct solution).
Inference knockout tournament.
Key Abstracts over the text learns which
scores are important.

Score Baseline Score Context Score Text
Chicago_city 0.99 0.01 0.03
Chicago_font 0.0001 0.2 0.01
Chicago_band 0.001 0.001 0.02
21
Example font or city?
Text(Chicago_city), Context(Chicago_city)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
22
Lexical matching
Text(Chicago_city), Context(Chicago_city)
Cosine similarity, TF-IDF weighting
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
23
Ranking font vs. city
Text(Chicago_city), Context(Chicago_city)
0.5
0.2
0.1
0.8
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
0.3
0.2
0.3
0.5
Text(Chicago_font), Context(Chicago_font)
24
Train a ranking SVM
Text(Chicago_city), Context(Chicago_city)
(0.5, 0.2 , 0.1, 0.8)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
(0.2, 0, -0.2, 0.3), -1
(0.3, 0.2, 0.3, 0.5)
Text(Chicago_font), Context(Chicago_font)
25
Scaling issues one of our key contributions
Text(Chicago_city), Context(Chicago_city)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
26
Scaling issues
Text(Chicago_city), Context(Chicago_city)
This stuff is big, and is loaded into the memory
from the disk
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
27
Improving performance
Text(Chicago_city), Context(Chicago_city)
Rather than computing TF-IDF weighted cosine
similarity, we want to train a classifier on the
fly. But due to the aggressive feature pruning,
we choose PrTFIDF
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
28
Performance (local only) ranking accuracy
Dataset Baseline (solvable) Local TFIDF (solvable) Local PrTFIDF (solvable)
ACE 94.05 95.67 96.21
MSN News 81.91 84.04 85.10
AQUAINT 93.19 94.38 95.57
Wikipedia Test 85.88 92.76 93.59
29
Talk outline

High-level algorithmic approach.
Bi-partite graph matching with global and local
inference.
Local Inference.
Experiments Results
Global Inference.
Experiments Results
Results, Conclusions
Demo

30
Co-occurrence(Title1,Title2)
The city senses of Boston and Chicago appear
together often.
31
Co-occurrence(Title1,Title2)
Rock music and albums appear together often
32
Global ranking

How to approximate the global semantic context
in the document? (What is G?)
Use only non-ambiguous mentions for G
Use the top baseline disambiguation for NER
surface forms.
Use the top baseline disambiguation for all the
surface forms.
How to define relatedness between two titles?
(What is ??)

33
? Pair-wise relatedness between 2 titles

Normalized Google Distance
Pointwise Mutual Information

34
What is best the G? (ranker
accuracy, solvable mentions)
Dataset Baseline Baseline Lexical Baseline Global Unambiguous Baseline Global NER Baseline Global, All Mentions
ACE 94.05 94.56 96.21 96.75
MSN News 81.91 84.46 84.04 88.51
AQUAINT 93.19 95.40 94.04 95.91
Wikipedia Test 85.88 89.67 89.59 89.79
35
Results ranker accuracy (solvable mentions)
Dataset Baseline Baseline Lexical Baseline Global Unambiguous Baseline Global NER Baseline Global, All Mentions
ACE 94.05 96.21 96.75
MSN News 81.91 85.10 88.51
AQUAINT 93.19 95.57 95.91
Wikipedia Test 85.88 93.59 89.79
36
Results Local Global
Dataset Baseline Baseline Lexical Baseline Lexical Global
ACE 94.05 96.21 97.83
MSN News 81.91 85.10 87.02
AQUAINT 93.19 95.57 94.38
Wikipedia Test 85.88 93.59 94.18
37
Talk outline

High-level algorithmic approach.
Bi-partite graph matching with global and local
inference.
Local Inference.
Experiments Results
Global Inference.
Experiments Results
Results, Conclusions
Demo

38
Conclusions

Dealing with a very large scale knowledge
acquisition and extraction problem
State-of-the-art algorithmic tools that exploit
using content structure of the network.
Formulated a framework for Local Global
reference resolution and disambiguation into
knowledge networks
Proposed local and global algorithms state of
the art performance.
Addressed scaling issue a major issue.
Identified key remaining challenges (next slide).

39
We want to know what we dont know

Not dealt well in the literature
As Peter Thompson, a 16-year-old hunter, said
..
Dorothy Byrne, a state coordinator for the
Florida Green Party
We train a separate SVM classifier to identify
such cases. The features are
All the baseline, lexical and semantic scores of
the top candidate.
Score assigned to the top candidate by the
ranker.
The confidence of the ranker on the top
candidate with respect to second-best
disambiguation.
Good-Turing probability of out-of-Wikipedia
occurrence for the mention.
Limited success future research.

40
Comparison to the previous state of the art
(all mentions, including OOW)
Dataset Baseline MilneWitten Our System- GLOW
ACE 69.52 72.76 77.25
MSN News 72.83 68.49 74.88
AQUAINT 82.64 83.61 83.94
Wikipedia Test 81.77 80.32 90.54
41
Demo

Write a Comment

User Comments (0)