Local and Global Algorithms for Disambiguation to Wikipedia - PowerPoint PPT Presentation

About This Presentation
Title:

Local and Global Algorithms for Disambiguation to Wikipedia

Description:

for Disambiguation to Wikipedia Lev Ratinov1, Dan Roth1, Doug Downey2, Mike Anderson3 1University of Illinois at Urbana-Champaign 2Northwestern University – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 42
Provided by: DanR97
Category:

less

Transcript and Presenter's Notes

Title: Local and Global Algorithms for Disambiguation to Wikipedia


1
Local and Global Algorithms for Disambiguation
to Wikipedia
  • Lev Ratinov1, Dan Roth1, Doug Downey2, Mike
    Anderson3
  • 1University of Illinois at Urbana-Champaign
  • 2Northwestern University
  • 3Rexonomy

March 2011
2
Information overload














3
Organizing knowledge
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
4
Cross-document co-reference resolution
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
5
Reference resolution (disambiguation to
Wikipedia)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
6
The reference collection has structure
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
Is_a
Is_a
Used_In
Released
Succeeded
7
Analysis of Information Networks
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Chicago was used by default for Mac menus through
MacOS 7.6, and OS 8 was released mid-1997..
Chicago VIII was one of the early 70s-era Chicago
albums to catch my ear, along with Chicago II.
8
Here Wikipedia as a knowledge resource . but
we can use other resources
Is_a
Is_a
Used_In
Released
Succeeded
9
Talk outline
  • High-level algorithmic approach.
  • Bi-partite graph matching with global and local
    inference.
  • Local Inference.
  • Experiments Results
  • Global Inference.
  • Experiments Results
  • Results, Conclusions
  • Demo

10
Problem formulation - matching/ranking problem
Text Document(s)News, Blogs,
Wikipedia Articles
11
Local approach
Text Document(s)News, Blogs,
Wikipedia Articles
  • G is a solution to the problem
  • A set of pairs (m,t)
  • m a mention in the document
  • t the matched Wikipedia Title

12
Local approach
Text Document(s)News, Blogs,
Wikipedia Articles
  • G is a solution to the problem
  • A set of pairs (m,t)
  • m a mention in the document
  • t the matched Wikipedia Title

Local score of matching the mention to the title
13
Local Global using the Wikipedia structure
Text Document(s)News, Blogs,
Wikipedia Articles
A global term evaluating how good the
structure of the solution is
14
Can be reduced to an NP-hard problem
Text Document(s)News, Blogs,
Wikipedia Articles
15
A tractable variation
Text Document(s)News, Blogs,
Wikipedia Articles
  • Invent a surrogate solution G
  • disambiguate each mention independently.
  • Evaluate the structure based on pair-wise
    coherence scores ?(ti,tj)

16
Talk outline
  • High-level algorithmic approach.
  • Bi-partite graph matching with global and local
    inference.
  • Local Inference.
  • Experiments Results
  • Global Inference.
  • Experiments Results
  • Results, Conclusions
  • Demo

17
I. Baseline P(TitleSurface Form)
P(TitleChicago)
18
II. Context(Title)
Context(Charcoal) a font called __ is used to
19
III. Text(Title)
Just the text of the page (one per title)
20
Putting it all together
  • City Vs Font (0.99-0.0001, 0.01-0.2, 0.03-0.01)
  • Band Vs Font (0.001-0.0001, 0.001-0.2,
    0.02-0.01)
  • Training ranking SVM
  • Consider all title pairs.
  • Train a ranker on the pairs (learn to prefer the
    correct solution).
  • Inference knockout tournament.
  • Key Abstracts over the text learns which
    scores are important.

Score Baseline Score Context Score Text
Chicago_city 0.99 0.01 0.03
Chicago_font 0.0001 0.2 0.01
Chicago_band 0.001 0.001 0.02
21
Example font or city?
Text(Chicago_city), Context(Chicago_city)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
22
Lexical matching
Text(Chicago_city), Context(Chicago_city)
Cosine similarity, TF-IDF weighting
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
23
Ranking font vs. city
Text(Chicago_city), Context(Chicago_city)
0.5
0.2
0.1
0.8
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
0.3
0.2
0.3
0.5
Text(Chicago_font), Context(Chicago_font)
24
Train a ranking SVM
Text(Chicago_city), Context(Chicago_city)
(0.5, 0.2 , 0.1, 0.8)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
(0.2, 0, -0.2, 0.3), -1
(0.3, 0.2, 0.3, 0.5)
Text(Chicago_font), Context(Chicago_font)
25
Scaling issues one of our key contributions
Text(Chicago_city), Context(Chicago_city)
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
26
Scaling issues
Text(Chicago_city), Context(Chicago_city)
This stuff is big, and is loaded into the memory
from the disk
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
27
Improving performance
Text(Chicago_city), Context(Chicago_city)
Rather than computing TF-IDF weighted cosine
similarity, we want to train a classifier on the
fly. But due to the aggressive feature pruning,
we choose PrTFIDF
Its a version of Chicago the standard classic
Macintosh menu font, with that distinctive thick
diagonal in the N.
Text(Chicago_font), Context(Chicago_font)
28
Performance (local only) ranking accuracy
Dataset Baseline (solvable) Local TFIDF (solvable) Local PrTFIDF (solvable)
ACE 94.05 95.67 96.21
MSN News 81.91 84.04 85.10
AQUAINT 93.19 94.38 95.57
Wikipedia Test 85.88 92.76 93.59
29
Talk outline
  • High-level algorithmic approach.
  • Bi-partite graph matching with global and local
    inference.
  • Local Inference.
  • Experiments Results
  • Global Inference.
  • Experiments Results
  • Results, Conclusions
  • Demo

30
Co-occurrence(Title1,Title2)
The city senses of Boston and Chicago appear
together often.
31
Co-occurrence(Title1,Title2)
Rock music and albums appear together often
32
Global ranking
  • How to approximate the global semantic context
    in the document? (What is G?)
  • Use only non-ambiguous mentions for G
  • Use the top baseline disambiguation for NER
    surface forms.
  • Use the top baseline disambiguation for all the
    surface forms.
  • How to define relatedness between two titles?
    (What is ??)

33
? Pair-wise relatedness between 2 titles
  • Normalized Google Distance
  • Pointwise Mutual Information

34
What is best the G? (ranker
accuracy, solvable mentions)
Dataset Baseline Baseline Lexical Baseline Global Unambiguous Baseline Global NER Baseline Global, All Mentions
ACE 94.05 94.56 96.21 96.75
MSN News 81.91 84.46 84.04 88.51
AQUAINT 93.19 95.40 94.04 95.91
Wikipedia Test 85.88 89.67 89.59 89.79
35
Results ranker accuracy (solvable mentions)
Dataset Baseline Baseline Lexical Baseline Global Unambiguous Baseline Global NER Baseline Global, All Mentions
ACE 94.05 96.21 96.75
MSN News 81.91 85.10 88.51
AQUAINT 93.19 95.57 95.91
Wikipedia Test 85.88 93.59 89.79
36
Results Local Global
Dataset Baseline Baseline Lexical Baseline Lexical Global
ACE 94.05 96.21 97.83
MSN News 81.91 85.10 87.02
AQUAINT 93.19 95.57 94.38
Wikipedia Test 85.88 93.59 94.18
37
Talk outline
  • High-level algorithmic approach.
  • Bi-partite graph matching with global and local
    inference.
  • Local Inference.
  • Experiments Results
  • Global Inference.
  • Experiments Results
  • Results, Conclusions
  • Demo

38
Conclusions
  • Dealing with a very large scale knowledge
    acquisition and extraction problem
  • State-of-the-art algorithmic tools that exploit
    using content structure of the network.
  • Formulated a framework for Local Global
    reference resolution and disambiguation into
    knowledge networks
  • Proposed local and global algorithms state of
    the art performance.
  • Addressed scaling issue a major issue.
  • Identified key remaining challenges (next slide).

39
We want to know what we dont know
  • Not dealt well in the literature
  • As Peter Thompson, a 16-year-old hunter, said
    ..
  • Dorothy Byrne, a state coordinator for the
    Florida Green Party
  • We train a separate SVM classifier to identify
    such cases. The features are
  • All the baseline, lexical and semantic scores of
    the top candidate.
  • Score assigned to the top candidate by the
    ranker.
  • The confidence of the ranker on the top
    candidate with respect to second-best
    disambiguation.
  • Good-Turing probability of out-of-Wikipedia
    occurrence for the mention.
  • Limited success future research.

40
Comparison to the previous state of the art
(all mentions, including OOW)
Dataset Baseline MilneWitten Our System- GLOW
ACE 69.52 72.76 77.25
MSN News 72.83 68.49 74.88
AQUAINT 82.64 83.61 83.94
Wikipedia Test 81.77 80.32 90.54
41
Demo
Write a Comment
User Comments (0)
About PowerShow.com