Information Extraction, Conditional Random Fields, and Social Network Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Information Extraction, Conditional Random Fields, and Social Network Analysis

Description:

people.cs.umass.edu – PowerPoint PPT presentation

Number of Views:192
Avg rating:3.0/5.0
Slides: 120
Provided by: Andrew1324
Category:

less

Transcript and Presenter's Notes

Title: Information Extraction, Conditional Random Fields, and Social Network Analysis


1
Information Extraction,Conditional Random
Fields,and Social Network Analysis
  • Andrew McCallum
  • Computer Science Department
  • University of Massachusetts Amherst
  • Joint work with
  • Aron Culotta, Charles Sutton, Ben Wellner,
    Khashayar Rohanimanesh, Wei Li,Andres Corrada,
    Xuerui Wang

2
Goal
Mine actionable knowledgefrom unstructured text.
3
Extracting Job Openings from the Web
4
A Portal for Job Openings
5
Job Openings Category High Tech Keyword Java
Location U.S.
6
Data Mining the Extracted Job Information
7
IE fromChinese Documents regarding Weather
Department of Terrestrial System, Chinese Academy
of Sciences
200k documents several millennia old - Qing
Dynasty Archives - memos - newspaper articles -
diaries
8
IE from Research Papers
McCallum et al 99
9
IE from Research Papers
10
Mining Research Papers
Rosen-Zvi, Griffiths, Steyvers, Smyth, 2004
Giles et al
11
What is Information Extraction
As a familyof techniques
Information Extraction segmentation
classification clustering association
October 14, 2002, 400 a.m. PT For years,
Microsoft Corporation CEO Bill Gates railed
against the economic philosophy of open-source
software with Orwellian fervor, denouncing its
communal licensing as a "cancer" that stifled
technological innovation. Today, Microsoft
claims to "love" the open-source concept, by
which software code is made public to encourage
improvement and development by outside
programmers. Gates himself says Microsoft will
gladly disclose its crown jewels--the coveted
code behind the Windows operating system--to
select customers. "We can be open source. We
love the concept of shared source," said Bill
Veghte, a Microsoft VP. "That's a super-important
shift for us in terms of code access. Richard
Stallman, founder of the Free Software
Foundation, countered saying
Microsoft Corporation CEO Bill Gates Microsoft Gat
es Microsoft Bill Veghte Microsoft VP Richard
Stallman founder Free Software Foundation
12
What is Information Extraction
As a familyof techniques
Information Extraction segmentation
classification association clustering
October 14, 2002, 400 a.m. PT For years,
Microsoft Corporation CEO Bill Gates railed
against the economic philosophy of open-source
software with Orwellian fervor, denouncing its
communal licensing as a "cancer" that stifled
technological innovation. Today, Microsoft
claims to "love" the open-source concept, by
which software code is made public to encourage
improvement and development by outside
programmers. Gates himself says Microsoft will
gladly disclose its crown jewels--the coveted
code behind the Windows operating system--to
select customers. "We can be open source. We
love the concept of shared source," said Bill
Veghte, a Microsoft VP. "That's a super-important
shift for us in terms of code access. Richard
Stallman, founder of the Free Software
Foundation, countered saying
Microsoft Corporation CEO Bill Gates Microsoft Gat
es Microsoft Bill Veghte Microsoft VP Richard
Stallman founder Free Software Foundation
13
What is Information Extraction
As a familyof techniques
Information Extraction segmentation
classification association clustering
October 14, 2002, 400 a.m. PT For years,
Microsoft Corporation CEO Bill Gates railed
against the economic philosophy of open-source
software with Orwellian fervor, denouncing its
communal licensing as a "cancer" that stifled
technological innovation. Today, Microsoft
claims to "love" the open-source concept, by
which software code is made public to encourage
improvement and development by outside
programmers. Gates himself says Microsoft will
gladly disclose its crown jewels--the coveted
code behind the Windows operating system--to
select customers. "We can be open source. We
love the concept of shared source," said Bill
Veghte, a Microsoft VP. "That's a super-important
shift for us in terms of code access. Richard
Stallman, founder of the Free Software
Foundation, countered saying
Microsoft Corporation CEO Bill Gates Microsoft Gat
es Microsoft Bill Veghte Microsoft VP Richard
Stallman founder Free Software Foundation
14
What is Information Extraction
As a familyof techniques
Information Extraction segmentation
classification association clustering
October 14, 2002, 400 a.m. PT For years,
Microsoft Corporation CEO Bill Gates railed
against the economic philosophy of open-source
software with Orwellian fervor, denouncing its
communal licensing as a "cancer" that stifled
technological innovation. Today, Microsoft
claims to "love" the open-source concept, by
which software code is made public to encourage
improvement and development by outside
programmers. Gates himself says Microsoft will
gladly disclose its crown jewels--the coveted
code behind the Windows operating system--to
select customers. "We can be open source. We
love the concept of shared source," said Bill
Veghte, a Microsoft VP. "That's a super-important
shift for us in terms of code access. Richard
Stallman, founder of the Free Software
Foundation, countered saying
Microsoft Corporation CEO Bill Gates Microsoft Gat
es Microsoft Bill Veghte Microsoft VP Richard
Stallman founder Free Software Foundation

Free Soft..
Microsoft
Microsoft
TITLE ORGANIZATION

founder

CEO
VP

Stallman
NAME
Veghte
Bill Gates
Richard
Bill
15
Larger Context
Spider
Filter
Data Mining
IE
Segment Classify Associate Cluster
Discover patterns - entity types - links /
relations - events
Database
Documentcollection
Actionableknowledge
Prediction Outlier detection Decision support
16
Outline
a
  • Examples of IE and Data Mining.
  • Brief review of Conditional Random Fields
  • Joint inference Motivation and examples
  • Joint Labeling of Cascaded Sequences (Belief
    Propagation)
  • Joint Labeling of Distant Entities (BP by Tree
    Reparameterization)
  • Joint Co-reference Resolution (Graph
    Partitioning)
  • Joint Segmentation and Co-ref (Iterated
    Conditional Samples)
  • Interactive IE
  • Two example projects
  • Email, contact management, and Social Network
    Analysis
  • Research Paper search and analysis

17
Hidden Markov Models
HMMs are the standard sequence modeling tool in
genomics, music, speech, NLP,
Graphical model
Finite state model
S
S
S
transitions
t
-
1
t
t1
...
...
observations
...
Generates State sequence Observation
sequence
O
O
O
t
t
1
-
t
1
o1 o2 o3 o4 o5 o6 o7 o8
Parameters for all states Ss1,s2, Start
state probabilities P(st ) Transition
probabilities P(stst-1 ) Observation
(emission) probabilities P(otst ) Training
Maximize probability of training observations (w/
prior)
Usually a multinomial over atomic, fixed alphabet
18
IE with Hidden Markov Models
Given a sequence of observations
Yesterday Rich Caruana spoke this example
sentence.
and a trained HMM
person name
location name
background
Find the most likely state sequence (Viterbi)
Yesterday Rich Caruana spoke this example
sentence.
Any words said to be generated by the designated
person name state extract as a person name
Person name Rich Caruana
19
We want More than an Atomic View of Words
Would like richer representation of text many
arbitrary, overlapping features of the words.
S
S
S
identity of word ends in -ski is capitalized is
part of a noun phrase is in a list of city
names is under node X in WordNet is in bold
font is indented is in hyperlink anchor last
person name was female next two words are and
Associates
t
-
1
t
t1

is Wisniewski

part ofnoun phrase
ends in -ski
O
O
O
t
t
1
-
t
1
20
Problems with Richer Representationand a Joint
Model
  • These arbitrary features are not independent.
  • Multiple levels of granularity (chars, words,
    phrases)
  • Multiple dependent modalities (words, formatting,
    layout)
  • Past future
  • Two choices

Ignore the dependencies. This causes
over-counting of evidence (ala naïve Bayes).
Big problem when combining evidence, as in
Viterbi!
Model the dependencies. Each state would have its
own Bayes Net. But we are already starved for
training data!
S
S
S
S
S
S
t
-
1
t
t1
t
-
1
t
t1
O
O
O
O
O
O
t
t
t
1
-
t
1
-
t
1
t
1
21
Conditional Sequence Models
  • We prefer a model that is trained to maximize a
    conditional probability rather than joint
    probabilityP(so) instead of P(s,o)
  • Can examine features, but not responsible for
    generating them.
  • Dont have to explicitly model their
    dependencies.
  • Dont waste modeling effort trying to generate
    what we are given at test time anyway.

22
From HMMs to Conditional Random Fields
Lafferty, McCallum, Pereira 2001
St-1
St
St1
Joint
...
...
Ot
Ot1
Ot-1
Conditional
St-1
St
St1
...
Ot
Ot1
Ot-1
...
where
(A super-special case of Conditional Random
Fields.)
Set parameters by maximum likelihood, using
optimization method on ?L.
23
Conditional Random Fields
Lafferty, McCallum, Pereira 2001
1. FSM special-case linear chain among
unknowns, parameters tied across time steps.
St
St1
St2
St3
St4
O Ot, Ot1, Ot2, Ot3, Ot4
2. In general CRFs "Conditionally-traine
d Markov Network" arbitrary structure among
unknowns
3. Relational Markov Networks Taskar, Abbeel,
Koller 2002 Parameters tied across hits
from SQL-like queries ("clique templates")
24
(Linear Chain) Conditional Random Fields
Lafferty, McCallum, Pereira 2001
Undirected graphical model, trained to maximize
conditional probability of outputs given inputs
Finite state model
Graphical model
OTHER PERSON OTHER ORG TITLE
output seq
y
y
y
y
y
t2
t3
t
-
1
t
t1
FSM states
. . .
observations
x
x
x
x
x
t
2
t
3
t
t
1
-
t
1
said Veght a Microsoft VP
input seq
25
Training CRFs
Feature count using correct labels
Feature count using predicted labels
-
-
Smoothing penalty
26
Linear-chain CRFs vs. HMMs
  • Comparable computational efficiency for inference
  • Features may be arbitrary functions of any or all
    observations
  • Parameters need not fully specify generation of
    observations can require less training data
  • Easy to incorporate domain knowledge

27
Table Extraction from Government Reports
Cash receipts from marketings of milk during 1995
at 19.9 billion dollars, was slightly below
1994. Producer returns averaged 12.93 per
hundredweight, 0.19 per hundredweight
below 1994. Marketings totaled 154 billion
pounds, 1 percent above 1994. Marketings
include whole milk sold to plants and dealers as
well as milk sold directly to consumers.


An estimated 1.56 billion pounds of milk
were used on farms where produced, 8 percent
less than 1994. Calves were fed 78 percent of
this milk with the remainder consumed in
producer households.



Milk Cows
and Production of Milk and Milkfat
United States,
1993-95
-------------------------------------------------
-------------------------------
Production of Milk and Milkfat
2/ Number
-------------------------------------------------
------ Year of Per Milk Cow
Percentage Total
Milk Cows 1/------------------- of Fat in All
------------------
Milk Milkfat Milk Produced Milk
Milkfat ----------------------------------------
----------------------------------------
1,000 Head --- Pounds --- Percent
Million Pounds

1993 9,589 15,704 575
3.66 150,582 5,514.4 1994
9,500 16,175 592 3.66
153,664 5,623.7 1995 9,461
16,451 602 3.66 155,644
5,694.3 ----------------------------------------
---------------------------------------- 1/
Average number during year, excluding heifers not
yet fresh. 2/ Excludes milk
sucked by calves.

28
Table Extraction from Government Reports
Pinto, McCallum, Wei, Croft, 2003 SIGIR
100 documents from www.fedstats.gov
Labels
CRF
  • Non-Table
  • Table Title
  • Table Header
  • Table Data Row
  • Table Section Data Row
  • Table Footnote
  • ... (12 in all)

Cash receipts from marketings of milk during 1995
at 19.9 billion dollars, was slightly below
1994. Producer returns averaged 12.93 per
hundredweight, 0.19 per hundredweight
below 1994. Marketings totaled 154 billion
pounds, 1 percent above 1994. Marketings
include whole milk sold to plants and dealers as
well as milk sold directly to consumers.


An estimated 1.56 billion pounds of milk
were used on farms where produced, 8 percent
less than 1994. Calves were fed 78 percent of
this milk with the remainder consumed in
producer households.



Milk Cows
and Production of Milk and Milkfat
United States,
1993-95
-------------------------------------------------
-------------------------------
Production of Milk and Milkfat
2/ Number
-------------------------------------------------
------ Year of Per Milk Cow
Percentage Total
Milk Cows 1/------------------- of Fat in All
------------------
Milk Milkfat Milk Produced Milk
Milkfat ----------------------------------------
----------------------------------------
1,000 Head --- Pounds --- Percent
Million Pounds

1993 9,589 15,704 575
3.66 150,582 5,514.4 1994
9,500 16,175 592 3.66
153,664 5,623.7 1995 9,461
16,451 602 3.66 155,644
5,694.3 ----------------------------------------
---------------------------------------- 1/
Average number during year, excluding heifers not
yet fresh. 2/ Excludes milk
sucked by calves.
Features
  • Percentage of digit chars
  • Percentage of alpha chars
  • Indented
  • Contains 5 consecutive spaces
  • Whitespace in this line aligns with prev.
  • ...
  • Conjunctions of all previous features, time
    offset 0,0, -1,0, 0,1, 1,2.

29
Table Extraction Experimental Results
Pinto, McCallum, Wei, Croft, 2003 SIGIR
Line labels, percent correct
Table segments, F1
HMM
65
64
Stateless MaxEnt
85
-
95
92
CRF
30
IE from Research Papers
McCallum et al 99
31
IE from Research Papers
Field-level F1 Hidden Markov Models
(HMMs) 75.6 Seymore, McCallum, Rosenfeld,
1999 Support Vector Machines (SVMs) 89.7 Han,
Giles, et al, 2003 Conditional Random Fields
(CRFs) 93.9 Peng, McCallum, 2004
? error 40
32
Named Entity Recognition
CRICKET - MILLNS SIGNS FOR BOLAND CAPE TOWN
1996-08-22 South African provincial side Boland
said on Thursday they had signed Leicestershire
fast bowler David Millns on a one year contract.
Millns, who toured Australia with England A in
1992, replaces former England all-rounder Phillip
DeFreitas as Boland's overseas professional.
Labels Examples
PER Yayuk Basuki Innocent Butare ORG 3M KDP
Cleveland LOC Cleveland Nirmal Hriday The
Oval MISC Java Basque 1,000 Lakes Rally
33
Automatically Induced Features
McCallum Li, 2003, CoNLL
Index Feature 0 inside-noun-phrase
(ot-1) 5 stopword (ot) 20 capitalized
(ot1) 75 wordthe (ot) 100 in-person-lexicon
(ot-1) 200 wordin (ot2) 500 wordRepublic
(ot1) 711 wordRBI (ot) headerBASEBALL 1027 he
aderCRICKET (ot) in-English-county-lexicon
(ot) 1298 company-suffix-word (firstmentiont2) 40
40 location (ot) POSNNP (ot) capitalized
(ot) stopword (ot-1) 4945 moderately-rare-first-
name (ot-1) very-common-last-name
(ot) 4474 wordthe (ot-2) wordof (ot)
34
Named Entity Extraction Results
McCallum Li, 2003, CoNLL
Method F1 HMMs BBN's Identifinder 73 CRFs
w/out Feature Induction 83 CRFs with Feature
Induction 90 based on LikelihoodGain
35
Outline
a
  • Examples of IE and Data Mining.
  • Brief review of Conditional Random Fields
  • Joint inference Motivation and examples
  • Joint Labeling of Cascaded Sequences (Belief
    Propagation)
  • Joint Labeling of Distant Entities (BP by Tree
    Reparameterization)
  • Joint Co-reference Resolution (Graph
    Partitioning)
  • Joint Segmentation and Co-ref (Iterated
    Conditional Samples)
  • Interactive IE
  • Two example projects
  • Email, contact management, and Social Network
    Analysis
  • Research Paper search and analysis

a
36
Larger Context
Spider
Filter
Data Mining
IE
Segment Classify Associate Cluster
Discover patterns - entity types - links /
relations - events
Database
Documentcollection
Actionableknowledge
Prediction Outlier detection Decision support
37
Problem
  • Combined in serial juxtaposition,
  • IE and DM are unaware of each others
  • weaknesses and opportunities.
  • DM begins from a populated DB, unaware of where
    the data came from, or its inherent
    uncertainties.
  • IE is unaware of emerging patterns and
    regularities in the DB.
  • The accuracy of both suffers, and significant
    mining of complex text sources is beyond reach.

38
Solution
Uncertainty Info
Spider
Filter
Data Mining
IE
Segment Classify Associate Cluster
Discover patterns - entity types - links /
relations - events
Database
Documentcollection
Actionableknowledge
Emerging Patterns
Prediction Outlier detection Decision support
39
Solution
Unified Model
Spider
Filter
Data Mining
IE
Segment Classify Associate Cluster
Discover patterns - entity types - links /
relations - events
Probabilistic Model
Documentcollection
Actionableknowledge
Prediction Outlier detection Decision support
40
Larger-scale Joint Inference for IE
  • What model structures will capture salient
    dependencies?
  • Will joint inference improve accuracy?
  • How do to inference in these large graphical
    models?
  • How to efficiently train these models,which are
    built from multiple large components?

41
1. Jointly labeling cascaded sequencesFactorial
CRFs
Sutton, Khashayar, McCallum, ICML 2004
Named-entity tag
Noun-phrase boundaries
Part-of-speech
English words
42
1. Jointly labeling cascaded sequencesFactorial
CRFs
Sutton, Khashayar, McCallum, ICML 2004
Named-entity tag
Noun-phrase boundaries
Part-of-speech
English words
43
1. Jointly labeling cascaded sequencesFactorial
CRFs
Sutton, Khashayar, McCallum, ICML 2004
Named-entity tag
Noun-phrase boundaries
Part-of-speech
English words
But errors cascade--must be perfect at every
stage to do well.
44
1. Jointly labeling cascaded sequencesFactorial
CRFs
Sutton, Khashayar, McCallum, ICML 2004
Named-entity tag
Noun-phrase boundaries
Part-of-speech
English words
Joint prediction of part-of-speech and
noun-phrase in newswire, matching accuracy with
only 50 of the training data.
Inference Tree reparameterization BP
Wainwright et al, 2002
45
2. Jointly labeling distant mentionsSkip-chain
CRFs
Sutton, McCallum, SRL 2004

Senator Joe Green said today .
Green ran for
Dependency among similar, distant mentions
ignored.
46
2. Jointly labeling distant mentionsSkip-chain
CRFs
Sutton, McCallum, SRL 2004

Senator Joe Green said today .
Green ran for
14 reduction in error on most repeated field in
email seminar announcements.
Inference Tree reparameterization BP
Wainwright et al, 2002
47
3. Joint co-reference among all pairsAffinity
Matrix CRF
Entity resolutionObject correspondence
. . . Mr Powell . . .
45
. . . Powell . . .
Y/N
Y/N
-99
Y/N
25 reduction in error on co-reference of proper
nouns in newswire.
11
. . . she . . .
Inference Correlational clustering graph
partitioning
McCallum, Wellner, IJCAI WS 2003, NIPS 2004
Bansal, Blum, Chawla, 2002
48
Coreference Resolution
AKA "record linkage", "database record
deduplication", "entity resolution", "object
correspondence", "identity uncertainty"
Output
Input
News article, with named-entity "mentions" tagged
Number of entities, N 3 1 Secretary of
State Colin Powell he Mr. Powell
Powell 2 Condoleezza Rice she
Rice 3 President Bush Bush
Today Secretary of State Colin Powell met with .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . he
. . . . . . . . . . . . . . . . . . . Condoleezza
Rice . . . . . . . . . Mr Powell . . . . . . . .
. .she . . . . . . . . . . . . . . . . . . . . .
Powell . . . . . . . . . . . . . . . President
Bush . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . Rice . . . . . . . . . .
. . . . . . Bush . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
49
Inside the Traditional Solution
Pair-wise Affinity Metric
Mention (3)
Mention (4)
Y/N?
. . . Mr Powell . . .
. . . Powell . . .
N Two words in common 29 Y One word in
common 13 Y "Normalized" mentions are string
identical 39 Y Capitalized word in
common 17 Y gt 50 character tri-gram
overlap 19 N lt 25 character tri-gram
overlap -34 Y In same sentence 9 Y Within
two sentences 8 N Further than 3 sentences
apart -1 Y "Hobbs Distance" lt 3 11 N Number
of entities in between two mentions
0 12 N Number of entities in between two mentions
gt 4 -3 Y Font matches 1 Y Default -19
OVERALL SCORE 98 gt threshold0
50
The Problem
Pair-wise merging decisions are being made
independently from each other
. . . Mr Powell . . .
affinity 98
Y
. . . Powell . . .
N
affinity -104
They should be made in relational dependence with
each other.
Y
affinity 11
. . . she . . .
Affinity measures are noisy and imperfect.
51
A Generative Model Solution
Russell 2001, Pasula et al 2002, Milch et al
2003, Marthi et al 2003
(Applied to citation matching, and object
correspondence in vision)
N
id
words
context
id
surname
distance
fonts
age
gender
. . .
. . .
52
A Markov Random Field for Co-reference
(MRF)
McCallum Wellner, 2003, ICML
. . . Mr Powell . . .
45
Make pair-wise merging decisions in dependent
relation to each other by - calculating a joint
prob. - including all edge weights - adding
dependence on consistent triangles.
. . . Powell . . .
Y/N
Y/N
-30
Y/N
11
. . . she . . .
53
A Markov Random Field for Co-reference
(MRF)
McCallum Wellner, 2003
. . . Mr Powell . . .
45
Make pair-wise merging decisions in dependent
relation to each other by - calculating a joint
prob. - including all edge weights - adding
dependence on consistent triangles.
. . . Powell . . .
Y/N
Y/N
-30
Y/N
11
. . . she . . .
54
A Markov Random Field for Co-reference
(MRF)
McCallum Wellner, 2003
. . . Mr Powell . . .
-(45)
. . . Powell . . .
N
N
-(-30)
Y
(11)
-4
. . . she . . .
55
A Markov Random Field for Co-reference
(MRF)
McCallum Wellner, 2003
. . . Mr Powell . . .
(45)
. . . Powell . . .
Y
N
-(-30)
Y
(11)
-infinity
. . . she . . .
56
A Markov Random Field for Co-reference
(MRF)
McCallum Wellner, 2003
. . . Mr Powell . . .
(45)
. . . Powell . . .
Y
N
-(-30)
N
-(11)
. . . she . . .
64
57
Inference in these MRFs Graph Partitioning
Boykov, Vekler, Zabih, 1999, Kolmogorov
Zabih, 2002, Yu, Cross, Shi, 2002
. . . Mr Powell . . .
45
. . . Powell . . .
-106
-30
-134
11
. . . Condoleezza Rice . . .
. . . she . . .
10
58
Inference in these MRFs Graph Partitioning
Boykov, Vekler, Zabih, 1999, Kolmogorov
Zabih, 2002, Yu, Cross, Shi, 2002
. . . Mr Powell . . .
45
. . . Powell . . .
-106
-30
-134
11
. . . Condoleezza Rice . . .
. . . she . . .
10
-22
59
Inference in these MRFs Graph Partitioning
Boykov, Vekler, Zabih, 1999, Kolmogorov
Zabih, 2002, Yu, Cross, Shi, 2002
. . . Mr Powell . . .
45
. . . Powell . . .
-106
-30
-134
11
. . . Condoleezza Rice . . .
. . . she . . .
10
314
60
Co-reference Experimental Results
McCallum Wellner, 2003
Proper noun co-reference DARPA ACE broadcast
news transcripts, 117 stories Partition
F1 Pair F1 Single-link threshold 16 18 Best
prev match Morton 83 89 MRFs 88 92
?error30 ?error28 DARPA MUC-6 newswire
article corpus, 30 stories Partition F1 Pair
F1 Single-link threshold 11 7 Best prev
match Morton 70 76 MRFs 74 80
?error13 ?error17
61
Joint Co-reference Decisions,Discriminative Model
Culotta McCallum 2005
People
Stuart Russell
Y/N
Stuart Russell
Y/N
Y/N
S. Russel
62
Co-reference for Multiple Entity Types
Culotta McCallum 2005
People
Organizations
Stuart Russell
University of California at Berkeley
Y/N
Y/N
Stuart Russell
Y/N
Berkeley
Y/N
Y/N
Y/N
S. Russel
Berkeley
63
Joint Co-reference of Multiple Entity Types
Culotta McCallum 2005
People
Organizations
Stuart Russell
University of California at Berkeley
Y/N
Y/N
Stuart Russell
Y/N
Berkeley
Y/N
Y/N
Y/N
Reduces error by 22
S. Russel
Berkeley
64
Joint Co-reference Experimental Results
Culotta McCallum 2005
CiteSeer Dataset 1500 citations, 900 unique
papers, 350 unique venues Paper
Venue indep joint indep joint constraint 88.
9 91.0 79.4 94.1 reinforce 92.2 92.2 56.5 60.1
face 88.2 93.7 80.9 82.8 reason 97.4 97.0 75
.6 79.5 Micro Average 91.7 93.4 73.1 79.1 ?
error20 ?error22
65
Joint co-reference among all pairsAffinity
Matrix CRF
. . . Mr Powell . . .
45
. . . Powell . . .
Y/N
Y/N
-99
Y/N
25 reduction in error on co-reference of proper
nouns in newswire.
11
. . . she . . .
Inference Correlational clustering graph
partitioning
McCallum, Wellner, IJCAI WS 2003, NIPS 2004
Bansal, Blum, Chawla, 2002
66
4. Joint segmentation and co-reference
Extraction from and matching of research paper
citations.
o
s
World Knowledge
Laurel, B. Interface Agents Metaphors with
Character, in The Art of Human-Computer
Interface Design, B. Laurel (ed), Addison-Wesley,
1990.
c
Co-reference decisions
y
y
p
Brenda Laurel. Interface Agents Metaphors with
Character, in Laurel, The Art of Human-Computer
Interface Design, 355-366, 1990.
Databasefield values
c
y
c
Citation attributes
s
s
Segmentation
o
o
35 reduction in co-reference error by using
segmentation uncertainty.
6-14 reduction in segmentation error by using
co-reference.
Inference Variant of Iterated Conditional Modes
Wellner, McCallum, Peng, Hay, UAI 2004
see also Marthi, Milch, Russell, 2003
Besag, 1986
67
Joint IE and Coreference from Research Paper
Citations
4. Joint segmentation and co-reference
Textual citation mentions(noisy, with duplicates)
Paper database, with fields,clean, duplicates
collapsed
AUTHORS TITLE VENUE Cowell,
Dawid Probab Springer Montemerlo,
ThrunFastSLAM AAAI Kjaerulff
Approxi Technic
68
Citation Segmentation and Coreference
Laurel, B. Interface Agents Metaphors with
Character , in The Art of Human-Computer
Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Brenda Laurel . Interface Agents Metaphors
with Character , in Smith , The Art of
Human-Computr Interface Design , 355-366 ,
1990 .
69
Citation Segmentation and Coreference
Laurel, B. Interface Agents Metaphors with
Character , in The Art of Human-Computer
Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Brenda Laurel . Interface Agents Metaphors
with Character , in Smith , The Art of
Human-Computr Interface Design , 355-366 ,
1990 .
  1. Segment citation fields

70
Citation Segmentation and Coreference
Laurel, B. Interface Agents Metaphors with
Character , in The Art of Human-Computer
Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Y ? N
Brenda Laurel . Interface Agents Metaphors
with Character , in Smith , The Art of
Human-Computr Interface Design , 355-366 ,
1990 .
  1. Segment citation fields
  2. Resolve coreferent citations

71
Citation Segmentation and Coreference
Laurel, B. Interface Agents Metaphors with
Character , in The Art of Human-Computer
Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Y ? N
Brenda Laurel . Interface Agents Metaphors
with Character , in Smith , The Art of
Human-Computr Interface Design , 355-366 ,
1990 .
Segmentation Quality Citation Co-reference (F1)
No Segmentation 78
CRF Segmentation 91
True Segmentation 93
  1. Segment citation fields
  2. Resolve coreferent citations

72
Citation Segmentation and Coreference
Laurel, B. Interface Agents Metaphors with
Character , in The Art of Human-Computer
Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Y ? N
Brenda Laurel . Interface Agents Metaphors
with Character , in Smith , The Art of
Human-Computr Interface Design , 355-366 ,
1990 .
AUTHOR Brenda Laurel TITLE Interface
Agents Metaphors with CharacterPAGES
355-366BOOKTITLE The Art of Human-Computer
Interface DesignEDITOR T. SmithPUBLISHER
Addison-WesleyYEAR 1990
  1. Segment citation fields
  2. Resolve coreferent citations
  3. Form canonical database record

Resolving conflicts
73
Citation Segmentation and Coreference
Laurel, B. Interface Agents Metaphors with
Character , in The Art of Human-Computer
Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Y ? N
Brenda Laurel . Interface Agents Metaphors
with Character , in Smith , The Art of
Human-Computr Interface Design , 355-366 ,
1990 .
AUTHOR Brenda Laurel TITLE Interface
Agents Metaphors with CharacterPAGES
355-366BOOKTITLE The Art of Human-Computer
Interface DesignEDITOR T. SmithPUBLISHER
Addison-WesleyYEAR 1990
  1. Segment citation fields
  2. Resolve coreferent citations
  3. Form canonical database record

jointly.
Perform
74
IE Coreference Model
AUT AUT YR TITL TITL
CRF Segmentation
s
Observed citation
x
J Besag 1986 On the
75
IE Coreference Model
AUTHOR J Besag YEAR 1986 TITLE On
the
Citation mention attributes
c
CRF Segmentation
s
Observed citation
x
J Besag 1986 On the
76
IE Coreference Model
Smyth , P Data mining
Structure for each citation mention
c
s
x
J Besag 1986 On the
Smyth . 2001 Data Mining
77
IE Coreference Model
Smyth , P Data mining
Binary coreference variablesfor each pair of
mentions
c
s
x
J Besag 1986 On the
Smyth . 2001 Data Mining
78
IE Coreference Model
Smyth , P Data mining
Binary coreference variablesfor each pair of
mentions
y
n
n
c
s
x
J Besag 1986 On the
Smyth . 2001 Data Mining
79
IE Coreference Model
Smyth , P Data mining
AUTHOR P Smyth YEAR 2001 TITLE Data
Mining ...
Research paper entity attribute nodes
y
n
n
c
s
x
J Besag 1986 On the
Smyth . 2001 Data Mining
80
IE Coreference Model
Smyth , P Data mining
Research paper entity attribute node
y
y
y
c
s
x
J Besag 1986 On the
Smyth . 2001 Data Mining
81
IE Coreference Model
Smyth , P Data mining
y
n
n
c
s
x
J Besag 1986 On the
Smyth . 2001 Data Mining
82
  • Such a highly connected graph makes exact
    inference intractable

83
Approximate Inference 1
m1(v2)
m2(v3)
  • Loopy Belief
  • Propagation

v1
v3
v2
m3(v2)
m2(v1)
messages passed between nodes
v6
v5
v4
84
Approximate Inference 1
m1(v2)
m2(v3)
  • Loopy Belief
  • Propagation
  • Generalized Belief
  • Propagation

v1
v3
v2
m3(v2)
m2(v1)
messages passed between nodes
v6
v5
v4
messages passed between regions
Here, a message is a conditional probability
table passed among nodes.But when message size
grows exponentially with size of overlap between
regions!
85
Approximate Inference 2
  • Iterated Conditional
  • Modes (ICM)
  • Besag 1986

v2
v1
v3
v6i1 argmax P(v6i v \ v6i)
v6
v5
v4

v6i
86
Approximate Inference 2
  • Iterated Conditional
  • Modes (ICM)
  • Besag 1986

v2
v1
v3
v5j1 argmax P(v5j v \ v5j)
v6
v5
v4

v5j
87
Approximate Inference 2
  • Iterated Conditional Modes
  • (ICM)
  • Besag 1986

v2
v1
v3
v4k1 argmax P(v4k v \ v4k)
v6
v5
v4

v4k
Structured inference scales well here,
but greedy, and easily falls into local minima.
88
Approximate Inference 2
  • Iterated Conditional Modes
  • (ICM)
  • Besag 1986
  • Iterated Conditional Sampling (ICS) (our
    name)
  • Instead of selecting only argmax, sample of
    argmaxes of P(v4k v \ v4k)
  • e.g. an N-best list (the top N values)

v2
v1
v3
v4k1 argmax P(v4k v \ v4k)
v6
v5
v4

v4k
v2
v1
v3
Can use Generalized Version of this doing
exact inference on a region of several nodes at
once. Here, a message grows only linearly with
overlap region size and N!
v6
v5
v4
89
Features of this Inference Method
  1. Structured or factored representation (ala GBP)
  2. Uses samples to approximate density
  3. Closed-loop message-passing on loopy graph (ala
    BP)

Related Work
  • Beam search
  • Forward-only inference
  • Particle filtering, e.g. Doucet 1998
  • Usually on tree-shaped graph, or feedforward
    only.
  • MC SamplingEmbedded HMMs Neal, 2003
  • Sample from high-dim continuous state space do
    forward-backward
  • Sample Propagation Paskin, 2003
  • Messages samples, on a junction tree
  • Fields to Trees Hamze de Freitas, UAI 2003
  • Rao-Blackwellized MCMC, partitioning G into
    non-overlapping trees
  • Factored Particles for DBNs Ng, Peshkin,
    Pfeffer, 2002
  • Combination of Particle Filtering and
    Boyan-Koller for DBNs

90
IE Coreference Model
Smyth , P Data mining
Exact inference onthese linear-chain regions
From each chainpass an N-best Listinto
coreference
J Besag 1986 On the
Smyth . 2001 Data Mining
91
IE Coreference Model
Smyth , P Data mining
Approximate inferenceby graph partitioning
Make scale to 1Mcitations with
CanopiesMcCallum, Nigam, Ungar 2000
integrating outuncertaintyin samplesof
extraction
J Besag 1986 On the
Smyth . 2001 Data Mining
92
InferenceSample N-best List from CRF
Segmentation
When calculating similarity with another
citation, have more opportunity to find correct,
matching fields.
Name Title Book Title Year
Laurel, B. Interface Agents Metaphors with Character The Art of Human Computer Interface Design 1990
Laurel, B. Interface Agents Metaphors with Character The Art of Human Computer Interface Design 1990
Laurel, B. Interface Agents Metaphors with Character The Art of Human Computer Interface Design 1990
Name Title
Laurel, B Interface Agents Metaphors with Character The
Laurel, B. Interface Agents Metaphors with Character
Laurel, B. Interface Agents Metaphors with Character
y ? n
93
IE Coreference Model
Smyth , P Data mining
Exact (exhaustive) inferenceover entity
attributes
y
n
n
J Besag 1986 On the
Smyth . 2001 Data Mining
94
IE Coreference Model
Smyth , P Data mining
Revisit exact inferenceon IE linear chain,now
conditioned on entity attributes
y
n
n
J Besag 1986 On the
Smyth . 2001 Data Mining
95
Parameter Estimation
Separately for different regions
IE Linear-chainExact MAP
Coref graph edge weightsMAP on individual edges
Entity attribute potentialsMAP, pseudo-likelihood
y
n
n
In all casesClimb MAP gradient
withquasi-Newton method
96
Experimenal Results
  • Set of citations from CiteSeer
  • 1500 citation mentions
  • to 900 paper entities
  • Hand-labeled for coreference and field-extraction
  • Divided into 4 subsets, each on a different topic
  • RL, Face detection, Reasoning, Constraint
    Satisfaction
  • Within each subset many citations share authors,
    publication venues, publishers, etc.
  • 70 of the citation mentions are singletons

97
4. Joint segmentation and co-reference
Wellner, McCallum, Peng, Hay, UAI 2004
o
Extraction from and matching of research paper
citations.
s
World Knowledge
Laurel, B. Interface Agents Metaphors with
Character, in The Art of Human-Computer
Interface Design, B. Laurel (ed), Addison-Wesley,
1990.
c
Co-reference decisions
y
y
p
Databasefield values
Brenda Laurel. Interface Agents Metaphors with
Character, in Laurel, The Art of Human-Computer
Interface Design, 355-366, 1990.
c
c
Citation attributes
y
s
s
Segmentation
o
o
35 reduction in co-reference error by using
segmentation uncertainty.
6-14 reduction in segmentation error by using
co-reference.
Inference Variant of Iterated Conditional Modes
Besag, 1986
98
Coreference Results
Coreference cluster recall
N Reinforce Face Reason Constraint
1 (Baseline) 0.946 0.96 0.94 0.96
3 0.95 0.98 0.96 0.96
7 0.95 0.98 0.95 0.97
9 0.982 0.97 0.96 0.97
Optimal 0.99 0.99 0.99 0.99
  • Average error reduction is 35.
  • Optimal makes best use of N-best list by using
    true labels.
  • Indicates that even more improvement can be
    obtained

99
Information Extraction Results
Segmentation F1
Reinforce Face Reason Constraint
Baseline .943 .908 .929 .934
w/ Coref .949 .914 .935 .943
Err. Reduc. .101 .062 .090 .142
P-value .0442 .0014 .0001 .0001
  • Error reduction ranges from 6-14.
  • Small, but significant at 95 confidence level
    (p-value lt 0.05)

Biggest limiting factor in both sets of results
data set is small, and does not have large
coreferent sets.
100
Parameter Estimation
Separately for different regions
IE Linear-chainExact MAP
Coref graph edge weightsMAP on individual edges
Entity attribute potentialsMAP, pseudo-likelihood
y
n
n
In all casesClimb MAP gradient
withquasi-Newton method
101
Outline
a
  • Examples of IE and Data Mining.
  • Brief review of Conditional Random Fields
  • Joint inference Motivation and examples
  • Joint Labeling of Cascaded Sequences (Belief
    Propagation)
  • Joint Labeling of Distant Entities (BP by Tree
    Reparameterization)
  • Joint Co-reference Resolution (Graph
    Partitioning)
  • Joint Segmentation and Co-ref (Iterated
    Conditional Samples)
  • Interactive IE
  • Two example projects
  • Email, contact management, and Social Network
    Analysis
  • Research Paper search and analysis

a
a
102
Interactive Information Extractionwith End-Users
Correction for Classification
Seminar How to Organize your Life by Jane
Smith, Stevenson Smith Mezzanine Level,
Papadapoulos Sq 330 pm Thursday March 31 In
this seminar we will learn how to use CALO to...
Seminar announcement
Todo request
Other
Easy Often found in user interfaces e.g. Apple
Mail
103
Multiple-choice Annotation forInteractive IE
with End-Users
Culotta, McCallum 2005
Task Information Extraction.Fields NAME
COMPANY ADDRESS (and others)
Jane Smith , Stevenson Smith , Mezzanine Level,
Papadopoulos Sq.
104
Multiple-choice Annotation for Interactive IE
with End-Users
Culotta, McCallum 2005
Task Information extraction.Fields NAME
COMPANY ADDRESS (and others)
Jane Smith , Stevenson Smith , Mezzanine Level,
Papadopoulos Sq.
Interface presents top hypothesized segmentations
Jane Smith , Stevenson Smith Mezzanine Level ,
Papadopoulos Sq.
Jane Smith , Stevenson Smith Mezzanine Level ,
Papadopoulos Sq.
Jane Smith , Stevenson Smith Mezzanine Level ,
Papadopoulos Sq.
user corrects labels, not segmentations
105
Multiple-choice Annotation for Interactive IE
with End-Users
Culotta, McCallum 2005
Task Information extraction.Fields NAME
COMPANY ADDRESS (and others)
Jane Smith , Stevenson Smith , Mezzanine Level,
Papadopoulos Sq.
Interface presents top hypothesized segmentations
Jane Smith , Stevenson Smith Mezzanine Level ,
Papadopoulos Sq.
Jane Smith , Stevenson Smith Mezzanine Level ,
Papadopoulos Sq.
Jane Smith , Stevenson Smith Mezzanine Level ,
Papadopoulos Sq.
29 percent reduction in user actions needed to
train
106
Outline
a
  • Examples of IE and Data Mining.
  • Brief review of Conditional Random Fields
  • Joint inference Motivation and examples
  • Joint Labeling of Cascaded Sequences (Belief
    Propagation)
  • Joint Labeling of Distant Entities (BP by Tree
    Reparameterization)
  • Joint Co-reference Resolution (Graph
    Partitioning)
  • Joint Segmentation and Co-ref (Iterated
    Conditional Samples)
  • Interactive IE
  • Two example projects
  • Email, contact management, and Social Network
    Analysis
  • Research Paper search and analysis

a
a
a
107
Managing and Understanding Connections of People
in our Email World
Workplace effectiveness Ability to leverage
network of acquaintances But filling Contacts DB
by hand is tedious, and incomplete.
Contacts DB
Email Inbox
Automatically
WWW
108
System Overview
CRF
WWW
Email
names
109
An Example
To Andrew McCallum mccallum_at_cs.umass.edu Subjec
t ...
First Name Andrew
Middle Name Kachites
Last Name McCallum
JobTitle Associate Professor
Company University of Massachusetts
Street Address 140 Governors Dr.
City Amherst
State MA
Zip 01003
Company Phone (413) 545-1323
Links Fernando Pereira, Sam Roweis,
Key Words Information extraction, social network,
Search for new people
110
Summary of Results
Example keywords extracted
Person Keywords
William Cohen Logic programming Text categorization Data integration Rule learning
Daphne Koller Bayesian networks Relational models Probabilistic models Hidden variables
Deborah McGuiness Semantic web Description logics Knowledge representation Ontologies
Tom Mitchell Machine learning Cognitive states Learning apprentice Artificial intelligence
Contact info and name extraction performance (25
fields)
Token Acc Field Prec Field Recall Field F1
CRF 94.50 85.73 76.33 80.76
  1. Expert Finding When solving some task, find
    friends-of-friends with relevant expertise.
    Avoid stove-piping in large orgs by
    automatically suggesting collaborators. Given a
    task, automatically suggest the right team for
    the job. (Hiring aid!)
  2. Social Network Analysis Understand the social
    structure of your organization. Suggest
    structural changes for improved efficiency.

111
Clustering words into topics withLatent
Dirichlet Allocation
Blei, Ng, Jordan 2003
112
Example topicsinduced from a large collection of
text
JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTU
NITIES WORKING TRAINING SKILLS CAREERS POSITIONS F
IND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY
EARN ABLE
SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK
RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BI
OLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIEN
TIST STUDYING SCIENCES
BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIEL
D PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNI
S TEAMS GAMES SPORTS BAT TERRY
FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POL
ES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORC
E MAGNETS BE MAGNETISM POLE INDUCED
STORY STORIES TELL CHARACTER CHARACTERS AUTHOR REA
D TOLD SETTING TALES PLOT TELLING SHORT FICTION AC
TION TRUE EVENTS TELLS TALE NOVEL
MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT
THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNES
S STRANGE FEELING WHOLE BEING MIGHT HOPE
DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED
SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PER
SON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECT
IONS CERTAIN
WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK
TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL
DIVE DOLPHIN UNDERWATER
Tennenbaum et al
113
Example topicsinduced from a large collection of
text
JOB WORK JOBS CAREER EXPERIENCE EMPLOYMENT OPPORTU
NITIES WORKING TRAINING SKILLS CAREERS POSITIONS F
IND POSITION FIELD OCCUPATIONS REQUIRE OPPORTUNITY
EARN ABLE
SCIENCE STUDY SCIENTISTS SCIENTIFIC KNOWLEDGE WORK
RESEARCH CHEMISTRY TECHNOLOGY MANY MATHEMATICS BI
OLOGY FIELD PHYSICS LABORATORY STUDIES WORLD SCIEN
TIST STUDYING SCIENCES
BALL GAME TEAM FOOTBALL BASEBALL PLAYERS PLAY FIEL
D PLAYER BASKETBALL COACH PLAYED PLAYING HIT TENNI
S TEAMS GAMES SPORTS BAT TERRY
FIELD MAGNETIC MAGNET WIRE NEEDLE CURRENT COIL POL
ES IRON COMPASS LINES CORE ELECTRIC DIRECTION FORC
E MAGNETS BE MAGNETISM POLE INDUCED
STORY STORIES TELL CHARACTER CHARACTERS AUTHOR REA
D TOLD SETTING TALES PLOT TELLING SHORT FICTION AC
TION TRUE EVENTS TELLS TALE NOVEL
MIND WORLD DREAM DREAMS THOUGHT IMAGINATION MOMENT
THOUGHTS OWN REAL LIFE IMAGINE SENSE CONSCIOUSNES
S STRANGE FEELING WHOLE BEING MIGHT HOPE
DISEASE BACTERIA DISEASES GERMS FEVER CAUSE CAUSED
SPREAD VIRUSES INFECTION VIRUS MICROORGANISMS PER
SON INFECTIOUS COMMON CAUSING SMALLPOX BODY INFECT
IONS CERTAIN
WATER FISH SEA SWIM SWIMMING POOL LIKE SHELL SHARK
TANK SHELLS SHARKS DIVING DOLPHINS SWAM LONG SEAL
DIVE DOLPHIN UNDERWATER
Tennenbaum et al
114
From LDA to Author-Recipient-Topic
(ART)
115
Inference and Estimation
  • Gibbs Sampling
  • Easy to implement
  • Reasonably fast

r
116
Enron Email Corpus
  • 250k email messages
  • 23k people

Date Wed, 11 Apr 2001 065600 -0700 (PDT) From
debra.perlingiere_at_enron.com To
steve.hooser_at_enron.com Subject
Enron/TransAltaContract dated Jan 1, 2001 Please
see below. Katalin Kiss of TransAlta has
requested an electronic copy of our final draft?
Are you OK with this? If so, the only version I
have is the original draft without
revisions. DP Debra Perlingiere Enron North
America Corp. Legal Department 1400 Smith Street,
EB 3885 Houston, Texas 77002 dperlin_at_enron.com
117
Topics, and prominent sender/receiversdiscovered
by ART
118
Topics, and prominent sender/receiversdiscovered
by ART
Beck Chief Operations Officer
Dasovich Government Relations
Executive Shapiro Vice Presidence of
Regulatory Affairs Steffes Vice President of
Government Affairs
119
Comparing Role Discovery
Traditional SNA
Author-Topic
ART
connection strength (A,B)
distribution over recipients
distribution over authored topics
distribution over authored topics
120
Comparing Role Discovery Tracy Geaconne ? Dan
McCarty
Traditional SNA
Author-Topic
ART
Different roles
Different roles
Similar roles
Geaconne Secretary McCarty Vice President
121
Comparing Role Discovery Tracy Geaconne ? Rod
Hayslett
Traditional SNA
Author-Topic
ART
Very similar
Not very similar
Different roles
Geaconne Secretary Hayslett Vice President
CTO
122
Comparing Role Discovery Lynn Blair ? Kimberly
Watson
Traditional SNA
Author-Topic
ART
Very different
Very similar
Different roles
Blair Gas pipeline logistics Watson
Pipeline facilities planning
123
Comparing Group Discovery Enron TransWestern
Division
Traditional SNA
Author-Topic
ART
Not
Not
Block structured
124
McCallum Email Corpus 2004
  • January - October 2004
  • 23k email messages
  • 825 people

From kate_at_cs.umass.edu Subject NIPS and
.... Date June 14, 2004 22741 PM EDT To
mccallum_at_cs.umass.edu There is pertinent stuff
on the first yellow folder that is completed
either travel or other things, so please sign
that first folder anyway. Then, here is the
reminder of the things I'm still waiting
for NIPS registration receipt. CALO
registration receipt. Thanks, Kate
125
McCallum Email Blockstructure
126
Four most prominent topicsin discussions with
____?
127
(No Transcript)
128
Two most prominent topicsin discussions with
____?
129
Topic 37
130
Topic 40
131
(No Transcript)
132
Pairs with highestrank difference between ART
SNA
5 other professors 3 other ML researchers
133
Role-Author-Recipient-Topic Models
134
Outline
a
  • Examples of IE and Data Mining.
  • Brief review of Conditional Random Fields
  • Joint inference Motivation and examples
  • Joint Labeling of Cascaded Sequences (Belief
    Propagation)
  • Joint Labeling of Distant Entities (BP by Tree
    Reparameterization)
  • Joint Co-reference Resolution (Graph
    Partitioning)
  • Joint Segmentation and Co-ref (Iterated
    Conditional Samples)
  • Interactive IE
  • Two example projects
  • Email, contact management, and Social Network
    Analysis
  • Research Paper search and analysis

a
a
a
a
135
Previous Systems
136
(No Transcript)
137
Previous Systems
Cites
Research Paper
138
More Entities and Relations
Expertise
Cites
Research Paper
Person
Grant
University
Venue
Groups
139
(No Transcript)
140
(No Transcript)
141
(No Transcript)
142
(No Transcript)
143
(No Transcript)
144
(No Transcript)
145
(No Transcript)
146
Summary
  • Conditional Random Fields
  • Conditional probability models of structured data
  • Data mining complex unstructured text suggests
    the need for joint inference IE DM.
  • Early examples
  • Factorial finite state models
  • Jointly labeling distant entities
  • Coreference analysis
  • Segmentation uncertainty aiding coreference
  • Interactive IE
  • Bring IE to the masses!
  • Current projects
  • Email, contact management, expert-finding, SNA
  • Mining the scientific literature

147
End of Talk
Write a Comment
User Comments (0)
About PowerShow.com