Title: Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques
1Learning for Semantic Parsing Using Statistical
Syntactic Parsing Techniques
Ruifang Ge Ph.D. Final Defense Supervisor
Raymond J. Mooney
Machine Learning Group Department of Computer
Science The University of Texas at Austin
2Semantic Parsing
- Semantic Parsing Transforming natural language
(NL) sentences into completely formal meaning
representations (MRs) - Sample application domains where MRs are directly
executable by another computer system to perform
some task - CLang Robocup Coach Language
- Geoquery A Database Query Application
3CLang (RoboCup Coach Language)
- In RoboCup Coach competition, teams compete to
coach simulated players - The coaching instructions are given in a formal
language called CLang
Simulated soccer field
4GeoQuery A Database Query Application
- Query application for U.S. geography database
Zelle Mooney, 1996
DataBase
Angelina, Blanco,
5Motivation for Semantic Parsing
- Theoretically, it answers the question of how
people interpret language - Practical applications
- Question answering
- Natural language interface
- Knowledge acquisition
- Reasoning
6Motivating Example
If our player 2 has the ball, our player 4 should
stay in our half
((bowner (player our 2)) (do our 4 (pos (half
our))))
Semantic parsing is a compositional process.
Sentence structures are needed for building
meaning representations.
bowner ball owner pos position
7Syntax-Based Approaches
- Meaning composition follows the tree structure of
a syntactic parse - Composing the meaning of a constituent from the
meanings of its sub-constituents in a syntactic
parse - Hand-built approaches (Woods, 1970, Warren and
Pereira, 1982) - Learned approaches
- Miller et al. (1996) Conceptually simple
sentences - Zettlemoyer Collins (2005)) hand-built
Combinatory Categorial Grammar (CCG) template
rules
8Example
MR bowner(player(our,2))
S
VP
NP
NP
PRP
NN
CD
VB
our
player
2
has
DT
NN
the
ball
Use the structure of a syntactic parse
9Example
MR bowner(player(our,2))
S
VP
NP
NP
PRP-our
NN-player(_,_)
CD-2
VB-bowner(_)
our
player
2
has
DT-null
NN-null
the
ball
Assign semantic concepts to words
10Example
MR bowner(player(our,2))
S
NP-player(our,2)
VP
NP
PRP-our
NN-player(_,_)
CD-2
VB-bowner(_)
our
player
2
has
DT-null
NN-null
the
ball
Compose meaning for the internal nodes
11Example
MR bowner(player(our,2))
S
VP-bowner(_)
NP-player(our,2)
NP-null
PRP-our
NN-player(_,_)
CD-2
VB-bowner(_)
our
player
2
has
DT-null
NN-null
the
ball
Compose meaning for the internal nodes
12Example
MR bowner(player(our,2))
S-bowner(player(our,2))
VP-bowner(_)
NP-player(our,2)
NP-null
PRP-our
NN-player(_,_)
CD-2
VB-bowner(_)
our
player
2
has
DT-null
NN-null
the
ball
Compose meaning for the internal nodes
13Semantic Grammars
- Non-terminals in a semantic grammar correspond to
semantic concepts in application domains - Hand-built approaches (Hendrix et al., 1978)
- Learned approaches
- Tang Mooney (2001), Kate Mooney (2006), Wong
Mooney (2006)
14Example
MR bowner(player(our,2))
bowner
player
has
the
ball
our
2
our
player
2
bowner ? player has the ball
15Thesis Contributions
- Introduce two novel syntax-based approaches to
semantic parsing - Theoretically well-founded in computational
semantics (Blackburn and Bos, 2005) - Great opportunity leverage the significant
progress made in statistical syntactic parsing
for semantic parsing (Collins, 1997 Charniak and
Johnson, 2005 Huang, 2008)
16Thesis Contributions
- SCISSOR a novel integrated syntactic-semantic
parser - SYNSEM exploits an existing syntactic parser to
produce disambiguated parse trees that drive the
compositional meaning composition - Investigate when the knowledge of syntax can help
17Representing Semantic Knowledge in Meaning
Representation Language Grammar (MRLG)
- Assumes a meaning representation language (MRL)
is defined by an unambiguous context-free
grammar. - Each production rule introduces a single
predicate in the MRL. - The parse of a MR gives its predicate-argument
structure.
Production Predicate
CONDITION ?(bowner PLAYER) P_BOWNER
PLAYER ?(player TEAM UNUM) P_PLAYER
UNUM ? 2 P_UNUM
TEAM ? our P_OUR
18Roadmap
- SCISSOR
- SYNSEM
- Future Work
- Conclusions
19SCISSOR
- Semantic Composition that Integrates Syntax and
Semantics to get Optimal Representations - Integrated syntactic-semantic parsing
- Allows both syntax and semantics to be used
simultaneously to obtain an accurate combined
syntactic-semantic analysis - A statistical parser is used to generate a
semantically augmented parse tree (SAPT)
20Syntactic Parse
S
VP
NP
PRP
NN
NP
CD
VB
DT
NN
our
player
2
has
the
ball
21SAPT
S-P_BOWNER
VP-P_BOWNER
NP-P_PLAYER
PRP-P_OUR
NN-P_PLAYER
NP-NULL
CD- P_UNUM
VB-P_BOWNER
DT-NULL
NN-NULL
our
player
2
has
the
ball
Non-terminals now have both syntactic and
semantic labels
Semantic labels dominate predicates in the
sub-trees
22SAPT
S-P_BOWNER
VP-P_BOWNER
NP-P_PLAYER
PRP-P_OUR
NN-P_PLAYER
NP-NULL
CD- P_UNUM
VB-P_BOWNER
DT-NULL
NN-NULL
our
player
2
has
the
ball
MR P_BOWNER(P_PLAYER(P_OUR,P_UNUM))
23SCISSOR Overview
Integrated Semantic Parser
24SCISSOR Overview
Integrated Semantic Parser
NL Sentence
TESTING
25Extending Collins (1997) Syntactic Parsing Model
- Find a SAPT with the maximum probability
- A lexicalized head-driven syntactic parsing model
- Extending the parsing model to generate semantic
labels simultaneously with syntactic labels
26Why Extending Collins (1997) Syntactic Parsing
Model
- Suitable for incorporating semantic knowledge
- Head dependency predicate-argument relation
- Syntactic subcategorization a set of arguments
that a predicate appears with - Bikel (2004) implementation easily extendable
27Parser Implementation
- Supervised training on annotated SAPTs is just
frequency counting - Testing a variant of standard CKY chart-parsing
algorithm - Details in the thesis
28Smoothing
- Each label in SAPT is the combination of a
syntactic label and a semantic label - Increases data sparsity
- Break the parameters down
- Ph(H P, w)
- Ph(Hsyn, Hsem P, w)
- Ph(Hsyn P, w) Ph(Hsem P, w, Hsyn)
29Experimental Corpora
- CLang (Kate, Wong Mooney, 2005)
- 300 pieces of coaching advice
- 22.52 words per sentence
- Geoquery (Zelle Mooney, 1996)
- 880 queries on a geography database
- 7.48 word per sentence
- MRL Prolog and FunQL
30Prolog vs. FunQL (Wong, 2007)
What are the rivers in Texas?
X1 river x2 texas
Prolog answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
FunQL answer(river(loc_2(stateid(texas))))
Logical forms widely used as MRLs in
computational semantics, support reasoning
31Prolog vs. FunQL (Wong, 2007)
What are the rivers in Texas?
Flexible order
Prolog answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
FunQL answer(river(loc_2(stateid(texas))))
Strict order
Better generalization on Prolog
32Experimental Methodology
- standard 10-fold cross validation
- Correctness
- CLang exactly matches the correct MR
- Geoquery retrieves the same answers as the
correct MR - Metrics
- Precision of the returned MRs that are correct
- Recall of NLs with their MRs correctly
returned - F-measure harmonic mean of precision and recall
33Compared Systems
- COCKTAIL (Tang Mooney, 2001)
- Deterministic, inductive logic programming
- WASP (Wong Mooney, 2006)
- Semantic grammar, machine translation
- KRISP (Kate Mooney, 2006)
- Semantic grammar, string kernels
- ZC (Zettleymoyer Collins, 2007)
- Syntax-based, combinatory categorial grammar
(CCG) - LU (Lu et al., 2008)
- Semantic grammar, generative parsing model
34Compared Systems
- COCKTAIL (Tang Mooney, 2001)
- Deterministic, inductive logic programming
- WASP (Wong Mooney, 2006)
- Semantic grammar, machine translation
- KRISP (Kate Mooney, 2006)
- Semantic grammar, string kernels
- ZC (Zettleymoyer Collins, 2007)
- Syntax-based, combinatory categorial grammar
(CCG) - LU (Lu et al., 2008)
- Semantic grammar, generative parsing model
Hand-built lexicon for Geoquery
Manual CCG Template rules
35Compared Systems
- COCKTAIL (Tang Mooney, 2001)
- Deterministic, inductive logic programming
- WASP (Wong Mooney, 2006)
- Semantic grammar, machine translation
- KRISP (Kate Mooney, 2006)
- Semantic grammar, string kernels
- ZC (Zettleymoyer Collins, 2007)
- Syntax-based, combinatory categorial grammar
(CCG) - LU (Lu et al., 2008)
- Semantic grammar, generative parsing model
?-WASP, handling logical forms
36Results on CLang
Precision Recall F-measure
COCKTAIL - - -
SCISSOR 89.5 73.7 80.8
WASP 88.9 61.9 73.0
KRISP 85.2 61.9 71.7
ZC - - -
LU 82.4 57.7 67.8
Memory overflow
Not reported
(LU F-measure after reranking is 74.4)
37Results on CLang
Precision Recall F-measure
SCISSOR 89.5 73.7 80.8
WASP 88.9 61.9 73.0
KRISP 85.2 61.9 71.7
LU 82.4 57.7 67.8
(LU F-measure after reranking is 74.4)
38Results on Geoquery
Precision Recall F-measure
SCISSOR 92.1 72.3 81.0
WASP 87.2 74.8 80.5
KRISP 93.3 71.7 81.1
LU 86.2 81.8 84.0
COCKTAIL 89.9 79.4 84.3
?-WASP 92.0 86.6 89.2
ZC 95.5 83.2 88.9
FunQL
Prolog
(LU F-measure after reranking is 85.2)
39Results on Geoquery (FunQL)
Precision Recall F-measure
SCISSOR 92.1 72.3 81.0
WASP 87.2 74.8 80.5
KRISP 93.3 71.7 81.1
LU 86.2 81.8 84.0
competitive
(LU F-measure after reranking is 85.2)
40Why Knowledge of Syntax does not Help
- Geoquery 7.48 word per sentence
- Short sentence
- Sentence structure can be feasibly learned from
NLs paired with MRs - Gain from knowledge of syntax vs. flexibility
loss
41Limitation of Using Prior Knowledge of Syntax
Traditional syntactic analysis
N1
is the smallest
N2
What state
answer(smallest(state(all)))
42Limitation of Using Prior Knowledge of Syntax
Traditional syntactic analysis
Semantic grammar
N1
N1
is the smallest
What
N2
N2
What state
state is the smallest
answer(smallest(state(all)))
answer(smallest(state(all)))
Isomorphic syntactic structure with MR Better
generalization
43Why Prior Knowledge of Syntax does not Help
- Geoquery 7.48 word per sentence
- Short sentence
- Sentence structure can be feasibly learned from
NLs paired with MRs - Gain from knowledge of syntax vs. flexibility
loss - LU vs. WASP and KRISP
- Decomposed model for semantic grammar
44Detailed Clang Results on Sentence Length
31-40 (13)
0-10 (7)
11-20 (33)
21-30 (46)
0-10 (7)
11-20 (33)
21-30 (46)
0-10 (7)
11-20 (33)
31-40 (13)
21-30 (46)
0-10 (7)
11-20 (33)
45SCISSOR Summary
- Integrated syntactic-semantic parsing approach
- Learns accurate semantic interpretations by
utilizing the SAPT annotations - knowledge of syntax improves performance on long
sentences
46Roadmap
- SCISSOR
- SYNSEM
- Future Work
- Conclusions
47SYNSEM Motivation
- SCISSOR requires extra SAPT annotation for
training - Must learn both syntax and semantics from same
limited training corpus - High performance syntactic parsers are available
that are trained on existing large corpora
(Collins, 1997 Charniak Johnson, 2005)
48SCISSOR Requires SAPT Annotation
S-P_BOWNER
VP-P_BOWNER
NP-P_PLAYER
PRP-P_OUR
NN-P_PLAYER
NP-NULL
CD- P_UNUM
VB-P_BOWNER
DT-NULL
NN-NULL
our
player
2
has
the
ball
Time consuming.Automate it!
49Part I Syntactic Parse
S
VP
NP
PRP
NN
NP
CD
VB
DT
NN
our
player
2
has
the
ball
Use a statistical syntactic parser
50Part II Word Meanings
P_OUR
P_PLAYER
P_UNUM
P_BOWNER
NULL
NULL
our
player
2
has
the
ball
our
player
2
has
ball
the
P_PLAYER
P_BOWNER
P_OUR
P_UNUM
Use a word alignment model (Wong and Mooney
(2006) )
51Learning a Semantic Lexicon
- IBM Model 5 word alignment (GIZA)
- top 5 word/predicate alignments for each training
example - Assume each word alignment and syntactic parse
defines a possible SAPT for composing the correct
MR
52Introducing ?variables in semantic labels for
missing arguments (a1 the first argument)
S
VP
NP
NP
P_OUR
NP
?a1P_BOWNER
?a1?a2P_PLAYER
P_UNUM
NULL
NULL
53Part III Internal Semantic Labels
S
P_BOWNER
P_PLAYER
VP
NP
P_UNUM
P_OUR
NP
P_OUR
NP
?a1P_BOWNER
?a1?a2P_PLAYER
P_UNUM
NULL
NULL
our
player
2
has
ball
the
How to choose the dominant predicates?
54Learning Semantic Composition Rules
?
P_BOWNER
P_PLAYER
?a1?a2P_PLAYER
P_UNUM
player
2
P_UNUM
P_OUR
P_UNUM
, a2c2
P_PLAYER
?a1?a2PLAYER
?
?a1
(c2 child 2)
55Learning Semantic Composition Rules
S
P_BOWNER
P_PLAYER
?
VP
P_UNUM
P_OUR
NP
P_OUR
?a1P_PLAYER
?a1P_BOWNER
?a1?a2P_PLAYER
P_UNUM
NULL
NULL
?a1?a2PLAYER P_UNUM ? ?a1P_PLAYER, a2c2
56Learning Semantic Composition Rules
S
P_BOWNER
P_PLAYER
VP
P_PLAYER
P_UNUM
P_OUR
?
P_OUR
?a1P_PLAYER
?a1P_BOWNER
?a1?a2P_PLAYER
P_UNUM
NULL
NULL
P_OUR ?a1P_PLAYER ? P_PLAYER, a1c1
57Learning Semantic Composition Rules
?
P_BOWNER
P_PLAYER
P_PLAYER
?a1P_BOWNER
P_UNUM
P_OUR
P_OUR
?a1P_PLAYER
NULL
?a1P_BOWNER
?a1?a2P_PLAYER
P_UNUM
NULL
NULL
58Learning Semantic Composition Rules
P_BOWNER
P_BOWNER
P_PLAYER
P_PLAYER
?a1P_BOWNER
P_UNUM
P_OUR
P_OUR
?a1P_PLAYER
NULL
?a1P_BOWNER
?a1?a2P_PLAYER
P_UNUM
NULL
NULL
P_PLAYER ?a1P_BOWNER ? P_BOWNER, a1c1
59Ensuring Meaning Composition
N1
is the smallest
N2
What state
answer(smallest(state(all)))
Non-isomorphism
60Ensuring Meaning Composition
- Non-isomorphism between NL parse and MR parse
- Various linguistic phenomena
- Machine translation between NL and MRL
- Use automated syntactic parses
- Introduce macro-predicates that combine multiple
predicates. - Ensure that MR can be composed using a syntactic
parse and word alignment
61SYNSEM Overview
syntactic parse tree,T
Syntactic parser
Before training testing
Unambiguous CFG of MRL
Semantic knowledge acquisition
Training set, (S,T,MR)
Semantic lexicon composition rules
Parameter estimation
Probabilistic parsing model
Training
Input sentence parse T
Output MR
Semantic parsing
Testing
62SYNSEM Overview
syntactic parse tree,T
Syntactic parser
Before training testing
Unambiguous CFG of MRL
Semantic knowledge acquisition
Training set, (S,T,MR)
Semantic lexicon composition rules
Parameter estimation
Probabilistic parsing model
Training
Input sentence, S
Output MR
Semantic parsing
Testing
63Parameter Estimation
- Apply the learned semantic knowledge to all
training examples to generate possible SAPTs - Use a standard maximum-entropy model similar to
that of Zettlemoyer Collins (2005), and Wong
Mooney (2006) - Training finds a parameter that (approximately)
maximizes the sum of the conditional
log-likelihood of the training set including
syntactic parses - Incomplete data since SAPTs are hidden variables
64Features
- Lexical features
- Unigram features that a word is assigned a
predicate - Bigram features that a word is assigned a
predicate given its previous/subsequent word. - Rule features a composition rule applied in a
derivation
65Handling Logical Forms
What are the rivers in Texas?
answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
?v1P_ANSWER(x1)
?v1P_RIVER(x1) ?v1?v2P_LOC(x1,x2)
?v1P_EQUAL(x2)
Handle shared logical variables
Use Lambda Calculus (v variable)
66Prolog Example
What are the rivers in Texas?
answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
?v1P_ANSWER(x1)
(?v1P_RIVER(x1) ?v1 ?v2P_LOC(x1,x2)
?v1P_EQUAL(x2))
Handle shared logical variables
Use Lambda Calculus (v variable)
67Prolog Example
What are the rivers in Texas?
answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
?v1P_ANSWER(x1)
(?v1P_RIVER(x1) ?v1?v2P_LOC(x1,x2)
?v1P_EQUAL(x2))
Handle shared logical variables
Use Lambda Calculus (v variable)
68Prolog Example
answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
SBARQ
Start from a syntactic parse
SQ
NP
PP
NP
IN
NP
VBP
WHNP
What are the rivers in
Texas
69Prolog Example
answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
SBARQ
Add predicates to words
SQ
NP
PP
?v1?a1P_ANSWER
NULL
?v1P_RIVER
?v1?v2P_LOC
?v1P_EQUAL
What are the rivers in
Texas
70Prolog Example
answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
SBARQ
Learn a rule with variable unification
SQ
NP
?v1P_LOC
?v1?a1P_ANSWER
NULL
?v1P_RIVER
?v1?v2P_LOC
?v1P_EQUAL
What are the rivers in
Texas
?v1?v2P_LOC(x1,x2) ?v1P_EQUAL(x2) ? ?v1P_LOC
71Experimental Results
72Syntactic Parsers (Bikel,2004)
- WSJ only
- CLang(SYN0) F-measure82.15
- Geoquery(SYN0) F-measure76.44
- WSJ in-domain sentences
- CLang(SYN20) 20 sentences, F-measure88.21
- Geoquery(SYN40) 40 sentences, F-measure91.46
- Gold-standard syntactic parses (GOLDSYN)
73Questions
- Q1. Can SYNSEM produce accurate semantic
interpretations? - Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? - Q3. Does it also improve on long sentences?
- Q4. Does it improve on limited training data due
to the prior knowledge from large treebanks? - Q5. Can it handle syntactic errors?
74Results on CLang
Precision Recall F-measure
GOLDSYN 84.7 74.0 79.0
SYN20 85.4 70.0 76.9
SYN0 87.0 67.0 75.7
SCISSOR 89.5 73.7 80.8
WASP 88.9 61.9 73.0
KRISP 85.2 61.9 71.7
LU 82.4 57.7 67.8
SYNSEM
SAPTs
(LU F-measure after reranking is 74.4)
GOLDSYN gt SYN20 gt SYN0
75Questions
- Q1. Can SynSem produce accurate semantic
interpretations? yes - Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? yes - Q3. Does it also improve on long sentences?
76Detailed Clang Results on Sentence Length
Prior Knowledge
Syntactic error
?
Flexibility
31-40 (13)
21-30 (46)
0-10 (7)
11-20 (33)
77Questions
- Q1. Can SynSem produce accurate semantic
interpretations? yes - Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? yes - Q3. Does it also improve on long sentences?
yes - Q4. Does it improve on limited training data due
to the prior knowledge from large treebanks?
78Results on Clang (training size 40)
Precision Recall F-measure
GOLDSYN 61.1 35.7 45.1
SYN20 57.8 31.0 40.4
SYN0 53.5 22.7 31.9
SCISSOR 85.0 23.0 36.2
WASP 88.0 14.4 24.7
KRISP 68.35 20.0 31.0
SYNSEM
SAPTs
The quality of syntactic parser is critically
important!
79Questions
- Q1. Can SynSem produce accurate semantic
interpretations? yes - Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? yes - Q3. Does it also improve on long sentences? yes
- Q4. Does it improve on limited training data due
to the prior knowledge from large treebanks?
yes - Q5. Can it handle syntactic errors?
80Handling Syntactic Errors
- Training ensures meaning composition from
syntactic parses with errors - For test NLs that generate correct MRs, measure
the F-measures of their syntactic parses - SYN0 85.5
- SYN20 91.2
If DR2C7 is true then players 2 , 3 , 7 and 8
should pass to player 4
81Questions
- Q1. Can SynSem produce accurate semantic
interpretations? yes - Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? yes - Q3. Does it also improve on long sentences? yes
- Q4. Does it improve on limited training data due
to the prior knowledge of large treebanks? yes - Q5. Is it robust to syntactic errors? yes
82Results on Geoquery (Prolog)
Precision Recall F-measure
GOLDSYN 91.9 88.2 90.0
SYN40 90.2 86.9 88.5
SYN0 81.8 79.0 80.4
COCKTAIL 89.9 79.4 84.3
?-WASP 92.0 86.6 89.2
ZC 95.5 83.2 88.9
SYNSEM
SYN0 does not perform well
All other recent systems perform competitively
83SYNSEM Summary
- Exploits an existing syntactic parser to drive
the meaning composition process - Prior knowledge of syntax improves performance on
long sentences - Prior knowledge of syntax improves performance on
limited training data - Handle syntactic errors
84Discriminative Reranking for semantic Parsing
- Adapt global features used for reranking
syntactic parsing for semantic parsing - Improvement on CLang
- No improvement on Geoquery where sentences are
short, and are less likely for global features to
show improvement on
85Roadmap
- SCISSOR
- SYNSEM
- Future Work
- Conclusions
86Future Work
- Improve SCISSOR
- Discriminative SCISSOR (Finkel, et al., 2008)
- Handling logical forms
- SCISSOR without extra annotation (Klein and
Manning, 2002, 2004) - Improve SYNSEM
- Utilizing syntactic parsers with improved
accuracy and in other syntactic formalism
87Future Work
- Utilizing wide-coverage semantic representations
(Curran et al., 2007) - Better generalizations for syntactic variations
- Utilizing semantic role labeling (Gildea and
Palmer, 2002) - Provides a layer of correlated semantic
information
88Roadmap
- SCISSOR
- SYNSEM
- Future Work
- Conclusions
89Conclusions
- SCISSOR a novel integrated syntactic-semantic
parser. - SYNSEM exploits an existing syntactic parser to
produce disambiguated parse trees that drive the
compositional meaning composition. - Both produce accurate semantic interpretations.
- Using the knowledge of syntax improves
performance on long sentences. - SYNSEM also improves performance on limited
training data.
90Thank you!