Learning and Inference for Hierarchically Split PCFGs presentation

About This Presentation

Transcript and Presenter's Notes

Title: Learning and Inference for Hierarchically Split PCFGs

1
Learning and Inference for Hierarchically Split
PCFGs

Slav Petrov and Dan Klein

2
The Game of Designing a Grammar

Annotation refines base treebank symbols to
improve statistical fit of the grammar
Parent annotation Johnson 98

3
The Game of Designing a Grammar

Annotation refines base treebank symbols to
improve statistical fit of the grammar
Parent annotation Johnson 98
Head lexicalization Collins 99, Charniak 00

4
The Game of Designing a Grammar

Annotation refines base treebank symbols to
improve statistical fit of the grammar
Parent annotation Johnson 98
Head lexicalization Collins 99, Charniak 00
Automatic clustering?

5
Learning Latent Annotations
Matsuzaki et al. 05

EM algorithm

Brackets are known
Base categories are known
Only induce subcategories

Just like Forward-Backward for HMMs.
6
Overview
- Hierarchical Training - Adaptive Splitting -
Parameter Smoothing
7
Refinement of the DT tag
DT
8
Refinement of the DT tag
DT
9
Hierarchical refinement of the DT tag
DT
10
Hierarchical Estimation Results
Model F1
Baseline 87.3
Hierarchical Training 88.4
11
Refinement of the , tag

Splitting all categories the same amount is
wasteful

12
Adaptive Splitting

Want to split complex categories more
Idea split everything, roll back splits which
were least useful

13
Adaptive Splitting

Want to split complex categories more
Idea split everything, roll back splits which
were least useful

14
Adaptive Splitting Results
Model F1
Previous 88.4
With 50 Merging 89.5
15
Number of Phrasal Subcategories
16
Number of Phrasal Subcategories
NP
VP
PP
17
Number of Phrasal Subcategories
NAC
X
18
Number of Lexical Subcategories
POS
TO
,
19
Number of Lexical Subcategories
NNP
JJ
NNS
NN
20
Smoothing

Heavy splitting can lead to overfitting

Idea Smoothing allows us to pool
statistics

21
Result Overview
Model F1
Previous 89.5
With Smoothing 90.7
22
Linguistic Candy

Proper Nouns (NNP)
Personal pronouns (PRP)

NNP-14 Oct. Nov. Sept.
NNP-12 John Robert James
NNP-2 J. E. L.
NNP-1 Bush Noriega Peters
NNP-15 New San Wall
NNP-3 York Francisco Street
PRP-0 It He I
PRP-1 it he they
PRP-2 it them him
23
Linguistic Candy

Relative adverbs (RBR)
Cardinal Numbers (CD)

RBR-0 further lower higher
RBR-1 more less More
RBR-2 earlier Earlier later
CD-7 one two Three
CD-4 1989 1990 1988
CD-11 million billion trillion
CD-0 1 50 100
CD-3 1 30 31
CD-9 78 58 34
24
Inference

She heard the noise.

Exhaustive parsing 1 min per sentence
25
Coarse-to-Fine Parsing
Goodman 97, CharniakJohnson 05

26
Hierarchical Pruning
lt t

Consider again the span 5 to 12

coarse
QP NP VP

split in two

QP1 QP2 NP1 NP2 VP1 VP2
split in four
QP1 QP1 QP3 QP4 NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4

split in eight

27
Intermediate Grammars
X-BarG0
G
28
Projected Grammars
X-BarG0
G
29
Final Results (Efficiency)

Parsing the development set (1600 sentences)
Berkeley Parser
10 min
Implemented in Java
Charniak Johnson 05 Parser
19 min
Implemented in C

30
Final Results (Accuracy)
40 words F1 all F1
ENG CharniakJohnson 05 (generative) 90.1 89.6
ENG This Work 90.6 90.1

GER Dubey 05 76.3 -
GER This Work 80.8 80.1

CHN Chiang et al. 02 80.0 76.6
CHN This Work 86.3 83.4
31
Extensions

Acoustic modeling
Infinite Grammars
Nonparametric Bayesian Learning

Petrov, Pauls Klein 07
Liang, Petrov, Jordan Klein 07
32
Conclusions

Split Merge Learning
Hierarchical Training
Adaptive Splitting
Parameter Smoothing
Hierarchical Coarse-to-Fine Inference
Projections
Marginalization
Multi-lingual Unlexicalized Parsing

Learning and Inference for Hierarchically Split PCFGs PowerPoint PPT Presentation