Learning and Inference for Hierarchically Split PCFGs - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Learning and Inference for Hierarchically Split PCFGs

Description:

Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein The Game of Designing a Grammar Annotation refines base treebank symbols to improve ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 34
Provided by: EEC93
Category:

less

Transcript and Presenter's Notes

Title: Learning and Inference for Hierarchically Split PCFGs


1
Learning and Inference for Hierarchically Split
PCFGs
  • Slav Petrov and Dan Klein

2
The Game of Designing a Grammar
  • Annotation refines base treebank symbols to
    improve statistical fit of the grammar
  • Parent annotation Johnson 98

3
The Game of Designing a Grammar
  • Annotation refines base treebank symbols to
    improve statistical fit of the grammar
  • Parent annotation Johnson 98
  • Head lexicalization Collins 99, Charniak 00

4
The Game of Designing a Grammar
  • Annotation refines base treebank symbols to
    improve statistical fit of the grammar
  • Parent annotation Johnson 98
  • Head lexicalization Collins 99, Charniak 00
  • Automatic clustering?

5
Learning Latent Annotations
Matsuzaki et al. 05
  • EM algorithm
  • Brackets are known
  • Base categories are known
  • Only induce subcategories

Just like Forward-Backward for HMMs.
6
Overview
- Hierarchical Training - Adaptive Splitting -
Parameter Smoothing
7
Refinement of the DT tag
DT
8
Refinement of the DT tag
DT
9
Hierarchical refinement of the DT tag
DT
10
Hierarchical Estimation Results
Model F1
Baseline 87.3
Hierarchical Training 88.4
11
Refinement of the , tag
  • Splitting all categories the same amount is
    wasteful

12
Adaptive Splitting
  • Want to split complex categories more
  • Idea split everything, roll back splits which
    were least useful

13
Adaptive Splitting
  • Want to split complex categories more
  • Idea split everything, roll back splits which
    were least useful

14
Adaptive Splitting Results
Model F1
Previous 88.4
With 50 Merging 89.5
15
Number of Phrasal Subcategories
16
Number of Phrasal Subcategories
NP
VP
PP
17
Number of Phrasal Subcategories
NAC
X
18
Number of Lexical Subcategories
POS
TO
,
19
Number of Lexical Subcategories
NNP
JJ
NNS
NN
20
Smoothing
  • Heavy splitting can lead to overfitting
  • Idea Smoothing allows us to pool
  • statistics

21
Result Overview
Model F1
Previous 89.5
With Smoothing 90.7
22
Linguistic Candy
  • Proper Nouns (NNP)
  • Personal pronouns (PRP)

NNP-14 Oct. Nov. Sept.
NNP-12 John Robert James
NNP-2 J. E. L.
NNP-1 Bush Noriega Peters
NNP-15 New San Wall
NNP-3 York Francisco Street
PRP-0 It He I
PRP-1 it he they
PRP-2 it them him
23
Linguistic Candy
  • Relative adverbs (RBR)
  • Cardinal Numbers (CD)

RBR-0 further lower higher
RBR-1 more less More
RBR-2 earlier Earlier later
CD-7 one two Three
CD-4 1989 1990 1988
CD-11 million billion trillion
CD-0 1 50 100
CD-3 1 30 31
CD-9 78 58 34
24
Inference
  • She heard the noise.

Exhaustive parsing 1 min per sentence
25
Coarse-to-Fine Parsing
Goodman 97, CharniakJohnson 05
















26
Hierarchical Pruning
lt t
  • Consider again the span 5 to 12

coarse
QP NP VP

split in two


QP1 QP2 NP1 NP2 VP1 VP2
split in four
QP1 QP1 QP3 QP4 NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4





split in eight

27
Intermediate Grammars
X-BarG0
G
28
Projected Grammars
X-BarG0
G
29
Final Results (Efficiency)
  • Parsing the development set (1600 sentences)
  • Berkeley Parser
  • 10 min
  • Implemented in Java
  • Charniak Johnson 05 Parser
  • 19 min
  • Implemented in C

30
Final Results (Accuracy)
40 words F1 all F1
ENG CharniakJohnson 05 (generative) 90.1 89.6
ENG This Work 90.6 90.1

GER Dubey 05 76.3 -
GER This Work 80.8 80.1

CHN Chiang et al. 02 80.0 76.6
CHN This Work 86.3 83.4
31
Extensions
  • Acoustic modeling
  • Infinite Grammars
  • Nonparametric Bayesian Learning

Petrov, Pauls Klein 07
Liang, Petrov, Jordan Klein 07
32
Conclusions
  • Split Merge Learning
  • Hierarchical Training
  • Adaptive Splitting
  • Parameter Smoothing
  • Hierarchical Coarse-to-Fine Inference
  • Projections
  • Marginalization
  • Multi-lingual Unlexicalized Parsing

33
Thank You!
  • http//nlp.cs.berkeley.edu
Write a Comment
User Comments (0)
About PowerShow.com