Probabilistic Parsing II - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Probabilistic Parsing II

Description:

Michael Collins) TexPoint fonts used in EMF. ... a PCFG adequate to parse over 90% of the MIT Voyager Corpus was successful in ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 21
Provided by: mitchel4
Category:

less

Transcript and Presenter's Notes

Title: Probabilistic Parsing II


1
Probabilistic Parsing II
  • (many slides adapted from slides by
  • Michael Collins)

TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAAA
2
How well do PCFGs work?
  • Not very well
  • a PCFG adequate to parse over 90 of the MIT
    Voyager Corpus was successful in picking the
    correct parse on only 35 of a reserved test set.
  • Sample Sentences -- The MIT Voyager Corpus
  • I'm currently at MIT
  • What kind of food does LaGroceria serve
  • Where is the closest library to MIT
  • What's the closest ice cream parlor to Harvard
    University
  • Is there a subway stop by the Mount Auburn
    Hospital
  • Can you show me the intersection of Cambridge
    Street and Hampshire Street
  • Which subway stop is closest to the library at
    forty five Pearl Street

3
Voyager Experiments Results
  • Results parsing reserved Voyager corpus. Ref
    Magerman Marcus 1991

4
PP Attachment V-NP-PP forced choice
S
VP
PP
NP
NP
V
P
  • He joined the board as a nonexecutive
    director
  • Quintuple (V-attach, vjoined, n1board, pas,
    n2director)
  • Training set 20,801 quintuples (V- or N-attach,
    v, n1, p, n2)
  • Test set 3097 quintuples
  • Development set 4059 quintuples

5
Core Statistical Approach
  • Estimate
  • If
  • Noun-attach
  • Else
  • Verb-attach
  • Estimation using Maximum Likelihood Estimate

6
The final algorithm w/ backoff
7
Q How to make PCFGs sensitive to
  • Lexical relationships
  • Larger Contexts

8
Lexical relationships paths in the parse tree
9
Lexical relationships are now coded locally in
the tree itself
10
The SPATTER Parser (Magerman 95)
11
A Lexicalized PCFG
12
Factoring rule expansion Charniak 97
13
Smoothed Estimation
14
Smoothed Estimation II
15
Smoothed Estimation III
16
Independence Assumptions
17
Head Probabilities
18
Rule Probabilities
19
Estimating Head Probabilities
20
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com