PARSING WITH CONTEXTFREE GRAMMARS - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

PARSING WITH CONTEXTFREE GRAMMARS

Description:

TOP-DOWN AND BOTTOM-UP SEARCH STRATEGIES ... BOTTOM-UP search: the parse tree must be an analysis of the input ... A hybrid of top-down and bottom-up parsing ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 41
Provided by: csBra
Category:

less

Transcript and Presenter's Notes

Title: PARSING WITH CONTEXTFREE GRAMMARS


1
PARSING WITH CONTEXT-FREE GRAMMARS
  • James Pustejovsky
  • CS 114

Thanks to Massimo Poesio
2
PARSING
  • Parsing is the process of recognizing and
    assigning STRUCTURE
  • Parsing a string with a CFG
  • Finding a derivation of the string consistent
    with the grammar
  • The derivation gives us a PARSE TREE

3
EXAMPLE (CFR LAST WEEK)
4
PARSING AS SEARCH
  • Just as in the case of non-deterministic regular
    expressions, the main problem with parsing is the
    existence of CHOICE POINTS
  • There is a need for a SEARCH STRATEGY determining
    the order in which alternatives are considered

5
TOP-DOWN AND BOTTOM-UP SEARCH STRATEGIES
  • The search has to be guided by the INPUT and the
    GRAMMAR
  • TOP-DOWN search the parse tree has to be rooted
    in the start symbol S
  • EXPECTATION-DRIVEN parsing
  • BOTTOM-UP search the parse tree must be an
    analysis of the input
  • DATA-DRIVEN parsing

6
AN EXAMPLE OF TOP-DOWN SEARCH(IN PARALLEL)
7
AN EXAMPLE OF BOTTOM-UP SEARCH
8
NON-PARALLEL SEARCH
  • If its not possible to examine all alternatives
    in parallel, its necessary to make further
    decisions
  • Which node in the current search space to expand
    first (breadth-first or depth-first)
  • Which of the applicable grammar rules to expand
    first
  • Which leaf node in a parse tree to expand next
    (e.g., leftmost)

9
TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT
10
TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (II)
11
TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (III)
12
TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (IV)
13
A T-D, D-F, L-R PARSER
14
TOP-DOWN vs BOTTOM-UP
  • TOP-DOWN
  • Only search among grammatical answers
  • BUT suggests hypotheses that may not be
    consistent with data
  • Problem left-recursion
  • BOTTOM-UP
  • Only forms hypotheses consistent with data
  • BUT may suggest hypotheses that make no sense
    globally

15
LEFT-RECURSION
  • A LEFT-RECURSIVE grammar may cause a T-D, D-F,
    L-R parser to never return
  • Examples of left-recursive rules
  • NP ? NP PP
  • S ? S and S
  • But also
  • NP ? Det Nom
  • Det ? NPs

16
THE PROBLEM WITH LEFT-RECURSION
17
LEFT-RECURSION POOR SOLUTIONS
  • Rewrite the grammar to a weakly equivalent one
  • Problem may not get correct parse tree
  • Limit the depth during search
  • Problem limit is arbitrary

18
LEFT-CORNER PARSING
  • A hybrid of top-down and bottom-up parsing
  • Strategy dont consider any expansion unless the
    current input can serve as the LEFT-CORNER of
    that expansion

19
FURTHER PROBLEMS IN PARSING
  • Ambiguity
  • Church and Patel (1982) the number of attachment
    ambiguities grows like the Catalan numbers
  • C(2) 2, C(3) 5, C(4) 14, C(5) 132, C(6)
    469, C(7) 1430, C(8) 4867
  • Avoiding reparsing

20
COMMON STRUCTURAL AMBIGUITIES
  • COORDINATION ambiguity
  • OLD (MEN AND WOMEN) vs (OLD MEN) AND WOMEN
  • ATTACHMENT ambiguity
  • Gerundive VP attachment ambiguity
  • I saw the Eiffel Tower flying to Paris
  • PP attachment ambiguity
  • I shot an elephant in my pajamas

21
PP ATTACHMENT AMBIGUITY
22
AMBIGUITY SOLUTIONS
  • Use a PROBABILISTIC GRAMMAR (not covered in this
    module)
  • Use semantics

23
AVOID RECOMPUTING INVARIANTS
  • Consider parsing with a top-down parser the NP
  • A flight from Indianapolis to Houston on TWA
  • With the grammar rules
  • NP ? Det Nominal
  • NP ? NP PP
  • NP ? ProperNoun

24
INVARIANTS AND TOP-DOWN PARSING
25
THE EARLEY ALGORITHM

26
DYNAMIC PROGRAMMING
  • A standard T-D parser would reanalyze A FLIGHT 4
    times, always in the same way
  • A DYNAMIC PROGRAMMING algorithm uses a table (the
    CHART) to avoid repeating work
  • The Earley algorithm also
  • Does not suffer from the left-recursion problem
  • Solves an exponential problem in O(n3)

27
THE CHART
  • The Earley algorithm uses a table (the CHART) of
    size N1, where N is the length of the input
  • Table entries sit in the gaps between words
  • Each entry in the chart is a list of
  • Completed constituents
  • In-progress constituents
  • Predicted constituents
  • All three types of objects are represented in the
    same way as STATES

28
THE CHART GRAPHICAL REPRESENTATION
29
STATES
  • A state encodes two types of information
  • How much of a certain rule has been encountered
    in the input
  • Which positions are covered
  • A ? ?, X,Y
  • DOTTED RULES
  • VP ? V NP ?
  • NP ? Det ? Nominal
  • S ? ? VP

30
EXAMPLES
31
SUCCESS
  • The parser has succeeded if entry N1 of the
    chart contains the state
  • S ? ? ?, 0,N

32
THE ALGORITHM
  • The algorithm loops through the input without
    backtracking, at each step performing three
    operations
  • PREDICTOR add predictions to the chart
  • COMPLETER Move the dot to the right when
    looked-for constituent is found
  • SCANNER read in the next input word

33
THE ALGORITHM CENTRAL LOOP
34
EARLEY ALGORITHM THE THREE OPERATORS
35
EXAMPLE, AGAIN
36
EXAMPLE BOOK THAT FLIGHT
37
EXAMPLE BOOK THAT FLIGHT (II)
38
EXAMPLE BOOK THAT FLIGHT (III)
39
EXAMPLE BOOK THAT FLIGHT (IV)
40
READINGS
  • Jurafsky and Martin, chapter 10.1-10.4
Write a Comment
User Comments (0)
About PowerShow.com