Chapter 9: Parsing with Context-Free Grammars - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Chapter 9: Parsing with Context-Free Grammars

Description:

Note that the regular grammars are a proper subset of the context-free grammars. ... a node (e.g., S), and then apply that same rule again, and again, ad infinitum. ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 54
Provided by: Inderje9
Category:

less

Transcript and Presenter's Notes

Title: Chapter 9: Parsing with Context-Free Grammars


1
Chapter 9 Parsing with Context-Free Grammars
  • Heshaam Faili
  • hfaili_at_ece.ut.ac.ir
  • University of Tehran

2
Context-Free Grammars
  • Context-Free Grammars are of the form
  • A ? ?, where ? is a string of terminals and/or
    non-terminals
  • Note that the regular grammars are a proper
    subset of the context-free grammars.
  • This means that every regular grammar is
    context-free, but there are context-free grammars
    that arent regular
  • CFGs only specify what trees look like, not how
    they should be computationally derived ? We need
    to parse the sentences

3
Parsing Intro
  • Input a string
  • Output a (single) parse tree
  • A useful step in the process of obtaining meaning
  • We can view the problem as searching through all
    possible parses (tree structures) to find the
    right one
  • Strategies
  • Top-Down (goal-directed) vs. Bottom-Up
    (data-directed)
  • Breadth-First vs. Depth-First
  • Adding Bottom-Up to Top-Down Left-Corner Parsing
  • Example
  • Book that flight!

4
Grammar and Desired Tree
5
Top-Down Parsing
  • Expand rules, starting with S and working down to
    leaves
  • Replace the left-most non-terminal with each of
    its possible expansions.
  • The full search is on p. 361, Fig. 10.3
  • While we guarantee that any parse in progress
    will be S-rooted, it will expand non-terminals
    that cant lead to the existing input
  • e.g., 5 of 6 trees in third ply level of the
    search space
  • None of the trees take the properties of the
    lexical items into account until the last stage

6
Top-down (breadth-first) parsing
S
7
Expansion techniques
  • Breadth-First Expansion (shown in figure)
  • All the nodes at each level are expanded once
    before going to the next (lower) level.
  • This is memory intensive when many grammar rules
    are involved
  • Depth-First (shown on p. 367, Fig. 10.7)
  • Expand a particular node at a level, only
    considering an alternate node at that level if
    the parser fails as a result of the earlier
    expansion
  • i.e., expand the tree all the way down until you
    cant expand any more

8
Top-down (depth-first) parsing
Does this flight include a meal ?
9
Top-Down Depth-First Parsing
  • There are still some choices that have to be
    made
  • 1. Which leaf node should be expanded first?
  • Left-to-right strategy moves through the leaf
    nodes in a left-to-right fashion
  • 2. Which rule should be applied first?
  • There are multiple NP rules which should be used
    first?
  • Can just use the textual order of rules from the
    grammar
  • There may be reasons to take rules in a
    particular order (e.g., probabilities)

10
Parsing with an agenda
  • Search states are kept in an agenda
  • Search states consist of partial trees and a
    pointer to the next input word in the sentence
  • Based on what weve seen before, apply the next
    item on the agenda to the current tree
  • Add new items to (the front of) the agenda, based
    on the rules in the grammar which can expand at
    the (leftmost) node
  • We maintain the depth-first strategy by adding
    new hypotheses (rules) to the front of the agenda
  • If we added them to the back, we would have a
    breadth-first strategy
  • See figure 10.6 pg. 366E

11
Bottom-Up Parsing
  • Bottom-Up Parsing is input-driven ? start from
    the words and move up to form a tree
  • Here we match one or more nodes on the upper
    fringe of the parse tree against the right-hand
    side of a CFG rule, building the left-hand side
    as a parent node of those nodes.
  • We can also have breadth-first and depth-first
    approaches
  • The example on the next slide (p. 362, Fig. 10.4)
    moves in a breadth-first fashion
  • While any parse in progress will be tied to the
    input, many may not lead to an S!
  • e.g., left-most trees in plies 1-4 of Fig 10.4

12
Bottom-up parsing
13
Comparing Top-Down and Bottom-Up Parsing
  • Top-Down
  • While we guarantee that any parse in progress
    will be S-rooted, it will expand non-terminals
    that cant lead to the existing input, e.g.,
    first 4 trees in third ply.
  • Bottom-Up
  • While any parse in progress will be tied to the
    input, many may not lead to an S, e.g., left-most
    trees in plies 1-4 of p. 362, Fig 10.4.
  • So, both pure top-down and pure bottom up
    approaches are highly inefficient.

14
Left-Corner Parsing
  • Motivation
  • Both pure top-down and bottom-up approaches are
    inefficient
  • The correct top-down parse has to be consistent
    with the left-most word of the input
  • Left-corner parsing a way of using bottom-up
    constraints as part of a top-down strategy.
  • Left-corner rule expand a node with a grammar
    rule only if the current input can serve as the
    left corner from this rule.
  • Left-corner from a rule first word along the
    left edge of a derivation from the rule
  • Put the left-corners into a table, which can then
    guide parsing

15
Left-Corner Example
  • S? NP VP
  • S? VP
  • S? Aux NP VP
  • NP? Det Nominal ProperNoun
  • Nominal ? Noun Nominal Noun
  • VP? Verb Verb NP
  • Noun ? book flight meal money
  • Verb ? book include prefer
  • Aux ? does
  • ProperNoun ? Houston TWA
  • Left Corners
  • S gt NP gt Det, ProperNoun
  • VP gt Verb
  • Aux gt Aux
  • NP gt Det, ProperNoun
  • VP gt Verb
  • Nominal gt Noun

16
Other problems Left-Recursion
  • Left-corner parsers still guided by top-down
    parsing
  • Consider rules like
  • S ? S and S
  • NP ? NP PP
  • A top-down left-to-right depth-first parser could
    apply a rule to expand a node (e.g., S), and then
    apply that same rule again, and again, ad
    infinitum.
  • Left Recursion A grammar is left-recursive if a
    non-terminal leads to a derivation that includes
    itself as its leftmost immediate or non-immediate
    child (i.e., along its leftmost branch).
  • PROBLEM Top-Down parsers may not terminate on a
    left-recursive grammar

17
Other problems Repeated Parsing of Subtrees
  • When parser backtracks to an alternative
    expansion of a non-terminal, it loses all parses
    of subconstituents that it built.
  • There is a good chance that it will rebuild the
    parses of some of those constituents again.
  • This can occur repeatedly.
  • a flight from Indianopolis to Houston on TWA
  • NP ? Det Nom
  • Will build an NP for a flight, before failing
    when the parser realizes the input PPs arent
    covered
  • NP ? NP PP
  • Will again build an NP for a flight, before
    failing when the parser realizes the two
    remaining PPs in the input arent covered

18
Other problems Ambiguity
  • Repeated parsing of subtrees is even more of a
    problem for ambiguous sentences
  • PP attachment
  • NP or VP I shot an elephant in my pajamas.
  • NP bracketing the meal on flight 286 from SF
    to Denver
  • Coordination
  • old men and women vs. old men and women
  • 3 kinds of ambiguities attachment, coordination,
    noun-phrase bracketing.
  • Parsers have to disambiguate between lots of
    valid parses or return all parses
  • Will repeat a lot of work parsing the
    commonalities of each ambiguity

19
Ambiguity
20
Addressing the problems Chart Parsing
  • More or less a standard method for carrying out
    parsing keeps tables of constituents that have
    been parsed earlier, so it doesnt reduplicate
    the work.
  • Each possible sub-tree is represented exactly
    once.
  • This makes it a form of dynamic programming
    (which we saw with minimum edit distance and the
    Viterbi algorithm)
  • Combines bottom-up and top-down parsing
  • Rather simple and elegant in the way it works!

21
Earley Chart Parsing Representation
  • The parser uses a representation for parse state
    based on dotted rules. S ? NP ? VP
  • Dotted rules distinguish what has been seen so
    far from what has not been seen (i.e., the
    remainder).
  • The constituents seen so far are to the left of
    the dot in the rule, the remainder is to the
    right.
  • Parse information is stored in a chart,
    represented as a graph.
  • The nodes represent word positions.
  • The labels represent the portion (using the dot
    notation) of the grammar rule that spans that
    word position.
  • ? In other words, at each position, there is a
    set of labels (each of which is a dotted rule,
    also called a state), indicating the partial
    parse tree produced until then.

22
Example Chart for A Dog
  • Given a trivial grammar
  • NP ? D N
  • D ? a
  • N ? dog
  • Heres the chart for the complete parse of a
    dog
  • 0 D ? a? 1 (scan)
  • 1 N? dog? 2 (scan)
  • 0 NP ? ?D N 0 (predict)
  • 0 NP ? D ? N 1 (complete)
  • 0 NP ? D N ? 2 (complete)

23
More Chart Parsing Terminology
  • A state is complete if it has a dot at the
    right-hand side of its rule. Otherwise, it is
    incomplete.
  • At each position, there is a list (actually, a
    queue) of states.
  • The parser moves through the N1 sets of states
    in the chart left-to-right, processing the states
    in each set in order.
  • States will be stored in a FIFO (first-in
    first-out) queue at each start position
  • The processing applies one of three operators,
    each of which takes a state and produces new
    states added to the chart.
  • Scanner, Predictor, Completer
  • There is no backtracking.

24
Earley Parsing Algorithm
  • The parsing algorithm is just a few lines long,
    as can be seen on p. 381, Figure 10.16
  • In the top level loop, for each position, for
    each state, it calls the predictor, or else the
    scanner, or else the completer.
  • The algorithm never backtracks and never removes
    states, so we dont redo any work
  • The goal is to have S ? a as the last chart
    entry, i.e. the dot has moved over the entire
    input to derive an S

25
The Earley Algorithm
26
The 3 Operators Predictor, Scanner, Completer
  • Procedure PREDICTOR((A???B?, i, j))
  • For each (B??) in grammar do
  • Enqueue((B ? ??, j, j), chartj)
  • End
  • Procedure SCANNER ((A???B?, i, j))
  • If B is a part-of-speech for wordj then
  • Enqueue((B ? wordj?, j, j1), chartj1)
  • Procedure COMPLETER((B???, j, k))
  • For each (A???B?, i, j) in chartj do
  • Enqueue((A ??B??, i, k), chartk)
  • End

27
Prediction
  • Procedure PREDICTOR((A???B?, i, j))
  • For each (B??) in grammar do
  • Enqueue((B ? ??, j, j), chartj)
  • End
  • Predicting is the task of saying we kinds of
    input we expect to see
  • Add a rule to the chart saying that we have not
    seen ?, but when we do, it will form a B
  • The rule covers no input, so it goes from j to j
  • Such rules provide the top-down aspect of the
    algorithm

28
Scanning
  • Procedure SCANNER ((A???B?, i, j))
  • If B is a part-of-speech for wordj then
  • Enqueue((B ? wordj?, j, j1), chartj1)
  • Scanning reads in lexical items
  • We add a dotted rule indicating that a word has
    been seen between j and j1
  • This is then added to the following (j1) chart
  • Such a completed dotted rule can be used to
    complete other dotted rules
  • These rules also show how the Earley parser has a
    bottom-up component

29
Completion
  • Procedure COMPLETER((B???, j, k))
  • For each (A???B?, i, j) in chartj do
  • Enqueue((A ??B??, i, k), chartk)
  • End
  • Completion combines two rules in order to move
    the dot, i.e., indicate that something has been
    seen
  • A rule covering B has been seen, so any rule A
    which refers to B in its RHS moves the dot
  • Instead of spanning from i to j, A now spans from
    i to k, which is where B ended
  • Once the dot is moved, the rule will not be
    created again

30
Example (Book that flight)
31
Example(Book that flight)
32
Example(Book that flight), cont
33
Example(Book that flight), cont
34
Example(Book that flight), cont
35
Earley parsing
  • The Earley algorithm is efficient, running in
    polynomial time.
  • Technically, however, it is a recognizer, not a
    parser
  • To make it a parser, each state needs to be
    augmented with a pointer to the states that its
    rule covers
  • For example, a VP would point to the state where
    its V was completed and the state where its NP
    was completed

36
Other Dynamic Programming methods
  • CYK (Cocke-Kasami-Younger) Parser
  • Using CNF grammar rules
  • Chart Parsing
  • Modified version of Earley parsing with dynamic
    ordering of states in the algorithm

37
CYK Parsing
  • The DP method by using CNF grammar
  • A?BC
  • A?m
  • Any CFG can be converted to CNF,
  • So, dont loss anything
  • A?B unit productions (can be rewrited by A??
    for any A??)
  • Like other DP methods, a simple (n1)(n1)
    matrix used to encode the structure of the
    sentence (n sentence length)
  • Indexed is the gap between words
  • 0 Book 1 that 2 flight 3
  • i,j is a set of non-terminals that represent
    all the constituents that span positions i
    through j of the input

38
CYK Parsing, cont,d
  • Since our grammar is in CNF, the non-terminal
    entries in the table have exactly two daughters
    in the parse.
  • for each constituent represented by an entry i,
    j in the table there must be a position in the
    input, k, where it can be split into two parts
    such that i lt k lt j.
  • Given such a k, the first constituent i,k must
    lie to the left of entry i, j somewhere along
    row i, and the second entry k, j must lie
    beneath it, along column j

39
CYK Algorithm
40
(No Transcript)
41
CYK example(CNF Grammar)
42
CYK example(Book the flight through Houston)
43
CYK in practice
  • Does not have major problem theoretically
  • The resulted parse tree are not consistent to
    syntacticians(because of CNF formal)
  • Syntax to Semantic approach complicated
  • Post-processing needed to return-back the result
    to more acceptable form

44
Chart Parser
  • In both the CKY and Earley algorithms, the order
    in which events occur (adding entries to the
    table, reading words, making predictions, etc.)
    is statically determined by the procedures that
    make up these algorithms.
  • Unfortunately, dynamically determining the order
    in which events occur based on the current
    information is often necessary
  • Chart Parsing facilitates just such dynamic
    determination of the order in which chart entries
    are processed.
  • Using Agenda

45
Chart Parser
  • fundamental rule generalized the ideas in CYK
    and Earley
  • if the chart contains two edges A ? a B ß , i,
    j and B ? ? , j,k then we should add the new
    edge A ?a B ß i,k to the chart
  • Prediction can be top-down of botton-up

46
(No Transcript)
47
Prediction in Chart Parser
48
Inadequacies of parsing with plain CFGs
  • While the Earley algorithm works well for CFGs,
    we have to at some point question the validity of
    using plain CFGs
  • Well show this by looking at two phenomena
    (although, there are many more)
  • Subject-verb agreement
  • Subcategorization frames

49
Modeling Subject-Verb Agreement in CFGs
  • The flights leave vs. The flight leaves.
  • S ? 3sgNP 3sgVP
  • S ? PluralNP PluralVP
  • 3sgVP? 3sgVerb flies
  • 3sgVerb NP wants a flight
  • 3sgVerb NP PP leaves Boston in the
    morning
  • 3sgVerb PP leaves on Thursday
  • 3sgNP? Pronoun I
  • ProperNoun Denver
  • Det 3sgNominal a flight
  • 3sgNominal ? Noun 3sgNominal morning
    flight
  • 3sgNoun flight

50
Problems with Modeling Agreement in CFGs
  • You can see how messy this is, resulting in a
    massive increase in the size of the grammar.
  • Of course, once we add in determiner-noun
    agreement (e.g., a flight vs. (the) flights),
    it would get even larger.
  • Other languages which have gender agreement
    (e.g., French) will make it even worse.
  • Furthermore, we miss generalizations all
    transitive verbs have an NP object, regardless of
    whether the verb is 3rd singular or not
  • We will need to go to feature-based grammars to
    address these problems.

51
Subcategorization Frames in CFGs
  • V1. ? eat, sleep I want to eat
  • V2. NP prefer, find, leave Find NP the flight
    from Pittsburgh to Boston
  • V3. NP NP show, give, find Show NP me NP the
    airlines with flights from Pittsburgh
  • V4. PPfrom PPto fly, travel I would like to fly
    PP from Boston PP to Philadelphia
  • V5. NP PPwith help, load Can you help NP me
    PP with a flight
  • V6. VPto prefer, want, need I would prefer VP
    to go by United Airlines
  • V7. VPbare_stem can, would, might I can VP go
    from Boston
  • V8. V_S mean, imply Does this mean S American
    has a hub in Boston

52
CFG Grammar For Subcategorization
  • VP ? V1
  • V2 NP
  • V3 NP NP
  • V4 PPfrom PPto
  • V5 NP PPwith
  • V6 VPto
  • V7 VPbare_stem
  • V8 S
  • V1? eat sleep,

53
Problem with Modeling Subcat in CFGs
  • Again, this results in an explosion in the number
    of rules, especially when a full set of
    subcategorization frames is included.
  • If we combine these rules with the agreement
    rules, it gets even worse
  • Also, nouns, adjectives, and prepositions can
    also subcategorize for complements.
  • And again, we have no way to state whats in
    common about these rules
  • So, we turn to feature-based grammars
Write a Comment
User Comments (0)
About PowerShow.com