Title: CPSC 503 Computational Linguistics
 1CPSC 503Computational Linguistics
- Parsing 
 - Lecture 12 
 - Giuseppe Carenini
 
  2Today 27/2
- Top-down (TD) 
 - Bottom-up (BU) 
 - Comparing TD and BU 
 - TD depth-first left-to-right 
 - Adding BU Filtering 
 - The Early Algorithm
 
  3Parsing with CFGs
I prefer a morning flight
Parser
CFG
- Assign valid trees covers all and only the 
elements of the input and has an S at the top 
  4Parsing as Search
- Search space of possible parse trees 
 
- S -gt NP VP 
 - S -gt Aux NP VP 
 - NP -gt Det Noun 
 - VP -gt Verb 
 - Det -gt a 
 - Noun -gt flight 
 - Verb -gt left 
 - Aux -gt do, does 
 
- Parsing find all trees that cover all and only 
the words in the input 
  5Constraints on Search
I prefer a morning flight
Parser
CFG (search space)
- Search Strategies 
 - Top-down or goal-directed 
 - Bottom-up or data-directed
 
  6Top-Down Parsing
- Since were trying to find trees rooted with an S 
(Sentences) start with the rules that give us an 
S.  - Then work your way down from there to the words.
 
  7Next step Top Down Space
- When POS categories are reached, reject trees 
whose leaves fail to match all words in the input  
  8Bottom-Up Parsing
- Of course, we also want trees that cover the 
input words. So start with trees that link up 
with the words in the right way.  - Then work your way up from there.
 
  9Two more steps Bottom-Up Space 
 10Top-Down vs. Bottom-Up
- Top-down 
 - Only searches for trees that can be answers 
 - But suggests trees that are not consistent with 
the words  - Bottom-up 
 - Only forms trees consistent with the words 
 - Suggest trees that make no sense globally
 
  11So Combine Them
- Top-down control strategy to generate trees 
 - Bottom-up to filter out inappropriate parses
 
- Top-down Control strategy 
 - Depth vs. Breadth first 
 - Which node to try to expand next 
 - Which grammar rule to use to expand a node
 
  12Top-Down, Depth-First, Left-to-Right Search
Sample sentence Does this flight include a 
meal? 
 13Example
Does this flight include a meal? 
 14Example
Does this flight include a meal?
flight
flight 
 15Example
Does this flight include a meal?
flight
flight 
 16Adding Bottom-up Filtering
-  The following sequence was a waste of time 
because an NP cannot generate a parse tree 
starting with an AUX 
Aux
Aux
Aux
Aux 
 17Bottom-Up Filtering 
 18Problems with TD-BU-filtering
- Left recursion 
 - Ambiguity 
 - Repeated Parsing
 
- SOLUTION Earley Algorithm 
 - (once again dynamic programming!)
 
  19(1) Left-Recursion
- These rules appears in most English grammars 
 - S -gt S and S 
 - VP -gt VP PP 
 - NP -gt NP PP
 
  20(2) Ambiguity
- I shot an elephant in my pajamas
 
  21(3) Repeated Work
- Parsing is hard, and slow. Its wasteful to redo 
stuff over and over and over.  - Consider an attempt to top-down parse the 
following as an NP  - A flight from Indi to Houston on TWA
 
  22- NP -gt Det Nom 
 - NP-gt NP PP 
 - Nom -gt Noun 
 -  
 
flight 
 23- NP -gt Det Nom 
 - NP-gt NP PP 
 - Nom -gt Noun 
 
flight 
 24flight 
 25  26  27Dynamic Programming
- Fills tables with solution to subproblems 
 
Parsing sub-trees consistent with the input, 
once discovered, are stored and can be reused
- Does not fall prey to left-recursion 
 - Stores ambiguous parse compactly 
 - Does not do (avoidable) repeated work
 
  28Earley Parsing
- Fills a table in a single sweep over the input 
words  - Table is length N1 N is number of words 
 - Table entries represent 
 - Completed constituents and their locations 
 - In-progress constituents 
 - Predicted constituents 
 
  29States
- The table-entries are called states and are 
represented with dotted-rules.  - S -gt ? VP A VP is predicted 
 - NP -gt Det ? Nominal An NP is in progress 
 - VP -gt V NP ? A VP has been found 
 
  30States/Locations
- Each state has a location indicating the portion 
of the input it applies to  - S -gt ? VP 0,0 A VP is predicted at the 
 start of the sentence  - NP -gt Det ? Nominal 1,2 An NP is in progress 
the Det goes from 1 to 2  - VP -gt V NP ? 0,3 A VP has been found 
 starting at 0 and ending at 3  
  31Graphically
S -gt ? VP 0,0 NP -gt Det ? Nominal 1,2 VP 
-gt V NP ? 0,3 
 32Earley answer
- As with most dynamic programming approaches, the 
answer is found by looking in the table in the 
right place.  - In this case, the following state should be in 
the final column 
S gt ?? 0,n1
- i.e., an S state the that spans from 0 to n1 and 
is complete. 
  33Earley processes
- So sweep through the table from 0 to n1 
 - New predicted states are created 
 - E.g., S -gt ? VP 0,0 gt VP -gt ? Verb 0,0  
 - New incomplete states are created by advancing 
existing states as new constituents are 
discovered  - E.g., VP -gt ? Verb NP .. gt VP -gt Verb ? NP 
..  - New complete states are created in the same way. 
 - E.g., VP -gt Verb ? NP .. gt VP -gt Verb NP ?.. 
  
  34Example
Book that flight
- We should find an S from 0 to 3 that is a 
completed state  
  35Example
Book that flight 
 36So far only a recognizer
- To generate all parses 
 - When old states waiting for the just completed 
constituent are updated gt add a pointer from 
each updated to completed  - Then simply read off all the backpointers from 
every complete S in the last column of the table 
  37Earley and Left Recursion
- So Earley solves the left-recursion problem 
without having to alter the grammar or 
artificially limiting the search.  - Never place a state into the chart thats already 
there  - Copy states before advancing them
 
  38Earley and Left Recursion 1
- S -gt NP VP 
 - NP -gt NP PP 
 - The first rule predicts 
 - S -gt ? NP VP 0,0 that adds 
 - NP -gt ? NP PP 0,0 
 - stops there since adding any subsequent 
prediction would be fruitless 
  39Earley and Left Recursion 2
- When a state gets advanced make a copy and leave 
the original alone  - Say we have NP -gt ? NP PP 0,0 
 - We find an NP from 0 to 2 so we create 
 -  NP -gt NP ? PP 0,2 
 - But we leave the original state as is
 
  40Dynamic Programming Approaches
- Earley 
 - Top-down, no filtering, no restriction on grammar 
form  - CYK 
 - Bottom-up, no filtering, grammars restricted to 
Chomsky-Normal Form (CNF)  
  41Next Time
- Read Chpt. 11 Features and Unification