Chapter 4 - Part 3: Bottom-Up Parsing - PowerPoint PPT Presentation

Loading...

PPT – Chapter 4 - Part 3: Bottom-Up Parsing PowerPoint presentation | free to download - id: 684f30-ZDgxY



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Chapter 4 - Part 3: Bottom-Up Parsing

Description:

Chapter 4 - Part 3: Bottom-Up Parsing Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 – PowerPoint PPT presentation

Number of Views:6
Avg rating:3.0/5.0
Date added: 6 January 2020
Slides: 89
Provided by: StevenAD2
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 4 - Part 3: Bottom-Up Parsing


1
Chapter 4 - Part 3 Bottom-Up Parsing
Prof. Steven A. Demurjian Computer Science
Engineering Department The University of
Connecticut 371 Fairfield Way, Unit 2155 Storrs,
CT 06269-3155
steve_at_engr.uconn.edu http//www.engr.uconn.edu/st
eve (860) 486 - 4818
Material for course thanks to Laurent
Michel Aggelos Kiayias Robert LeBarre
2
Basic Intuition
  • Recall that
  • LL(k) works
  • TOP-DOWN
  • With a LEFTMOST Derivation
  • Predicts the right production to select based on
    lookahead
  • Our new motto
  • LR(k) works
  • BOTTOM-UP
  • With a RIGHTMOST Derivation
  • Commits to the production choice after seeing the
    whole body (left hand side), working in reverse

3
Bottom-Up Parsing
  • Inverse or Complement of Top-Down Parsing
  • Top Down Parsing Utilizes Start Symbol and
    Attempts to Derive the Input String using
    Productions
  • Bottom-Up Parsing Makes Modifications to the
    Input String which Allows it to Reduce to Start
    Symbol
  • For Example, Consider Grammar Derivations S ?
    a A B e A ? Abc b B ? d
  • What Does Each Derivation Represent?
  • Top-Down ---- Leftmost Derivation
  • Bottom-Up ---- Rightmost Derivation in Reverse!

abbcde ? aAbcde ? aAde ? aABe ? S ?
S ? aABe ? aAbcBe ? abbcBe ? abbcde
4
Type of Derviation
  • Grammar S ? a A B e A ? Abc b B ? d
  • Key Issues
  • How do we Determine which Substring to Reduce?
  • How do we Know which Production Rule to Use?
  • What is the General Processing for BUP?
  • How are Conflicts Resolved?
  • What Types of BUP are Considered?

TDP S ? aABe ? aAbcBe ? abbcBe ? abbcde
BUP S ? aABe ? aAde? aAbcde ? abbcde Is a
rightmost derivation that happens in reverse!
5
What is a Handle?
  • Defn A Right-Sentential Form is Sentential Form
    that has Been Derived in a Righmost Derivation
  • S ? aABe ? aAde? aAbcde ? abbcde
  • Underline all Right Sentential Forms
  • Handle is a Substring of a Right Sentential Form
    that
  • Appears on Right Hand Side of Production Rule
  • Can be Used to Reduce the Right Sentential Form
    via a Substitution in a Step of a RM Derivation
  • Formally is a rule A ? ß and position in Right
    Sentential Form ? s.t. S ? RM aAw ? RM aßw and
    A occurs at ? in aAw
  • Example Handles are Underlined in
  • S ? aABe ? aAde? aAbcde ? abbcde
  • Abc is Right hand Side of Rule A ? Abc at
    Position 2 in Right Sentential Form ? aAbcde

6
What is a Handle?
  • Consider again...
  • S ? aABe ? aAde ? aAbcde ? abbcde

S ? aABe A ? Abc b
B ? d
7
Handle Pruning
  • What bottom-up really means...

abbcde
aAbcde
8
Handle Pruning
aAbcde
aAde
9
Handle Pruning
aAde
aABe
10
Handle Pruning
aABe
S
11
Whats Going on in Parse Tree?
  • Consider Right Sentential Form aßw and Rule A ? ß

S
A
a
What Does a Signify?
w
ß
What Does w Contain?
What Does ß Represent?
Input Processed Still on Parsing Stack
Input yet to be Consumed
Candidate Handle to be Reduced
12
Bottom-Up Parsing
  • Recognized body of last production applied in
    rightmost derivation
  • Replace the symbol sequence of that body by the
    RHS of the Production Rule Based on Current
    Input
  • Repeats
  • At the end
  • Either
  • We are left with the start symbol ? Success!
  • Or
  • We get stuck somewhere ? Syntax error!
  • Key Issue If there are Multiple Handles for the
    Same Sentential Form, then the Grammar G is
    Ambiguous

13
General Processing of BUP
  • Basic mechanisms
  • Shift
  • Reduce
  • Basic data-structure
  • A stack of grammar symbols (Terminals and
    Non-Terminals)
  • Basic idea
  • Shift input symbols on the stack until ... the
    entire handle of the last rightmost reduction
  • When the body of the last RM reduction is on
    Stack, reduce it by replacing the body by the
    right-hand-side of the Production Rule
  • When only start symbol is left
  • We are done.

14
Example
abbcde Shift
a bbcde Shift
ab bcde Reduce
aA bcde Shift
aAb cde Shift
aAbc de Reduce
aA de Shift
aAd e Reduce
aAB e Shift
aABe Reduce
S Accept
Rule to Reduce with
Handle
15
Example
abbcde Shift
a bbcde Shift
ab bcde Reduce
aA bcde Shift
aAb cde Shift
aAbc de Reduce
aA de Shift
aAd e Reduce
aAB e Shift
aABe Reduce
S Accept
Handle
Rule to Reduce with
16
Example
abbcde Shift
a bbcde Shift
ab bcde Reduce
aA bcde Shift
aAb cde Shift
aAbc de Reduce
aA de Shift
aAd e Reduce
aAB e Shift
aABe Reduce
S Accept
Handle
Rule to Reduce with
17
Example
abbcde Shift
a bbcde Shift
ab bcde Reduce
aA bcde Shift
aAb cde Shift
aAbc de Reduce
aA de Shift
aAd e Reduce
aAB e Shift
aABe Reduce
S Accept
18
Example
abbcde Shift
a bbcde Shift
ab bcde Reduce
aA bcde Shift
aAb cde Shift
aAbc de Reduce
aA de Shift
aAd e Reduce
aAB e Shift
aABe Reduce
S Accept
19
Key Observation
  • At any point in time
  • Content of the stack is a prefix of a
    right-sentencial form
  • This prefix is called a viable prefix
  • Check again!
  • Below all the right-sentencial form of a
    rightmost derivation
  • S ? aABe ? aAde ? aAbcde ?
    abbcde


a
ab
aA
aAb
aAbc
aA
aAd
aAB
aABe
S
20
What is General Processing for BUP?
  • Utilize a Stack Implementation
  • Contains Symbols, Non-Terminals, and Input
  • Input is Examined w.r.t. Stack/Current State
  • General Operation Options to Process Stack
    Include
  • Shift Symbols from Input onto Stack
  • When Handle ß on Top of Stack
  • Reduce by using Rule A ? ß
  • Pop all Symbols of Handle ß
  • Push Non-Terminal A onto Stack
  • When Configuration (S, ) of Stack, ACCEPT
  • Error Occurs when Handle Cant be Found or S is
    on Stack with Non-Empty Input

21
Consider the Example Below
22
What are Possible Grammar Conflicts?
  • Shift-Reduce (S/R) Conflict
  • Content of Stack and Reading Current Input
  • More than One Option of What to do Next
  • stmt ? if expr then stmt if expr
    then stmt else stmt other Consider
    Stack as below with input of token else . if
    expr then stmt
  • Do we Reduce if expr then stmt to stmt
  • Do we Shift else onto Stack?

23
What are Possible Grammar Conflicts?
  • Reduce-Reduce (R/R) Conflict
  • stmt ? id ( parameter_list ) parameter_list ?
    parameter_list, parameter
  • parameter ? id
  • expr ? id ( expression_list ) id
  • expression_list ? expression_list, expr
  • expr
  • Consider Stack as below with input of token .
    id (id, , id) .
  • Do we Reduce to stmt?
  • Do we Reduce to expr?

24
Bottom-Up Parsing Techniques
  • LR(k) Parsers
  • Left to Right Input Scanning (L)
  • Construct a Rightmost Derivation in Reverse (R)
  • Use k Lookahead Symbols for Decisions
  • Advantages
  • Well Suited to Almost All PLs
  • Most General Approach/Efficiently Implemented
  • Detects Syntax Errors Very Quickly
  • Disadvantages
  • Difficult to Build by Hand
  • Tools to Assist Parser Construction (Yacc, Bison)

25
Components of an LR Parser
Table Generator
Grammar
Parsing Table
Driver Routines
Parsing Table
Output Parse Tree
Input Tokens
Differs Based on Grammar/Lookaheads
Common to all LR Parsers
26
Three Classes of LR Parsers
  • Simple LR (SLR) or LR(0)
  • Easiest but Limited in Grammar Applicability
  • Grammar Leads to S/R and R/R Conflicts
  • Canonical LR
  • Powerful but Expensive
  • LR(k) Usually LR(1)
  • Lookahead LR (LALR) In Between Two
  • Two Fold Focus
  • Parser Table Construction Item and Item Sets
  • Examination of LR Parsing Algorithm

27
LR Parser Structure
a1 ... ai ai1 ... an
sm
Xm
sm-1
Xm-1

X1
s0
INPUT
(s0 X1 s1 X2 ... Xm-1 sm-1 Xm sm , ai ai1 ... an
)
O U T P U T
Grammar symbol (Terminal or non-terminal)
LR Parsing Program
state
action goto
  • actionsm , ai is Parsing Table with Four
    Options 1. Shift S onto Stack 2. Reduce by
    Rule 3. Accept (,) 4. Report an Error
  • gotosm , ai determines next state for action
  • Question What does following Represent?

X1 X2 ... Xm-1 Xm ai ai1 ... an
28
What is the Parsing Table?
  • Combination of State, Action, and Goto
  • Shift s5 means shift input symbol and state 5
  • Reduce r2 means reduce using rule 2
  • goto state/NT indicates the next state

29
Actions Against Configuration
Configuration (s0 X1 s1 X2 ... Xm-1 sm-1 Xm sm ,
ai ai1 ... an )
  • actionsm , ai
  • Shift s in Parsing Table Move aism1 to Stack
    (s0 X1 s1 X2 ... Xm-1 sm-1 Xm sm ai sm1 , ai1
    ... an )
  • Reduce A ? ß means
  • Remove 2 ß symbols from stack and Push A along
    with state s gotosm-1 , A onto stack
  • Uses Prior State after popping to determine goto
  • Accept Parsing Complete
  • Error Call recovery Routine

30
How Does BUP Work?
Stack Input Action
31
Another Detailed Example
32
Constructing Parsing Tables
  • Three Types of Parsers (SLR, Canonical, LALR) all
    have Shared Concept for Parsing Table
    Construction
  • An Item Characterizes for Each Grammar Rule
  • What weve Seen or Derived
  • What weve Yet to See or Derive
  • Consider the Grammar Rule E ? E T
  • There are Four Items for this Rule E ? . E T E
    ? E . T E ? E . T E ? E T .
  • E . T Means weve Derived E and have yet to
    Derive T, so we are Expecting Next
  • Note A ? e has Item A ? .

____.____ Has To Be Been Seen/ Seen/
Derived Derived
33
Another Characterization of Items
  • Consider the Grammar Rule E ? E T
  • There are Four Items for this Rule E ? . E T E
    ? E . T E ? E . T E ? E T .
  • This Represents Summary of History of Parse
  • Each Item Refers to
  • Whats Been Placed on Stack (Left of .)
  • What Remains to Reduce for a Rule (Right of .)

on
stack left to derive/reduce
Seen a string derived from E Looking for
String Derivable from T Found input
through the Yet to process input for T
34
Start with SLR Parsing Table Construction
  • Step 1 Construct an Augmented Grammar which has
    a Single Alternative/Production Rule
  • Now, Every Derivation Starts with the Production
    Rule E ? E

Augmented E ? E E ? E T E ? T T ? T
F T ? F F ? ( E ) F ? Id
Original E ? E T E ? T T ? T F T ? F F
? ( E ) F ? Id
35
Start with SLR Parsing Table Construction
  • Step 2 Construct the Closure of All Items
  • Intuitively, if A ? a . B ß is in Closure, we
    would Expect to see B ß at Some Point in
    Derivation
  • If B ? ? is a Production Rule, Expect to see a
    Substring Derivable from ? in Future
  • Step 3 Compute the GOTO (Item_Set, X), where X
    is a Grammar Symbol
  • Intuitively, Identifies Which Items are Valide
    for Viable Prefix ?
  • Utilized to Determine Next Action (State) for the
    Parser
  • Note Different from goto as Previously
    Discussed!

36
Calculating Closure
  • Closure (I) where I is Set of Items
  • All Items in I are in Closure (I)
  • If A ? a . B ß in Closure (I) and B ? ? is a
    Production Rule, then Add B ? . ? to Closure
    (I)
  • Repeat Step 2 Until there are No New Items Added
  • I0 Closure (E ? . E) --- Add in Following
    Items E ? . E - Rule 1 - Any Rules E ? ? - Yes
  • E ? . E T - Rule 2 E ? . T - Rule 3 -
    Any Rules T ? ? - Yes
  • T ? . T F - Rule 4 T ? . F - Rule 5 -
    Any Rules T ? ? - Yes F ? . ( E ) - Rule
    6 F ? . id - Rule 7

37
Whats Next Step?
  • Recall the Parsing Table
  • States are 0, 1, 2, 11 which Correspond to
    Item Sets
  • actions based on Input and Current State
  • goto is What State to Transition to Next
  • This is a Push Down Automata!
  • What are Three Critical Functions to Calculate?
  • State closure
  • To compute the set of productions in a given
    state
  • Transition function
  • To compute the states reachable from a given
    state
  • Items
  • To compute the set of states in the PDA

38
What is Important Part of Process?
  • Viable Prefix Definition
  • (1) a string that equals a prefix of a
    right-sentential form up to (and including) its
    unique handle.
  • (2) any prefix of a string that satisfies (1)
  • Essentially a subset of a right-sentential form
  • May be inclusive of entire handle (right hand
    side of a production rule)
  • Examples of Viable Prefixes are
  • a, aA, aAd, aAbc, ab, aAb,
  • Not viable prefixes aAde, Abc, aAA,

39
What is The Big Deal ?
  • Consider the stack again
  • Each Element of Stack Represents a right
    sentential form
  • They are all Viable Prefixes
  • When Parsing, two Alternatives
  • lengthening a viable prefix
  • pruning a handle
  • In other words...
  • States represent viable prefixes
  • We transition between viable prefixes!


a
ab
aA
aAb
aAbc
aA
aAd
aAB
aABe
S
Answer We are either -
40
Intuition for this Process
  • Objective
  • Turn a Grammar into a PDA
  • We want
  • A PDA
  • With states the capture viable prefixes
  • We have
  • A grammar
  • With production rules
  • We know that
  • Production rules are used to derive handles
  • Viable prefixes are (strings) prefixes of handles

41
Example
  • Consider augmented grammar given below.
  • Assume that
  • We start the parsing (with E) and therefore
  • We are at the initial state of the PDA
  • We have some input (e.g., id id id)
  • Questions
  • Which productions are activated at this point ?
  • In other words, which productions could be used
    to match the rest of the input ?

1 E ? E 2 E ? E T 3 ? T
4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id
42
Example II
  • Consider the Derivation Given Below
  • In Example, Production Rules 1,2,3,5,7 are
    active and utilized to lead to the viable
    prefix id

4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id
1 E ? E 2 E ? E T 3 ? T
E ? E by (1) ? E T by (2) ? T T by
(3) ? F T by (5) ? id T by
(7) ....
43
PDA State (Closure(E ? E )
  • A PDA State is...
  • The set of productions that are active in the
    state
  • Question
  • How do we compute that from G ?

4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id
1 E ? E 2 E ? E T 3 ? T
State I0
E ? . E E ? . E T
E ? . E
E ? . E E ? . E T E ? . T
E ? . E E ? . E T E ? . T T ? . T F T ?
. F
E ? . E E ? . E T E ? . T T ? . T F T ?
. F F ? . ( E ) F ? . Id
44
PDA Transition
  • How can we leave state I0 ?
  • What does it mean to leave I0 ?
  • Terminals means that weve Consumed the
    terminal from the input stream
  • Non-terminals means that we have pushed onto
    the stack the non-terminal, input, and states
    that will allow for a future reduction

State I0
E? . E E ? . E T E ? . T T ? . T F T ? .
F F ? . ( E ) F ? . Id
This defines the GOTO Function!
45
The GOTO Function
  • GOTO(I, X) is Defined for
  • An item set I
  • A grammar symbol (non-terminal or terminal) X
  • GOTO(I, X) items A ? a X . ? where A ? a .
    X ß in I
  • Algorithmically
  • Look for Rules of Form A ? a . X ß
  • Identify the Grammar Symbols in I to Right of .
  • Group all A ? a . X ß with Same X to Form a New
    State
  • Compute the Closure of the New State for All X
  • This leads to

46
Destination states
State I0
E ? . E E ? . E T E ? . T T ? . T F T ?
. F F ? . ( E ) F ? . Id
47
Destination states
State I0
E ? . E E ? . E T E ? . T T ? . T F T ?
. F F ? . ( E ) F ? . Id
  • For GOTO(I0, ( ) we compute Closure(F? ( . E )
    )
  • Since E? E T and E?T, include E? . E T, E ?
    . T
  • Since T? T F and T?F, include T? . T F, T ?
    . F
  • Since F? ( E ) and F? Id, include F? . ( E ) ,
    F ? . Id
  • Now, compute GOTO(I1, X ) for X E, T, F, ( , Id

48
What Does it Mean when . at End of Rule?
State I0
E ? . E E ? . E T E ? . T T ? . T F T ?
. F F ? . ( E ) F ? . Id
  • For the Three States above, the . Occurs at
    the end of an Item
  • E? T . and T? F . and F? id .
  • Each if these is a Reduction to Replace
  • T by E on Stack
  • T by F on Stack
  • F by id on Stack

49
How is this Interpreted
State I0
E ? . E E ? . E T E ? . T T ? . T F T ?
. F F ? . ( E ) F ? . Id
  • Represents the Possible Next Steps in a
    Derivation
  • Consider Symbol Directly to Right of .
  • That is what we Expect to see Next in a
    Derivation
  • For two Rules, we Expect to See E
  • Move . to Right to Consume E for Both
    Production Rules
  • Weve Seen E
  • We expect to see What Follows . Next
  • Now, Compute Closure(E? . E , E? . E T)
    State I1

E? . E E? . E T
50
Continue Process to Yield
  • The State Machine also Represents Viable Prefixes
  • Possible Combinations that appear on Parsing Stack

51
Viable Prefixes and Valid Items
  • Consider a Derivation
  • Let a ß1 be a Viable . Prefix
  • A ? ß1 . ß2 is Valid Item if the above derivation
    exists
  • When a ß1 is on the Parsing Stack Two Cases
  • If ß2 ? e Then we Dont have Handle on Stack
  • If ß2 e Then Perhaps A ? ß1 is the Reduction
  • However, Reduction Choice may not be Limited to a
    Single Production Rule
  • There may be two or more Valid Items for the Same
    Viable Prefix!
  • Shift/Reduce or Reduce/Reduce Conflicts Possible!

52
How Does this Relate to State Machine?
  • Consider the Viable Prefix ET
  • Each State in Machine Represents a Set of One or
    More Items
  • Specifically, for ET, we end up in State I7 if
    you Follow the Transitions of the State Machine

53
Consider the State
  • Item Set is with three possible derivations
  • Which do you Choose? Why?

T ? T . F F ? . ( E ) F ? . Id
E ? E ? E T ? E T F
E ? E ? E T ? E T F ? E T ( E )
E ? E ? E T ? E T F ? E T id
54
End Result of Process?
  • Machine that Contains
  • All Item Set States
  • Transitions Between States on
  • Terminals
  • Non-Terminals
  • What do we need this for?
  • To Construct the Parsing Table!

55
Whats Next Step?
  • Constructing SLR Parsing table
  • actionstate,symbol
  • gotostate,symbol
  • Easy Part of this Process
  • Determining shift actions
  • Examine Machine for all terminal transitions
  • These are shifts from one state to next
  • Push both the terminal and state onto parsing
    stack
  • More Difficult Part of this Process
  • Reductions are Items with . at End of Item
  • Two Questions
  • What is the input that Determines Correct
    Reduction?
  • What is the state to push onto Stack?

56
Recall First and Follow Calculations
  • Recall the Grammar
  • First (E) First (E) First (T) (, id
  • Follow (E)
  • Follow (E)First( T ), First( ) ), First
    (), ),
  • Follow (T)Follow (E), First (F) , ), ,
  • Follow (F) Follow(T) , ), ,

4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id
1 E ? E 2 E ? E T 3 ? T
57
Return to Item Sets
  • Suppose an Item Set Contains the Item A ? a .
  • When Reach this Item it is Time to Reduce and
    Replace a on the Stack with A
  • However, What is the Input under which this
    Reduction is Allowed to Occur?
  • Want to Replace a with A
  • Reading some current input x
  • Only Do the Reduction if x in Follow (A)
  • Consider Two Reductions in a Same Item Set A ?
    a . and B ? a . and current input x
  • If x in Follow (A), reduce using A ? a
  • If x in Follow (B), reduce using B ? a
  • If x in both, Reduce/Reduce Error!
  • Well See Two Examples Shortly

58
Back to Item Sets/State Machine
  • RED underlines are all shifts with associated
    gotos
  • BLUE circles are all gotos for non-terminals
  • GREEN underlines are all reductions
  • Reductions are based on Follow

59
Action and goto tables
  • Action contains shifts, reduction, and accept
    (green)
  • All other states are error states
  • Goto contains the next state to shift onto the
    stack

State id ( ) E T F
0 5 4 1 2 3
1 6
2 7
3
4 5 4 8 2 3
5
6 5 4 9 3
7 5 4 10
8 6 11
9 7
10
11
State id ( ) E T F
0 S S S S S
1 S
2 R2 S R2 R2
3 R4 R4 R4 R4
4 S S S S S
5 R6 R6 R6 R6
6 S S S S
7 S S S
8 S S
9 R1 S R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5
60
Formal Algorithms
  • To Calculate the Parsing Table, we Require Three
    Algorithms
  • State closure
  • To compute the set of productions in a given
    state
  • Transition function
  • To compute the states reachable from a given
    state
  • Items
  • To compute the set of states in the PDA
  • Algorithms from Prof. Michel

61
State Closure Algorithm
function closure(setItem I)
setItem setItem J0 I repeat Ji1
Ji for each A?a.Bß in Ji and
each B?? in P s.t. B?.? in Ji Ji1
Ji1 ? B ? .? i i 1 until Ji
Ji-1 return Ji
62
GOTO Function
function GOTO (setItem s,symbol X)
setItem setItem J e for each c in
s if c of the form A?a.Xß J J ? A?aX.ß
return closure(J)
63
All State Functions (set-of-items)
function items(Grammar G) setState
setState C0 closure(S ?.S) i
0 repeat Ci1 Ci for each S in Ci and
each symbol X in G Z goto(S,X) if Z ? e
AND Z in Ci then Ci1 Ci1 ? Z i i
1 until Ci Ci-1 return Ci
64
Using Ambiguous Grammars
  • Ambiguous Grammars will Cause Multiple Entries
    for a given state/terminal in Parsing Table
  • Results in Two Types of Conflicts
  • Shift/Reduce Conflicts
  • Reduce/Reduce Conflicts
  • Compiler Writing Tools (Yacc, Bison, etc.)
    Automatically Resolve these by
  • For Shift/Reduce chooses Shift
  • For Reduce/Reduce Reduce by earlier rule
  • Consider Two Examples
  • Dangling Else
  • Simplified Expression Grammar

65
Dangling Else Ambiguity
  • Recall the Grammar stmt ? if expr then stmt else
    stmt if expr then stmt
    other
  • Rewrite the Grammar as s ? i s e s i s
    a Essentially collapsing expr then
    stmt into s and with a representing all
    other statements
  • Now Compute LR(0) Items and SLR Parsing Table

66
The Item Sets for the Grammar
Follow(s) Follow(s), e
s
I0 s ? .s s ? . i s e s s ? . i
s s ? . a
I1 s ? s .
i
I4 s ? i s . e s s ? i s .
I2 s ? i . s e s s ? i . s s ? . i
s e s s ? . i s s ? . a
s
a
a
e
i
I3 s ? a .
a
I5 s ? i s e . s s ? . i s e s s ? .
i s s ? . a
s
I6 s ? i s e s .
67
The Parsing table
Follow(s) Follow(s), e
Rules s ? i s e s s ? i s s ? a
  • Notice s/r conflict for action4,e if ltexprgt
    then ltstmtgt else ltstmtgt
  • If shift on else what is the result w.r.t.
    language?
  • If reduce else on what is the result w.r.t
    language?

68
Solution to Dangling Else
  • Pick Shift over Reduce action4, e s5
  • Consider input iiaea which is equivalent to if
    ltexprgt then if ltexprgt then ltstmtgt
    else ltstmtgt
  • Parser as follows w.r.t. stack/input
  • Using this approach, we eliminate the need for a
    more complex unambiguous grammar with more rules

. ea shift e .e a shift a .e...a
reduce using s ? a .e reduce
using s ? i s e s ..i.. reduce using s
? i s accept
69
Example 2 Simplified Expression Grammar
  • Consider the Grammar
  • E ? E E E E ( E ) id
  • Whats Problem with this Grammar?
  • Why would this Grammar be Preferable?
  • Employ Techniques Similar to Previous Example to
    Remove Multiple Table Entries
  • Result is to Achieve both Associative and
    Precedence Behavior for and
  • Change Assoc/Precedence by Changing Table
  • No more Extra Work ? Improve Performance

70
First, Calculate Item Sets
Follow(E) Follow(E), , , )
I0 E ? .E E ? . E E E ? . E
E E ? . (E) E ? . id
I1 E ?E. E ? E . E E ? E . E
I3 E ? id .
I4 E ? E . E E ? . E E E ? . E
E E ? . (E) E ? . id
I5 E ? E . E E ? . E E E ? . E
E E ? . (E) E ? . id
I2 E ? (.E) E ? . E E E ? . E E
E ? . (E) E ? . id
I6 E ? (E.) E ? E . E E ? E . E
I7 E ?E E. E ? E . E E ? E .
E
I8 E ?E E. E ? E . E E ? E .
E
I9 E ? (E).
71
Consider States I7 and I8
  • State I7 E ?E E. action7, reduce by E ?
    E E action7, reduce by E ? E E
    action7,) reduce by E ? E E
    action7, reduce by E ? E E E ? E .
    E action7, shift to state 4 E ? E . E
    action7, shift to state 5
  • State I8 action7, reduce by E ? E E or
    shift to state 4 action7, reduce by E ? E
    E or shift to state 5
  • How is Each Conflict Resolve?

72
Parsing Table
Rules 1 E ? .E 2 E ? . E E 3 E ? . E E 4 E
? . (E) 5 E ? . id
is left assoc
Shift onto stack since it has higher
precedence
Reduce using rule 2 regardless of or
73
Canonical Parser Table Construction
  • Not all Parser Tables are Created Equally!
  • Differentiate between SLR/LR(0), LR(1), and
    LALR(1) (Yacc/Bison)
  • Key Issue Utilization of Lookaheads
  • SLR Current Input
  • LR(1) Current Input plus Next Token
  • LR(k) Current Input plus Next k Tokens
  • Consider id id id

LR(1) id determines if shift or reduce
2nd token () determines rule if conflict, 2nd
token can break tie on the fly dis-ambiguity
sometimes s, sometimes r depends on that
2nd toek
SLR/LR(0) Current Input
74
Recall the Prior Grammar
  • Item set I0 as given below left
  • For LR(1) items, we must consider basis on which
    the rule causes a shift on a lookahead terminal
  • When we put E? . E into LR(1) set, we must also
    consider the first terminal that appears after E
  • This is the lookahead

LR(0) E ? . E E ? . E T E ? . T T ? . T
F T ? . F F ? . ( E ) F ? . Id
Step 1 LR(1) E ? . E, E ? . E T, E ? .
T,
Step 2 LR(1) E ? . E, E ? . E T, / E ?
. T,
Step 3 LR(1) E ? . E, E ? . E T, / E ?
. T, /
What appear after E in 2nd Item?
If it appears after E, what else does it appear
after?
75
Another Way to View Process
  • ClosureE? E begins with placing E ? . E,
    into the item set
  • Since E ? E T, we place E? . E T, into
    item set carrying along lookahead from E? .
    E,
  • Now, for E? . E T, what can E on right hand
    side be replaced with? E ? E T again!
  • If we do this replacement, we need to ask what is
    the lookahead that follows E on r.h.s. in E ? E
    T ?
  • We calculate First (T) the remainder of the rule
  • This is so we add in this additional lookahead

E ? . E, E ? . E T, E ? . E T,
E ? . E, E ? . E T, /
We abbreviate this as
76
Continuing
  • Since E ? T, we add E? . T, / into the Set
  • Now, what does T go to? T ? T F and T? F
  • So we add T ? . T F, / and T? . F , /
    into Set
  • What can T go to? T ? T F
  • What is the First token following T? First (F)
  • So, add in to get T ? . T F, //
  • Since T? F, we also add to yield T? . F ,
    //
  • Are we done?

77
Continuing
  • Since T ? . F, we now consider the two F
    rules F ? ( E ) and F ? Id
  • We add in the items F ? . ( E ), //
  • F ? . Id, // bringing along the lookaheads
    from T? . F , //
  • The lookaheads in this case are First (what
    follows F concatenated with //)
  • This is //!
  • We arrive at item set I0

LR(1) E ? . E, E ? . E T, / E ? . T,
/ T ? . T F , // T ? . F , // F ? . (
E ) , // F ? . Id , //
78
Another Example LR(0) Sets
S ? S S ? CC C ? cC d Follow(S)
Follow(S) Follow(C)c,d,
S
I0 S ? .S S ? . CC C? . cC
C ? . d
I1 S ? S .
C
I2 S ? C.C C? . cC C ? . d
C
I5 S ? CC.
d
c
d
c
I3 C? c.C C? . cC C ? . d
C
I4 C ? d .
d
I6 C ? cC .
c
79
Now Consider LR(1) Sets
Follow(S) Follow(S) Follow(C)c,d,
S
I0 S ? .S, S ? . CC, C? . cC,
c/d C ? . d , c/d
I1 S ? S .,
C
C
I2 S ? C.C, C? . cC, C ? . d,
I5 S ? CC.,
c
d
c
d
I6 C? c.C, C? . cC, C ? . d,
d
I7 C ? d .,
I4 C ? d ., c/d
d
C
c
I3 C? c.C, c/d C? . cC, c/d C ? .
d, c/d
I9 C ? cC .,
C
c
I8 C ? cC ., c/d
80
Parsing Table
  • Easy to Construct from the State Machine
  • Shifts on terminals (arcs)
  • Reductions based on lookaheads
  • Gotos as with SLR case

State action goto c
d S C 0 s3 s4 1 2 1 acc 2 s6 s7 5 3 s
3 s4 8 4 r3 r3 5 r1 6
s6 s7 9 7 r3 8 r2 r2 9 r2
81
Whats Real Problem Here?
  • Grammar we used with 3 Production Rules
  • Result was 10 LR(1) states!
  • For Expression Grammar (slide 58), LR(1) would
    have 22 states!
  • Lookahead LR Parsing (LALR), on which Compiler
    Tools (Yacc, Bison) are Based, Achieve Similar
    Results with Less States
  • Objective is to Create LR(1) Sets
  • Identify Sets with Similar Cores (Items are the
    same but lookaheads may be different)
  • Merge Sets with Similar Cores
  • Factor of 10 in Reduction of States

82
What are the Similar Cores?
S
I0 S ? .S, S ? . CC, C? . cC,
c/d C ? . d , c/d
I1 S ? S .,
C
C
I2 S ? C.C, C? . cC, C ? . d,
I5 S ? CC.,
c
d
c
d
I6 C? c.C, C? . cC, C ? . d,
d
I7 C ? d .,
I4 C ? d ., c/d
d
C
c
I3 C? c.C, c/d C? . cC, c/d C ? .
d, c/d
I9 C ? cC .,
C
c
I8 C ? cC ., c/d
83
Resulting State Machine
S
I0 S ? .S, S ? . CC, C? . cC,
c/d C ? . d , c/d
I1 S ? S .,
C
C
I2 S ? C.C, C? . cC, C ? . d,
I5 S ? CC.,
c
d
d
c
I36 C? c.C, c/d/ C? . cC, c/d/ C
? . d, c/d/
d
I47 C ? d ., /c/d
C
c
I89 C ? cC ., /c/d
84
With Simplified Parsing Table
State action goto c
d S C 0 s36 s47 1 2 1 acc 2 s36 s47 5
36 s36 s47 89 47 r3 r3 r3 5 r1
89 r2 r2 r2
85
Parser Generators
  • The entire process we describe can be automated
  • Computation of the machine states
  • Computation of the lookaheads
  • Computation of the action and goto tables
  • Optimization of the LALR tables.
  • Therefore...
  • Tools exist to do this for you!

86
Parser Generators II
  • In the C/C world
  • Most famous parser generator
  • YACC LALR(1)
  • Most used parser generator
  • BISON LALR(1)
  • Table-driven leftmost
  • PCCTS LL(k)
  • In the Java world
  • Several alternatives
  • CUP (a BISON/YACC lookalike) LALR(1)
  • JACK LALR(1)

87
Big Picture
88
The Road Ahead
  • What are we missing ?
  • A parse tree!
  • How can we get one ?
  • By augmenting the grammar!
  • With actions pieces of Java code
  • Purpose of actions
  • Manufacture the tree as a side-effect of parsing.
  • Reading
  • Syntax directed translation via
  • Attribute Grammars
  • Yacc
About PowerShow.com