Loading...

PPT – Chapter 4 - Part 3: Bottom-Up Parsing PowerPoint presentation | free to download - id: 684f30-ZDgxY

The Adobe Flash plugin is needed to view this content

Chapter 4 - Part 3 Bottom-Up Parsing

Prof. Steven A. Demurjian Computer Science

Engineering Department The University of

Connecticut 371 Fairfield Way, Unit 2155 Storrs,

CT 06269-3155

steve_at_engr.uconn.edu http//www.engr.uconn.edu/st

eve (860) 486 - 4818

Material for course thanks to Laurent

Michel Aggelos Kiayias Robert LeBarre

Basic Intuition

- Recall that
- LL(k) works
- TOP-DOWN
- With a LEFTMOST Derivation
- Predicts the right production to select based on

lookahead - Our new motto
- LR(k) works
- BOTTOM-UP
- With a RIGHTMOST Derivation
- Commits to the production choice after seeing the

whole body (left hand side), working in reverse

Bottom-Up Parsing

- Inverse or Complement of Top-Down Parsing
- Top Down Parsing Utilizes Start Symbol and

Attempts to Derive the Input String using

Productions - Bottom-Up Parsing Makes Modifications to the

Input String which Allows it to Reduce to Start

Symbol - For Example, Consider Grammar Derivations S ?

a A B e A ? Abc b B ? d - What Does Each Derivation Represent?
- Top-Down ---- Leftmost Derivation
- Bottom-Up ---- Rightmost Derivation in Reverse!

abbcde ? aAbcde ? aAde ? aABe ? S ?

S ? aABe ? aAbcBe ? abbcBe ? abbcde

Type of Derviation

- Grammar S ? a A B e A ? Abc b B ? d
- Key Issues
- How do we Determine which Substring to Reduce?
- How do we Know which Production Rule to Use?
- What is the General Processing for BUP?
- How are Conflicts Resolved?
- What Types of BUP are Considered?

TDP S ? aABe ? aAbcBe ? abbcBe ? abbcde

BUP S ? aABe ? aAde? aAbcde ? abbcde Is a

rightmost derivation that happens in reverse!

What is a Handle?

- Defn A Right-Sentential Form is Sentential Form

that has Been Derived in a Righmost Derivation - S ? aABe ? aAde? aAbcde ? abbcde
- Underline all Right Sentential Forms
- Handle is a Substring of a Right Sentential Form

that - Appears on Right Hand Side of Production Rule
- Can be Used to Reduce the Right Sentential Form

via a Substitution in a Step of a RM Derivation - Formally is a rule A ? ß and position in Right

Sentential Form ? s.t. S ? RM aAw ? RM aßw and

A occurs at ? in aAw - Example Handles are Underlined in
- S ? aABe ? aAde? aAbcde ? abbcde
- Abc is Right hand Side of Rule A ? Abc at

Position 2 in Right Sentential Form ? aAbcde

What is a Handle?

- Consider again...
- S ? aABe ? aAde ? aAbcde ? abbcde

S ? aABe A ? Abc b

B ? d

Handle Pruning

- What bottom-up really means...

abbcde

aAbcde

Handle Pruning

aAbcde

aAde

Handle Pruning

aAde

aABe

Handle Pruning

aABe

S

Whats Going on in Parse Tree?

- Consider Right Sentential Form aßw and Rule A ? ß

S

A

a

What Does a Signify?

w

ß

What Does w Contain?

What Does ß Represent?

Input Processed Still on Parsing Stack

Input yet to be Consumed

Candidate Handle to be Reduced

Bottom-Up Parsing

- Recognized body of last production applied in

rightmost derivation - Replace the symbol sequence of that body by the

RHS of the Production Rule Based on Current

Input - Repeats
- At the end
- Either
- We are left with the start symbol ? Success!
- Or
- We get stuck somewhere ? Syntax error!
- Key Issue If there are Multiple Handles for the

Same Sentential Form, then the Grammar G is

Ambiguous

General Processing of BUP

- Basic mechanisms
- Shift
- Reduce
- Basic data-structure
- A stack of grammar symbols (Terminals and

Non-Terminals) - Basic idea
- Shift input symbols on the stack until ... the

entire handle of the last rightmost reduction - When the body of the last RM reduction is on

Stack, reduce it by replacing the body by the

right-hand-side of the Production Rule - When only start symbol is left
- We are done.

Example

abbcde Shift

a bbcde Shift

ab bcde Reduce

aA bcde Shift

aAb cde Shift

aAbc de Reduce

aA de Shift

aAd e Reduce

aAB e Shift

aABe Reduce

S Accept

Rule to Reduce with

Handle

Example

abbcde Shift

a bbcde Shift

ab bcde Reduce

aA bcde Shift

aAb cde Shift

aAbc de Reduce

aA de Shift

aAd e Reduce

aAB e Shift

aABe Reduce

S Accept

Handle

Rule to Reduce with

Example

abbcde Shift

a bbcde Shift

ab bcde Reduce

aA bcde Shift

aAb cde Shift

aAbc de Reduce

aA de Shift

aAd e Reduce

aAB e Shift

aABe Reduce

S Accept

Handle

Rule to Reduce with

Example

abbcde Shift

a bbcde Shift

ab bcde Reduce

aA bcde Shift

aAb cde Shift

aAbc de Reduce

aA de Shift

aAd e Reduce

aAB e Shift

aABe Reduce

S Accept

Example

abbcde Shift

a bbcde Shift

ab bcde Reduce

aA bcde Shift

aAb cde Shift

aAbc de Reduce

aA de Shift

aAd e Reduce

aAB e Shift

aABe Reduce

S Accept

Key Observation

- At any point in time
- Content of the stack is a prefix of a

right-sentencial form - This prefix is called a viable prefix
- Check again!
- Below all the right-sentencial form of a

rightmost derivation - S ? aABe ? aAde ? aAbcde ?

abbcde

a

ab

aA

aAb

aAbc

aA

aAd

aAB

aABe

S

What is General Processing for BUP?

- Utilize a Stack Implementation
- Contains Symbols, Non-Terminals, and Input
- Input is Examined w.r.t. Stack/Current State
- General Operation Options to Process Stack

Include - Shift Symbols from Input onto Stack
- When Handle ß on Top of Stack
- Reduce by using Rule A ? ß
- Pop all Symbols of Handle ß
- Push Non-Terminal A onto Stack
- When Configuration (S, ) of Stack, ACCEPT
- Error Occurs when Handle Cant be Found or S is

on Stack with Non-Empty Input

Consider the Example Below

What are Possible Grammar Conflicts?

- Shift-Reduce (S/R) Conflict
- Content of Stack and Reading Current Input
- More than One Option of What to do Next
- stmt ? if expr then stmt if expr

then stmt else stmt other Consider

Stack as below with input of token else . if

expr then stmt - Do we Reduce if expr then stmt to stmt
- Do we Shift else onto Stack?

What are Possible Grammar Conflicts?

- Reduce-Reduce (R/R) Conflict
- stmt ? id ( parameter_list ) parameter_list ?

parameter_list, parameter - parameter ? id
- expr ? id ( expression_list ) id
- expression_list ? expression_list, expr
- expr
- Consider Stack as below with input of token .

id (id, , id) . - Do we Reduce to stmt?
- Do we Reduce to expr?

Bottom-Up Parsing Techniques

- LR(k) Parsers
- Left to Right Input Scanning (L)
- Construct a Rightmost Derivation in Reverse (R)
- Use k Lookahead Symbols for Decisions
- Advantages
- Well Suited to Almost All PLs
- Most General Approach/Efficiently Implemented
- Detects Syntax Errors Very Quickly
- Disadvantages
- Difficult to Build by Hand
- Tools to Assist Parser Construction (Yacc, Bison)

Components of an LR Parser

Table Generator

Grammar

Parsing Table

Driver Routines

Parsing Table

Output Parse Tree

Input Tokens

Differs Based on Grammar/Lookaheads

Common to all LR Parsers

Three Classes of LR Parsers

- Simple LR (SLR) or LR(0)
- Easiest but Limited in Grammar Applicability
- Grammar Leads to S/R and R/R Conflicts
- Canonical LR
- Powerful but Expensive
- LR(k) Usually LR(1)
- Lookahead LR (LALR) In Between Two
- Two Fold Focus
- Parser Table Construction Item and Item Sets
- Examination of LR Parsing Algorithm

LR Parser Structure

a1 ... ai ai1 ... an

sm

Xm

sm-1

Xm-1

X1

s0

INPUT

(s0 X1 s1 X2 ... Xm-1 sm-1 Xm sm , ai ai1 ... an

)

O U T P U T

Grammar symbol (Terminal or non-terminal)

LR Parsing Program

state

action goto

- actionsm , ai is Parsing Table with Four

Options 1. Shift S onto Stack 2. Reduce by

Rule 3. Accept (,) 4. Report an Error - gotosm , ai determines next state for action
- Question What does following Represent?

X1 X2 ... Xm-1 Xm ai ai1 ... an

What is the Parsing Table?

- Combination of State, Action, and Goto
- Shift s5 means shift input symbol and state 5
- Reduce r2 means reduce using rule 2
- goto state/NT indicates the next state

Actions Against Configuration

Configuration (s0 X1 s1 X2 ... Xm-1 sm-1 Xm sm ,

ai ai1 ... an )

- actionsm , ai
- Shift s in Parsing Table Move aism1 to Stack

(s0 X1 s1 X2 ... Xm-1 sm-1 Xm sm ai sm1 , ai1

... an ) - Reduce A ? ß means
- Remove 2 ß symbols from stack and Push A along

with state s gotosm-1 , A onto stack - Uses Prior State after popping to determine goto
- Accept Parsing Complete
- Error Call recovery Routine

How Does BUP Work?

Stack Input Action

Another Detailed Example

Constructing Parsing Tables

- Three Types of Parsers (SLR, Canonical, LALR) all

have Shared Concept for Parsing Table

Construction - An Item Characterizes for Each Grammar Rule
- What weve Seen or Derived
- What weve Yet to See or Derive
- Consider the Grammar Rule E ? E T
- There are Four Items for this Rule E ? . E T E

? E . T E ? E . T E ? E T . - E . T Means weve Derived E and have yet to

Derive T, so we are Expecting Next - Note A ? e has Item A ? .

____.____ Has To Be Been Seen/ Seen/

Derived Derived

Another Characterization of Items

- Consider the Grammar Rule E ? E T
- There are Four Items for this Rule E ? . E T E

? E . T E ? E . T E ? E T . - This Represents Summary of History of Parse
- Each Item Refers to
- Whats Been Placed on Stack (Left of .)
- What Remains to Reduce for a Rule (Right of .)

on

stack left to derive/reduce

Seen a string derived from E Looking for

String Derivable from T Found input

through the Yet to process input for T

Start with SLR Parsing Table Construction

- Step 1 Construct an Augmented Grammar which has

a Single Alternative/Production Rule - Now, Every Derivation Starts with the Production

Rule E ? E

Augmented E ? E E ? E T E ? T T ? T

F T ? F F ? ( E ) F ? Id

Original E ? E T E ? T T ? T F T ? F F

? ( E ) F ? Id

Start with SLR Parsing Table Construction

- Step 2 Construct the Closure of All Items
- Intuitively, if A ? a . B ß is in Closure, we

would Expect to see B ß at Some Point in

Derivation - If B ? ? is a Production Rule, Expect to see a

Substring Derivable from ? in Future - Step 3 Compute the GOTO (Item_Set, X), where X

is a Grammar Symbol - Intuitively, Identifies Which Items are Valide

for Viable Prefix ? - Utilized to Determine Next Action (State) for the

Parser - Note Different from goto as Previously

Discussed!

Calculating Closure

- Closure (I) where I is Set of Items
- All Items in I are in Closure (I)
- If A ? a . B ß in Closure (I) and B ? ? is a

Production Rule, then Add B ? . ? to Closure

(I) - Repeat Step 2 Until there are No New Items Added
- I0 Closure (E ? . E) --- Add in Following

Items E ? . E - Rule 1 - Any Rules E ? ? - Yes - E ? . E T - Rule 2 E ? . T - Rule 3 -

Any Rules T ? ? - Yes - T ? . T F - Rule 4 T ? . F - Rule 5 -

Any Rules T ? ? - Yes F ? . ( E ) - Rule

6 F ? . id - Rule 7

Whats Next Step?

- Recall the Parsing Table
- States are 0, 1, 2, 11 which Correspond to

Item Sets - actions based on Input and Current State
- goto is What State to Transition to Next
- This is a Push Down Automata!
- What are Three Critical Functions to Calculate?
- State closure
- To compute the set of productions in a given

state - Transition function
- To compute the states reachable from a given

state - Items
- To compute the set of states in the PDA

What is Important Part of Process?

- Viable Prefix Definition
- (1) a string that equals a prefix of a

right-sentential form up to (and including) its

unique handle. - (2) any prefix of a string that satisfies (1)
- Essentially a subset of a right-sentential form
- May be inclusive of entire handle (right hand

side of a production rule) - Examples of Viable Prefixes are
- a, aA, aAd, aAbc, ab, aAb,
- Not viable prefixes aAde, Abc, aAA,

What is The Big Deal ?

- Consider the stack again
- Each Element of Stack Represents a right

sentential form - They are all Viable Prefixes
- When Parsing, two Alternatives
- lengthening a viable prefix
- pruning a handle
- In other words...
- States represent viable prefixes
- We transition between viable prefixes!

a

ab

aA

aAb

aAbc

aA

aAd

aAB

aABe

S

Answer We are either -

Intuition for this Process

- Objective
- Turn a Grammar into a PDA
- We want
- A PDA
- With states the capture viable prefixes
- We have
- A grammar
- With production rules
- We know that
- Production rules are used to derive handles
- Viable prefixes are (strings) prefixes of handles

Example

- Consider augmented grammar given below.
- Assume that
- We start the parsing (with E) and therefore
- We are at the initial state of the PDA
- We have some input (e.g., id id id)
- Questions
- Which productions are activated at this point ?
- In other words, which productions could be used

to match the rest of the input ?

1 E ? E 2 E ? E T 3 ? T

4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id

Example II

- Consider the Derivation Given Below
- In Example, Production Rules 1,2,3,5,7 are

active and utilized to lead to the viable

prefix id

4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id

1 E ? E 2 E ? E T 3 ? T

E ? E by (1) ? E T by (2) ? T T by

(3) ? F T by (5) ? id T by

(7) ....

PDA State (Closure(E ? E )

- A PDA State is...
- The set of productions that are active in the

state - Question
- How do we compute that from G ?

4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id

1 E ? E 2 E ? E T 3 ? T

State I0

E ? . E E ? . E T

E ? . E

E ? . E E ? . E T E ? . T

E ? . E E ? . E T E ? . T T ? . T F T ?

. F

E ? . E E ? . E T E ? . T T ? . T F T ?

. F F ? . ( E ) F ? . Id

PDA Transition

- How can we leave state I0 ?
- What does it mean to leave I0 ?
- Terminals means that weve Consumed the

terminal from the input stream - Non-terminals means that we have pushed onto

the stack the non-terminal, input, and states

that will allow for a future reduction

State I0

E? . E E ? . E T E ? . T T ? . T F T ? .

F F ? . ( E ) F ? . Id

This defines the GOTO Function!

The GOTO Function

- GOTO(I, X) is Defined for
- An item set I
- A grammar symbol (non-terminal or terminal) X
- GOTO(I, X) items A ? a X . ? where A ? a .

X ß in I - Algorithmically
- Look for Rules of Form A ? a . X ß
- Identify the Grammar Symbols in I to Right of .
- Group all A ? a . X ß with Same X to Form a New

State - Compute the Closure of the New State for All X
- This leads to

Destination states

State I0

E ? . E E ? . E T E ? . T T ? . T F T ?

. F F ? . ( E ) F ? . Id

Destination states

State I0

E ? . E E ? . E T E ? . T T ? . T F T ?

. F F ? . ( E ) F ? . Id

- For GOTO(I0, ( ) we compute Closure(F? ( . E )

) - Since E? E T and E?T, include E? . E T, E ?

. T - Since T? T F and T?F, include T? . T F, T ?

. F - Since F? ( E ) and F? Id, include F? . ( E ) ,

F ? . Id - Now, compute GOTO(I1, X ) for X E, T, F, ( , Id

What Does it Mean when . at End of Rule?

State I0

E ? . E E ? . E T E ? . T T ? . T F T ?

. F F ? . ( E ) F ? . Id

- For the Three States above, the . Occurs at

the end of an Item - E? T . and T? F . and F? id .
- Each if these is a Reduction to Replace
- T by E on Stack
- T by F on Stack
- F by id on Stack

How is this Interpreted

State I0

E ? . E E ? . E T E ? . T T ? . T F T ?

. F F ? . ( E ) F ? . Id

- Represents the Possible Next Steps in a

Derivation - Consider Symbol Directly to Right of .
- That is what we Expect to see Next in a

Derivation - For two Rules, we Expect to See E
- Move . to Right to Consume E for Both

Production Rules - Weve Seen E
- We expect to see What Follows . Next
- Now, Compute Closure(E? . E , E? . E T)

State I1

E? . E E? . E T

Continue Process to Yield

- The State Machine also Represents Viable Prefixes
- Possible Combinations that appear on Parsing Stack

Viable Prefixes and Valid Items

- Consider a Derivation
- Let a ß1 be a Viable . Prefix
- A ? ß1 . ß2 is Valid Item if the above derivation

exists - When a ß1 is on the Parsing Stack Two Cases
- If ß2 ? e Then we Dont have Handle on Stack
- If ß2 e Then Perhaps A ? ß1 is the Reduction
- However, Reduction Choice may not be Limited to a

Single Production Rule - There may be two or more Valid Items for the Same

Viable Prefix! - Shift/Reduce or Reduce/Reduce Conflicts Possible!

How Does this Relate to State Machine?

- Consider the Viable Prefix ET
- Each State in Machine Represents a Set of One or

More Items - Specifically, for ET, we end up in State I7 if

you Follow the Transitions of the State Machine

Consider the State

- Item Set is with three possible derivations
- Which do you Choose? Why?

T ? T . F F ? . ( E ) F ? . Id

E ? E ? E T ? E T F

E ? E ? E T ? E T F ? E T ( E )

E ? E ? E T ? E T F ? E T id

End Result of Process?

- Machine that Contains
- All Item Set States
- Transitions Between States on
- Terminals
- Non-Terminals
- What do we need this for?
- To Construct the Parsing Table!

Whats Next Step?

- Constructing SLR Parsing table
- actionstate,symbol
- gotostate,symbol
- Easy Part of this Process
- Determining shift actions
- Examine Machine for all terminal transitions
- These are shifts from one state to next
- Push both the terminal and state onto parsing

stack - More Difficult Part of this Process
- Reductions are Items with . at End of Item
- Two Questions
- What is the input that Determines Correct

Reduction? - What is the state to push onto Stack?

Recall First and Follow Calculations

- Recall the Grammar
- First (E) First (E) First (T) (, id
- Follow (E)
- Follow (E)First( T ), First( ) ), First

(), ), - Follow (T)Follow (E), First (F) , ), ,

- Follow (F) Follow(T) , ), ,

4 T ? T F 5 ? F 6 F ? ( E ) 7 ? Id

1 E ? E 2 E ? E T 3 ? T

Return to Item Sets

- Suppose an Item Set Contains the Item A ? a .
- When Reach this Item it is Time to Reduce and

Replace a on the Stack with A - However, What is the Input under which this

Reduction is Allowed to Occur? - Want to Replace a with A
- Reading some current input x
- Only Do the Reduction if x in Follow (A)
- Consider Two Reductions in a Same Item Set A ?

a . and B ? a . and current input x - If x in Follow (A), reduce using A ? a
- If x in Follow (B), reduce using B ? a
- If x in both, Reduce/Reduce Error!
- Well See Two Examples Shortly

Back to Item Sets/State Machine

- RED underlines are all shifts with associated

gotos - BLUE circles are all gotos for non-terminals
- GREEN underlines are all reductions
- Reductions are based on Follow

Action and goto tables

- Action contains shifts, reduction, and accept

(green) - All other states are error states

- Goto contains the next state to shift onto the

stack

State id ( ) E T F

0 5 4 1 2 3

1 6

2 7

3

4 5 4 8 2 3

5

6 5 4 9 3

7 5 4 10

8 6 11

9 7

10

11

State id ( ) E T F

0 S S S S S

1 S

2 R2 S R2 R2

3 R4 R4 R4 R4

4 S S S S S

5 R6 R6 R6 R6

6 S S S S

7 S S S

8 S S

9 R1 S R1 R1

10 R3 R3 R3 R3

11 R5 R5 R5 R5

Formal Algorithms

- To Calculate the Parsing Table, we Require Three

Algorithms - State closure
- To compute the set of productions in a given

state - Transition function
- To compute the states reachable from a given

state - Items
- To compute the set of states in the PDA
- Algorithms from Prof. Michel

State Closure Algorithm

function closure(setItem I)

setItem setItem J0 I repeat Ji1

Ji for each A?a.Bß in Ji and

each B?? in P s.t. B?.? in Ji Ji1

Ji1 ? B ? .? i i 1 until Ji

Ji-1 return Ji

GOTO Function

function GOTO (setItem s,symbol X)

setItem setItem J e for each c in

s if c of the form A?a.Xß J J ? A?aX.ß

return closure(J)

All State Functions (set-of-items)

function items(Grammar G) setState

setState C0 closure(S ?.S) i

0 repeat Ci1 Ci for each S in Ci and

each symbol X in G Z goto(S,X) if Z ? e

AND Z in Ci then Ci1 Ci1 ? Z i i

1 until Ci Ci-1 return Ci

Using Ambiguous Grammars

- Ambiguous Grammars will Cause Multiple Entries

for a given state/terminal in Parsing Table - Results in Two Types of Conflicts
- Shift/Reduce Conflicts
- Reduce/Reduce Conflicts
- Compiler Writing Tools (Yacc, Bison, etc.)

Automatically Resolve these by - For Shift/Reduce chooses Shift
- For Reduce/Reduce Reduce by earlier rule
- Consider Two Examples
- Dangling Else
- Simplified Expression Grammar

Dangling Else Ambiguity

- Recall the Grammar stmt ? if expr then stmt else

stmt if expr then stmt

other - Rewrite the Grammar as s ? i s e s i s

a Essentially collapsing expr then

stmt into s and with a representing all

other statements - Now Compute LR(0) Items and SLR Parsing Table

The Item Sets for the Grammar

Follow(s) Follow(s), e

s

I0 s ? .s s ? . i s e s s ? . i

s s ? . a

I1 s ? s .

i

I4 s ? i s . e s s ? i s .

I2 s ? i . s e s s ? i . s s ? . i

s e s s ? . i s s ? . a

s

a

a

e

i

I3 s ? a .

a

I5 s ? i s e . s s ? . i s e s s ? .

i s s ? . a

s

I6 s ? i s e s .

The Parsing table

Follow(s) Follow(s), e

Rules s ? i s e s s ? i s s ? a

- Notice s/r conflict for action4,e if ltexprgt

then ltstmtgt else ltstmtgt - If shift on else what is the result w.r.t.

language? - If reduce else on what is the result w.r.t

language?

Solution to Dangling Else

- Pick Shift over Reduce action4, e s5
- Consider input iiaea which is equivalent to if

ltexprgt then if ltexprgt then ltstmtgt

else ltstmtgt - Parser as follows w.r.t. stack/input
- Using this approach, we eliminate the need for a

more complex unambiguous grammar with more rules

. ea shift e .e a shift a .e...a

reduce using s ? a .e reduce

using s ? i s e s ..i.. reduce using s

? i s accept

Example 2 Simplified Expression Grammar

- Consider the Grammar
- E ? E E E E ( E ) id
- Whats Problem with this Grammar?
- Why would this Grammar be Preferable?
- Employ Techniques Similar to Previous Example to

Remove Multiple Table Entries - Result is to Achieve both Associative and

Precedence Behavior for and - Change Assoc/Precedence by Changing Table
- No more Extra Work ? Improve Performance

First, Calculate Item Sets

Follow(E) Follow(E), , , )

I0 E ? .E E ? . E E E ? . E

E E ? . (E) E ? . id

I1 E ?E. E ? E . E E ? E . E

I3 E ? id .

I4 E ? E . E E ? . E E E ? . E

E E ? . (E) E ? . id

I5 E ? E . E E ? . E E E ? . E

E E ? . (E) E ? . id

I2 E ? (.E) E ? . E E E ? . E E

E ? . (E) E ? . id

I6 E ? (E.) E ? E . E E ? E . E

I7 E ?E E. E ? E . E E ? E .

E

I8 E ?E E. E ? E . E E ? E .

E

I9 E ? (E).

Consider States I7 and I8

- State I7 E ?E E. action7, reduce by E ?

E E action7, reduce by E ? E E

action7,) reduce by E ? E E

action7, reduce by E ? E E E ? E .

E action7, shift to state 4 E ? E . E

action7, shift to state 5 - State I8 action7, reduce by E ? E E or

shift to state 4 action7, reduce by E ? E

E or shift to state 5 - How is Each Conflict Resolve?

Parsing Table

Rules 1 E ? .E 2 E ? . E E 3 E ? . E E 4 E

? . (E) 5 E ? . id

is left assoc

Shift onto stack since it has higher

precedence

Reduce using rule 2 regardless of or

Canonical Parser Table Construction

- Not all Parser Tables are Created Equally!
- Differentiate between SLR/LR(0), LR(1), and

LALR(1) (Yacc/Bison) - Key Issue Utilization of Lookaheads
- SLR Current Input
- LR(1) Current Input plus Next Token
- LR(k) Current Input plus Next k Tokens
- Consider id id id

LR(1) id determines if shift or reduce

2nd token () determines rule if conflict, 2nd

token can break tie on the fly dis-ambiguity

sometimes s, sometimes r depends on that

2nd toek

SLR/LR(0) Current Input

Recall the Prior Grammar

- Item set I0 as given below left
- For LR(1) items, we must consider basis on which

the rule causes a shift on a lookahead terminal - When we put E? . E into LR(1) set, we must also

consider the first terminal that appears after E - This is the lookahead

LR(0) E ? . E E ? . E T E ? . T T ? . T

F T ? . F F ? . ( E ) F ? . Id

Step 1 LR(1) E ? . E, E ? . E T, E ? .

T,

Step 2 LR(1) E ? . E, E ? . E T, / E ?

. T,

Step 3 LR(1) E ? . E, E ? . E T, / E ?

. T, /

What appear after E in 2nd Item?

If it appears after E, what else does it appear

after?

Another Way to View Process

- ClosureE? E begins with placing E ? . E,

into the item set - Since E ? E T, we place E? . E T, into

item set carrying along lookahead from E? .

E, - Now, for E? . E T, what can E on right hand

side be replaced with? E ? E T again! - If we do this replacement, we need to ask what is

the lookahead that follows E on r.h.s. in E ? E

T ? - We calculate First (T) the remainder of the rule
- This is so we add in this additional lookahead

E ? . E, E ? . E T, E ? . E T,

E ? . E, E ? . E T, /

We abbreviate this as

Continuing

- Since E ? T, we add E? . T, / into the Set
- Now, what does T go to? T ? T F and T? F
- So we add T ? . T F, / and T? . F , /

into Set - What can T go to? T ? T F
- What is the First token following T? First (F)

- So, add in to get T ? . T F, //
- Since T? F, we also add to yield T? . F ,

// - Are we done?

Continuing

- Since T ? . F, we now consider the two F

rules F ? ( E ) and F ? Id - We add in the items F ? . ( E ), //
- F ? . Id, // bringing along the lookaheads

from T? . F , // - The lookaheads in this case are First (what

follows F concatenated with //) - This is //!
- We arrive at item set I0

LR(1) E ? . E, E ? . E T, / E ? . T,

/ T ? . T F , // T ? . F , // F ? . (

E ) , // F ? . Id , //

Another Example LR(0) Sets

S ? S S ? CC C ? cC d Follow(S)

Follow(S) Follow(C)c,d,

S

I0 S ? .S S ? . CC C? . cC

C ? . d

I1 S ? S .

C

I2 S ? C.C C? . cC C ? . d

C

I5 S ? CC.

d

c

d

c

I3 C? c.C C? . cC C ? . d

C

I4 C ? d .

d

I6 C ? cC .

c

Now Consider LR(1) Sets

Follow(S) Follow(S) Follow(C)c,d,

S

I0 S ? .S, S ? . CC, C? . cC,

c/d C ? . d , c/d

I1 S ? S .,

C

C

I2 S ? C.C, C? . cC, C ? . d,

I5 S ? CC.,

c

d

c

d

I6 C? c.C, C? . cC, C ? . d,

d

I7 C ? d .,

I4 C ? d ., c/d

d

C

c

I3 C? c.C, c/d C? . cC, c/d C ? .

d, c/d

I9 C ? cC .,

C

c

I8 C ? cC ., c/d

Parsing Table

- Easy to Construct from the State Machine
- Shifts on terminals (arcs)
- Reductions based on lookaheads
- Gotos as with SLR case

State action goto c

d S C 0 s3 s4 1 2 1 acc 2 s6 s7 5 3 s

3 s4 8 4 r3 r3 5 r1 6

s6 s7 9 7 r3 8 r2 r2 9 r2

Whats Real Problem Here?

- Grammar we used with 3 Production Rules
- Result was 10 LR(1) states!
- For Expression Grammar (slide 58), LR(1) would

have 22 states! - Lookahead LR Parsing (LALR), on which Compiler

Tools (Yacc, Bison) are Based, Achieve Similar

Results with Less States - Objective is to Create LR(1) Sets
- Identify Sets with Similar Cores (Items are the

same but lookaheads may be different) - Merge Sets with Similar Cores
- Factor of 10 in Reduction of States

What are the Similar Cores?

S

I0 S ? .S, S ? . CC, C? . cC,

c/d C ? . d , c/d

I1 S ? S .,

C

C

I2 S ? C.C, C? . cC, C ? . d,

I5 S ? CC.,

c

d

c

d

I6 C? c.C, C? . cC, C ? . d,

d

I7 C ? d .,

I4 C ? d ., c/d

d

C

c

I3 C? c.C, c/d C? . cC, c/d C ? .

d, c/d

I9 C ? cC .,

C

c

I8 C ? cC ., c/d

Resulting State Machine

S

I0 S ? .S, S ? . CC, C? . cC,

c/d C ? . d , c/d

I1 S ? S .,

C

C

I2 S ? C.C, C? . cC, C ? . d,

I5 S ? CC.,

c

d

d

c

I36 C? c.C, c/d/ C? . cC, c/d/ C

? . d, c/d/

d

I47 C ? d ., /c/d

C

c

I89 C ? cC ., /c/d

With Simplified Parsing Table

State action goto c

d S C 0 s36 s47 1 2 1 acc 2 s36 s47 5

36 s36 s47 89 47 r3 r3 r3 5 r1

89 r2 r2 r2

Parser Generators

- The entire process we describe can be automated
- Computation of the machine states
- Computation of the lookaheads
- Computation of the action and goto tables
- Optimization of the LALR tables.
- Therefore...
- Tools exist to do this for you!

Parser Generators II

- In the C/C world
- Most famous parser generator
- YACC LALR(1)
- Most used parser generator
- BISON LALR(1)
- Table-driven leftmost
- PCCTS LL(k)
- In the Java world
- Several alternatives
- CUP (a BISON/YACC lookalike) LALR(1)
- JACK LALR(1)

Big Picture

The Road Ahead

- What are we missing ?
- A parse tree!
- How can we get one ?
- By augmenting the grammar!
- With actions pieces of Java code
- Purpose of actions
- Manufacture the tree as a side-effect of parsing.
- Reading
- Syntax directed translation via
- Attribute Grammars
- Yacc