MLYACC - PowerPoint PPT Presentation

About This Presentation
Title:

MLYACC

Description:

not as boring as writing a lexer, but not exactly a weekend in the Bahamas ... Input from lexer: E - E. yet to read. We have a shift-reduce conflict. ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 48
Provided by: csPrin
Category:
Tags: mlyacc | lexer

less

Transcript and Presenter's Notes

Title: MLYACC


1
ML-YACC
  • David Walker
  • COS 320

2
Outline
  • Last Week
  • Introduction to Lexing, CFGs, and Parsing
  • Today
  • More parsing
  • automatic parser generation via ML-Yacc
  • Reading Chapter 3 of Appel

3
The Front End
  • Lexical Analysis Create sequence of tokens from
    characters
  • Syntax Analysis Create abstract syntax tree from
    sequence of tokens
  • Type Checking Check program for well-formedness
    constraints

stream of characters
stream of tokens
abstract syntax
Lexer
Parser
Type Checker
4
Parser Implementation
  • Implementation Options
  • Write a Parser from scratch
  • not as boring as writing a lexer, but not exactly
    a weekend in the Bahamas
  • Use a Parser Generator
  • Very general robust. sometimes not quite as
    efficient as hand-written parsers. Nevertheless,
    good for lazy compiler writers.

Parser Specification
5
Parser Implementation
  • Implementation Options
  • Write a Parser from scratch
  • not as boring as writing a lexer, but not exactly
    a weekend in the Bahamas
  • Use a Parser Generator
  • Very general robust. sometimes not quite as
    efficient as hand-written parsers. Nevertheless,
    good for lazy compiler writers.

Parser Specification
Parser
parser generator
6
Parser Implementation
  • Implementation Options
  • Write a Parser from scratch
  • not as boring as writing a lexer, but not exactly
    a weekend in the Bahamas
  • Use a Parser Generator
  • Very general robust. sometimes not quite as
    efficient as hand-written parsers. Nevertheless,
    good for lazy compiler writers.

stream of tokens
Parser Specification
Parser
parser generator
abstract syntax
7
ML-Yacc specification
  • three parts

User Declarations declare values available in
the rule actions ML-Yacc Definitions
declare terminals and non-terminals
special declarations to
resolve conflicts Rules parser specified
by CFG rules and associated semantic
action that generate abstract syntax
8
ML-Yacc declarations (preliminaries)
  • specify type of positions
  • pos int int
  • specify terminal and nonterminal symbols
  • term IF THEN ELSE PLUS MINUS ...
  • nonterm prog exp op
  • specify end-of-parse token
  • eop EOF
  • specify start symbol (by default, non terminal in
    LHS of first rule)
  • start prog

9
Simple ML-Yacc Example
grammar symbols
term NUM PLUS MUL LPAR RPAR nonterm
exp fact base pos int start exp eop
EOF exp fact () fact
PLUS exp () fact base () base MUL
factor () base NUM () LPAR exp RPAR
()
semantic actions (currently do nothing)
grammar rules
10
attribute-grammars
  • ML-Yacc uses an attribute-grammar scheme
  • each nonterminal may have an associated semantic
    value associated with it
  • when the parser reduces the parsing stack using
    rule (X s), a semantic action that uses the
    semantic values from s will be executed
  • when parsing is completed successfully, the
    parser returns the value associated with the
    start symbol

11
attribute-grammars
  • semantic actions typically build the abstract
    syntax for the internal language
  • to use semantic values during parsing, we must
    declare symbol types
  • terminal NUM of int PLUS MUL ...
  • nonterminal exp of int fact of int base of
    int
  • type of semantic action must match type declared
    for LHS nonterminal in rule

12
ML-Yacc with Semantic Actions
grammar symbols with type declarations
term NUM of int PLUS MUL LPAR
RPAR nonterm exp of int fact of int base of
int pos int start exp eop EOF exp fact
(fact) fact PLUS exp (fact
exp) fact base (base) base MUL
base (base1 base2) base NUM (NUM)
LPAR exp RPAR (exp)
computing integer result via semantic actions
grammar rules with semantic actions
13
ML-Yacc with Semantic Actions
datatype exp Int of int Add of exp exp
Mul of exp exp ... exp fact
(fact) fact PLUS exp (Add (fact,
exp)) fact base (base) base MUL
exp (Mul (base, exp)) base NUM (Int NUM)
LPAR exp RPAR (exp)
computing abstract syntax via semantic actions
14
A simpler grammar
datatype exp Int of int Add of exp exp
Mul of exp exp ... exp NUM (Int
NUM) exp PLUS exp (Add (exp1, exp2))
exp MUL exp (Mul (exp1, exp2))
LPAR exp RPAR (exp)
why dont we just use this simpler grammar?
15
A simpler grammar
datatype exp Int of int Add of exp exp
Mul of exp exp ... exp NUM (Int
NUM) exp PLUS exp (Add (exp1, exp2))
exp MUL exp (Mul (exp1, exp2))
LPAR exp RPAR (exp)
this grammar is ambiguous!
E
E

E
E
E

E
NUM
E
E
E
E


NUM
NUM NUM NUM
NUM
NUM
NUM
NUM
16
a simpler grammar
datatype exp Int of int Add of exp exp
Mul of exp exp ... exp NUM (Int
NUM) exp PLUS exp (Add (exp1, exp2))
exp MUL exp (Mul (exp1, exp2))
LPAR exp RPAR (exp)
But it is so clean that it would be nice to use.
Moreover, we know which parse tree we want.
We just need a mechanism to specify it!
E
E

E
E
E

E
NUM
E
E
E
E


NUM
NUM NUM NUM
NUM
NUM
NUM
NUM
17
Recall how LR parsing works
desired parse tree
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

E
E
E

NUM
yet to read
NUM
NUM
Input from lexer
NUM NUM NUM
State of parse so far
E E
elements of desired parse parsed so far
We have a shift-reduce conflict. What should we
do to get the right parse?
18
Recall how LR parsing works
desired parse tree
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

E
E
E

NUM
yet to read
NUM
NUM
Input from lexer
NUM NUM NUM
State of parse so far
E E
elements of desired parse parsed so far
We have a shift-reduce conflict. What should we
do to get the right parse? SHIFT
19
Recall how LR parsing works
desired parse tree
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

E
E
E

NUM
yet to read
NUM
NUM
Input from lexer
NUM NUM NUM
State of parse so far
E E NUM
elements of desired parse parsed so far
SHIFT SHIFT
20
Recall how LR parsing works
desired parse tree
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

E
E
E

NUM
yet to read
NUM
NUM
Input from lexer
NUM NUM NUM
State of parse so far
E E E
elements of desired parse parsed so far
REDUCE
21
Recall how LR parsing works
desired parse tree
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

E
E
E

NUM
yet to read
NUM
NUM
Input from lexer
NUM NUM NUM
State of parse so far
E E
elements of desired parse parsed so far
REDUCE
22
Recall how LR parsing works
desired parse tree
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

E
E
E

NUM
yet to read
NUM
NUM
Input from lexer
NUM NUM NUM
State of parse so far
E
elements of desired parse parsed so far
REDUCE
23
The alternative parse
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

NUM
NUM
yet to read
Input from lexer
NUM NUM NUM
elements parsed so far
State of parse so far
E E
We have a shift-reduce conflict. Suppose we
REDUCE next
24
The alternative parse
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E
E

NUM
NUM
yet to read
Input from lexer
NUM NUM NUM
elements parsed so far
State of parse so far
E
REDUCE
25
The alternative parse
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E

E
NUM
E
E

NUM
NUM
yet to read
Input from lexer
NUM NUM NUM
elements parsed so far
State of parse so far
E E
Now SHIFT SHIFT REDUCE
26
The alternative parse
E
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR

E
E
NUM
E
E

NUM
NUM
yet to read
Input from lexer
NUM NUM NUM
elements parsed so far
State of parse so far
E
REDUCE
27
Summary
desired parse tree
exp NUM exp PLUS exp
exp MUL exp LPAR exp RPAR
E
E

E
E
E

NUM
yet to read
NUM
NUM
Input from lexer
NUM NUM NUM
State of parse so far
E E
elements of desired parse parsed so far
We have a shift-reduce conflict. We have E E on
stack, we see . We want to shift. We ALWAYS want
to shift since has higher precedence than .
28
Example 2
exp NUM exp PLUS exp
exp MUL exp exp MINUS exp
LPAR exp RPAR
E
E
-
NUM
NUM
yet to read
Input from lexer
NUM - NUM - NUM
elements parsed so far
State of parse so far
E - E
We have a shift-reduce conflict. We have E - E on
stack, we see -. What do we do?
29
Example 2
exp NUM exp PLUS exp
exp MUL exp exp MINUS exp
LPAR exp RPAR
E
E
E
-
NUM
NUM
yet to read
Input from lexer
NUM - NUM - NUM
elements parsed so far
State of parse so far
E
We have a shift-reduce conflict. We have E - E on
stack, we see -. What do we do? REDUCE
30
Example 2
exp NUM exp PLUS exp
exp MUL exp exp MINUS exp
LPAR exp RPAR
-
E
E
NUM
E
E
-
NUM
NUM
yet to read
Input from lexer
NUM - NUM - NUM
elements parsed so far
State of parse so far
E - E
SHIFT SHIFT REDUCE
31
Example 2
E
exp NUM exp PLUS exp
exp MUL exp exp MINUS exp
LPAR exp RPAR
-
E
E
NUM
E
E
-
NUM
NUM
yet to read
Input from lexer
NUM - NUM - NUM
elements parsed so far
State of parse so far
E
REDUCE
32
Example 2 Summary
E
exp NUM exp PLUS exp
exp MUL exp exp MINUS exp
LPAR exp RPAR
-
E
E
NUM
E
E
-
NUM
NUM
yet to read
Input from lexer
NUM - NUM - NUM
elements parsed so far
State of parse so far
E
We have a shift-reduce conflict. We have E - E on
stack, we see -. What do we do? REDUCE. We
ALWAYS want to reduce since is left-associative.
33
precedence and associativity
  • three solutions to dealing with operator
    precedence and associativity
  • 1) let Yacc complain.
  • its default choice is to shift when it encounters
    a shift-reduce error
  • programmer intentions unclear harder to debug
    other parts of your grammar generally inelegant
  • 2) rewrite the grammar to eliminate ambiguity
  • can be complicated and less clear
  • 3) use Yacc precedence directives
  • left, right nonassoc

34
precedence and associativity
  • given directives, ML-Yacc assigns precedence to
    each terminal and rule
  • precedence of terminal based on order in which
    associativity is specified
  • precedence of rule is the precedence of the
    right-most terminal
  • eg precedence of (E E E) gt prec()
  • a shift-reduced conflict is resolved as follows
  • prec(terminal) gt prec(rule) gt shift
  • prec(terminal) lt prec(rule) gt reduce
  • prec(terminal) prec(rule) gt
  • assoc(terminal) left gt reduce
  • assoc(terminal) right gt shift
  • assoc(terminal) nonassoc gt report as error

yet to read
....................T E
input terminal T next
........E E
RHS of rule on stack
35
precedence and associativity
datatype exp Int of int Add of exp exp
Sub of exp exp Mul of exp exp Div of exp
exp left PLUS MINUS left MUL
DIV exp NUM (Int NUM) exp PLUS
exp (Add (exp1, exp2)) exp MINUS exp
(Sub (exp1, exp2)) exp MUL exp (Mul
(exp1, exp2)) exp DIV exp (Div (exp1,
exp2)) LPAR exp RPAR (exp)
36
precedence and associativity
precedence directives left PLUS MINUS left
MUL DIV
yet to read
prec(MUL) gt prec(PLUS)
....................MUL E
input terminal T next
...E PLUS E
RHS of rule on stack
37
precedence and associativity
precedence directives left PLUS MINUS left
MUL DIV
yet to read
prec(MUL) gt prec(PLUS)
....................MUL E
input terminal T next
... E PLUS E
RHS of rule on stack
SHIFT
38
precedence and associativity
precedence directives left PLUS MINUS left
MUL DIV
yet to read
prec(PLUS) prec(SUB)
....................SUB E
input terminal T next
...E PLUS E
RHS of rule on stack
39
precedence and associativity
precedence directives left PLUS MINUS left
MUL DIV
yet to read
prec(PLUS) prec(SUB)
....................SUB E
input terminal T next
...E PLUS E
RHS of rule on stack
REDUCE
40
one more example
datatype exp Int of int Add of exp exp
Sub of exp exp Mul of exp exp Div of
exp exp Uminus of exp left PLUS
MINUS left MUL DIV exp NUM (Int NUM)
MINUS exp (Uminus exp) exp PLUS
exp (Add (exp1, exp2)) exp MINUS exp
(Sub (exp1, exp2)) exp MUL exp (Mul
(exp1, exp2)) exp DIV exp (Div (exp1,
exp2)) LPAR exp RPAR (exp)
yet to read
....................MUL E
...MINUS E
what happens?
41
one more example
datatype exp Int of int Add of exp exp
Sub of exp exp Mul of exp exp Div of
exp exp Uminus of exp left PLUS
MINUS left MUL DIV exp NUM (Int NUM)
MINUS exp (Uminus exp) exp PLUS
exp (Add (exp1, exp2)) exp MINUS exp
(Sub (exp1, exp2)) exp MUL exp (Mul
(exp1, exp2)) exp DIV exp (Div (exp1,
exp2)) LPAR exp RPAR (exp)
yet to read
....................MUL E
...MINUS E
what happens? prec() gt prec(-) gt we SHIFT
42
the fix
datatype exp Int of int Add of exp exp
Sub of exp exp Mul of exp exp Div of
exp exp Uminus of exp left PLUS
MINUS left MUL DIV left UMINUS exp NUM
(Int NUM) MINUS exp prec UMINUS
(Uminus exp) exp PLUS exp (Add (exp1,
exp2)) exp MINUS exp (Sub (exp1,
exp2)) exp MUL exp (Mul (exp1, exp2))
exp DIV exp (Div (exp1, exp2))
LPAR exp RPAR (exp)
yet to read
....................MUL E
...MINUS E
43
the fix
datatype exp Int of int Add of exp exp
Sub of exp exp Mul of exp exp Div of
exp exp Uminus of exp left PLUS
MINUS left MUL DIV left UMINUS exp NUM
(Int NUM) MINUS exp prec UMINUS
(Uminus exp) exp PLUS exp (Add (exp1,
exp2)) exp MINUS exp (Sub (exp1,
exp2)) exp MUL exp (Mul (exp1, exp2))
exp DIV exp (Div (exp1, exp2))
LPAR exp RPAR (exp)
yet to read
....................MUL E
...E MINUS E
changing precedence of rule alters
decision prec(-) gt prec() gt we REDUCE
44
the dangling else problem
  • Grammar
  • S if E then S else S
  • S if E then S
  • S ...
  • Consider if a then if b then S else S
  • parse 1 if a then (if b then S else S)
  • parse 2 if a then (if b then S) else S
  • Parser reports shift-reduce error
  • in default behavior shift (what we want)

45
the dangling else problem
  • Grammar
  • S if E then S else S
  • S if E then S
  • S ...
  • Alternative solution is to rewrite grammar
  • S M
  • S U
  • M if E then M else M
  • M ...
  • U if E then S
  • U if E then M else U

46
default behavior of ML-Yacc
  • Shift-Reduce error
  • shift
  • Reduce-Reduce error
  • reduce by first rule
  • generally considered unacceptable
  • for assignment 3, your job is to write a grammar
    for Fun such that there are no conflicts
  • you may use precedence directives tastefully

47
Note To enter ML-Yacc hell, use a parser to
catch type errors
  • when doing assignment 3, your job is to catch
    parse errors
  • there are lots of programming errors that will
    slip by the parser
  • eg 3 true
  • catching these sorts of errors is the job of the
    type checker
  • just as catching program structure errors was the
    job of the parser, not the lexer
  • attempting to do type checking in the parser is
    impossible (in general)
  • why? Hint what does context-free grammar
    imply?
Write a Comment
User Comments (0)
About PowerShow.com