Design Patterns for Recursive Descent Parsing - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Design Patterns for Recursive Descent Parsing

Description:

In multiple rules (branches), replace sequences and tokens with unique non-terminal symbols ... Branches modeled by inheritance ('is-a') Sequences modeled by ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 20
Provided by: scie76
Category:

less

Transcript and Presenter's Notes

Title: Design Patterns for Recursive Descent Parsing


1
Design Patterns for Recursive Descent Parsing
  • Dung Nguyen, Mathias Ricken Stephen Wong
  • Rice University

2
RDP in CS2?
  • Context objects-first intro curriculum which
    already covers
  • Polymorphism
  • Recursion
  • Design patterns (visitors, factories, etc)
  • OOD principles
  • Want good OOP/D example
  • Want a relevant CS topic
  • Recursive Descent Parsing
  • Smooth transitions from simple to complex
    examples, developing abstract model
  • ? change in grammar ? ? change in code

3
The Problem of Teaching RDP
Mutual Recursion!
A complex, isolated, advanced topic for upper
division only
Global Analysis
? ?
New Grammar
New Code
4
Object-Oriented Approach
  • Grammar must drive any processing related to it,
    e.g. parsing.
  • ? Model the grammar first
  • Terminal symbols (tokens)
  • Non-Terminal symbols (incl. start symbol)
  • Rules
  • Driving forces
  • Decouple intelligent tokens from rules ? visitors
    to tokens
  • Extensible system open ended number of tokens ?
    extended visitors

Then Parsing will come!
5
Representing Tokens
  • Intelligent Tokens ? No type checking!
  • Decoupled from processing ? Visitor pattern
  • For LL(1) grammars, in any given situation, the
    token determines the parsing action taken
  • ? Parsing is done by visitors to tokens

6
Processing Tokens with Visitors
Standard Visitor Pattern
Visitor caseA caseB
visits
Token A
calls
visits
calls
Token B
But we want to be able to add an unbounded number
of tokens!
7
Processing Tokens with Visitors
Visitor Pattern modified with Chain-of-Responsibil
ity
Visitor caseA
VisitorA defaultCase
visits
Token A
caseA
calls
delegates to
visits
chain
calls
Token B
visits
VisitorB defaultCase
caseB
caseB
calls
Handles Any Types of Tokens!
8
Modeling an LL(1) Grammar
E ?
E
F
F


E1 ?
F ?
empty

E1
num id
  • Left-Factoring
  • Make grammar predictively parsable

9
Modeling an LL(1) Grammar
E ?
F
E1
E1 ?
empty

E
F ?
num
id

E1a ?
E1a

F ?
num
id
F1 ?
F1
F2 ?
F2
  • In multiple rules (branches), replace sequences
    and tokens with unique non-terminal symbols
  • Branches only contain non-terminals

10
Modeling an LL(1) Grammar
  • Branches modeled by inheritance (is-a)

A ? B C
  • Sequences modeled by composition (has-a)

S ? X Y
11
Object Model of Grammar
E ? F E1 E1 ? empty E1a E1a ? E F ? F1
F2 F1 ? num F2 ? id
Grammar StructureClass Structure
12
Modeling an LL(1) Grammar
No Predictive Parsing Table!
Declarative, not procedural
Model the grammar, not the parsing!
13
Detailed and Global Analysis
Abstract and Local Analysis!
E ?
F
E1
To process E, we must firstknow about F and E1
To process E, we must have the ability to
process F and E1, independent of how either F
or E1 are processed!
E1 ?
empty

E1a
E
E1a ?
E1a
But to process F, we must first know about F1
and F2

F ?
F1
F2
Since parsing is done with visitors to tokens,
all we need to parse E are the visitors to parse
F and E1.
num
F1 ?
F1
but to process F1, we must firstknow about num!
id
F2 ?
But E doesnt know what it takes to make the F
and E1 parsing visitors
The processing of one rule requires deep
knowledge of the whole grammar!
We need abstract construction of the visitors
Or does it??...
Abstract Factories Decouple Rules
14
Factory Model of Parser
E ? F E1 E1 ? empty E1a E1a ? E F ? F1
F2 F1 ? num F2 ? id
Parser StructureFactory Structure Grammar
represented purely with composition
15
Extending the Grammar
  • Adding new tokens and rules
  • Highly localized impact on code
  • No re-computing of prediction tables

16
E ? S E1E1 ? empty E1aE1a ? ES ? P
TP ? (E)T ? F T1T1 ? empty T1aT1a
? SF ? F1 F2F1 ? numF2 ? id
E ? F E1E1 ? empty E1aE1a ? EF ? F1
F2F1 ? numF2 ? id
17
Parser Demo
(If time permits)
We change your grammar in two minutes while you
wait!
gram
18
Automatic Parser Generator
  • No additional theory needed for generalization
  • No fixed-points, FIRST and FOLLOWS sets
  • Kooprey
  • Parser generator BNF ? Java
  • kouprey (noun) a rare short-haired ox (Bos
    sauveli) of forests of Indochina
    (Merriam-Webster Online)
  • Extensions
  • Skip generation of source, create parser at
    runtime

19
Conclusion
  • Simple enough to introduce in CS2 course (_at_Rice
    near end of CS2)
  • Teaches an abstraction of grammars and parsing
  • Reinforces foundational OO principles
  • Abstract representations
  • Abstract construction
  • Decoupled systems
  • Recursion

http///www.exciton.cs.rice.edu/research/sigcse05
Write a Comment
User Comments (0)
About PowerShow.com