BNF Grammar - PowerPoint PPT Presentation

About This Presentation
Title:

BNF Grammar

Description:

... mark , any chars (any printable nonalphabetic character except double quote) ... A grammar typically has a 'top level' element ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 23
Provided by: DaveMa56
Category:
Tags: bnf | grammar | printable

less

Transcript and Presenter's Notes

Title: BNF Grammar


1
BNF Grammar
  • Example design of a data structure

2
Form of BNF rules
  • ltBNF rulegt ltnonterminalgt ""
    ltdefinitionsgt
  • ltnonterminalgt "lt" ltwordsgt "gt"
  • ltterminalgt ltwordgt ltpunctuation markgt
    ' " ' ltany charsgt ' " '
  • ltwordsgt ltwordgt ltwordsgt ltwordgt
  • ltwordgt ltlettergt ltwordgt ltlettergt ltwordgt
    ltdigitgt
  • ltdefinitionsgt ltdefinitiongt ltdefinitionsgt
    "" ltdefinitiongt
  • ltdefinitiongt ltemptygt lttermgt
    ltdefinitiongt lttermgt
  • ltemptygt
  • lttermgt ltterminalgt ltnonterminalgt
  • Not defined here (but you know what they are)
    ltlettergt, ltdigitgt, ltpunctuation markgt, ltany
    charsgt (any printable nonalphabetic character
    except double quote)

3
Notes on terminology
  • A grammar typically has a top level element
  • A string is a sentential form if it satisfies
    the syntax of the top level element
  • A string is a sentence if it is a sentential
    form and is composed entirely of terminals
  • A string belongs to a grammar if it satisfies
    the rules of that grammar

4
Uses of a grammar
  • A BNF grammar can be used in two ways
  • To generate strings belonging to the grammar
  • To do this, start with a string containing a
    nonterminalwhile there are still nonterminals
    in the string replace a nonterminal with
    one of its definitions
  • To recognize strings belonging to the grammar
  • This is the way programs are compiled--a program
    is a string belonging to the grammar that defines
    the language
  • Recognition is much harder than generation

5
Generating sentences
  • I want to write a program that reads in a
    grammar, stores it in some appropriate data
    structure, then generates random sentences
    belonging to that grammar
  • I need to decide
  • How to store the grammar
  • What operations to provide on the grammar
  • These decisions are intertwined!
  • How I store the grammar determines what
    operations are easy and efficient (and even
    possible!)

6
Development approaches
  • Bad approachDesign a general representation for
    grammars and a complete set of operations on them
  • Actually, this is a good approach if you are
    writing a general-purpose package for general
    use--for example, for inclusion in the Java API
  • Otherwise, it just makes your program much more
    complex
  • Good approachDecide the operations you need for
    this program, and design a representation for
    grammars that supports these operations
  • Its a nice bonus if the design can later be
    extended for other purposes
  • Remember the Extreme Programming slogan YAGNI
    You aint gonna need it.

7
Requirements and constraints
  • We need to read the grammar in
  • But we dont need to modify it later
  • Any tools for building the grammar structure can
    be private
  • We need to look up the definitions of
    nonterminals
  • We need this because we will need to replace each
    nonterminal with one of its definitions
  • We need to know the top level element of the
    grammar
  • But we can just assume that we know what it is
  • For example, we can insist that the top-level
    element be ltsentencegt

8
First cut
  • public class Grammar implements Iterable
  • ListltStringgt rule
    // a single alternative for a nonterminal
  • ListltListltStringgtgt definition
    // all the rules for one nonterminal
  • MapltString, ListltListltStringgtgtgt grammar //
    rules for all the nonterminals
  • public Grammar() grammar new
    TreeMapltString, ListltStringgtgt()
  • public void addRule(String rule) throws
    IllegalArgumentException
  • public ListltListltStringgtgt getDefinition(String
    nonterminal)
  • public ListltStringgt getOneRule(String
    nonterminal) // random choice
  • public Iterator iterator()
  • public void print()

9
First cut Evaluation
  • Advantages
  • Small, easily learned interface
  • Just one class
  • Can be made to work
  • Disadvantages
  • As designed, ltfoogt bar baz is two rules,
    requiring two calls to addRule hence requires
    caller to do some of the parsing, to separate out
    the left-hand side
  • Requires some fairly complicated use of generics
  • ArrayList implements List (hence is a List), but
    consider ListltListltStringgtgt
    definition makeList()
  • This statement is legal if makeList() returns an
    ArrayListltListltStringgtgt
  • It is not legal if makeList() returns an
    ArrayListltArrayListltStringgtgt

10
Second cut Overview
  • We can eliminate the compound generics by using
    more than one class
  • public class Grammar implements Iterable
    MapltString, Definitiongt grammar // all the
    rules
  • public class Definition ListltRulegt
    definition
  • public class Rule String lhs //
    the definiendum ListltStringgt rhs // the
    definiens

11
Second cut More detail
  • public class Grammar implements Iterable
    MapltString, Definitiongt grammar // rules for all
    the nonterminals public Grammar()
    grammar new TreeMapltString, Definitiongt()
    // constructor public void addRule(String
    rule) throws IllegalArgumentException public
    Definition getDefinition(String nonterminal)
    public Iterator iterator() public void
    print()
  • public class Definition ListltRulegt
    definition // all definitions for some
    unspecified nonterminal Definition()
    // constructor void addRule(Rule rule)
    Rule getOneRule() public String
    toString()
  • public class Rule String lhs //
    the definiendum ListltStringgt rhs //
    the definiens Rule(String text) //
    constructor public String getLeftHandSide()
    public ListltStringgt getRightHandSide()
    public String toString()

12
Second cut Evaluation
  • Advantages
  • Simplifies use of generics
  • Disadvantages
  • Many more methods
  • Definitions are unattached from nonterminal
    being defined
  • This makes it easier to parse definitions
  • Seems a bit unnatural
  • Need to pass the tokenizer around as an
    additional argument
  • Doesnt help with the problem that the caller
    still has to separate out the definiendum from
    the definiens

13
Third (very brief) cut
  • Definition and Rule are basically both just lists
    of strings
  • Why not just have them implement List?
  • Methods to implement
  • public boolean add(Object o)public void add(int
    index, Object element)public boolean
    addAll(Collection c)public boolean addAll(int
    index, Collection c)public void clear() public
    boolean contains(Object o)public boolean
    containsAll(Collection c)public Object get(int
    index)public int indexOf(Object o)public
    boolean isEmpty()public Iterator iterator()
    public int lastIndexOf(Object o)public
    ListIterator listIterator()public ListIterator
    listIterator(int index)public boolean
    remove(Object o)public Object remove(int
    index)public boolean removeAll(Collection
    c)public boolean retainAll(Collection c)public
    Object set(int index, Object element)public int
    size() public List subList(int fromIndex, int
    toIndex)public Object toArray()public
    Object toArray(Object a)

14
Fourth cut, not quite as brief
  • The class AbstractList provides a skeletal
    implementation of the List interface...the
    programmer needs only to extend this class and
    provide implementations for the get(int index)
    and size() methods.
  • I tried this, but...
  • If I dont know how AbstractList is implemented,
    how can I write these methods?
  • No book or API class that I looked at provided
    any clues
  • I may be missing something, but it looks like the
    only thing to do is to look at the source code
    for some of Javas classes (like ArrayList) to
    see how they do it
  • Doable, but too much work!

15
Letting go of a constraint
  • It is good practice to use a more general class
    or interface if you dont need the services of a
    more specific class
  • In this problem, I want to use lists, but I dont
    care whether they are ArrayLists, or LinkedLists,
    or something else
  • Hence, I generally prefer declarations like
    ListltStringgt list new ArrayListltStringgt()
  • In this case, however, trying to do this just
    seems to be the cause of many of the problems
  • What happens if I just make all lists ArrayLists?

16
Fifth (and final) cut
  • public class Grammar
  • MapltString, Definitionsgt grammar // rules for
    all the nonterminals
  • public Grammar() grammar new TreeMapltString,
    Definitionsgt()
  • public void addRule(String rule) throws
    IllegalArgumentException
  • public Definitions getDefinitions(String
    nonterminal)
  • public void print()
  • private void addToGrammar(String lhs,
    SingleDefinition definition)
  • private static boolean isNonterminal(String s)
    return s.startsWith("lt")
  • public class Definitions extends
    ArrayListltSingleDefinitiongt
  • _at_Override public String toString()
  • public class SingleDefinition extends
    ArrayListltStringgt
  • _at_Override public String toString()

17
Explanation I of final BNF API
  • Exampleltunsigned integergt ltdigitgt
    ltunsigned integergt ltdigitgt
  • The above is a rule
  • ltunsigned integergt is the definiendum (the thing
    being defined)
  • ltdigitgt is a single definition of ltunsigned
    integergt
  • ltunsigned integergt ltdigitgt is another single
    definition of ltunsigned integergt
  • So,
  • There is a SingleDefinition consisting of the
    ArrayList "ltdigitgt"
  • Another SingleDefinition consists of the
    ArrayList "ltunsigned integergt", "ltdigitgt"
  • A Definitions object is a list of single
    definitions, in this case "ltdigitgt" ,
    "ltunsigned integergt", "ltdigitgt"
  • A Grammar maps nonterminals onto their
    definitions thus, a grammar containing the above
    rule would include the mapping"ltunsigned
    integergt" ? "ltdigitgt" , "ltunsigned
    integergt", "ltdigitgt"

18
Explanation II of final BNF API
  • A Grammar is a set of mappings from definienda
    (nonterminals) to definitions, along with some
    operations on that set of definitions
  • You can addRule(String rule) to a Grammar
  • The rule is parsed, and an entry made in the map
  • Definitions for a nonterminal may be together, as
    in the above example, or separate
  • ltunsigned integergt ltdigitgt
  • ltunsigned integergt ltunsigned integergt ltdigitgt
  • You can get a list of all the Definitions for a
    given nonterminal
  • You can print the complete Grammar

19
Final version Evaluation
  • Advantages
  • Grammar has one constructor and three public
    methods
  • Definitions and SingleDefinition are just
    ArrayLists, so there are no new methods to learn
  • All rule parsing is consolidated into a single
    public method, addRule(String rule)
  • I was able to come up with more meaningful names
    for classes
  • Disadvantages
  • User has to do a bit more list manipulation in
    particular, choosing a random element from a list
  • This doesnt seem like an appropriate thing to
    have in a grammar, anyway

20
Morals
  • Weeks of programming can save you hours of
    planning.
  • The mistake most programmers make is to use the
    first design that comes to mind
  • This usually can be made to work, but its seldom
    optimal
  • Much as we would like to pretend otherwise,
    programming is an iterative process--we design,
    then try to implement, then change the design,
    then try to implement....
  • TDD (Test-Driven Development) is a lightweight
    (low cost) way to try out a design
  • For example, in my first design, I discovered how
    difficult it was to write tests that used the
    complex generics
  • Consequently, I never even tried to implement
    this first design
  • Morals to take home
  • Be flexible try out more than one design
  • Do TDD

21
Aside Tokenizing the input grammar
  • I wrote a BnfTokenizer class that returns every
    token as a String
  • Nonterminals keep their angle brackets, and may
    be multi-word
  • Double-quoted strings are returned as a single
    token (minus the double quotes)
  • and are returned as single tokens
  • BnfTokenizer uses StreamTokenizer
  • It provides two constructors, BnfTokenizer()
    and BnfTokenizer(String text)
  • And two methods, void tokenize(text) and
    String nextToken()

22
The End
Write a Comment
User Comments (0)
About PowerShow.com