Regular Grammars - PowerPoint PPT Presentation

About This Presentation
Title:

Regular Grammars

Description:

Scanning, or Lexical Analysis. Regular Grammars Non-terminals (arbitrary names) Terminals (characters) Productions limited to the following: Non-terminal ::= terminal – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 21
Provided by: Dr231886
Category:

less

Transcript and Presenter's Notes

Title: Regular Grammars


1
Scanning, or Lexical Analysis.
  • Regular Grammars
  • Non-terminals (arbitrary names)
  • Terminals (characters)
  • Productions limited to the following
  • Non-terminal terminal
  • Non-terminal terminal Non-terminal
  • Treat character class (e.g. digit) as terminal
  • Regular grammars cannot count cannot express
    size limits on identifiers, literals
  • Cannot express proper nesting (parentheses)

2
Regular Grammars
  • grammar for real literals with no exponent
  • digit 0 1 2 3 4 5 6 7
    8 9
  • REALVAL digit REALVAL1
  • REALVAL1 digit REALVAL1
    (arbitrary size)
  • REALVAL1 . INTEGERVAL
  • INTEGERVAL digit INTEGERVAL (arbitrary
    size)
  • INTEGERVAL digit
  • Start symbol is ?

3
Regular Expressions
  • RE are defined by an alphabet (terminal symbols)
    and three operations
  • Alternation RE1 RE2
  • Concatenation RE1 RE2
  • Repetition RE (zero or more REs)
  • Language of REs regular grammars
  • Regular expressions are more convenient for some
    applications

4
Finite State Machines or Finite Automata (FSM or
FA)
  • A language defined by a grammar is a (possibly
    infinite) set of strings
  • An automaton is a computation that determines
    whether a given string belongs to a specified
    language
  • A finite state machine (FSM) is an automaton that
    recognize regular languages (regular expressions)
  • Simplest automaton memory is single number
    (state)

5
Specifying an Finite State Machine (FA)
  • A set of labeled states, directed arcs between
    states labeled with character
  • One or more states may be terminal (accepting)
  • Start is a distinguished state
  • Automaton makes transition from state S1 to S2
  • If and only if arc from S1 to S2 is labeled with
    next character in input
  • Token is legal if automaton stops on terminal
    state

6
FA from Grammar
  • One state for each non-terminal
  • A rule of the form
  • Nt1 terminal, generates transition from a
    state to final state
  • A rule of the form
  • Nt1 terminal Nt2
  • Generates transition from state 1 to state 2 on
    an arc labeled by the terminal

7
Graphic representation of FA

8
FA from RE
  • Each RE corresponds to a grammar
  • For all REs
  • A natural translation to FSM exists
  • Alternation often leads to non-deterministic
    machines

9
Deterministic Finite Automata (DFA)
  • For all states S
  • For all characters C
  • There is at most one arc from any state S that is
    labeled with C
  • Easier to implement
  • No backtracking
  • Conventions for DFA
  • Error transitions are not explicitly shown
  • Input symbols that result in the same transition
    are grouped together (this set can even be given
    a name)
  • Still not displayed stopping conditions and
    actions

10
Non-Deterministic Finite Automata (NFA)
  • A non-deterministic FA
  • Has at least one state
  • With two arcs to two distinct states
  • Labeled with the same character
  • Example from start state, a digit can begin an
    integer literal or a real literal
  • Implementation requires backtracking

11
Lookahead Backtracking in NFA
12
Implementation of FA
13
From RE to DFA RE to NFA
14
NFA to DFA
  • There is an algorithm for converting a
    non-deterministic machine to a deterministic one
  • Result may have exponentially more states
  • Intuitively need new states to express
    uncertainty about token int or real
  • Other algorithms for minimizing number of states
    of FSM, for showing equivalence, etc.

15
Example DFA
16
Another view of the same DFA
17
Yet another view of the same DFA
18
State Minimization in DFA
19
TINY DFA
20
Lex for Scanner
  • Lex Conventions for RE
  • Format of a Lex Input File
Write a Comment
User Comments (0)
About PowerShow.com