Minimize the number of states in a DFA presentation

About This Presentation

Transcript and Presenter's Notes

Title: Minimize the number of states in a DFA

1

Minimize the number of states in a DFA
Algorithm (3.6, page 142)
Input a DFA M
output a minimum state DFA M
If some states in M ignore some inputs, add
transitions to a dead state.
Let P All accepting states, All nonaccepting
states
Let P
Loop for each group G in P do
Partition G into subgroups so that s and t (in G)
belong to the same subgroup if and only if each
input a moves s and t to the same state of the
same P-groups
put the new subgroups in P
if (P ! P) P P goto loop
Remove any dead states and unreachable states.

Lex internal
construct an NFA to recognize the sum of all
patterns
convert the NFA to a DFA (record all accepting
states for each individual pattern).
Minimize the DFA (separate distinct accepting
states for the initial pattern).
Simulate the DFA to termination (that is, no
further transitions)
Find the last DFA state entered that holds an
accepting NFA state (this picks the longest
match). If no such state, then it is an invalid
token.

4
Chapter 4 Syntax analysis

token
Rest of front end
Lexical analyzer
Int. code
Parse tree
Source program
parser
Request for token
Symbol table
5

The syntax of a programming language is described
by a context-free grammar (Backus-Naur Form
(BNF)).
A grammar gives a precise syntactic specification
of a language.
From some classes of grammars, tools exist that
can automatically construct an efficient parser.
These tools can also detect syntactic ambiguities
and other problems automatically.
A compiler based on a grammatical description of
a language is more easily maintained and updated.

A grammar G (N, T, P, S)
N is a finite set of non-terminal symbols
T is a finit set of terminal symbols
P is a finit subset of
An element is written as
S is a distinguished symbol in N and is called
the start symbol.
Language defined by a grammar
We say aAb derives awb in one step, denoted as
aAbgtawb, if A-gtw is a production and a and b
are arbitrary strings of terminal or nonterminal
symbols.
We say a1 derives am if a1gta2gtgtam, written as
a1gtam
The languages L(G) defined by G are the set of
strings of the terminals w such that Sgtw.

Chomsky Hierarchy (classification of grammars)
A grammar is said to be
regular if it is
right-linear, where each production in P has the
form, or
. Here, A and B are non-terminals and w is a
terminal
left-linear
context-free if each production in P is of the
form , where and
context sensitive if each production in P is of
the form where
unrestricted if each production in P is of the
form where

Context-free grammar is sufficient to describe
most programming languages.
Example a grammar for arithmetic expressions.
ltexprgt -gt ltexprgt ltopgt ltexprgt
ltexprgt -gt ( ltexprgt )
ltexprgt -gt - ltexprgt
ltexprgt -gt id
ltopgt -gt - /
derive -(id) from the grammar
ltexprgt gt -ltexprgt gt - (ltexprgt) gt-(id)
sentence a strings of terminals that can be
derived from S
sentential form a strings of terminals or none
terminals that can be derived from S.

derive id id id from the grammar
EgtEEgtEEEgtEEidgtEididgtididid
leftmost/rightmost derivation -- each step
replaces leftmost/rightmost non-terminal.
EgtEEgtidEgtidEEgtididEgtididid
Parse tree
A parse tree pictorially shows how the start
symbol of a grammar derives a specific string in
the language. Given a context-free grammar, a
parse tree has the following properties
The root is labeled by the start symbol
Each leaf is labeled by a token or the empty
string
Each interior node is labeled by a nonterminal
If A is a non-terminal labeling some interior
node and abcdefg..z are the labels of the
children of that node from left to right, then
A-gtabcdefg..z is a production of the grammar.

The leaves of the parse tree read from left to
right is called yield of the parse tree. It is
equivalent to the string derived from the
nonterminal at the root of the parse tree.
An ambiguous grammar is one that can generate two
or more parse trees that yield the same string
E.G
string -gt string string
string-gtstring - string
string -gt0123456789
stringgtstring string gtstring - string
string gt 9 -5 2
stringgtstring - stringgtstring - string string
gt9-52

Write a Comment

User Comments (0)

About PowerShow.com

Minimize the number of states in a DFA PowerPoint PPT Presentation