Title: Chomsky Hierarchy
1Chomsky Hierarchy
We dont need them all for PL
Lexical structure (Scanner)
Regular Expressions (type3)
Syntactic structure (Parser)
Contextfree languages (type2)
Contextsensitive languages (type1)
Computable (formal) languages (type0)
Type3 ? Type2 ? Type1 ? Type0
The inclusions are all proper.
2Syntax
The structure of a program Grammar
Analyze Lexical Structure
 Scanner ?output Tokens string
 Parser ?output Parse tree (intermediate data
structure)
Analyze Phrase Structure
3Separate Grammars
 Lexical Structure
 Phrase Structure
ltprogramgt ltendoffilegt ltstatementblockgt
ltprogramgt ltstatementblockgt ltdeclaratorlistgt
ltstatementlistgt ltdeclaratorlistgt
ltdeclaratorgt ltdeclaratorgt , ltdeclaratorlistgt ltde
claratorgt ltidgt ltidgt ltexpgt ................
. ltidgt ltalphabetgtltalphabetgtltidtailgt
ltidtailgt ltdelimitergt ltalphabetgt ltidtailgt
ltdigitgt ltidtailgt ltdelimitergt
ltspacegtlttabgtltendoflinegt ltalphabetgt
abcdefghijklmnopqrstuvwxz
ltdigitgt 0123456789 ..................
.
4Scanner simplifies Parsers Job
ltEgt ltEgt ltTgt ltTgt ltTgt ltTgt ltFgt
ltFgt ltFgt ( ltEgt ) ltidgt ltvgt .................
ltidgt ltalphabetgtltalphabetgtltidtailgt
ltidtailgt ltdelimitergt ltalphabetgt ltidtailgt
ltdigitgt ltidtailgt ltdelimitergt
ltspacegtlttabgtltendoflinegt ltalphabetgt
abcdefghijklmnopqrstuvwxz
ltdigitgt 0123456789 ltvgt ltdigitgt
ltdigitgtltvgt ...................
Lexical Structure
5To simplify Parsers Job
total total price number (1discount)
(1saletax)
program source
Scanner
token string
Id3
Id1
(
Id1
Id2
v1
(
)
)

v1
Id4
Id5
6Regular Expressions, Regular Grammars (BNF)
( Rules to specify)
Lexical Analysis
Scanners, Finite Automata, Finite State Machines
(Formalism to recognize)
7ContextFree Grammars (CFG)
(to specify)
Syntactic Analysis
PushDown Automata (PDA)
(to recognize)
Theoretically, but may not practically doable for
compilers
Unambiguous CFG (to specify PL)
Deterministic PDA (DPDA) (to parse programs)
8? A finite set of alphabets (symbols)
 ?, the empty string, is a regular expression
 S is a regular expression if S? ?
 If S is a regular expression, so is Si for i?N
 If S is a regular expression, so are S and S
 If R and S are two regular expressions, so is RS
 If R and S are two regular expressions, so is RS
9 Languages Specified by Regular Expressions
 A finite set of alphabets (symbols)
 ? the empty string
 L(S) the set of sentences represented by regular
expression S.
Ler S and R be two regulars expressions
 L() ?
 L(?) ?
 If S? ?, then L(S) S
 L(S)L(R) xy x ?L(S) and y?L(R)
 L(SR) L(S)L(R)
 L(S0) ?
 For n 1, L(Sn) L(SSn1)
 L(S) L(S) ? L(S2) ? L(S3) ? ....
 L(S) L(S0) ? L(S) ? L(S2) ? L(S3) ? ....
 L(SR) L(S) ? L(R)
10Here we use regular expressions in the right hand
said of BNF
id A(AD) A abcdefghijklmno
pqrstuvwxz D 0123456789 L(
id) a, ab, a234b, xyz, x5z9, li2, ....
ab(aab)(aaa) is a regular expression
abaa, abaaaabaaba, abaaaaa, abababaa,....
11id A(AD) A abcdefghijklmno
pqrstuvwxz D 0123456789
AD
a
id
b
A
q'0
q0
q1
q'1
z
12id A(AD) A abcdefghijklmno
pqrstuvwxz D 0123456789
AD
id
a
b
q1
q0
z
13 DFA (Deterministic Finite Automata)
ab(aab)(aaa)
14 DFA (Deterministic Finite Automata)
? na(?) nb(?) mod 3 1
na(?) The number of as in ? nb(?) The number
of bs in ?
aaabaa (514) bbabbab (25 3) bbabbaba
(352)
15 NFA (Nondeterministic Finite Automata)
aaabab
aaaaa, ababb
16aaabab
aaaaa, ababb
17aaabab
S ? aA S ? aX A ? ? A ? aA X ? bY Y ? aY Y ?
B B ? ? B ? bB
A Regular Grammar
18 DFA have no memory to count
anbn n 0 is not regular
aaaaaaaaaaaaaaaabbbbbbbbbbbbbb
S ? ? S ? aAb
Context Free Grammar
Not a rightlinear grammar Is there one? Or, is
there a leftlinear grammar this language? No