PLT 2 - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

PLT 2

Description:

Example (Drunkards Lament): Some dirty swine stole my drink. A rotten pig stole my drink ... Formalism Needed Why? Difficulties: How do you know when to stop? ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 23
Provided by: deptSc7
Category:

less

Transcript and Presenter's Notes

Title: PLT 2


1
PLT 2
  • Grammar Review and Regular Expressions

2
Grammar Specification
  • Production Rules
  • S-gt jSVR
  • S-gtjVR
  • RV-gtVR
  • jV-gtjv
  • vV-gtvv
  • vR-gtvr
  • rR-gtrr
  • R-gt q
  • VT.
  • VN.
  • Note that sometimes we can apply the rules in an
    order which prohibits language instances being
    produced

3
Context Free Grammar
  • Context Free Grammars
  • Example
  • (S, 0,1, S-gt0S1, S-gt01, S)
  • S-gt0S1
  • S-gt01
  • Constraints on Production Rules
  • Start Domain unit VN
  • A-gtCa
  • B-gtcB
  • A-gtl
  • S-gtanything
  • Non-reduced CFG
  • S-gtAB
  • A-gta
  • B-gtBb
  • C-gtc

4
Variant Grammers
  • Regular Grammars
  • Example (Drunkards Lament)
  • Some dirty swine stole my drink
  • A rotten pig stole my drink
  • Subset of CFGs
  • Constraints on Production Rules
  • CFG RHS Constraints
  • A-gtaB
  • A-gt a
  • A-gtl
  • Context Sensitive Grammers
  • Constraints on Production Rules
  • Non-Terminals in Context
  • aAb-gt aaBb

5
Context Free Grammar
  • ltLamentgt-gt ltArticlegtltAdjectivalstuffgt
    ltNoungtltVerbgt my drink
  • ltArticlegt-gt Some
  • ltArticlegt-gt A
  • ltArticlegt-gt The
  • ltAdjectivalstuffgt-gt ltAdjectivegt
  • ltAdjectivalstuffgt-gt ltAdjectivegtltAdjectivalStuffgt
  • lt Adjective gt-gt rotten
  • lt Adjective gt-gt lousy
  • lt Adjective gt-gt dirty
  • lt noun gt-gt swine
  • lt noun gt-gt pig
  • lt noun gt-gt rat
  • lt verb gt-gt took
  • lt verb gt -gt swiped
  • lt verb gt-gt stole

6
Regular Grammar
  • ltLamentgt-gt A ltAdjectNounVerbObjectgt
  • ltLamentgt-gt Some ltAdjectNounVerbObjectgt
  • ltLamentgt-gt The ltAdjectNounVerbObjectgt
  • ltAdjectNounVerbObjectgt-gt lousy
  • ltAdjectNounVerbObjectgt
  • ltAdjectNounVerbObjectgt-gt rotten
  • ltAdjectNounVerbObjectgt I
  • ltAdjectNounVerbObjectgt-gt stinking
  • ltAdjectNounVerbObjectgt I
  • ltAdjectNounVerbObjectgt-gt ltNounVerbObjectgt
  • ltNounVerbObjectgt-gt pig ltVerbObjectgt
  • ltNounVerbObjectgt-gt rat ltVerbObjectgt
  • ltNounVerbObjectgt-gt swine ltVerbObjectgt
  • ltVerbObjectgt-gt stole ltObjectgt
  • ltVerbObjectgt-gt took ltObjectgt
  • ltObjectgt-gt my ltdrinkwordgt
  • ltdrinkwordgt-gt drink

7
Logical Architecture of Demo Compiler
8
Compiler Structure Tie In
  • Remember Compilers
  • Descriptive Grammer Role
  • Ascii Character Stream-gt
  • Lexical Analyser-gt (Reg Grammar)
  • Token Stream-gt
  • Parser-gt (CF Grammar)
  • Parse Tree-gt
  • Code Generation-gt
  • Object Code.

9
Lexical Analysis
  • Identifying VT From Stream
  • for(x1xlt10x)
  • For keyword
  • (
  • Variable named x
  • asignment
  • Integer 1
  • Statement end
  • Variable named x
  • Less than
  • Integer 10
  • Statement end
  • Variable named x
  • operator
  • )

10
Formalism Needed Why?
  • Difficulties
  • How do you know when to stop?
  • Some tokens are difficult to define
  • String example
  • 1., .10 example
  • Requirements
  • Need Precise Definitions
  • Thus
  • Regular Expressions
  • Finite State Automata

11
Regular Expressions
  • Convenient means of specifying certain simple
    (though possibly infinite) sets of strings
  • Heres how it works
  • Vocabulary -gt Reg Exp -gtSet of strings
  • (a, b,., z) -gt Reg Exp -gt identifier
  • (a, b,., z) -gt Reg Exp -gt for
  • Generation / Recognition
  • f, concat o -gt fo
  • fo concat r -gtfor
  • Strings built from vocabulary via concatenation
  • Based on Regular Grammar
  • ltFORgt -gt f ltROFgt
  • ltROFgt -gt o ltLLgt
  • ltLLgt -gt r

12
Reg Expressions
  • Components
  • Small finite sets are represented by listing
    their elements chars or lists of chars
  • Meta characters punctuation and operators to
    specify structure of more complex sets using sets
    (in regular expressions)
  • Concatenation
  • Alternation (or)
  • Kleene Closure (zero one or more)
  • Each Regular Expression denotes a set of strings
  • q is a regular expression denoting the empty
    string
  • l is the set containing only the empty string.
  • A string s is a regular expression denoting a set
    containing s only
  • If A and B are regular expressions then AB, AB
    and A are regular expression

13
Reg Expressions and Tokens
  • 2 Additional Terms
  • P
  • Not(A) V-A
  • Given
  • D (012.9) or 0-9
  • L (ABC..Z) or A-Z
  • Literal
  • D.D
  • C Line Comment
  • // Not(EOL) EOL
  • Identifier
  • L(LD) (no underscores)
  • L(_LD) (with underscores)

14
Finite State Automota
  • Recognise tokens Specified by Regular
    Expressions
  • Consists Of
  • A Finite Set of States
  • Set of Transitions Between States
  • Special Start State
  • Final / Acceptence State
  • Transition Diagram Representation
  • Deterministic FSA
  • Drivers of Scanners
  • By Transition Table

15
Deterministic FSA for C Comment ( Transition
Table)
16
Physical Architecture
17
Scanner Driver (with transition table T)
  • state initial_state
  • cur_char getchar()
  • while (true)
  • next_state Tstatecur_char
  • if (next_state ERROR)
  • break
  • state next_state
  • if (current_char EOF)
  • break
  • current_char getchar()
  • if (is_final_state(state))
  • return (valid_token)
  • else
  • return (not_valid_token)

18
Some Examples Flex
  • Literal
  • 0-9
  • 0-9
  • Literal with Optional minus
  • -?0-9
  • Identifier
  • a-zA-Za-zA-Z0-9
  • Comment
  • \ / \ / . \n
  • Matching Quoted Strings
  • \\n\n

19
Minimal Flex Specification
  • /
  • this sample demonstrates (very) simple
    recognition a verb/not a verb
  • /
  •  
  • \t / ignore whitespace /
  • is
  • am
  • are
  • were
  • was
  • be printf("s is a verb\n", yytext)
  •  
  • a-zA-Z printf(s is not a verb\n, yytext)
  • .\n ECHO / normal default anyway/
  •  

20
Description
  • Definition Section
  • Legal C Code
  • For Inclusion into final C program
  • Rules Section
  • Rules
  • Reg. Exp. Recognition Rule
  • Action Taken
  • Delimiters
  • Matching Order
  • Only Match Strings Once
  • Longest Possible Match
  • Specification Ordering if match is equal
  • User Subroutines Section
  • Legal C Code
  • Again for inclusion

21
Getting It Working
  • Write specification lex file
  • Name it firsteg.l
  • flex firsteg.l
  • cc lex.yy.c o first ll
  • Run it
  • first
  • Standard input is now analysed
  • The cat is a dog
  • The is not a verb
  • cat is not a verb
  • is is a verb
  • a is not a verb
  • dog is not a verb

22
Flex Punctuation
  • (0,1 or more occurances of prev. exp)
  • (1 or more occurances of prev. expression)
  • ? (0 or 1 occurances or previous expression)
  • . (an single character except newline char)
  • (any character within the brackets)
  • 0-9, a-z, A-Z, a-zA-Z
  • (any character except those within brackets
    after )
  • (how many time prev expression allowed to
    match)
  • a-z1,3
  • \ (used to escape meta-characters)
  • (or)
  • (matches everything in the quotation marks
    literally)
Write a Comment
User Comments (0)
About PowerShow.com