CPSC 325 - Compiler - PowerPoint PPT Presentation

About This Presentation
Title:

CPSC 325 - Compiler

Description:

... Lex/Flex Use Flex instead of Lex Use Bison instead of yacc When compile, link to the library flex file.lex gcc o object lex.yy.c ll object Lex ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 15
Provided by: Jer962
Category:
Tags: cpsc | compiler | yacc

less

Transcript and Presenter's Notes

Title: CPSC 325 - Compiler


1
CPSC 325 - Compiler
  • Tutorial 2
  • Scanner Lex

2
Tokens
Input
  • Token Stream Each significant lexical chunk of
    the program is represented by a token
  • Operators Punctuation ! -
  • Keywords if while return goto
  • Identifier id actual name
  • Constants kind value int, floating-point
    character, string,

3
Token example 1
  • Input text
  • if( x gt y ) y 10
  • Token Stream

IF
LP
ID(x)
ID(y)
RP
GEQ
Assign
SEMI
INT(10)
ID(y)
4
Parser
  • Tokens

IF
LP
ID(x)
ID(y)
RP
GEQ
Assign
SEMI
INT(10)
ID(y)
IfStmt
gt
assign
INT(10)
ID(y)
ID(y)
ID(x)
5
Sample Grammar
  • Program statement program statement
  • Statement assignStmt ifStmt
  • assignStmt id expr
  • ifStmt if ( expr ) Statement
  • Expr id int expr expr
  • id a b y z
  • Int 1 2 9 0

6
Why Separate the Scanner and Parser?
  • Simplicity Separation of Concerns
  • Scanner hides details from parser (comments,
    whitespace, input files, etc.)
  • Parser is easier to build has simpler input
    stream
  • Efficiency
  • Scanner can use simpler, faster design
  • (But still often consumes a surprising amount of
    the compilers total execution time)

7
Principle of Longest Match
  • In most of languages, the scanner should pick the
    longest possible string to make up the next token
    if there is a choice.
  • Example
  • return apple ! banana
  • Should be recognized as 5 tokens
  • Not more (not parts of words or identifier, or !
    And as separate tokens)

return
NEQ
ID(banana)
SEMI
ID(apple)
8
Scanner DFA Example (1)
White space or comments
0
Accept EOF
1
end of input
Accept LP
(
2
Accept RP
3
)
4

Accept SEMI
9
Scanner DFA Example (2)
White space or comments
Accept NEQ
6
!

5
Accept NOT
7
other
8
Accept LEQ
lt

9
other
10
Accept LESS
10
Scanner DFA Example (3)
White space or comments
0-9
0-9
11
Accept INT
other
12
11
Scanner DFA Example (4)
White space or comments
a-zA-Z
a-zA-Z
13
Accept ID or keyword
other
14
12
Lex/Flex
  • Use Flex instead of Lex
  • Use Bison instead of yacc
  • When compile, link to the library
  • flex file.lex
  • gcc o object lex.yy.c ll
  • object

13
Lex - Structure
  • Declarations/Definitions
  • Rules/Production
  • - Lex expression
  • - white space
  • - C statement (optional)
  • Additional Code/Subroutines

14
Lex Basic operators
  • - zero or more occurrences
  • . - ANY character
  • . - matches any sequence
  • - separator
  • - one or more occurrences. (a aa)
  • ? - zero or one of something. (b? (bnull)
  • - choice, so 12345 ?? (12345)
  • (Note represent a choice between
    star and
  • plus. They lost their
    specialty.
  • - - a-zA-Z ? a to z and A to Z, all the
    letters.
  • \ - \ matches , and \. Match period or
    decimal point.
Write a Comment
User Comments (0)
About PowerShow.com