CS 2130 - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

CS 2130

Description:

The concepts involved with learning lex get at the core material for this course. Even if you never use lex again knowledge of its operation will help you to ... – PowerPoint PPT presentation

Number of Views:164
Avg rating:3.0/5.0
Slides: 38
Provided by: ble87
Category:
Tags: aaa | com | lex

less

Transcript and Presenter's Notes

Title: CS 2130


1
CS 2130
  • Presentation 18
  • Tools
  • Lex

2
Tools
  • An important skill for a computer scientist is
    knowing when to code the solution to a problem
    and when to use a tool
  • It wasn't always this way
  • A key to using tools is getting past the learning
    curve
  • Some of the most useful tools are those which
    produce usable code

3
lex
  • lex is a lexical analyzer generator
  • It doesn't do lexical analysis. It writes code
    that will perform lexical analysis for you
  • A program which can perform lexical analysis is
    useful.
  • A program that can generate code for a custom
    lexical analyzer that can be embedded into an
    application you are creating is a gem

4
Why learn lex?
  • For some it will be a useful program that they
    will use over and over
  • Others may use it but infrequently
  • It appears in want ads!!!
  • http//www.appforge.com/corp/careers/sfw_engineer.
    html
  • The concepts involved with learning lex get at
    the core material for this course. Even if you
    never use lex again knowledge of its operation
    will help you to better understand the
    translation process

5
More Info?
lex yacc, 2nd Edition By John Levine, Tony
Mason Doug Brown2nd Edition October
1992 1-56592-000-7 Order Number 0007386 pages,
29.95
http//www.oreilly.com/catalog/lex/
Note Some code taken from O'Reilly website
6
Basic lex
7
lex process
  • Create a specification in a file called scan.l
  • Note "scan" is an arbitrary name here
  • lex processes scan.l and produces lex.yy.c
  • lex.yy.c contains a function called yylex()
  • lex scan.l
  • cc -c lex.yy.c ? Produces lex.yy.o
  • Now this object file can be linked with files
    that call yylex.

8
lex file format
  • ltDefinitionsgt
  • ...
  • ...
  • ltRulesgt
  • ...
  • ...
  • ltSupplementary codegt
  • ...
  • ...

includes defines RegExps
Pattern/Action Pairs ltpattern1gt
ltaction1gt ltpattern2gt ltaction2gt
Additional code (Not always needed)
9
lex
  • Text not matched is echoed as read
  • Thus, there is an implied ECHO
  • Which can be supressed. How?
  • Lex patterns only match a given input character
    or string once
  • Lex executes the action for the longest possible
    match for the current input.

10
A Simple Example
  • enum ONE1, TWO, THREE, IDENT, ENDFILE
  • one_a (aA)
  • two_as (aaAA)
  • three_as (aaaAAA)
  • ident A-Za-z_0-9A-Za-z_
  • ignore (.\n)
  • ltltEOFgtgt return ENDFILE
  • one_a return ONE
  • two_as return TWO
  • three_as return THREE
  • ident return IDENT
  • ignore

11
Build it
  • Put this code in a file called
  • scan.l
  • Run lex
  • lex scan.l
  • The output will be
  • lex.yy.c
  • Which can be compiled
  • gcc -c lex.yy.c
  • Which produces
  • lex.yy.o

12
Write a c program
  • / Includes not shown /
  • enum ONE1, TWO, THREE, IDENT, ENDFILE
  • int main(void)
  • int token
  • int i
  • for(i0 ilt10 i)
  • token yylex()
  • if(token ENDFILE)
  • PRINTF(("Goodbye!\n"))
  • return EXIT_SUCCESS
  • PRINTF(("Token d\n", token))
  • return EXIT_SUCCESS

13
Put it all together
  • gcc -o tester lex.yy.o tester.c -lfl

14
Warning
  • lex (and flex) as well as yacc contain more bells
    and whistles than you can shake a stick at...
  • Make sure that you understand the basic
    functionality before attempting advanced
    projects!!!

15
The Simplest Lex Program

Put this code in a file called scan.l Run
lex lex scan.l Compile gcc lex.yy.c -ll Run by
typing a.out or a.out lt somefile.txt
This form will read from stdin. To terminate
type ctrl/d
16
Why does this work?
  • Bells and whistles?

17
The Simplest Lex Program
  • .\n ECHO

Put this code in a file called scan.l Run
lex lex scan.l Compile gcc lex.yy.c -ll Run by
typing a.out or a.out lt somefile.txt
By default this rule exists!
This form will read from stdin. To terminate
type ctrl/d
18
The Simplest Lex Program
  • .\n ECHO
  • int main()
  • yylex()
  • return 0

We got this by default!
PLUS
19
lex example
A definition involving a regular expression
  • wspc \t\n
  • wspc output( ' ' )

Reduce all whitespace to a single space. Note
This works on acme. On linux boxes e.g.
helsinki substitute putc(' ', yyout) for output(
' ' )
Note the curlies meaning substitute the
definition of wspc here
20
lex example
  • include ltstdio.hgt
  • include ltctype.hgt
  • word -'A-Za-z
  • word printf("cs", toupper(yytext),
  • yytext1)

How to include c code
21
lex example
  • include ltstdio.hgt
  • include ltctype.hgt
  • word -'A-Za-z
  • word printf("cs", toupper(yytext),
  • yytext1)

yytext is an Internal variable containing text
of word matched
Capitalize first letter of each word leaving
remainder of text unchanged the 777 hits
becomes The 777 Hits
22
To be more specific...
  • If you don't specifiy a main you get one for
    free!!!
  • If you call yylex it will start scanning the
    appropriate input and as it recognizes rules do
    the specified action
  • Example
  • AAA printf("ltFound 3 A'sgt")
  • AA printf("ltFound 2 A'sgt")
  • Given AAAAAAAA
  • Will print
  • ltFound 3 A'sgtltFound 3 A'sgtltFound 2 A'sgt
  • The scanning continues unless a value is returned!

23
lex example
  • include ltstdio.hgt
  • static int lineno 0
  • \n\n printf( "5d ", lineno ) ECHO

Print out file with line numbers
24
another way
  • include ltstdio.hgt
  • static int lineno 0
  • line \n\n
  • line printf( "5d s", lineno, yytext )

Print out file with line numbers
25
Or
  • include ltstdio.hgt
  • static int lineno 0
  • line .\n
  • line printf( "5d s", lineno, yytext )

Print out file with line numbers
26
Or even
  • include ltstdio.hgt
  • static int lineno 0
  • line .\n
  • line printf("/ 5d / s", lineno, yytext )

Print out file with line numbers commented for c
27
another example
  • include "defs.h"
  • static char BigLine NULL
  • static int BigLineLen -1
  • line \n\n
  • line if( yyleng gt BigLineLen )
  • free( BigLine )
  • BigLineLen
  • ( BigLine strdup( yytext ) )
  • NULL ? -1 yyleng
  • int yywrap( void )
  • PRINTF(
  • ("s",( BigLine NULL ) ? "" BigLine ))
  • return 1

yywrap gets called at the end of input
28
count chars, words, lines
  • include ltstdio.hgt
  • static int words 0, lines 0, chars 0
  • word -'A-Za-z
  • word words chars yyleng
  • \n lines chars
  • . chars
  • int yywrap( void )
  • printf( "8u8u8u\n", lines, words, chars )
  • return 1

29
Distribution of word lengths
30
  • include "defs.h"
  • define MAX_WORD_LEN 100
  • static unsigned int WrdLengArr MAX_WORD_LEN
  • static unsigned int WrdLengSum, NumWords
  • word \t\n
  • word if( yyleng lt MAX_WORD_LEN )
  • WrdLengArr yyleng
  • WrdLengSum yyleng
  • NumWords
  • .\n / do nothing /

31
  • int yywrap( void )
  • int i
  • PRINTF(( "Length\tFrqncy\n" ))
  • for( i 0 i lt MAX_WORD_LEN i )
  • if(WrdLengArr i ! 0 )
  • PRINTF(( "4u\t4u\n", i, WrdLengArr i
    ))
  • PRINTF((" Avg\t0.2f\n", (float)WrdLengSum /
    NumWords))
  • return 1

32
Word Replace
33
  • include "defs.h"
  • define ARG( n ) ( argc lt (n) ? "" argv (n)
    )
  • static char SearchWord
  • static char InsertWord
  • word -a-zA-Z
  • num 0-9
  • punct !.,()

34
  • punct
  • num
  • word
  • if( strcmp( yytext, SearchWord )
    0 )
  • PRINTF(( "s", InsertWord ))
  • else
  • PRINTF(( "s", yytext ))

35
  • int main( int argc, char argv )
  • const char OutFile "output.txt"
  • char InFile ARG( 1 )
  • SearchWord ARG( 2 )
  • InsertWord ARG( 3 )
  • if((yyin freopen( InFile, "r", stdin ) )
    NULL
  • (yyout freopen( OutFile, "w", stdout ) )
    NULL )
  • ERR_MSG( freopen )
  • return EXIT_FAILURE
  • return ( yylex( ) 0 ) ? EXIT_SUCCESS
    EXIT_FAILURE

36
Questions?
37
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com