CS 2130 - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

CS 2130

Description:

An important skill for a computer scientist is knowing when to code the solution ... analyzer that can be embedded into an application you are creating is a gem ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 31
Provided by: ble87
Category:
Tags: gem

less

Transcript and Presenter's Notes

Title: CS 2130


1
CS 2130
  • Presentation 18
  • Tools
  • Lex

2
Tools
  • An important skill for a computer scientist is
    knowing when to code the solution to a problem
    and when to use a tool
  • It wasn't always this way
  • A key to using tools is getting past the learning
    curve
  • Some of the most useful tools are those which
    produce usable code

3
lex
  • lex is a lexical analyzer generator
  • It doesn't do lexical analysis. It writes code
    that will perform lexical analysis for you
  • A program which can perform lexical analysis is
    useful.
  • A program that can generate code for a custom
    lexical analyzer that can be embedded into an
    application you are creating is a gem

4
Why learn lex?
  • For some it will be a useful program that they
    will use over and over
  • Others may use it but infrequently
  • It appears in want ads!!!
  • http//www.appforge.com/corp/careers/sfw_engineer.
    html
  • The concepts involved with learning lex get at
    the core material for this course. Even if you
    never use lex again knowledge of its operation
    will help you to better understand the
    translation process

5
More Info?
lex yacc, 2nd Edition By John Levine, Tony
Mason Doug Brown2nd Edition October
1992 1-56592-000-7 Order Number 0007386 pages,
29.95
http//www.oreilly.com/catalog/lex/
Note Some code taken from O'Reilly website
6
lex process
  • Create a specification in a file called scan.l
  • lex processes scan.l and produces lex.yy.c
  • The c compiler can turn lex.yy.c into a.out
  • lex scan.l
  • cc lex.yy.c -ll
  • (-ll means link with lex library, use lfl if
    using flex)
  • lex contains a function yylex( ) which does
    actual lexical analysis

or gcc
7
lex file format
  • ltDefinitionsgt
  • ...
  • ...
  • ltRulesgt
  • ...
  • ...
  • ltSupplementary codegt
  • ...
  • ...

includes defines RegExps
Pattern/Action Pairs ltpattern1gt
ltaction1gt ltpattern2gt ltaction2gt
Additional code (Not always needed)
8
The Simplest Lex Program

Put this code in a file called scan.l Run
lex lex scan.l Compile gcc lex.yy.c -ll Run by
typing a.out or a.out lt somefile.txt
This form will read from stdin. To terminate
type ctrl/d
9
The Simplest Lex Program
  • .\n ECHO

Put this code in a file called scan.l Run
lex lex scan.l Compile gcc lex.yy.c -ll Run by
typing a.out or a.out lt somefile.txt
This form will read from stdin. To terminate
type ctrl/d
10
The Simplest Lex Program
  • .\n ECHO
  • main()
  • yylex()

We got this by default!
11
lex example
A definition involving a regular expression
  • wspc \t\n
  • wspc output( ' ' )

Reduce all whitespace to a single space. Note
This works on acme. On linux boxes e.g.
helsinki substitute putc(' ', yyout) for output(
' ' )
Note the curlies meaning substitute the
definition of wspc here
12
lex example
  • include ltstdio.hgt
  • include ltctype.hgt
  • word -'A-Za-z
  • word printf("cs", toupper(yytext),
  • yytext1)

How to include c code
13
lex example
  • include ltstdio.hgt
  • include ltctype.hgt
  • word -'A-Za-z
  • word printf("cs", toupper(yytext),
  • yytext1)

yytext is an Internal variable containing text
of word matched
Capitalize first letter of each word leaving
remainder of text unchanged the 777 hits
becomes The 777 Hits
14
lex
  • Text not matched is echoed as read
  • Thus, there is an implied ECHO
  • Which can be supressed. How?
  • Lex patterns only match a given input character
    or string once
  • Lex executes the action for the longest possible
    match for the current input.

15
To be more specific...
  • If you don't specifiy a main you get one for
    free!!!
  • If you call yylex it will start scanning the
    appropriate input and as it recognizes rules do
    the specified action
  • Example
  • AAA printf("ltFound 3 A'sgt")
  • AA printf("ltFound 2 A'sgt")
  • Given AAAAAAAA
  • Will print
  • ltFound 3 A'sgtltFound 3 A'sgtltFound 2 A'sgt
  • The scanning continues unless a value is returned!

16
lex example
  • include ltstdio.hgt
  • static int lineno 0
  • \n\n printf( "5d ", lineno ) ECHO

Print out file with line numbers
17
another way
  • include ltstdio.hgt
  • static int lineno 0
  • line \n\n
  • line printf( "5d s", lineno, yytext )

Print out file with line numbers
18
Or
  • include ltstdio.hgt
  • static int lineno 0
  • line .\n
  • line printf( "5d s", lineno, yytext )

Print out file with line numbers
19
Or even
  • include ltstdio.hgt
  • static int lineno 0
  • line .\n
  • line printf("/ 5d / s", lineno, yytext )

Print out file with line numbers commented for c
20
another example
  • include "defs.h"
  • static char BigLine NULL
  • static int BigLineLen -1
  • line \n\n
  • line if( yyleng gt BigLineLen )
  • free( BigLine )
  • BigLineLen
  • ( BigLine strdup( yytext ) )
  • NULL ? -1 yyleng
  • int yywrap( void )
  • PRINTF(
  • ("s",( BigLine NULL ) ? "" BigLine ))
  • return 1

yywrap gets called at the end of input
21
count chars, words, lines
  • include ltstdio.hgt
  • static int words 0, lines 0, chars 0
  • word -'A-Za-z
  • word words chars yyleng
  • \n lines chars
  • . chars
  • int yywrap( void )
  • printf( "8u8u8u\n", lines, words, chars )
  • return 1

22
Distribution of word lengths
23
  • include "defs.h"
  • define MAX_WORD_LEN 100
  • static unsigned int WrdLengArr MAX_WORD_LEN
  • static unsigned int WrdLengSum, NumWords
  • word \t\n
  • word if( yyleng lt MAX_WORD_LEN )
  • WrdLengArr yyleng
  • WrdLengSum yyleng
  • NumWords
  • .\n / do nothing /

24
  • int yywrap( void )
  • int i
  • PRINTF(( "Length\tFrqncy\n" ))
  • for( i 0 i lt MAX_WORD_LEN i )
  • if(WrdLengArr i ! 0 )
  • PRINTF(( "4u\t4u\n", i, WrdLengArr i
    ))
  • PRINTF((" Avg\t0.2f\n", (float)WrdLengSum /
    NumWords))
  • return 1

25
Word Replace
26
  • include "defs.h"
  • define ARG( n ) ( argc lt (n) ? "" argv (n)
    )
  • static char SearchWord
  • static char InsertWord
  • word -a-zA-Z
  • num 0-9
  • punct !.,()

27
  • punct
  • num
  • word
  • if( strcmp( yytext, SearchWord )
    0 )
  • PRINTF(( "s", InsertWord ))
  • else
  • PRINTF(( "s", yytext ))

28
  • int main( int argc, char argv )
  • const char OutFile "output.txt"
  • char InFile ARG( 1 )
  • SearchWord ARG( 2 )
  • InsertWord ARG( 3 )
  • if((yyin freopen( InFile, "r", stdin ) )
    NULL
  • (yyout freopen( OutFile, "w", stdout ) )
    NULL )
  • ERR_MSG( freopen )
  • return EXIT_FAILURE
  • return ( yylex( ) 0 ) ? EXIT_SUCCESS
    EXIT_FAILURE

29
Questions?
30
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com