AWK Alfred V' Aho, Peter J' Weinberg, Brian W' Kernighan a powerful programming language disguised a - PowerPoint PPT Presentation

Loading...

PPT – AWK Alfred V' Aho, Peter J' Weinberg, Brian W' Kernighan a powerful programming language disguised a PowerPoint presentation | free to view - id: 1eaadb-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

AWK Alfred V' Aho, Peter J' Weinberg, Brian W' Kernighan a powerful programming language disguised a

Description:

AWK. Alfred V. Aho, Peter J. Weinberg, Brian W. Kernighan ... How could you demonstrate your answer? There is only one record buffer available. ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 25
Provided by: josephl
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: AWK Alfred V' Aho, Peter J' Weinberg, Brian W' Kernighan a powerful programming language disguised a


1
AWKAlfred V. Aho, Peter J. Weinberg, Brian W.
Kernighana powerful programming language
disguised as a utility
  • Material in this slide set from
  • Chapter 12 awk
  • UNIX and Shell Programming
  • By
  • Behrouz A. Forouzan Richard F. Gilberg
  • Brooks/Cole - Thomson Learning 2003

2
awk baiscs
  • awk reads an input file line by line, and
    performs an action on each line
  • awk views a file as a collection of fields and
    records
  • Only two UNIX options for awk
  • The F option specifies the input field separator
  • The f option names the script file

3
  • awk pattern(action) input-file
  • When the awk commands are coded on the command
    line the script is enclosed in quotes
  • awk f scriptFile.awk input-file
  • Longer scripts are placed in a separate script
    file
  • The .awk extension is not required but is a nice
    convention for identifying awk scripts
  • awk F -f scriptFile.awk input-file
  • Here the colon is the field separator
  • The default field separator is the space or tab

4
awk pattern(action) input-file
  • awk print Hello World
  • Note that awk is waiting for input
  • ltCtrlgt-C to end awk program
  • echo This is line 1 gt test
  • awk print Hello World test
  • Note awk prints one line of output
  • The output is the argument of the print command

5
awk pattern(action) input-file
  • echo This is line 1 gt test
  • echo This is line 2 gtgt test
  • echo This is line 3 gtgt test
  • awk print Hello World test
  • awk prints three lines of output,
  • one line of output for each line in the input
  • the output is the argument of the print command

6
awk pattern(action) input-file
  • echo This is line 1 gt test
  • echo This is line 2 gtgt test
  • echo This is line 2 gtgt test
  • awk print test
  • awk prints the contents of the input file
  • one line of output for each line in the input
  • when the print command has no argument the input
    line is printed.

7
awk f total.awk total.dat
total.dat Input 22 78 44 66 31 70 52 30 44 88 31
66
total.awk Begin Processing BEGIN print Print
Totals Body Processing total 1 2
3 print 1 2 3
total End Processing END print End Totals
Output Print Totals 22 78 44 144 66 31
70 167 52 30 44 126 88 31 66
185 End Totals
8
Buffers Variables
  • Each field of the record is read into a field
    buffer
  • The field buffers are named 1, 2, 3,
  • The record buffer contains a concatenation of all
    field buffers with one field separator character
    between each field
  • The record buffer is named 0

9
awk System Variables
aTotally controlled by awk
10
awk System Variables (cont)
aTotally controlled by awk
11
awk User-Defined Variables
  • Can be numbers, strings, or arrays
  • Variable names start with a letter and can be
    followed by any sequence of letters, digits, and
    underscores
  • Do not need to be declared
  • Come into existence the first time they are
    referenced
  • Initially created as strings and initialized to a
    null string ().

12
awk Script Basics
  • awk scripts are divided into three parts
  • BEGIN
  • Initialization processing that is done only once
    before awk starts reading the file
  • BODY
  • A loop that processes each line in the file
  • END
  • End processing that is done only once after all
    lines in the file have be read

13
awk Script Design
BEGIN Begins Actions
Pattern Action Pattern
Action Pattern
Action END Ends Actions
The Pattern determines if its associated action
is executed. Essentially a pattern is a logical
expression that will have a true or false value.
14
awk Script Design (cont)
BEGIN Begins Actions
Pattern statement Pattern
statement1 statement 2 statement3
Pattern statement1
statement2
statement3
END Ends Actions Actions can be
made up of one or more statements. In awk the
end of a statement is designated by a newline,
semicolon, or a closing brace.
15
awk F f total.awk total.dat
total.dat Input 227844 663170 523044 8831
66
total.awk Begin Processing BEGIN print Print
Totals total 0 Body Processing total 1
2 3 print 1 2 3
total End Processing END print End Totals
Output Print Totals 22 78 44 144 66 31
70 167 52 30 44 126 88 31 66
185 End Totals
16
Output in awk
  • print
  • printf
  • The C formatted print statement
  • sprintf
  • String print
  • Uses the formatted print concept to combine two
    or more fields into one string that can be used
    as a variable later in the script

17
printf
awk printf(2d -12s 9.2f\n, 1, 2, 3)
sales2.dat head -5
flag minimum
precision conversion
width
code
c character s string d
decimal integer o,x octal/hex f,e,g
floating point percent sign
  • left justify
  • sign ( or -)
  • 0 zero padding

18
sprintf
NR 1 str sprintf(2d
-12s 9.2f\n, 1, 2, 3) len
length (str) print len
str
  • Input Output
  • clothing 3141 27
    1 clothing 3141.00
  • 2 computers 9161
  • 3 textbook 21312

19
Selection
if (expression1) if
(expression2) action1
end nested if else action2 end of
outer if
20
Iterationwhile loop
stuWhile.awk script total 0 count
0 i 2 While (i lt NF)
total i count i
while test for zero divide If (count gt
0) avrg total/count
print (1, avrg) zero divid test
body
Input Output
1234 87 83 91 89 1234 87.5 2345
71 78 83 81 2345 78.25 2345 93 97
89 91 3456 92.5 4567 81 82 79
89 4567 82.75 5678 78 86 81 79
5678 81
21
Iterationfor loop
for loop example BEGIN total 0 count
0 for (i 2 i lt NF i)
total i count
for end of student scores test for zero
divide if count gt 0 avrg total/count
print (1, avrg) zero divide test end

Input Output
1234 87 83 91 89 1234 87.5 2345
71 78 83 81 2345 78.25 3456 93 97
89 91 3456 92.5 4567 81 82 79
89 4567 82.75 5678 78 86 81 79
5678 81
22
Questions
  • Do the awk field buffers use the same memory
    space as the record buffer or do the field
    buffers have their own memory space that is
    separate from the record buffer?
  • i.e. Are the field buffers just pointers into the
    record buffer?
  • How could you demonstrate your answer?

23
  • There is only one record buffer available. Its
    name is 0. It holds the whole record. In
    other words, its the concatenation of all field
    buffers with one field separator character
    between each field.
  • As long as the contents of any field are not
    changed, 0 holds exactly the same data as found
    in the input file.
  • What happens if there are multiple blanks between
    words?
  • What about other field separators?

24
Questions
  • Is there a limit to the number of field buffers?
  • i.e. can you have 10, 100, etc.
About PowerShow.com