Awk Utility - PowerPoint PPT Presentation

About This Presentation
Title:

Awk Utility

Description:

The original version of awk was written in 1977. ... END{print'fo fum'} Let's say that's in the file giant.awk. ... END{print'fo fum' ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 30
Provided by: mas69
Category:

less

Transcript and Presenter's Notes

Title: Awk Utility


1
Math 272
AWK UTILITY
2
BY A Mikati M Shaito
3
supervised by
Dr. A Nasri
4
Awk Utility
  • Introduction
  • Some basics
  • Some samples
  • Patterns Actions
  • Regular Expressions
  • Boolean
  • start /end
  • BEGIN /END

5
Awk Utility (continued)
  • Awk variables
  • Control of flow statements
  • a If_Else statement
  • b While Statement
  • c For statement

6
Introduction
History The name awk comes from the initials of
its designers Alfred V. Aho, Peter J.
Weinberger, and Brian W. Kernighan. The original
version of awk was written in 1977. In 1985 a new
version made the programming language more
powerful, introducing user-defined functions,
multiple input streams, and computed regular
expressions.
7
Introduction (contd)
If you are like many computer users, you would
frequently like to make changes in various text
files wherever certain patterns appear, or
extract data from parts of certain lines while
discarding the rest. To write a program to do
this in a language such as C or Pascal is a
time-consuming inconvenience that may take many
lines of code. The job may be easier with awk.
The awk utility interprets a special-purpose
programming language that makes it possible to
handle simple data-reformatting jobs easily with
just a few lines of code.
8
Some Basics
  • The basic function of awk is to search files for
    lines (or other units of text) that contain
    certain patterns.
  • Awk recognizes the concepts of "file", "record",
    and "field".
  • A file consists of records, which by default are
    the lines of the file. One line becomes one
    record.
  • Awk operates on one record at a time.
  • A record consists of fields, which by default are
    separated by any number of spaces or tabs.
  • Field number 1 is accessed with 1, field 2 with
    2, and so forth. 0 refers to the whole record.

9
Some Samples
gtawk print 0 filename Perhaps the quickest
way of learning awk is to look at some sample
programs. The one above will print the file in
its entirety, just like cat(1). Here are some
others, along with a quick description of what
they do. gtawk 'print 2,1' filename will
print the second field, then the first. All other
fields are ignored. What if you don't want to
apply the program to each line of the file? Say,
for example, that you only wanted to process
lines that had the first field greater than the
second. The following program will do that gtawk
'1 gt 2 print 1,2,1-2' filename
10
Patterns Actions
The part outside the curly braces is called the
"pattern", and the part inside is the "action".
The comparison operators include the ones from C
! lt gt lt gt ? If no pattern is given, then
the action applies to all lines. This fact was
used in the sample programs above. If no action
is given, then the entire line is printed. If
"print" is used all by itself, the entire line is
printed. Thus, the following are equivalent
awk '1 gt 2' filename awk '1 gt
2print' filename awk '1 gt 2print 0'
filename
11
Patterns Actions (contd)
The various fields in a line can also be treated
as strings instead of numbers. To compare a field
to a string, use the following method gtawk
'1"foo"print 2' filename There are various
types of patterns and actions that will be
explained in details.
12
Kinds of patterns
/regular expression/ A regular expression as a
pattern. It matches when the text of the input
record fits the regular expression. expression
A single expression. It matches when its value,
converted to a number, is nonzero (if a number)
or non null (if a string). BEGIN END Special
patterns to supply start-up or clean-up
information to awk. null The empty pattern
matches every input record.
13
Regular Expressions
A regular expression, or regexp, is a way of
describing a class of strings. A regular
expression enclosed in slashes (/') is an awk
pattern that matches every input record whose
text belongs to that class. The simplest regular
expression is a sequence of letters, numbers, or
both. Such a regexp matches any string that
contains that sequence. Thus, the regexp foo'
matches any string containing foo'. Therefore,
the pattern /foo/ matches any input record
containing foo'. Other kinds of regexps let you
specify more complicated classes of strings.
gtawk '/foo.bar/print 1,3' filename
14
Boolean
A Boolean pattern is an expression which combines
other patterns using the Boolean operators "or"
('), "and" ('), and "not" (!'). Whether
the Boolean pattern matches an input record
depends on whether its subpatterns match. For
example, the following command prints all records
in the input file filename' that contain both
2400'and foo'. awk '/2400/ /foo/' filename
15
Start end
There are three special forms of patterns that do
not fit the above descriptions. One is the
start-end pair of regular expressions. Also it
is known as range pattern which is made of two
patterns separated by a comma, of the form
startpat, endpat. It matches ranges of
consecutive input records. The first pattern
startpat controls where the range begins, and the
second one endpat controls where it ends. For
example, awk '1 "on", 1 "off" filename
16
BEGIN /END
Any action associated with the BEGIN pattern
will happen before any line-by-line processing is
done. Actions with the END pattern will happen
after all lines are processed. But how do you
put more than one pattern-action pair into an awk
program? There are several choices. One is to
just mash them together, like so gtawk
'BEGINprint"fee"\ 1"foo"print"fi"\ ENDprin
t"fo fum"' filename
17
BEGIN /END (contd)
Another choice is to put the program into a file,
like so BEGINprint"fee" 1"foo"print"fi"
ENDprint"fo fum" Let's say that's in the file
giant.awk. Now, run it using the "-f" flag to
awk gtawk -f giant.awk filename
18
BEGIN / END (contd)
A third choice is to create a file that calls awk
all by itself. The following form will do the
trick !/usr/bin/awk -f BEGINprint"fee" 1"
foo"print"fi" ENDprint"fo fum" If we call
this file giant2.awk, we can run it by first
giving it execute permissions, gtchmod ux
giant2.awk and then just call it like so
gt./giant2.awk filename .
19
BEGIN /END (contd)
awk has variables that can be either real numbers
or strings. For example, the following code
prints a running total of the fifth column gtawk
'print x5,0 ' filename This can be used
when looking at file sizes from an "ls -l". It is
also useful for balancing one's checkbook, if the
amount of the check is kept in one column.
20
Actions
An awk program or script consists of a series of
rules and function definitions, interspersed. A
rule contains a pattern and an action, either of
which may be omitted. The purpose of the action
is to tell awk what to do once a match for the
pattern is found. Thus, the entire program looks
somewhat like this pattern action
pattern action function name (args)
... An action consists of one or more awk
statements, enclosed in curly braces (' and
'). Each statement specifies one thing to be
done. The statements are separated by newlines or
semicolons.
21
Actions (contd)
Here are the kinds of statements supported in
awk 1)Expressions, which can call functions or
assign values to variables .Executing this kind
of statement simply computes the value of the
expression and then ignores it. This is useful
when the expression has side effects 2)Control
statements, which specify the control flow of awk
programs. The awk language gives you C-like
constructs (if, for, while, and so on) as well as
a few special ones 3)Compound statements, which
consist of one or more statements enclosed in
curly braces. A compound statement is used in
order to put several statements together in the
body of an if, while, do or for statement.
22
Actions(contd)
4)Input control, using the getline command and
the next statement 5)Output statements, print and
printf. 6)Deletion statements, for deleting
array elements.
23
Awk variables
Most awk variables are available for you to use
for your own purposes they never change except
when your program assigns values to them, and
never affect anything except when your program
examines them. A few variables have special
built-in meanings. Some of them awk examines
automatically, so that they enable you to tell
awk how to do certain things. Others are set
automatically by awk, so that they carry
information from the internal workings of awk to
your program. user-modified Built-in variables
that you change to control awk. Auto-set
Built-in variables where awk gives you info.
24
Control of flow statements
Control statements such as if, while, and so on
control the flow of execution in awk programs.
Most of the control statements in awk are
patterned on similar statements in C. All the
control statements start with special keywords
such as if and while, to distinguish them from
simple expressions. Many control statements
contain other statements for example, the if
statement contains another statement which may or
may not be executed. The contained statement is
called the body. If you want to include more than
one statement in the body, group them into a
single compound statement with curly braces,
separating them with newlines or semicolons.

25
If- statement
The if-else statement is awk's decision-making
statement. It looks like this if (condition)
then-body else else-body condition is an
expression that controls what the rest of the
statement will do. If condition is true,
then-body is executed otherwise, else-body is
executed (assuming that the else clause is
present). The else part of the statement is
optional. The condition is considered false if
its value is zero or the null string, and true
otherwise. awk ' if (x 2 0) print "x is
even" else print "x is odd" '
26
While Statement
In programming, a loop means a part of a program
that is (or at least can be) executed two or more
times in succession. The while statement is the
simplest looping statement in awk. It repeatedly
executes a statement as long as a condition is
true. It looks like this while (condition)
body this example prints the first three fields
of each record, one per line. awk ' i 1 while
(i lt 3) print i i
'
27
For Statement
The for statement makes it more convenient to
count iterations of a loop. The general form of
the for statement looks like this for
(initialization condition increment)
body This statement starts by executing
initialization. Then, as long as condition is
true, it repeatedly executes body and then
increment. Here is an example of a for
statement awk ' for (i 1 i lt 3 i)
print i ' This prints the first three
fields of each input record, one field at a time.
28
Thanks for listening
A Mikati M Shaito
29
For more information about Awk utility VISIT http
//mshaito.tripod.com/awk/awk.html http//
Write a Comment
User Comments (0)
About PowerShow.com