Title: Perl
1Perl
- Major parts of this lecture adapted from
http//www.scs.leeds.ac.uk/Perl/start.html
2Why Perl?
- Perl is built around regular expressions
- REs are good for string processing
- Therefore Perl is a good scripting language
- Perl is especially popular for CGI scripts
- Perl makes full use of the power of UNIX
- Short Perl programs can be very short
- Perl is designed to make the easy jobs easy,
without making the difficult jobs impossible. --
Larry Wall, Programming Perl
3Why not Perl?
- Perl is very UNIX-oriented
- Perl is available on other platforms...
- ...but isnt always fully implemented there
- However, Perl is often the best way to get some
UNIX capabilities on less capable platforms - Perl does not scale well to large programs
- Weak subroutines, heavy use of global variables
- Perls syntax is not particularly appealing
4What is a scripting language?
- Operating systems can do many things
- copy, move, create, delete, compare files
- execute programs, including compilers
- schedule activities, monitor processes, etc.
- A command-line interface gives you access to
these functions, but only one at a time - A scripting language is a wrapper language that
integrates OS functions
5Major scripting languages
- UNIX has sh, Perl
- Macintosh has AppleScript, Frontier
- Windows has no major scripting languages
- probably due to the weaknesses of DOS
- Generic scripting languages include
- Perl (most popular)
- Tcl (easiest for beginners)
- Python (new, Java-like, best for large programs)
6Perl versions
- To find out which version of Perl you are using,
type perl -vat the command line - Alternatively, put the following in a file
version.pl - !/usr/bin/perlprint "\n\nHello from Perl
!\n\n" - Run this program with perl version.pl
- This may or may not work, depending on where and
how Perl is located on your system - Please use Perl 5 in your assignments
- Im using Perl 5.8.8
7Perl Example 1
!/usr/local/bin/perl Program to do the
obvious print 'Hello world.' Print a
message
Save this on a file named hello.pl
8Comments on Hello, World
- Perl statements end with semicolons
- Perl is case-sensitive
- Perl is compiled and run in a single operation
- Comments are to end of line
- But the first line, !/usr/local/bin/perl, tells
where to find the Perl compiler on your system - Its usually here or at /usr/bin/perl
- Perl files should have the .pl extension
- How to run the hello.pl Perl program
- perl hello.pl may work, and may or may not
require the first ! line - ./hello.pl (or .\hello.pl) may work, and requires
the first ! line
9Perl in EasyEclipse
- Open the Perl perspective
- Window -gt Open Perspective -gt Other... -gt Perl
- Create a Perl Project
- File -gt New -gt Project... -gt Perl -gt Perl Project
- File -gt New -gt Perl Project (if in the Perl
perspective) - Create a file, and give it the .pl extension
- File -gt New -gt Perl File
- Write your Perl code
- Create a run configuration
- Run -gt Run... -gt Perl Local -gt New_configuration
- In the Main tab
- Name enter a meaningful name (in place of
new_configurationProjectBrowse, choose
projectFile to execute choose file - Apply and Run
10Perl Example 2
!/ex2/usr/bin/perl Remove blank lines from a
file Usage singlespace lt oldfile gt
newfile while (line ltSTDINgt) if (line
eq "\n") next print "line"
11More Perl notes
- On the UNIX command line
- lt filename means to get input from this file
- gt filename means to send output to this file
- In Perl, ltSTDINgt is the input file, ltSTDOUTgt is
the output file - Scalar variables start with
- Scalar variables hold strings or numbers, and
they are interchangeable - Examples
- priority 9
- priority '9'
- Array variables start with _at_
12Perl Example 3
!/usr/local/bin/perl Usage fixm ltfilenamesgt
Replace \r with \n -- replaces input
files foreach file (_at_ARGV) print
"Processing file\n" if (-e "fixm_temp")
die " File fixm_temp already exists!\n"
if (! -e file) die " No such file
file!\n" open DOIT, " tr \'\\015'
\'\\012' lt file gt fixm_temp" or die
" Can't tr '\015' '\012' lt infile gt
outfile\n" close DOIT open DOIT, " mv
-f fixm_temp file" or die " Can't mv -f
fixm_temp file\n" close DOIT
13Comments on example 3
- In Usage fixm ltfilenamesgt, the angle brackets
just mean to supply a list of file names here - In UNIX text editors, the \r (carriage return)
character usually shows up as M (hence the name
fixm_temp) - The UNIX command tr '\015' '\012' replaces all
\015 characters (\r) with \012 (\n) characters - The format of the open and close commands is
- open fileHandle, fileName
- close fileHandle, fileName
- " tr \'\\015' \'\\012' lt file gt fixm_temp"
says Take input from file, pipe it to the tr
command, put the output on fixm_temp
14Arithmetic in Perl
a 1 2 Add 1 and 2 and store in a a
3 - 4 Subtract 4 from 3 and store in
a a 5 6 Multiply 5 and 6 a 7 /
8 Divide 7 by 8 to give 0.875 a 9
10 Nine to the power of 10, that is, 910 a
5 2 Remainder of 5 divided by 2 a
Increment a and then return
it a Return a and then
increment it --a Decrement a
and then return it a-- Return a
and then decrement it
15String and assignment operators
a b . c Concatenate b and c a b x
c b repeated c times a b
Assign b to a a b Add b to a a
- b Subtract b from a a . b
Append b onto a
16Single and double quotes
- a 'apples'
- b 'bananas'
- print a . ' and ' . b
- prints apples and bananas
- print 'a and b'
- prints a and b
- print "a and b"
- prints apples and bananas
17Arrays
- _at_food ("apples", "bananas", "cherries")
- But
- print food1
- prints "bananas"
- _at_morefood ("meat", _at_food)
- _at_morefood ("meat", "apples", "bananas",
"cherries") - (a, b, c) (5, 10, 20)
18push and pop
- push adds one or more things to the end of a list
- push (_at_food, "eggs", "bread")
- push returns the new length of the list
- pop removes and returns the last element
- sandwich pop(_at_food)
- len _at_food len gets length of _at_food
- food returns index of last element
19foreach
Visit each item in turn and call it
morsel foreach morsel (_at_food) print
"morsel\n" print "Yum yum\n"
20Tests
- Zero is false. This includes 0, '0', "0", '',
"" - Anything not false is true
- Use and ! for numbers, eq and ne for strings
- , , and ! are and, or, and not, respectively.
21for loops
- for loops are just as in C or Java
- for (i 0 i lt 10 i) print
"i\n"
22while loops
!/usr/local/bin/perl print "Password? " a
ltSTDINgt chop a Remove the last
character (\n) while (a ne "fred") print
"sorry. Again? " a ltSTDINgt chop
a
23do..while and do..until loops
!/usr/local/bin/perl do print
"Password? " a ltSTDINgt chop
a while (a ne "fred")
24if statements
if (a) print "The string is not
empty\n" else print "The string is
empty\n"
25if - elsif statements
if (!a) print "The string is empty\n"
elsif (length(a) 1) print "The string
has one character\n" elsif (length(a) 2)
print "The string has two characters\n" else
print "The string has many characters\n"
26Why Perl?
- Two factors make Perl important
- Pattern matching/string manipulation
- Based on regular expressions (REs)
- REs are similar in power to those in Formal
Languages - but have many convenience features
- Ability to execute UNIX commands
- The Perl interpreter emulates these commands on
non-UNIX platforms - Often Perl is used simply for its UNIX emulation
27Basic pattern matching
- sentence /the/
- True if sentence contains "the"
- sentence "The dog bites."if (sentence
/the/) is false - because Perl is case-sensitive
- ! is "does not contain"
28RE special characters
. Any single character except a
newline The beginning of the line or
string The end of the line or
string Zero or more of the last
character One or more of the last
character ? Zero or one of the last
character
29RE examples
. matches the entire string hi.bye
matches from "hi" to "bye" inclusive x y
matches x, one or more blanks, and y Dear
matches "Dear" only at beginning bags?
matches "bag" or "bags" hiss matches
"hiss", "hisss", "hissss", etc.
30Square brackets
qjk Either q or j or k qjk
Neither q nor j nor k a-z Anything
from a to z inclusive a-z No lower
case letters a-zA-Z Any letter a-z
Any non-zero sequence of
lower case letters
31More examples
aeiou matches one or more
vowels aeiou matches one or more
nonvowels 0-9 matches an unsigned
integer 0-9A-F matches a single hex
digit a-zA-Z matches any
letter a-zA-Z0-9_ matches identifiers
32More special characters
\n A newline \t A tab \w Any
alphanumeric same as a-zA-Z0-9_ \W Any
non-word char same as a-zA-Z0-9_ \d Any
digit. The same as 0-9 \D Any non-digit.
The same as 0-9 \s Any whitespace
character\S Any non-whitespace character \b
A word boundary, outside only \B No
word boundary
33Quoting special characters
\ Vertical bar \ An open square
bracket \) A closing parenthesis \
An asterisk \ A carat symbol \/ A
slash \\ A backslash
34Alternatives and parentheses
jellycream Either jelly or cream (egle)gs
Either eggs or legs (da)
Either da or dada or
dadada or...
35Substitution
- is a test, as in sentence /the/
- ! is the negated test, as in sentence !
/the/ - is also used for replacement, as in
sentence /london/London/ - This is an expression, whose value is the number
of substitutions made (0 or 1)
36The _ variable
- Often we want to process one string repeatedly
- The _ variable holds the current string
- If a subject is omitted, _ is assumed
- Hence, the following are equivalent
- if (sentence /under/)
- _ sentence if (/under/) ...
37Global substitutions
- s/london/London/
- substitutes London for the first occurrence of
london in _ - s/london/London/g
- substitutes London for each occurrence of london
in _ - The value of a substitution expression is the
number of substitutions actually made
38Case-insensitive substitutions
- s/london/London/i
- case-insensitive substitution will replace
london, LONDON, London, LoNDoN, etc. - You can combine global substitution with
case-insensitive substitution - s/london/London/gi
39Remembering patterns
- Any part of the pattern enclosed in parentheses
is assigned to the special variables 1, 2, 3,
, 9 - Numbers are assigned according to the left
(opening) parentheses - "The moon is high" /The (.) is (.)/
- Afterwards, 1 "moon" and 2 "high"
40Dynamic matching
- During the match, an early part of the match that
is tentatively assigned to 1, 2, etc. can be
referred to by \1, \2, etc. - Example
- \b.\b matches a single word
- /(\b.\b) \1/ matches repeated words
- "Now is the the time" /(\b.\b) \1/
- Afterwards, 1 "the"
41tr
- tr does character-by-character translation
- tr returns the number of substitutions made
- sentence tr/abc/edf/
- replaces a with e, b with d, c with f
- count (sentence tr///)
- counts asterisks
- tr/a-z/A-Z/
- converts to all uppercase
42split
- split breaks a string into parts
- info "CaineMichaelActor14, Leafy
Drive"_at_personal split(//, info) - _at_personal ("Caine", "Michael", "Actor",
"14, Leafy Drive")
43Associative arrays
- Associative arrays allow lookup by name rather
than by index - Associative array names begin with
- Example
- fruit ("apples", "red", "bananas", "yellow",
"cherries", "red") - Now, fruit"bananas" returns "yellow"
- Note braces, not parentheses
44Associative Arrays II
- Can be converted to normal arrays_at_food
fruit - You cannot index an associative array, but you
can use the keys and values functionsforeach
f (keys fruit) print ("The color of f is
" . fruitf . "\n")
45Associative Arrays III
- The function each gets key-value pairs
- while ((f, c) each(fruit)) print
"f is c\n"
46Calling subroutines
- Assume you have a subroutine printargs that just
prints out its arguments - Subroutine calls
- printargs("perly", "king")
- Prints "perly king"
- printargs("frog", "and", "toad")
- Prints "frog and toad"
47Defining subroutines
- Here's the definition of printargs
- sub printargs print "_at__\n"
- Where are the parameters?
- Parameters are put in the array _at__
- _at__ has nothing to do with _ they are unrelated
48Returning a result
- The value of a subroutine is the value of the
last expression that was evaluated
sub maximum if (_0 gt _1)
_0 else _1
biggest maximum(37, 24)
49Local variables
- _at__ is local to the subroutine, and
- so are _0, _1, _2,
- local creates local variables
50Example subroutine
sub inside local(a, b)
Make local variables (a,
b) (_0, _1) Assign values
a s/ //g
Strip spaces from b s/ //g
local variables (a
/b/ b /a/) Is b inside a
or a inside b? inside("lemon", "dole
money") true
51Perl 5
- Perl 5 is usually described as a whole new
language - However, Perl 5 is mostly backward compatible,
and there are only a few apparent differences - The most significant difference is that Perl can
now be written in a more object-oriented fashion
(like C compares to C) - Perl 4 had three types of data scalar , _at_array,
and hash - Perl 5 adds another item the reference
- References are indicated by \
- Perl 5 interpolates arrays into double-quoted
strings - Perl 5 provides reluctant quantifiers (with ?)
- Perl 5 has modules, which are similar to classes
- Perl now provides a handful of modules (just as
Java has always provided prewritten classes) - Perl 5 has auto variables
- Variables may now be declared within a lexical
scope
52Perl 6
- Whereas Perl 5 is often described as a whole new
language, Perl 6 is a whole new language - The best summary of the differences that Ive
found is http//perlcabal.org/syn/Differences.html
- Perl 6 status Vaporware?
- Under development since 1999 or 2000
- Larry Wall We're working on it, slowly but
surely...or not-so-surely in the spots we're not
so sure... - Just interesting reading An interview with Larry
Wallhttp//lwn.net/2001/features/LarryWall/
53The End