Perl Practical Extraction and Report Language - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Perl Practical Extraction and Report Language

Description:

... of program file to tell the location of the Perl interpreter and enable ... a 'temporary' variable $code is used to store the key. foreach (keys %countries) ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 49
Provided by: ICT371
Category:

less

Transcript and Presenter's Notes

Title: Perl Practical Extraction and Report Language


1
Perl - Practical Extraction and Report Language
  • Perl is a programming language used in various
    applications such as text processing. An
    important feature of Perl is regular expression.
  • Useful sites for Perl www.cpan.org,
    www.activestate.com, etc.
  • A simple, one-line Perl program/script
  • print "Hello World\n"
  • In Linux, add the following line at the top of
    program file to tell the location of the Perl
    interpreter and enable the warning option
  • !/usr/bin/perl -w
  • the symbol introduces a comment in Perl
  • set execution permission for the program file
    chmod x hello
  • and execute it ./hello
  • In Windows, save the program file with the .pl
    extension, and execute it with Perl interpreter
    c\gtperl hello.pl
  • to enable warning option, add this line use
    warnings

2
Scalar, _at_Array and Hash
  • Scalar, a number or a string or , starts with ,
    e.g. name
  • Array, an array, starts with _at_, e.g. _at_names
  • Hash, an associative array, starts with , e.g.
    names

scalar string "snm" num1 100 num2
1.01 print "The string is string\n" print "One
number is num1, another number is num2\n"
array and scalar myvar"my variable"
_at_myvar("one","two","three","four","five","six")
print "myvar\n" print value of a
scalar print "myvar1\n" print value of the
2nd element of _at_myvar print "_at_myvar\n" print
all elements of _at_myvar
3
Typing - use strict
  • Typing declaration of the type of variable may
    not be used in short Perl programs such as the
    example on previous page. Perl is referred as a
    loosely typed language.
  • For longer Perl programs, it is better to add use
    strict which requires that each variable be
    declared before being used.

example of use strict use strict our (x, y)
split ' ', ltstdingt declare global variable
sub product my (num1, num2) _at__ declare
local variables return num1num2 rest of
the program Print product (x, y) \n
4
Print and Variable Interpolation
  • The print function is a list operator. It
    accepts a list of things, separated by commas, to
    print.
  • Double quote " " supports variable interpolation
  • Single quote ' ' does not support variable
    interpolation.
  • string"Perl"
  • num10
  • print string, " and ", num
  • print "string and num\n"
  • print 'string and num\n'

5
Escaping
  • Escaping - the functions/meanings of special
    characters such as , _at_ and are not taken.
    Special characters are also called
    metacharacters.
  • Escaping can turn some non-special characters
    into something special.
  • A backslash \ is used to apply escaping, as in
    the following example.
  • num1 10
  • print "\num1 is num1\n"
  • with \, the special meaning of is escaped,
  • the character is printed once,
  • followed by the string "num1 is " and the value
    of num1
  • with \n, a new line is printed

6
Subroutines
  • A subroutine is a user-defined function
  • can be in the beginning, middle or end of a Perl
    program
  • start with sub and then its name
  • a pair of curly bracket is required to
    enclose the code
  • the area between the two brackets is called a
    block
  • the prefix is used when calling a subroutine.
  • sub print_results print "\num is num\n"
  • num10
  • print_results call the subroutine
  • num
  • print_results call the subroutine

7
Subroutines passing return
  • To pass parameters to the subroutine, use ( ).
    The parameters are passed through the list _at__
  • At the end of the subroutine, use return to
    return the result.

sub product (num1, num2)_at__ receive two
values from main return num1num2 x 15
y 20 print product(x, y), "\n"
product(x, y) pass the values of x and y to
subroutine result returned by the subroutine is
then printed.
8
Testing 'truth' in Perl
  • What is 'true' in Perl
  • Any string is true except for "" and "0".
  • Any number, positive or negative, is true except
    for 0.
  • Any undefined variable is false. An undefined
    variable is one which doesn't have a value.
  • a 0 b"" c"0"
  • if (!a) print 'a is ',a," which is
    false\n"
  • if (!b) print 'b is ',b," which is
    false\n"
  • if (!c) print 'c is ',c," which is
    false\n"
  • if (!d) print 'd is ',d," which is
    false\n"

9
Comparison Operators
  • Comparison Numeric String
  • Equal eq
  • Not equal ! ne
  • Greater than gt gt
  • Less than lt lt
  • Greater than or equal to gt ge
  • Less than or equal to lt le
  • num115 num215
  • if (num1 num2) print "num1 equal num2.\n"
  • name1 John" name2 "May"
  • if (name1 name2) print "name1 equal
    name2.\n"
  • to compare strings eq should be used instead of

10
Standard Input STDIN, chomp shift
  • To get input from user through keyboard, use the
    ltSTDINgt function.
  • name ltSTDINgt
  • print "Your input is name\n"
  • The chomp function remove the trailing new line.
  • Compare the above example with the one below
  • nameltSTDINgt or, chomp(nameltSTDINgt)
  • chomp name
  • print "Your input is name\n"
  • Using shift
  • name shift
  • class shift
  • print "Your name is name,\tyour class is
    class\n"
  • run this program with two arguments such as,
  • c\name.pl john 3A or ./name john 3A

11
Arrays
  • An array is an ordered list of scalar variables.
  • Each element in the array is indexed by a number
  • _at_names("Mary","Alex","John","Albert","Henry")
  • print _at_names no space between two elements
  • print "\nThe elements of \_at_names are _at_names\n"
  • print "The first element is names0 \n"
  • print "The first two elements are _at_names0,1\n"
  • print "The first three elements are
    _at_names0..2\n"
  • print "The last element is names-1\n"
  • print "The index number of the last element is
    names \n"
  • print 'There are ',scalar(_at_names)," elements in
    the array\n"
  • An array can be initialized using "space" instead
    of using comma
  • _at_namesqw(Mary Alex John Albert Henry)
  • quote words qw means splitting the list by words

12
Arrays push, pop, shift unshift
  • _at_names("Albert", "Alex", "Anna")
  • Add a new element
  • push _at_names, "Bob" Add to the end of array
  • print "_at_names\n" "Albert Alex Anna Bob" is
    printed
  • unshift _at_names, "Betty" Add to the start of
    array
  • print "_at_names\n" Betty Albert Alex Anna Bob
  • remove an element and return its value
  • name1 pop _at_names remove from the "end" of
    array
  • name2 shift _at_names remove from the "start"
    of array
  • print "name1 name2\n" "Bob Betty" is
    printed

13
Arrays, For, Foreach, _
  • A simple for loop for accessing elements in an
    array.
  • _at_names("Mary","Alex","John","Albert","Henry")
  • for (x0 x lt names x)
  • print "namesx\n"
  • The foreach function is used to loop through
    arrays and hashes
  • foreach person (_at_names)
  • print "person"
  • elements in the array is accessed and printed
    one by one
  • The default input _
  • foreach (_at_names)
  • print "_ " or just, print
  • if no variable is specified, _ is printed by
    default

14
next last in a loop
  • Stopping a while loop that never ends.
  • while (1) this while loop never end!
  • print "press CTRL-C to stop!\n"
  • The last operator
  • _at_names('Mr Ho','Mrs Leung','Miss Lee','Dr
    Wong','Ms Chan')
  • foreach person (_at_names)
  • last if person /Dr /
  • print "person\n"
  • after printing "Miss Lee" and detects "Dr ",
    exits the loop
  • The next operator
  • _at_names('Mr Ho','Mrs Leung','Miss Lee','Dr
    Wong','Ms Chan')
  • foreach person (_at_names)
  • next if person /Mr /
  • print "person\n"
  • the program skips "Mr Ho", then prints the rest

15
Hashes Associative Arrays
one way to define a hash countries('CH','China
','TH','Thailand','JA','Japan') another way
to define a hash countries(CHgt'China',THgt'Thai
land',JAgt'Japan') print "Enter the country
code " chomp (findltSTDINgt) print
"\nCountry code find\tCountry
countriesfind\n"
  • A hash is formed by pairs of key and value. Each
    key is associated with a value.
  • Each key of a hash must be unique. If a key is
    defined twice, the second value overwrites the
    first.
  • The values pointed/indexed by different keys can
    be the same.

16
_at_array and hash
  • In an array, elements/values are accessed with
    their index number.
  • print "myarray1"
  • In a hash, each value is associated with an
    unique string a key.
  • print "myhash'CH'"
  • To access an array element, uses an index number
    and
  • To access a hash element, uses a key and

17
_at_array or hash
  • Using hashes
  • When you want to look something up by a keyword,
    for example, a program returns an IP address when
    given a host name.
  • To relate one set of data (keys) to another set
    of data (values).
  • Using arrays
  • Use an array when you need an ordered list of
    elements. The elements in an array always stay
    in the same order when they are created.
  • Unlike an array, a hash is not an "ordered" list.

countries(CHgt'China',THgt'Thailand',JAgt'Japan'
,HKgt'Hong Kong') print countries the print
results show no recognizable sequence
18
Accessing Hash Elements
  • Assign/add a pair of key and value
  • countriesHK'Hong Kong'
  • Remove a pair of key and value
  • delete countriesHK
  • Print all keys
  • print keys countries
  • Print all values
  • print values countries
  • Print a slice of the hash
  • print _at_countriesCH,TH
  • Print how many elements in the hash
  • print scalar(keys countries)

19
Hash each, foreach
  • countries(CHgt'China',THgt'Thailand',JAgt'Japan'
    ,HKgt'Hong Kong')
  • while ((code, name) each countries)
  • print "The key code points to name\n"
  • call each pair of key and value
  • foreach code (keys countries)
  • print "The key code points to
    countriescode\n"
  • a "temporary" variable code is used to store
    the key
  • foreach (keys countries)
  • print "The key _ points to countries_\n"
  • if a variable is not specified, the default,
    _ is used

20
Hash sorting
  • countries(CHgt'China',THgt'Thailand',JAgt'Japan'
    ,HKgt'Hong Kong')
  • sorting based on the keys (strings)
  • hash elements are accessed in the order the
    keys are sorted
  • a simple sort
  • foreach (sort keys countries)
  • print "The key _ points to countries_\n"
  • a reverse sort
  • foreach (reverse sort keys countries)
  • print "The key _ points to countries_\n"

21
Regular Expressions (regex)
  • What is a regular expression? A regular
    expression is simply a string that describes a
    pattern. For example, the patterns typed into a
    search engine the patterns used to list files in
    a directory (dir .exe)
  • In Perl, the patterns described by regular
    expressions are used to search strings, extract
    parts of strings, search and replace, etc. For
    example,
  • if ("Hello World" /World/) print "it
    matches\n"
  • /World/ tells Perl to search for the pattern
    enclosed by the two //, in this case, the word
    World.
  • The operator associates the string with the
    regex and produces a true value if the regex
    matched.

22
Examples of regex
_"perl for System Administration" the
string to match with if (_ /perl/) print
"1 Found perl\n" if (/perl/) print "2
Found perl\n" same as above, not used if
(/PeRl/) print "3 Found PeRl\n" fail, case
sensitive if (/PeRl/i) print "4 Found PeRl\n"
with /i, not case sensitive print "5
Found!\n" if / / "if" can be put after
"print" if the regex / / find a space in _,
print the string "5 Found!" print "6 Found!\n"
unless _ ! / / 'unless' '!' are both
negative find" for " create some variables
to search for if (/find/) print "8 Found find
in _\n"
23
Character Class
  • The square brackets enclose single character to
    be matched.
  • To negate the entire regex, change to ! , for
    example,
  • (_ ! /KCarel/)
  • To negate specific character, use caret , for
    example,
  • /KCarel/
  • match K or C, then ar, then anything EXCEPT e
    or l

_at_namesqw(Karl Care Karen Card) foreach
(_at_names) if (/KCarel/) print "_ is
matched!\n" matches K or C, then ar, then e
or l
24
More examples of regex matching
  • Start with use , e.g., /n/ matches those
    strings that begin with n.
  • End with use , e.g., /s/ matches those strings
    that end with s
  • Matching x times
  • Zero or more times use
  • One or more times use
  • Zero or one time use ?
  • Exactly n times use n
  • At least n times use n,
  • At least n but not more than m times use n,m

25
Return the match
  • Regex can return what is found to a specified
    variable.
  • star has no value assigned because 0-9 match
    0 or more characters from 0 to 9 at the very
    start of the regex
  • plus get the 2200 value because 0-9 match one
    or more characters from 0 to 9. If 0 to 9 is not
    found the match will fail. Once a 0-9 is found,
    the match continues as long as the next character
    is 0-9, then it stops.

_'This year is 2009'(one)/(0-9)/
return one digit only (star)/(0-9)/
match zero or more character (plus)/(0-9)/
match one or more character print
"\oneone \starstar \plusplus\n"
print result is one2 star plus2009
26
Return the match 1, 2
  • With parenthesis ( ) around the regex, the first
    match is put into a variable called 1.
  • In the example below, the regex begin matching
    for lt. When lt has been found, the first
    parenthesis ( ) start to match for staff and put
    the value into the variable 1 when matched.
  • The regex then continues matching for the
    character _at_ . When _at_ is found, the 2nd
    parenthesis ( ) starts to match for vtc.edu.hk
    and put the value into the variable 2 when
    matched.

_'My email address is ltstaff_at_vtc.edu.hkgt.'
/lt(staff)\_at_(vtc.edu.hk)gt/ print "1 at 2 \n"
print result staff at vtc.edu.hk
27
Match return the unkown
  • The last example requires that the email address
    be know in advanced.
  • In the example below, the regex does not know
    what to match for.
  • In the /(.)/ regex, the dot (or period) match
    for any character, and the match for zero or
    more characters.
  • In the regex /lt(.)gt/, a match start with lt and
    end with gt
  • The regex returns any characters between lt and gt
    into 1

_'My email address ltstaff_at_vtc.edu.hkgt.' print
"1 \n" if /(.)/ result My email address
ltstaff_at_vtc.edu.hkgt. print "1 \n" if
/lt(.)gt/ result staff_at_vtc.edu.hk
28
Substitutions and Split
  • Search and replace examples
  • _ "your home , my home, sweet home"
  • s/home/apple/ replace once
  • print _,"\n"
  • s/home/apple/g replace many
  • print _,"\n"
  • Using split
  • _at_fields split //, "nameagetelephone"
  • _at_fields split /,/, "name,age,telephone"

29
Calling external commands/processes
  • There are several ways for Perl to start external
    commands/processes, such as system and
    backticks/backquotes
  • The system function
  • It calls and runs an external commands/processes,
    e.g.,
  • system("dir")
  • Backticks/backquotes can start an external
    commands/processes
  • Note that the single quote '' is different from
    the backquote
  • dirdir
  • print dir

30
Using system
  • Using the external command system together
    external commands such as the Windows net use
    and net user commands, you can perform various
    system administration works, for example,
  • system("net user abc 123 /add")
  • create a user account abc with password 123
  • system("net use drive_letter \\\\xxx\\yyy 123
    /userabc")
  • drive_letter such as z is the drive that you
    want to map
  • where xxx is the server name or ip address
  • yyy is the shared folder name
  • abc is username and 123 is the password

31
Opening Files
  • Assume you have a file called data.txt in the c\
    directory, the following program can open the
    file and read the content.
  • If the open operation fails, the code next to or
    is evaluated.
  • The code die means exiting the program after
    printing the specified message.
  • The ! provides the error message.
  • The special variable . is the current line
    number, starting at 1.
  • The special variable _ is the default variable.
    If there is not an user-defined variable, _ is
    assigned the content read from the input file.

data"c\\data.txt" open DATA, data or die
"Cannot open data to read !" while (ltDATAgt)
print "Line . is _"
32
Opening Files
  • Variables which represent files are called "file
    handles" which does not begin with a special
    character. For example, the file handle DATA
    used in the example shown in previous slide.
  • The input operator, the angle brackets ltgt, reads
    from the beginning of the line until and
    including the first newline. It returns one line
    from standard input.
  • The data read by the ltgt operator goes into _
  • On the next iteration of the loop, data is read
    from where the last read left off, up to the next
    newline. And so on until there is no more data
    in the file. When that happens the condition is
    false and the loop terminates.
  • The ltgt operator returns undef when there is no
    more input.

33
Writing appending data to a File
  • To open a file for writing data, add gt to the
    filename.
  • To open a file for appending data, add gtgt to the
    filename.
  • When print to a file, just specify the name of
    file handle.
  • Closing a file after use is not mandatory unless
    you wish to open another file.
  • Note that localtime holds the value of current
    time.

out"out.txt" open OUT, "gtout" or die "Cannot
open out for write !" for i (1..10)
time localtime print OUT "i The time is
now ", time,"\n" close OUT
34
Directory and File Tests
  • To check if the file (filename) exists in the
    current directory
  • print "file exists\n" if e filename
  • To check how long since the file has been
    modified
  • age -M filename
  • To get the size of a file
  • size -s filename

Test Meaning -r readable -w writable -s size
(bytes) -f a file -d a directory -M modificatio
n age (days) -A access age (days)
35
Directory File Operations
  • Moving around (change directory) the directory
    tree
  • chdir "/etc"
  • Globbing produces a list of files/sub-directories
    in current directory
  • _at_files glob "" get names of files
    sub-directories
  • _at_pl_files glob ".pl"
  • Making and removing directories
  • mkdir "temp", or warn "cannot make dir !"
  • rmdir "temp" remove empty directory only
  • Renaming files
  • rename "old.txt", "new.txt"

36
Directory Handle readdir()
  • Opening a directory is similar to opening a file.
    The directory handle can be passed to the
    function readdir( ) to get a list of all of the
    files and sub-directories.

chdir "c\\Windows" opendir DIR,"." or die
"Can't open the directory !\n" _at_names
readdir(DIR) foreach name (_at_names) if (-d
name) print "found a directory name\n"
print the name if it is a directory. if (-f
name) print "found a file name\n" print
the name if it is a file. closedir(DIR)
37
Referencing Data
  • References indirectly refer to other data
  • References are like pointers, but done right
  • Backslash operator creates a reference
  • References are scalars
  • my _at_fruit qw(apple banana cherry)
  • my fruitref \_at_fruit

38
Dereferencing Data
  • Dereferencing yields the data
  • Appropriate symbol dereferences original data
  • Arrow operator (-gt) dereferences items
  • my _at_fruit qw(apple banana cherry)
  • my fruitref \_at_fruit
  • print "I have these fruits _at_fruitref.\n"
  • print "I want a fruitref-gt1.\n"
  • I have these fruits apple banana cherry.
  • I want a banana.

39
Anonymous Data
  • References allow you to create anonymous data
  • Referencing arrays and hashes are common
  • Build unnamed arrays with brackets ( )
  • Build unnamed hashes with braces ( )
  • my fruits "apple", "bananas", "cherries"
  • my wheels unicycle gt 1,
  • bike gt 2,
  • tricycle gt 3,
  • car gt 4,
  • semi gt 18
  • print "A car has wheels-gtcar wheels.\n"
  • A car has 4 wheels.

40
Perl Modules
  • Perl modules that can be added to a standard Perl
    installation, for example, the NetLDAP module
    for directory (LDAP) applications and the module
    NetSNMP for network management applications.
  • Having installed a module the documentation is
    available using the standard "perldoc" command,
    e.g. perldoc NetLDAP.
  • Two basic types of module Function and Object.
  • Functional modules export new subroutines and
    variables into your program.
  • Using object type modules create an object,
    access methods through the object
  • The NetLDAP and NetSNMP modules are examples
    of object type module.

41
Perl Objects
  • Perl objects are special references that come
    bundled with subroutines (called "methods") and
    data.
  • To use Perl objects, first declare to use a
    object type module, such as ,
  • use NETLDAP
  • Then instantiate an object (create an instance),
    such as,
  • ldapNetLDAP-gtnew("localhost",portgt389,versi
    ongt3)
  • Refer to the examples shown in next slide for how
    to use methods via the object.
  • ldap is object of class NetLDAP
  • Interact with objects through object methods like
    print
  • Notice that the arrow -gt is used to invoke a
    method.

42
NetLDAP module session bind
!/usr/bin/perl -w use NetLDAP ldap
NetLDAP-gtnew("localhost", portgt389,
versiongt3) or die "cannot make session,
!\n" Anonymous bind resultldap-gtbind()
resultldap-gtbind("cnict,dccsa-snm,dccom",
passwordgt"snm") errorresult-gterror() if
(result-gtcode()) print "cannot bind
successfully, error\n" exit 1
43
NetLDAP module - search
  • mesg ldap-gtsearch(
  • base gt "dccsa-snm,dccom",
  • scope gt "sub",
  • filter gt "(objectClassperson)",
  • attrs gt "cn", "sn"
  • to read all attributes attrs gt ""
  • )
  • foreach myentry (mesg-gtall_entries())
  • myentry-gtdump() display all attributes
  • tempmyentry-gtget_value("cn")
  • print temp get and display an attribute
  • ldap-gtunbind()

44
NetLDAP module add, modify, delete
  • refer to page 46 for the creation of the object
    ldap
  • dn "cnjohn,oustaff,dccsa-snm,dccom"
  • Add an entry
  • result ldap-gtadd( dn,
  • attrs gt cn gt 'john',
  • objectClass gt 'person',
  • sn gt 'ho')
  • Modify an entry by adding an attribute value
  • result ldap-gtmodify( dn,
  • add gt description gt "hi")
  • Modify an entry by replacing an attribute value
  • result ldap-gtmodify( dn,
  • replace gt description gt "hi, there")
  • result ldap-gtdelete(dn) Delete an entry

45
Using NetSNMP module
  • The modules NetSNMP, NetSNMPHostInfo and
    NetSNMPInterfaces provide an OO interface to
    SNMP.
  • One NetSNMP object corresponds to one remote
    SNMP agent or manager.
  • Methods that require a response return a hash
    reference containing the query results. Keys in
    the hash are dotted OIDs.
  • Returns undefined value on failure. The error()
    method shows cause of failure.
  • Method parameters uses a dashed-option naming
    style
  • object-gtmethod( -argument gt value )

46
NetSNMP Session()
  • (session, error) NetSNMP-gtsession(
  • -hostname gt hostname, -port gt
    port,-version gt version,
  • -community gt community, v1/v2c
  • -username gt username, v3
  • -authkey gt authkey, v3
  • -authpassword gt authpasswd, v3
  • -authprotocol gt authproto, v3
  • -privkey gt privkey, v3
  • -privpassword gt privpasswd, v3
  • -privprotocol gt privproto, v3
  • ) some arguments for session() are not shown
    here

47
NetSNMP get, getnext, getbulk, set
  • result session-gtget_request(
  • -varbindlist gt \_at_oids)
  • result session-gtget_next_request(
  • -varbindlist gt \_at_oids )
  • result session-gtget_bulk_request(
  • -nonrepeaters gt non_reps,
  • -maxrepetitions gt max_reps,
  • -varbindlist gt \_at_oids)
  • result session-gtset_request(
  • -varbindlist gt \_at_oid_value)

48
NetSNMP - example
  • !/usr/bin/perl -w
  • use NetSNMP
  • session NetSNMP-gtsession( create a session
    object
  • -hostname gt "localhost",
  • -community gt "snm",
  • -port gt 161,
  • -version gt 2)
  • _at_array ("1.3.6.1.2.1.1.3.0", "1.3.6.1.2.1.1.4.0"
    )
  • result session-gtget_request(
  • -varbindlist gt \_at_array)
  • foreach oid (keys result)
  • print "OID oid and VALUE result-gtoid\n"
  • session-gtclose
Write a Comment
User Comments (0)
About PowerShow.com