Outline - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Outline

Description:

Outline – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 42
Provided by: johngref
Category:

less

Transcript and Presenter's Notes

Title: Outline


1
Outline
Outline
  • Quiz 4
  • Lab 1 (Quiz 3) Solution
  • Scoping
  • Algorithm efficiency
  • Sorting
  • Hashes
  • Review for midterm
  • Programming assignment 3

2
Lab 1 a More Elegant Tester
Lab1
  • !/usr/bin/perl
  • use strict
  • use warnings
  • File regex_tester.pl
  • Author Jim Logan
  • Date 6 October 2009
  • Fully interactive version (i.e., no recompiles
    required) a regular expression
  • tester based on a script by Fernando J. Pineda
    as presented to
  • class of BINF623 by Jeff Solka on 10/5/09.
  • Particularly useful in an Eclipse environment
    using its cut and paste facility.
  • instructions for use
  • print "\nAccepts keyboard entry of a regular
    expression and then permits\n"
  • print "successive entry of strings to test that
    expression.\n"
  • print "Square brackets in output indicate the
    text that matched pattern\n\n"
  • print "Note Depending upon the environment
    (e.g. Eclipse), you may be\n"
  • print "able to cut and paste into both the \"Next
    expression\" and the\n"
  • initialization
  • my regex '/./' default regex to
    start and to demonstrate
  • my string 'This is a test string'
  • my input ""
  • my stripped_regex ""
  • while (1) outer loop to sequence
    regular expressions
  • print "\nCurrent regular expresssion
    regex\n"
  • print "Enter a new expression to change or
    ENTER to continue without change.\n"
  • print "(\"quit\" terminates the
    program)\n"
  • print "New expression "
  • input ltSTDINgt
  • chomp input
  • if (input /q./i) exit
  • if (input ! //)
  • regex input

3
Lab 1 a More Elegant Tester
Lab1
  • stripped_regex substr (regex, 1, length
    (regex) -2)
  • User includes the two slashes for a
    regular expresssion
  • but they are stripped here so that
    variable is just the pattern
  • that will be interpolated in /pattern/
    context.
  • while (1) inner loop to sequence
    strings to test the expression
  • print "\nCurrent test string
    string\n"
  • print "Enter a new expression to
    change or ENTER to reset the regex.\n"
  • print "New test string "
  • input ltSTDINgt
  • chomp input
  • if (input //)
    for blank line, go back to set expresssion
  • last
  • else
  • string input
    else run regex over input
  • if( string /stripped_regex/)
  • print("\'\n")
    show match in context of input
  • else
  • print("no match\n")
  • exit never used

4
Lab 1 Solution
Lab1
Lab1
  • What is a pattern that matches the substring
    world occurring
  • anywhere in the input string, e.g.
  • hello cold cruel world
  • hello world news tonight
  • helloworld.pl is a script
  • Solution
  • /world/
  • 2. What is a pattern that matches the
  • word world occurring anywhere in
  • the input string, e.g.
  • hello cold cruel world
  • hello world news tonight
  • but not
  • helloworld.pl is a script
  • Solution
  • /\bworld\b/

5
Lab 1 Solution
Lab1
  • 3. What is a pattern that matches the
  • word world only if occurs at the end
  • of the string, i.e
  • hello cold cruel world
  • but not
  • next is world news tonight
  • hello cold cruelworld
  • Solution
  • /\bworld\b/
  • 4. What is a pattern that matches a
  • string that starts with the word hello
  • OR ends in the word world, e.g.
  • hello and good night
  • thats all for tonight world
  • Solution
  • /\bhello\b\bworld\b/

6
Lab 1 Solution
Lab1
  • 5. What is a pattern that matches a
  • string that starts with the word hello
  • OR bye, AND ends with the word
  • world, e.g.
  • bye cold cruel world
  • hello cold cruel world
  • but not
  • hello cold cruel world?
  • hello cold cruelworld
  • Solution
  • /\b(hellobye)\b.\bworld\b/
  • 6. What is a pattern that matches a
  • substring world occurring 1 or more
  • times at
  • the end of the line, e.g.
  • This string ends in world
  • This string ends in worldworld
  • This string ends in worldworldworld
  • Solution
  • /(world)/

7
Lab 1 Solution
Lab1
  • 7. What is a pattern that matches one
  • or more of backslashes immediately
  • Followed by one or more asterisks, e.g.
  • \\\\
  • but not
  • \\\\\
  • Solution
  • /\\\/

8
Lab 1 Solution
Lab1
  • 8. What is a pattern that matches any line of
    input
  • that has the same word repeated
  • two or more times in a row. In this problem,
    words
  • can be considered to be
  • sequences of letters a to z, A to Z, digits, and
  • underscores. Whitespace between
  • words may differ, e.g.
  • Paris in the the spring
  • I thought that that was the problem
  • For this example you will need to use
    backreferences. A
  • backreference is a reference to a string captured
    with
  • parentheses. (Recall that in Perl, captured
  • strings are referred to as 1,,9) In a regular
    expression,
  • you can refer to captured strings, while the
    pattern is being
  • matched, as \1,\9. For example,
  • /(AT)G(\1)/ matches a 5 character string ATGAT.
  • Note Strictly speaking the inclusion of
    backreferences makes
  • Solution
  • /\b(\S)\b(\s\1\b)/
  • Understanding this
  • \b start at a word boundary (begin letters)
  • (\S) find chunk of nonwhite space
  • \b until another word boundary (end letters)
  • (\s separated by some white space
  • \1 and that very same chunk again
  • \b) until another word boundary
  • one or more sets of these

9
Be Careful With Scope
Scoping
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my x 23
  • print "value in main body is x \n"
  • mysub(x)
  • print "value in main body is x \n"
  • exit
  • sub mysub
  • print "value in subroutine is x \n"
  • x33
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my x 23
  • print "value in main body is x \n"
  • mysub(x)
  • print "value in main body is x \n"
  • exit
  • sub mysub
  • print "value in subroutine is x \n"
  • x33

10
Be Careful With Scope (cont.)
Scoping
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my x 23
  • print "value in main body is x \n"
  • mysub(x)
  • print "value in main body is x \n"
  • exit
  • sub mysub
  • my(x) _at__
  • x33
  • print "value in subroutine is x \n"
  • value in main body is 23
  • value in subroutine is 33
  • value in main body is 23

11
Data Structures and Algorithm Efficiency
Algorithm Efficiency
Algorithm is O(N2)
  • An inefficient way to compute intersections
  • my _at_a qw/ A B C D E F G H I J K X Y Z /
  • my _at_b qw/ Q R S A C D T U G H V I J K X Z /
  • my _at_intersection ()
  • for my i (_at_a)
  • for my j (_at_b)
  • if (i eq j)
  • push _at_intersection, i
  • last
  • print "_at_intersection\n"
  • exit
  • Output

N size of Lists
12
Data Structures and Algorithm Efficiency
Algorithm Efficiency
  • A better way to compute intersections
  • my _at_a qw/ A B C D E F G H I J K X Y Z /
  • my _at_b qw/ Q R S A C D T U G H V I J K X Z /
  • my _at_intersection ()
  • "mark" each item in _at_a
  • my mark ()
  • for my i (_at_a) marki 1
  • intersection any "marked" item in _at_b
  • for my j (_at_b)
  • if (exists markj)
  • push _at_intersection, j
  • print "_at_intersection\n"
  • exit

version 1
version 2
13
Demonstration
Algorithm Efficiency
  • Unix commands
  • /usr/bin/time
  • head
  • diff
  • cmp
  • wc -l list1 list2
  • 24762 list1
  • 12381 list2
  • 37143 total
  • /usr/bin/time intersect1.pl list1 list2 gt out1
  • 22.91 real 22.88 user 0.02
    sys
  • /usr/bin/time intersect2.pl list1 list2 gt out2
  • 0.06 real 0.05 user 0.00
    sys
  • 22.88/.05 458

14
Hashes and Efficiency
Hashes
  • Hashes provide a very fast way to look up
    information associated with a set of scalar
    values (keys)
  • Examples
  • Count how many time each word appears in a file
  • Also whether or not a certain work appeared in a
    file
  • Count how many time each codon appears in a DNA
    sequence
  • Whether a given codon appears in a sequence
  • How many time an item appears in a given list
  • Intersections

15
Examples
Hashes
  • Write a subroutine get_intersection(\_at_a, \_at_b)
    that returns the intersection of two lists.
  • Write a subroutine first_list_only(\_at_a, \_at_b) that
    returns the items that are in list _at_a but not in
    _at_b.
  • Write a subroutine unique(_at_a) that return the
    unique items in list _at_a (that is, remove the
    duplicates).
  • Write a subroutine dups(n, _at_a) that returns a
    list of items that appear in _at_a at least n
    times.

16
Sorting
Sorting
  • sort LIST -- returns list sorted in string order
  • sort BLOCK LIST -- compares according to BLOCK
  • sort USERSUB LIST -- compares according
    subroutine SUB

17
Sorting Our First Attempt
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my(_at_unsorted) (17, 8, 2, 111)
  • my(_at_sorted) sort _at_unsorted
  • print "_at_unsorted \n"
  • print "_at_sorted \n"
  • exit
  • Output
  • 17 8 2 111
  • 111 17 2 8

18
The Comparison Operator
Sorting
  • 1. a ltgt b returns 0 if equal, 1 if a gt b,
    -1 if a lt b
  • 2. The "cmp" operator gives similar results for
    strings
  • 3. a and b are special global variables
  • do NOT declare with "my" and do NOT modify.

19
Sorting Numerically
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my(_at_unsorted) (17, 8, 2, 111)
  • my(_at_sorted) sort a ltgt b _at_unsorted
  • print "_at_unsorted \n"
  • print "_at_sorted \n"
  • exit
  • Output
  • 17 8 2 111
  • 2 8 17 111

20
Sorting Using a Subroutine
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my(_at_unsorted) (17, 8, 2, 111)
  • my(_at_sorted) sort numerically _at_unsorted
  • print "_at_unsorted \n"
  • print "_at_sorted \n"
  • exit
  • sub numerically a ltgt b
  • Output
  • 17 8 2 111
  • 2 8 17 111

21
Sorting Descending
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my(_at_unsorted) (17, 8, 2, 111)
  • my(_at_reversesorted) reverse sort numerically
    _at_unsorted
  • print "_at_unsorted \n"
  • print "_at_reversesorted \n"
  • exit
  • sub numerically a ltgt b
  • Output
  • 17 8 2 111
  • 111 17 8 2

22
Sorting DNA by Length
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • Sorting strings
  • my _at_dna qw/ TATAATG TTTT GT CTCAT /
  • Sort _at_dna by length
  • _at_dna sort length(a) ltgt length(b) _at_dna
  • print "_at_dna\n" Output GT TTTT CTCAT TATAATG
  • exit
  • Output
  • GT TTTT CTCAT TATAATG

23
Sorting DNA by Number of Ts (Largest First)
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • Sorting strings
  • my _at_dna qw/ TATAATG TTTT GT CTCAT /
  • _at_dna sort (b tr/Tt//) ltgt (a tr/Tt//)
    _at_dna
  • print "_at_dna\n" Output TTTT TATAATG CTCAT GT
  • exit
  • Output
  • TTTT TATAATG CTCAT GT

24
Sorting DNA by Number of Ts (Largest First)
(Take 2)
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • Sorting strings
  • my _at_dna qw/ TATAATG TTTT GT CTCAT /
  • _at_dna reverse sort
  • (a tr/Tt//) ltgt (b tr/Tt//) _at_dna
  • print "_at_dna\n" Output TTTT TATAATG CTCAT GT
  • exit
  • Output
  • TTTT TATAATG CTCAT GT

25
Sorting Strings Without Regard to Case
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • Sort strings without regard to case
  • my(_at_unsorted) qw/ mouse Rat HUMAN eColi /
  • my(_at_sorted) sort lc(a) cmp lc(b)
    _at_unsorted
  • print "_at_unsorted \n"
  • print "_at_sorted \n"
  • exit
  • Output
  • mouse Rat HUMAN eColi
  • eColi HUMAN mouse Rat

26
Sorting Hashes by Value
Sorting
  • !/usr/bin/perl
  • use strict
  • use warnings
  • my(sales_amount) ( autogt100, kitchengt2000,
    hardwaregt200 )
  • sub bysales sales_amountb ltgt
    sales_amounta
  • for my dept (sort bysales keys sales_amount)
  • printf "s\t4d\n", dept, sales_amountdept
  • exit
  • Output
  • kitchen2000
  • hardware 200
  • auto 100

27
Review for Midterm BINF634
Midterm
  • Material
  • Tisdall Chapters 1-9
  • Wall Chapter 5
  • Lecture notes
  • The exam will be open book and notes
  • You cannot work together on it
  • You cannot use outside material
  • You will have the full period to take the midterm
  • You will be asked to program

28
Some Example Questions
Midterm
  • Given two DNA fragments contained in DNA1 and
    DNA2 how can we concatenate these to make a
    third string DNA3?

DNA3 DNA1DNA2
  • DNA3 DNA1 . DNA2

29
Some Example Questions
Midterm
  • What does this line of code do?
  • RNA s/T/U/ig

Substitute Ts with Us in a case insensitive
manner globally within the string RNA
30
Some Example Questions
Midterm
  • What does this statement do?
  • revcom tr/ACGT/TGCA/

It performs the mapping A ? T C ? G G ? C T ?
A all at once
31
Some Example Questions
Midterm
  • What do these four lines do?
  • _at_bases (A, C, G, T)
  • baset pop _at_bases
  • unshift (_at_bases, base1)
  • print _at_bases\n\n

T A G C
32
Some Example Questions
Midterm
  • What does this code snippet do if COND is true
  • unless(COND)
  • do something

nothing
33
Some Example Questions
Midterm
  • What does this code fragment do?
  • protein join(,_at_protein)

Converts the array _at_protein into a scalar
protein with no space between The entries
34
Some Example Questions
Midterm
  • What does this code fragment do?
  • myfile myfile
  • Open(MYFILE, gtmyfile)

Opens the file myfile with the file handle
MYFILE for writing
35
Some Example Questions
Midterm
  • What does this code fragment do?
  • while(DNA /a/ig)a

Counts the occurrences of the letter a or
A within the string DNA
36
Some Example Questions
Midterm
  • What is the effect of using the command
  • use strict
  • at the beginning of your program?

It insists that your programs have all their
variables declared as my variables
37
Some Example Questions
Midterm
  • What is contained in the reserved variable 0 and
  • in the array _at_ARGV ?

0 contains the name of the program _at_ARGV
contains the command line arguments for the
program
38
Some Example Questions
Midterm
  • What is the difference between pass by value
    and pass by reference ?

In pass by value you provide a subroutine with
a copy of your variable. In pass by reference
you provide a subroutine with a pointer to your
variable. In this manner the subroutine can
change the contents of the variable.
39
Some Example Questions
Midterm
  • What is a pointer and what does it mean to
    dereference a pointer?

A pointer is an address in memory to a particular
variable. Dereferecing a pointer is the act of
obtaining the information that is stored at a
particular pointer location.
40
Some Example Questions
Midterm
  • How do you invoke perl with the debugger?

perl - d
41
Some Example Questions
Midterm
  • Given an array _at_verbs what is going on here?
  • verbsrand _at_verbs

rand wants an integer so it uses scalar _at_verbs
rand then generates a random number between 0 and
length of the array _at_verbs. This is then
converted to an integer to index into _at_verbs
Write a Comment
User Comments (0)
About PowerShow.com