Programming and Perl for Bioinformatics Part V - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Programming and Perl for Bioinformatics Part V

Description:

It prints: The probe at row 1, column 2 has value 8 ... This prints out: Amino acids 1. E V L D T Y. Amino acids 2. Passing References to Subroutines ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 25
Provided by: duan90
Category:

less

Transcript and Presenter's Notes

Title: Programming and Perl for Bioinformatics Part V


1
Programming and Perlfor BioinformaticsPart V
2
References and Objects
3
What Are References?
  • A reference is a (starting) address of a memory
    block that stores some data also called a
    pointer

...

G
A
T
C

010010
010010
010011
str_ref \string_1
010100
list_ref \_at_list_1
010101
hash_ref \hash_1
...
3
2/22/2014
4
What Good Are References?
  • An array of arrays (can do the job of a
    2-dimensional matrix)
  • Spot_num Ch1-BKGD CH1 Ch2-BKGD Ch2
  • 000 0.124 43.2 0.102 80.4
  • 001 0.113 60.7 0.091 22.6
  • 002 0.084 112.2 0.144 35.3
  • Code my _at_spotarray ( 0.124, 43.2, 0.102,
    80.4,
  • 0.113, 60.7, 0.091, 22.6,
  • 0.084, 112.2, 0.144, 35.3)

2/22/2014
Perl in a Day - Subroutines
4
5
What Good Are References?
  • A hash of arrays
  • Accession Ch1-BKGD CH1 Ch2-BKGD Ch2
  • AW10021 0.124 43.2 0.102 80.4
  • BE52002 0.113 60.7 0.091 22.6
  • W20209 0.0841 12.2 0.144 35.3
  • Code
  • my spothash ('AW10021' gt 0.124, 43.2, 0.102,
    80.4, 'BE52002' gt 0.113, 60.7, 0.091,
    22.6,
  • 'W20209' gt 0.0841, 12.2, 0.144, 35.3
  • )
  • Hashes of hashes, and other more complex data
    structures

2/22/2014
Perl in a Day - Subroutines
5
6
What Is A Reference?
  • _at_y ( 1, a, 2.3 )
  • ref_to_y \_at_y
  • print _at_y yields 1a2.3
  • print ref_to_y yields
  • ARRAY(0x80cd6ac)

1 a 2.3
_at_y
2/22/2014
Perl in a Day - Subroutines
6
7
Getting At The Value de-referencing
  • Using a block
  • _at_array_reference hash_reference
    scalar_reference
  • print _at_ref_to_y yields 1a23.
  • Or without it
  • _at_x _at_ref_to_y
  • foo two humps
  • scalar_ref \foo
  • camel_model scalar_ref is now two humps
  • push (_at_array_ref, filename)
  • hash_refKEY VALUVE

2/22/2014
Perl in a Day - Subroutines
7
8
Getting At The Value de-referencing
  • Using the Arrow Operator
  • array_ref 0 1 array_ref 0
    1 array_ref-gt0 1
  • Note that array3 and array-gt3 are NOT the
    same.
  • my hash_copy hash_ref
  • my hash_value hash_ref'some_key'
  • my hash_value hash_ref-gt'some_key'

2/22/2014
Perl in a Day - Subroutines
8
9
Getting At The Value de-referencing
  • Reference to subroutines my_cool_sub
    \subroutine
  • Dereference
  • my result
  • my_cool_sub(arg1, arg2)
  • invoke my_cool_sub with two arguments
  • using block operator
  • my result
  • my_cool_sub(arg1, arg2)
  • or without it
  • my result
  • my_cool_sub-gt(arg1,arg2)
  • or using arrow

2/22/2014
Perl in a Day - Subroutines
9
10
Getting At The Value de-referencing
  • _at_y ( 1, a, 2.3 )
  • ref_to_y \_at_y
  • print _at_y yields 1a2.3
  • print ref_to_y yields
  • ARRAY(0x80cd6ac)

1 a 2.3
_at_y
2/22/2014
10
11
Getting At The Value de-referencing
  • y3 'z'
  • print _at_ref_to_y yield 1a2.3z
  • _at_y (5, 6, 7)
  • print _at_ref_to_y yield 567
  • Why?
  • Regular variables static scoping
  • Reference variables dynamic scoping

2/22/2014
Perl in a Day - Subroutines
11
12
Making References To Arbitrary Values From
Scratch
  • Anonymous Hashes or Arrays
  • y_gene_families
  • 'DAZ', 'TSPY', 'RBMY', 'CDY1',
    'CDY2'
  • instead of ( and )
  • y_gene_family_counts 'DAZ' gt 4,
  • 'TSPY' gt 20,
  • 'RBMY' gt 10,
  • 'CDY2' gt 2
  • instead of ( and )
  • y_gene_families gets a reference to an array,
    and
  • y_gene_family_counts gets a reference to a
    hash.

2/22/2014
Perl in a Day - Subroutines
12
13
Making References To Arbitrary Values From
Scratch
  • for (keys y_gene_family_counts)
  • print "_\n"
  • my _at_a _at_y_gene_families
  • y_gene_families0
  • y_gene_family_counts'DAZ'
  • Arrow shorthand
  • y_gene_families-gt0 yields 'DAZ'
  • y_gene_family_counts-gt'DAZ' yields '4'

2/22/2014
Perl in a Day - Subroutines
13
14
New Function ref
  • ref - What kind of value does this reference
    point to?
  • print ref(y_gene_families), "\n"
  • ARRAY
  • print ref(y_gene_family_counts), "\n"
  • HASH
  • x 1 print ref(x), "\n"
  • (empty string) return null string if not a
    reference.
  • Return values SCALAR, ARRAY, HASH, CODE

2/22/2014
Perl in a Day - Subroutines
14
15
Two-Dimensional Arrays Matrices
  • _at_probes ( 1, 3, 2, 9,
  • 2, 0, 8, 1,
  • 5, 4, 6, 7,
  • 1, 9, 2, 8 )
  • print "The probe at row 1, column 2 has value ",
    probes12,"\n"
  • It prints The probe at row 1, column 2 has
    value 8
  • probes_ref 1, 3, 2, 9,
  • 2, 0, 8, 1,
  • 5, 4, 6, 7,
  • 1, 9, 2, 8
  • print "The probe at row 1, column 2 has value ",
  • probes_ref-gt12, "\n"
  • It prints The probe at row 1, column 2 has
    value 8
  • probes_ref-gt12 is a shorthand for
    probes_ref-gt1-gt2
  • it can also be written as probes_ref12

16
Complex Data Structure
  • gene
  • hash of basic information about the gene
    name, discoverer,
  • discovery date and laboratory.
  • name gt 'antiaging',
  • reference gt 'G. Mendel', '1865',
  • laboratory gt 'Dept. of Genetics', 'Cornell
    University',
  • 'USA'
  • ,
  • scalar giving priority
  • 'high',
  • array of local work history
  • 'Jim', 'Rose', 'Eamon', 'Joe'
  • print "Name is ", gene-gt0'name', "\n"
  • print "Research center is ", gene-gt0'labo
    ratory'1,
  • "\n"

17
Passing References to Subroutines
  • Perl collapses all arguments to a subroutine as a
    list of scalars. This makes it impossible to
    distinguish between two arrays you might try to
    pass to a subroutine, as the following example
    illustrates
  • _at_aminoacids1 ('E', 'V', 'L')
  • _at_aminoacids2 ('D', 'T', 'Y')
    printacids(_at_aminoacids1, _at_aminoacids2)
  • sub printacids
  • my(_at_aa1, _at_aa2) _at__
  • print "Amino acids 1\n"
  • print "_at_aa1\n"
  • print "Amino acids 2\n"
  • print "_at_aa2\n"
  • This prints out
  • Amino acids 1
  • E V L D T Y
  • Amino acids 2

18
Passing References to Subroutines
  • Here is how to fix the previous example
  • _at_aminoacids1 ('E', 'V', 'L')
  • _at_aminoacids2 ('D', 'T', 'Y')
    printacids(\_at_aminoacids1, \_at_aminoacids2)
  • sub printacids
  • my(aa1, aa2) _at__
  • print "Amino acids 1\n"
  • print "_at_aa1\n"
  • print "Amino acids 2\n"
  • print "_at_aa2\n"
  • This prints out
  • Amino acids 1
  • E V L
  • Amino acids 2
  • D T Y

19
Perl Object Syntax
  • Perl objects are special references that come
    bundled with a set of functions that know how to
    act on the contents of the reference.
  • For example, in BioPerl, there is a class of
    objects called Sequence. Internally, the object
    is a hash reference that has keys that point to
    the DNA string, the name and source of the
    sequence, and other attributes. The object is
    bundled with functions that know how to
    manipulate the sequence, such as revcom( ),
    translate( ), subseq( ), etc.
  • When talking about objects, the bundled functions
    are known as methods.

20
Perl Objects
  • For example, if we have a Sequence object stored
    in the scalar variable sequence1, we can call
    its methods like this
  • reverse_complement sequence1-gtrevcom()
    first_10_bases sequence1-gtsubseq(1,10)
  • protein sequence1-gttranslate()
  • You will learn later from the BioPerl lecture
    that revcom(), subseq() and translate() are all
    returning new Sequence objects that themselves
    know how to revcom(), translate() and so forth.
    So if you wanted to get the protein translation
    from the reverse complement, you could do this
  • reverse_complement sequence-gtrevcom()
  • protein reverse_complement-gttranslate()

21
Creating Objects
  • Before you can start using objects, you must load
    their definitions from the appropriate module(s).
    For example, if we want to load the BioPerl
    Sequence definitions, we load the appropriate
    module, which in this case is called
    BioPrimarySeq (you learn this from reading the
    BioPerl documentation)
  • !/usr/bin/perl -w
  • use strict
  • use BioPrimarySeq
  • Now you'll probably want to create a new object.
    There are a variety of ways to do this, and
    details vary from module to module, but most
    modules, including BioPrimarySeq, do it using
    the new() method
  • my sequence1 new
  • BioPrimarySeq('gattcgattccaaggttccaaa')

22
Creating Objects
  • The syntax here is
  • ModuleName-gtnew(_at_args)
  • where ModuleName is the name of the module that
    contains the object definitions.
  • The new( ) method will return an object that
    belongs to the ModuleName class.
  • In the example above, we get a BioPrimarySeq
    object, which is the simplest of BioPerl's
    various Sequence object types.

23
Creating Objects
  • When you call object methods, you can pass a list
    of parameters, just as you would to a regular
    function.
  • As methods get more complex, parameter lists can
    get quite long and have possibly dozens of
    optional parameters. To make this manageable,
    many object-oriented modules use a named
    parameter style of parameter passing, that looks
    like this
  • my result object-gt
  • method( -arg1gtvalue1, -arg2gtvalue2,
  • -arg3gtvalue3, ... )
  • In this case "-arg1", "-arg2", and so on are the
    names of parameters, and value1, value2 are the
    values of those named parameters. The name/value
    pairs can occur in any order.

24
Creating Objects
  • Rather than create a humungous argument list
    which forces you to remember the correct position
    of each argument, BioPrimarySeq lets you create
    a new Sequence this way
  • !/usr/bin/perl -w
  • use strict
  • use BioPrimarySeq
  • my sequence1 BioPrimarySeq-gtnew(
  • -seq gt 'gattcgattccaaggttccaaa',
  • -id gt 'oligo23',
  • -alphabet gt 'dna',
  • -is_circular gt 0,
  • -accession_number gt 'X123'
  • )
Write a Comment
User Comments (0)
About PowerShow.com