Perl - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Perl

Description:

Nucleic acids, nucleotides, bases. Adenine, Cytosine, Guanine, Thymine ... Nucleic Acid(s) Code. A, G, C, T (any) N. G or T (keto) K. C or G or T. B. C or T ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 39
Provided by: Bak994
Category:
Tags: acids | nucleic | perl

less

Transcript and Presenter's Notes

Title: Perl


1
Perl
  • Part I A Biology Primer

2
Conceptual Biology
  • H. sapiens did not create the genetic code but
    they did invent the transistor
  • Biological life is not optimized the modern
    synthesis
  • Nature vs. Nurture
  • What are the best ways to understand the
    important differences the make the difference?

3
A Molecular Primer
  • Hierarchy of the eukaryote
  • Organism gt System gt Organ gt Tissue gt Cell gt
    Organelle gt Protein gt RNA gt DNA
  • Put Simply DNA ? RNA ? Protein

4
The Building Blocks
  • DNA is composed of four building blocks
  • Nucleic acids, nucleotides, bases
  • Adenine, Cytosine, Guanine, Thymine
  • RNA also has four building blocks
  • Adenine, Cytosine, Guanine, Uracil
  • Proteins are composed of 20 building blocks
  • Amino acids, residues
  • Fragments of proteins are called peptides
  • DNA, RNA and Proteins are polymers

5
Code Nucleic Acid(s) w/ Sugar w/P
A Adenine Adenosine Adenylic Acid
C Cytosine Cytodine Cytidylic Acid
G Guanine Guanosine Guanylic Acid
T Thymine Tymidine Thymidylic Acid
U Uracil Uridine Uridylic Acid
M A or C (amino) Code Nucleic Acid
R A or G (purine) V A or C or G
W A or T (weak) H A or C or T
S C or G (strong) D A or G or T
Y C or T (pyrimidine) B C or G or T
K G or T (keto) N A, G, C, T (any)
6
Code Nucleic Acid(s) w/ Sugar w/P
A Adenine Adenosine Adenylic Acid
C Cytosine Cytodine Cytidylic Acid
G Guanine Guanosine Guanylic Acid
T Thymine Tymidine Thymidylic Acid
U Uracil Uridine Uridylic Acid
M A or C (amino) Code Nucleic Acid
R A or G (purine) V A or C or G
W A or T (weak) H A or C or T
S C or G (strong) D A or G or T
Y C or T (pyrimidine) B C or G or T
K G or T (keto) N A, G, C, T (any)
DNA DNA DNA RNA
A T ? A
C G ? C
G C ? G
C G ? C
T A ? U
T A ? U
M K ? M
W W ? ?
N N ? N
C G ? C
C G ? C
T A ? U
Y R ? ?
B V ? ?
N N ? N
K M ? ?
S S ? S
T A ? U
T A ? U
7
DNA DNA DNA RNA
A T ? A
C G ? C
G C ? G
C G ? C
T A ? U
T A ? U
M K ? M
W W ? ?
N N ? N
C G ? C
C G ? C
T A ? U
Y R ? ?
B V ? ?
N N ? N
K M ? ?
S S ? S
T A ? U
T A ? U
  • One Dimensional
  • Two Dimensional
  • Three Dimensional

8
DNA DNA DNA RNA
A T ? A
C G ? C
G C ? G
C G ? C
T A ? U
T A ? U
M K ? M
W W ? ?
N N ? N
C G ? C
C G ? C
T A ? U
Y R ? ?
B V ? ?
N N ? N
K M ? ?
S S ? S
T A ? U
T A ? U
9
DNA DNA DNA RNA
A T ? A
T A ? U
G C ? G
C G ? C
T A ? U
T A ? U
M K ? M
W W ? ?
N N ? N
C G ? C
C G ? C
T A ? U
Y R ? ?
B V ? ?
N N ? N
K M ? ?
S S ? S
T A ? U
T A ? U
10
DNA DNA DNA RNA
A T ? A
T A ? U
G C ? G
C G ? C
T A ? U
T A ? U
M K ? M
W W ? ?
N N ? N
C G ? C
C G ? C
T A ? U
Y R ? ?
B V ? ?
N N ? N
K M ? ?
S S ? S
T A ? U
T A ? U
11
One-Letter Code Amino Acid Three-Letter Code One-Letter Code Amino Acid Three-Letter Code
C Cysteine Cys D Aspartic acid Asp
E Glutamic Acid Glu F Phenylalanin Phe
G Glycine Gly H Histidine His
I Isoleucine Ile K Lysine Lys
L Leucine Leu M Methionine Met
N Asparagine Asn P Proline Pro
Q Glutamine Gln R Argine Arg
S Serine Ser T Threonine Thr
V Valine Val W Tryptophan Trp
X Unknown Xxx Y Tyrosine Tyr
Z Glutamic acid or Glutimine Glutamic acid or Glutimine Glutamic acid or Glutimine Glutamic acid or Glutimine Glx
12
DNA DNA DNA RNA
A T ? A
T A ? U
G C ? G
C G ? C
T A ? U
T A ? U
M K ? M
W W ? ?
N N ? N
C G ? C
C G ? C
T A ? U
Y R ? ?
B V ? ?
N N ? N
K M ? ?
S S ? S
T A ? U
T A ? U
Met (Start)
Leu
AA?, AU?, CA?, CU? -gt Asn, Lys, Ile, Met, His,
Gln, Val
Pro
UU?, UG?, UC?, CU?, CG?, CC? -gt Phe, Leu, Cys,
Stop, Trp, Ser, Leu, Arg, Pro
UCU, UGU, GCU, GGU -gt Ser, Cys, Ala, Gly
13
DNA DNA DNA RNA
A T ? A
T A ? U
G C ? G
C G ? C
T A ? U
T A ? U
M K ? M
W W ? ?
N N ? N
C G ? C
C G ? C
T A ? U
Y R ? ?
B V ? ?
N N ? N
K M ? ?
S S ? S
T A ? U
T A ? U
Cys
Phe, Leu
A?C, U?C -gt Ile, Thr, Asn, Ser, Phe, Ser, Tyr, Cys
Leu
U?U, U?G, C?U, C?G -gt Phe, Ser, Tyr, Cys, Leu,
Stop, Trp, Leu, Pro, His, Arg, Gln
GUU, CUU -gt Val, Leu
14
(No Transcript)
15
Lecture II
  • Part II One-Dimensional Strings

16
Hello World
  • A few perls of wisdom
  • Concatenating Sequences
  • Making a reverse complement
  • Read sequences from data files

17
Every journey starts with a first 10bp
!/usr/bin/perl w storing DNA in a variable,
and printing it out First, storing DNA in a
variable called DNA DNA CGGGCTATTC Next,
print the DNA onto the screen print
DNA Finally, specifically tell the program to
end exit
18
Every journey starts with a first 10bp
!/usr/bin/perl w storing DNA in a variable,
and printing it out First, storing DNA in a
variable called DNA DNA CGGGCTATTC Next,
print the DNA onto the screen print
DNA Finally, specifically tell the program to
end exit
19
Every journey starts with a first 10bp
!/usr/bin/perl w storing DNA in a variable,
and printing it out First, storing DNA in a
variable called DNA DNA CGGGCTATTC Next,
print the DNA onto the screen print
DNA Finally, specifically tell the program to
end exit
20
Every journey starts with a first 10bp
!/usr/bin/perl w storing DNA in a variable,
and printing it out First, storing DNA in a
variable called DNA DNA CGGGCTATTC Next,
print the DNA onto the screen print
DNA Finally, specifically tell the program to
end exit
21
Concatenating DNA Fragments
!/usr/bin/perl w Store DNA in 2
variables DNA1 AGTGCGTCGCTAG DNA2
ACCGCATGCATTG using string interpolation DNA3
DNA1DNA2 print DNA3\n\n dot
operator DNA3 DNA1 . DNA2 print
DNA3\n\n Print DNA1,DNA2,\n exit
22
Transcription DNA to RNA
!/usr/bin/perl w DNA ACGACTGCACGATCGTACG
print the DNA onto the screen print
DNA\n\n Transcribe the DNA-gtRNA by
substituting all Ts with Us RNA DNA RNA
s/T/U/g print the result to the screen print
Here is the result of DNA-gtRNA\tRNA\n\n exit
23
Variable
Binding Operator
Delimiters to separate the operator
RNA s/T/U/g
Substitute operator
Pattern modifier g globally i case
insensititve m multiline s single line x
permit comments o compile only once
for speed e treat replacement as Perl code
Pattern to be replaced
Replacement Text of replace pattern
24
Calculating the Reverse Complement
!usr/bin/perl w DNA ACGTCAGTCGAGCT print
the starting DNA onto the screen print Here is
the starting DNA\tDNA\n\n Calculate the
reverse complement, first copying the DNA onto
a new variable called revcom revcom reverse
DNA substitute all bases by their
complement revcom s/A/T/g revcom
s/T/A/g revcom s/C/G/g revcom
s/G/C/g print revcom\n
25
Calculating the Reverse Complement
!usr/bin/perl w DNA ACGTCAGTCGAGCT print
the starting DNA onto the screen print Here is
the starting DNA\tDNA\n\n Calculate the
reverse complement, first copying the DNA onto
a new variable called revcom revcom reverse
DNA substitute all bases by their
complement revcom tr/ACGTacgt/TGCAtgca/ print
revcom\n
26
Reading Data from Files
Sample Data in FASTA Format gtNM_012345
Sample Data Muppet Stuffing
Protein MNIDDKLEFGDEMGOSSRTMV FGDLVRSMPHOEILAADEVL
ISHEE GLOYAKLEFGDEMGOGHDDEFGVY
27
Reading Files
!/usr/bin/perl w The filename of the file
containing the sequence data proteinFilename
NM_012345.pep open the file, and associate a
filehandle with it open (PROTEINFILE IN,
proteinFilename) assign file with an input
operator muppetProtein ltPROTEINFILEgt print
the protein file print Here is the
protein\tmuppetProtein\n\n exit
28
Reading Data from Files
Sample Data in FASTA Format gtNM_012345
Sample Data Muppet Stuffing
Protein MNIDDKLEFGDEMGOSSRTMV FGDLVRSMPHOEILAADEVL
ISHEE GLOYAKLEFGDEMGOGHDDEFGVY
29
Lets try this again !usr/bin/perl
w proteinFilename NM_012345.pep open(PROTEI
NFILE, proteinFilename) muppetProtein
ltPROTEINFILEgt print Here is the first
line\tmuppetProtein\n\n muppetProtein
ltPROTEINFILEgt print Here is the second
line\tmuppetProtein\n\n muppetProtein
ltPROTEINFILEgt print Here is the third
line\tmuppetProtein\n\n close
PROTEINFILE exit
30
Using Arrays to Read Files !usr/bin/perl
w proteinFilename NM_012345 open the
file open(PROTEINFILE, proteinFilename) Read
the sequence data from the file, and store it in
the array variable _at_protein _at_protein
ltPROTEINFILEgt print the protein onto the
screen print _at_protein close PROTEINFILE exit
31
Arrays
Heres one way to declare an array _at_bases
(A,C,G,T) Now print each element of the
array print \nFirst element ,
bases0 print \nSecond Element ,
bases1 print \nThird Element ,
bases2 print \nFourth Element , bases3
32
Arrays
Heres one way to declare an array _at_bases
(A,C,G,T) Now print each element of the
array in a row print \nHere are all of the
bases , _at_bases
This prints out Here are all of the bases
ACGT But, you can print them out with spaces in
between print \nHere they are with spaces ,
_at_bases
33
Arrays
Heres one way to declare an array _at_bases
(A,C,G,T) Heres how to take an element
off of the end base1 pop _at_bases print Heres
the last element , base1, \n\n
The other elements still remain print \nHere
are the remaining elements , _at_bases
34
Arrays
Heres one way to declare an array _at_bases
(A,C,G,T) Heres how to take an element
off of the front base2 shift _at_bases print
Heres the first element , base2, \n\n
The other elements still remain print \nHere
are the remaining elements , _at_bases
35
Arrays
Heres one way to declare an array _at_bases
(A,C,G,T) Heres how you put an element
at the beginning of an array Our example will
put the last element at the beginning base1
pop _at_bases unshift (_at_bases, base1) print
Heres the last element put first ,
_at_bases\n\n
36
Arrays
Heres one way to declare an array _at_bases
(A,C,G,T) Heres how you put an element
at the end of an array Our example will put the
first element at the end base1 shift
_at_bases push (_at_bases, base1) print Heres the
first element put last , _at_bases\n\n
37
Arrays
Heres one way to declare an array _at_bases
(A,C,G,T) Heres how to reverse an
array _at_reverse reverse _at_bases Heres how to
get the length print scaler _at_bases,
\n\n Heres how to insert an element at an
arbitrary place splice (_at_bases, 2, 0, X)
38
Arrays
Arrays can be evaluated as lists and
scalers _at_bases (A,C,G,T) Heres how
to print the array print _at_bases\n Heres how
to assign it to a scaler a _at_bases print
a Heres how to assign an array to a list (a)
_at_bases print a
Write a Comment
User Comments (0)
About PowerShow.com