Title: Part II : Sequence Analysis
1Part II Sequence Analysis
Paul Tan Thiam Joo paul_at_bic.nus.edu.sg Department
of Biochemistry, Medicine Faculty, NUS Institute
for Infocomm Research
2What is sequence analysis?
- Nucleic acids DNA and RNA
- Proteins amino acid composition, pI, molecular
weight, hydrophobicity.
3Why do sequence analysis?
- Assessing potential allergenicity (Gendel, 2002)
- Parkinson's disease (neurodegenerative) (Tversky
and Fink, 2002) - Human Genome Project (completed in 2001)
4Sequence analysis of proteins
- Backtranslation
- Amino acid composition
- Molecular weights, pIs
- Hydropathy profile
5http//kr.expasy.org
6Backtranslation
- Protein -gt DNA
- Use for cloning protein of interest where it may
be present in low amount. - Beware of codon bias and degeneracy of codons.
7 UUU-Phe UCU-Ser UAU-Tyr UGU-Cys
UUC-Phe UCC-Ser UAC-Tyr UGC-Cys
UUA-Leu UCA-Ser UAA-Stop UGA-Stop
UUG-Leu UCG-Ser UAG-Stop UGG-Trp
CUU-Leu CCU-Pro CAU-His CGU-Arg
CUC-Leu CCC-Pro CAC-His CGC-Arg
CUA-Leu CCA-Pro CAA-Gln CGA-Arg
CUG-Leu CCG-Pro CAG-Gln CGG-Arg
AUU-Ile ACU-Thr AAU-Asn AGU-Ser
AUC-Ile ACC-Thr AAC-Asn AGC-Ser
AUA-Ile ACA-Thr AAA-Lys AGA-Arg
AUG-Met ACG-Thr AAG-Lys AGG-Arg
GUU-Val GCU-Ala GAU-Asp GGU-Gly
GUC-Val GCC-Ala GAC-Asp GGC-Gly
GUA-Val GCA-Ala GAA-Glu GGA-Gly
GUG-Val GCG-Ala GAG-Glu GGG-Gly
8Biased codon usage
9Amino Acid Composition
- Determine the percentages of amino acid residues
present in a protein molecule. - Uses
- determine the lifestyles of organisms high
percentages of Glutamate (- charge) and both
Lysine and Arginine ( charge) in
hyperthermophiles vs. mesophiles -gt absent
(Tekaia et al., 2002). - predict structural class (Luo et al., 2002).
10Nonpolar amino acids (FILMWAV)
11Polar uncharged (S-QT-NY-)
12Polar charged (KHERD)
13Unique Properties
14Protein functions from specific residues
- C Disulphide-rich, zinc fingers
- G Collagens
- H Histidine-rich glycoprotein
- KR Nuclear proteins, nuclear localisation
- P Collagen, filaments
15(No Transcript)
16(No Transcript)
17Molecular weights, pIs
- Aid in designing of purification experiments e.g.
SDS-PAGE, IEF, 2-Dimensional Gel, Column
chromatography etc.
18Hydropathy Profiles
- Hydropathy - describe the hydrophobicity and
hydrophilicity of a protein sequence. - A graph in which hydropathy values are calculated
within a sliding window and plotted for each
residue in a protein sequence.
19A sliding window
M K F F L M C L I I F P I M G V L G
20Signal region
Alpha-helix
21A schematic representation of a 3-D structure of
a scorpion toxin
Alpha-helix
Beta-sheet
Alpha-helix
22Hydropathy Profiles
- Hydropathy scale - each amino acid is assigned a
value reflecting its relative hydrophobicity and
hydrophilicity. - 2 broad classes of scales
- Environmental characteristics of protein
residues. - Experimental measurements of amino acid
physiochemical properties.
23Venn Diagram of the 20 amino acid physiochemical
properties
24Hydropathy Profiles
- Basic ranking internal FILMV, external
DEHKNQR, ambivalent ACGPSTWY
25Hydropathy Profiles
- Detect possible transmembrane domains
(consecutive 20-25 runs of hydrophobic amino
acids). - Hydrophobic protein cores
- Predict neurotoxicity in snake Phospholipases A2
(Kini and Iwanaga, 1986)
26References
- Kini RM, Iwanaga S. (1986) Toxicon 24(6)527-541.
- Rehm BH. (2001) Appl Microbiol Biotechnol.
57(5-6)579-92. - Weir M, Swindells M, OveringTon J. (2001) Trends
Biotechnol 19(10 Suppl)S61-6. - Gendel S. M. (2002) Ann. N.Y. Acad. Sci. 964
8798. - Luo RY, Feng ZP, Liu JK. (2002) Eur J Biochem
2002 269(17)4219-4225 - Tekaia F, Yeramian E, Dujon B. (2002) Gene
29751-60. - Tversky VN, Fink AL. (2002) FEBS Lett
522(1-3)9-13. - EXPASY http//cn.expasy.org/