Genomics - PowerPoint PPT Presentation

About This Presentation

Title:

Genomics

Description:

Lecture 19A: Protein-protein interactions Complexity: Multibody interaction Diversity: Various interaction types Specificity: ... – PowerPoint PPT presentation

Number of Views:82

Avg rating:3.0/5.0

Slides: 65

Provided by: vijeuniv8

Category:

more less

Transcript and Presenter's Notes

Title: Genomics

1
Lecture 19
(A) Protein-protein interactionand(B) Nucleic
Acid Structure
Introduction to Bioinformatics
2
Lecture 19AProtein-protein interactions

Complexity
Multibody interaction
Diversity
Various interaction types
Specificity
Complementarity in shape and binding properties

3
PPI Characteristics

Universal
Cell functionality based on protein-protein
interactions
Cyto-skeleton
Ribosome
RNA polymerase
Numerous
Yeast
6.000 proteins
at least 3 interactions each
18.000 interactions
Human
estimated 100.000 interactions
Network
simplest homodimer (two)
common hetero-oligomer (more)
holistic protein network (all)

4
Interface Area

Contact area
usually gt1100 Å2
each partner gt550 Å2
each partner loses 800 Å2 of solvent accessible
surface area
20 amino acids lose 40 Å2
100-200 J per Å2
Average buried accessible surface area
12 for dimers
17 for trimers
21 for tetramers
83-84 of all interfaces are flat
Secondary structure
50 a-helix
20 b-sheet
20 coil
10 mixed
Less hydrophobic than core, more hydrophobic than
exterior

5
Complexation Reaction

A B ? AB
Ka AB/AB ? association
Kd AB/AB ? dissociation

6
Experimental Methods for determining PPI

2D (poly-acrylamide) gel electrophoresis ? mass
spectrometry
Liquid chromatography
e.g. gel permeation chromatography
Binding study with one immobilized partner
e.g. surface plasmon resonance
In vivo by two-hybrid systems or FRET
Binding constants by ultra-centrifugation,
micro-calorimetry or competition
Experiments with labelled ligand
e.g. fluorescence, radioactivity
Role of individual amino acids by site directed
mutagenesis
Structural studies
e.g. NMR or X-ray

7
PPI Network
http//www.phy.auckland.ac.nz/staff/prw/biocomplex
ity/protein_network.htm
8
Binding vs. Localization
strong
Non-obligatetriggered transient e.g. GTPPO4-
Non-obligatepermanente.g. antibody-antigen
Obligateoligomers
Non-obligateco-localised e.g. in membrane
Non-obligateweak transient
weak
co-expressed and at same place
different places
9
Some terminology

Transient interactions
Associate and dissociate in vivo
Weak transient
dynamic oligomeric equilibrium
Strong transient
require a molecular trigger to shift the
equilibrium
Obligate PPI
protomers no stable structures on their own (i.e.
they need to interact in complexes)
(functionally obligate)

10
Analysis of 122 Homodimers

70 interfaces single patched
35 have two patches
17 have three or more

11
Interfaces

30 polar
70 non-polar

12
Interface

Rim is water accessible

rim
interface
13
Interface composition

Composition of interface essentially the same as
core
But surface area can be quite different!

different surface/interface areas
14
Some preferences
prefer
avoid
15
Ribosome structure

In the nucleolus, ribosomal RNA is transcribed,
processed, and assembled with ribosomal proteins
to produce ribosomal subunits
At least 40 ribosomes must be made every second
in a yeast cell with a 90-min generation time
(Tollervey et al. 1991). On average, this
represents the nuclear import of 3100 ribosomal
proteins every second and the export of
80 ribosomal subunits out of the nucleus every
second. Thus, a significant fraction of nuclear
trafficking is used in the production of
ribosomes.
Ribosomes are made of a small and a large subunit

Large (1) and small (2) subunit fit together
(note this figure mislabels angstroms as
nanometers)
16
Ribosome structure

The ribosomal subunits of prokaryotes and
eukaryotes are quite similar but display some
important differences.
Prokaryotes have 70S ribosomes, each consisting
of a (small) 30S and a (large) 50S subunit,
whereas eukaryotes have 80S ribosomes, each
consisting of a (small) 40S and a bound (large)
60S subunit.
However, the ribosomes found in chloroplasts and
mitochondria of eukaryotes are 70S, this being
but one of the observations supporting the
endosymbiotic theory.
"S" means Svedberg units, a measure of the rate
of sedimentation of a particle in a centrifuge,
where the sedimentation rate is associated with
the size of the particle. Note that Svedberg
units are not additive.
Each subunit consists of one or two very large
RNA molecules (known as ribosomal RNA or rRNA)
and multiple smaller protein molecules.
Crystallographic work has shown that there are no
ribosomal proteins close to the reaction site for
polypeptide synthesis. This suggests that the
protein components of ribosomes act as a scaffold
that may enhance the ability of rRNA to
synthesise protein rather than directly
participating in catalysis.
The differences between the prokaryotic and
eukaryotic ribosomes are exploited by humans
since the 70S ribosomes are vulnerable to some
antibiotics that the 80S ribosomes are not. This
helps pharmaceutical companies create drugs that
can destroy a bacterial infection without harming
the animal/human host's cells!

17
70S structure at 5.5 Å
(Noller et al. Science 2001)
18
70S structure
19
30S-50S interface

Overall buried surface area 8500 Å2
lt 37.5 Å2
37.5 Å2 75 Å2
gt 75 Å2

20
Protein-nucleic acid Interactions
21
Interactions in the Ribosome
22
Docking - ZDOCK

Protein-protein docking
3-dimensional (3D) structure of protein complex
starting from 3D structures of receptor and
ligand
Rigid-body docking algorithm (ZDOCK)
pairwise shape complementarity function
all possible binding modes
using Fast Fourier Transform algorithm
Refinement algorithm (RDOCK)
Take top 2000 predicted structures from ZDOCK
(RDOCK is too computer intensive to refine very
many possible dockings)
three-stage energy minimization
electrostatic and desolvation energies
molecular mechanical software (CHARMM)
statistical energy method (Atomic Contact Energy)
49 non-redundant unbound test cases
near-native structure (lt2.5Å) on top for 37 test
cases
for 49 within top 4

23
Protein-protein docking

Finding correct surface match
Systematic search
2 times 3D space!
Define functions
1 on surface
r or d inside
0 outside

d
r
24
Protein-protein docking

Correlation function
Ca,b,g 1/N3 So Sp Sq exp2pi(oa pb qg)/N
Co,p,q

25
Docking Programs

ZDOCK, RDOCK
AutoDock
Bielefeld Protein Docking
DOCK
DOT
FTDock, RPScore and MultiDock
GRAMM
Hex 3.0
ICM Protein-Protein docking (Abagyan group,
currently the best)
KORDO
MolFit
MPI Protein Docking
Nussinov-Wolfson Structural Bioinformatics Group

26
Docking Programs

Issues
Rigid structures or made flexible?
Side-chains
Main-chains
Full atomic detail or simplified models?
Docking energy functions (purpose built force
fields)

27
Docking exampleantibody HyHEL-63 (cyan)
complexed with Hen Egg White Lysozyme
The X-ray structure of the antibody HyHEL-63
(cyan) uncomplexed and complexed with Hen Egg
White Lysozyme (yellow) has shown that there are
small but significant, local conformational
changes in the antibody paratope on binding. The
structure also reveals that most of the charged
epitope residues face the antibody. Details are
in Li YL, Li HM, Smith-Gill SJ and Mariuzza RA
(2000) The conformations of the X-ray structure
Three-dimensional structures of the free and
antigen-bound Fab from monoclonal antilysozyme
antibody HyHEL-63. Biochemistry 39 6296-6309.
Salt links and electrostatic interactions
provide much of the free energy of binding. Most
of the charged residues face in interface in the
X-ray structure. The importance of the salt link
between Lys97 of HEL and Asp27 of the antibody
heavy chain is revealed by molecular dynamics
simulations. After 1NSec of MD simulation at
100C the overall conformation of the complex has
changed, but the salt link persists. Details are
described in Sinha N and Smith-Gill SJ (2002)
Electrostatics in protein binding and function.
Current Protein Peptide Science 3 601-614.
28
Introduction to Bioinformatics

Lecture 19B
Nucleic acid structure

29
Nucleic Acid Basics

Nucleic Acids Are Polymers
Each Monomer Consists of Three Moieties
Nucleotide
A Base A Ribose Sugar A Phosphate
Nucleoside
A Base Can be One of the Five Rings

Pyrimidines

Purines

Pyrimidines and Purines can Base-Pair
(Watson-Crick Pairs)

31
(No Transcript)
32

Unlike three dimensional structures of proteins,
DNA molecules assume simple double helical
structures independent of their sequences. There
are three kinds of double helices that have been
observed in DNA type A, type B, and type Z,
which differ in their geometries. The double
helical structure is essential to the coding
function of DNA. Watson (biologist) and Crick
(physicist) first discovered the double helix
structure in 1953 by X-ray crystallography.
RNA, on the other hand, can have as diverse
structures as proteins, as well as simple double
helix of type A. The ability of being both
informational and diverse in structure suggests
that RNA was the prebiotic molecule that could
function in both replication and catalysis (The
RNA World Hypothesis). In fact, some viruses
encode their genetic materials by RNA (retrovirus)

33
Forces That Stabilize Nucleic Acid Double Helix

There are two major forces that contribute to
stability of helix formation
Hydrogen bonding in base-pairing
Hydrophobic interactions in base stacking

5
3
Same strand stacking
cross-strand stacking
3
5
34
Types of DNA Double Helix

Type A major conformation of RNA, minor
conformation of DNA
Type B major conformation of DNA
Type Z minor conformation of DNA

5
5
3
3
3
5
Z
A
B
3
3
3
5
5
5
Narrow tight
Wide Less tight
Left-handed Least tight
35
Three Dimensional Structures of Double Helices
A-DNA
Minor Groove
Major Groove
36
Secondary Structures of Nucleic Acids

DNA is primarily in duplex form.
RNA is normally single stranded which can have a
diverse form of secondary structures other than
duplex.

37
More Secondary Structures of Nucleic Acids
Pseudoknots
Source Cornelis W. A. Pleij in Gesteland, R. F.
and Atkins, J. F. (1993) THE RNA WORLD. Cold
Spring Harbor Laboratory Press.
38
3D Structures of RNA Transfer RNA Structures
Secondary Structure of tRNA
Tertiary Structure of tRNA
TyC Loop
Anticodon Stem
Variable loop
D Loop
Anticodon Loop
Gm, Cm, etc., are modified bases
39
3D Structures of RNA Ribosomal RNA Structures
Secondary Structure Of large ribosomal RNA
Tertiary Structure Of large ribosome subunit
rRNA Secondary Structure Based on Phylogenetic
Data
40
Central Dogma of Molecular Biology
Transcription
Translation
Replication
DNA
mRNA
Protein
Transcription is carried out by RNA polymerase
(II) Translation is performed on
ribosomes Replication is carried out by DNA
polymerase Reverse transcriptase copies RNA into
DNA
Transcription Translation Expression
41
But DNA can also be transcribed into non-coding
RNA

tRNA (transfer) transfer of amino acids to
theribosome during protein synthesis.
rRNA (ribosomal) essential component of the
ribosomes (complex with rProteins).
snRNA (small nuclear) mainly involved in
RNA-splicing(removal of introns). snRNPs.
snoRNA (small nucleolar) involved in chemical
modifications of ribosomal RNAs and other RNA
genes. snoRNPs.
SRP RNA (signal recognition particle) forms
RNA-protein complex involved in mRNA secretion.
Further microRNA,,eRNA, gRNA, tmRNA etc.

42
Eukaryotes have spliced genes

Promoter involved in transcription initiation
(TF/RNApol-binding sites)
TSS transcription start site
UTRs un-translated regions (important for
translational control)
Exons will be spliced together by removal of the
Introns
Poly-adenylation site important for transcription
termination (but also mRNA stability,
export mRNA from nucleus etc.)

43
DNA makes mRNA makes Protein
44
Some facts about human genes

There are about 20.000 25.000 genes in the
human genome ( 3 of the genome)
Average gene length is 8.000 bp
Average of 5-6 exons per gene
Average exon length is 200 bp
Average intron length is 2000 bp
8 of the genes have a single exon
Some exons can be as small as 1 or 3 bp

45
DMD the largest known human gene

The largest known human gene is DMD, the gene
that encodes dystrophin 2.4 milion bp over 79
exons
X-linked recessive disease (affects boys)
Two variants Duchenne-type (DMD) and Becker-type
(BMD)
Duchenne-type more severe, frameshift-mutations
Becker-type milder phenotype, in frame-
mutations

Posture changes during progression of Duchenne
muscular dystrophy
46
Nucleic acid basics

Nucleic acids are polymers

nucleotide
nucleoside

Each monomer consists of 3 moieties

47
Nucleic acid basics (2)

A base can be of 5 rings

Purines and Pyrimidines can base-pair (Watson-
Crick pairs)

Watson and Crick, 1953
48
Nucleic acid as hetero-polymers

Nucleosides, nucleotides

DNA and RNA strands

(Ribose sugar, RNA precursor)
(2-deoxy ribose sugar, DNA precursor)

REMEMBER
DNA deoxyribonucleotidesRNA ribonucleotides
(OH-groups at the 2 position)
Note the directionality of DNA (5-3 3-5) or
RNA (5-3)
DNA A, G, C, T RNA A, G, C, U

(2-deoxy thymidine tri- phosphate, nucleotide)
49
So

RNA
50
Stability of base-pairing

C-G base pairing is more stable than A-T (A-U)
base pairing (why?)
3rd codon position has freedom to evolve
(synonymous mutations)
Species can therefore optimise their G-C content
(e.g. thermophiles are GC rich) (consequences for
codon use?)

Thermocrinis ruber, heat-loving bacteria
51
DNA compositional biases

Base compositions of genomes GC (and therefore
also AT) content varies between different
genomes
The GC-content is sometimes used to classify
organism in taxonomy
High GC content bacteria Actinobacteriae.g. in
Streptomyces coelicolor it is 72Low GC
content Plasmodium falciparum (20)
Other examples

Saccharomyces cerevisiae (yeast) 38
Arabidopsis thaliana (plant) 36
Escherichia coli (bacteria) 50

52
Lets return to DNA and RNA structure

Unlike three dimensional structures of proteins,
DNA molecules assume simple double helical
structures independent on their sequences.
There are three kinds of double helices that have
been observed in DNA type A, type B, and type Z,
which differ in their geometries.
RNA on the other hand, can have as diverse
structures as proteins, as well as simple double
helix of type A.
The ability of being both informational and
diverse in structure suggests that RNA was the
prebiotic molecule that could function in both
replication and catalysis (The RNA World
Hypothesis).
In fact, some viruses encode their genetic
materials by RNA (retrovirus)

53
Three dimensional structures of double helices
Side view A-DNA, B-DNA, Z-DNA
Space-filling models of A, B and Z- DNA
Top view A-DNA, B-DNA, Z-DNA
54
Major and minor grooves

55
Forces that stabilize nucleic acid double helix

There are two major forces that contribute to
stability of helix formation
Hydrogen bonding in base-pairing
Hydrophobic interactions in base stacking

5
3
Same strand stacking
cross-strand stacking

3
5

56
Types of DNA double helix

Type A
major conformation RNA
minor conformation DNA
Right-handed helix

Type B
major conformation DNA
Right-handed helix

Type Z
minor conformation DNA
Left-handed helix

57
Secondary structures of Nucleic acids

DNA is primarily in duplex form
RNA is normally single stranded which can have a
diverse form of secondary structures other than
duplex.

58
Non B-DNA Secondary structures

Cruciform DNA

Slipped DNA

Triple helical DNA

Hoogsteen basepairs
Source Van Dongen et al. (1999) , Nature
Structural Biology 6, 854 - 859
59
More Secondary structures

RNA pseudoknots

Cloverleaf rRNA structure

16S rRNA Secondary Structure Based
onPhylogenetic Data
Source Cornelis W. A. Pleij in Gesteland, R. F.
and Atkins, J. F. (1993) THE RNA WORLD. Cold
Spring Harbor Laboratory Press.
60
3D structures of RNA transfer-RNA structures

Secondary structure of tRNA (cloverleaf)

Tertiary structure of tRNA

61
3D structures of RNA ribosomal-RNA structures

Secondary structure of large rRNA (16S)

Tertiary structure of large rRNA subunit

62
3D structures of RNA Catalytic RNA

Secondary structure of self-splicing RNA

Tertiary structure of self-splicing RNA

63
Some structural rules

Base-pairing is stabilizing
Un-paired sections (loops) destabilize
3D conformation with interactions makes up for
this

64
Final notes

Sense/anti-sense RNAantisense RNA blocks
translation through hybridization with coding
strand

Example. Tomatoes synthesize ethylene in order to
ripe. Transgenic tomatoes have been constructed
that carry in their genome an artificial gene
(DNA) that is transcribed into an antisense
RNA complementary to the mRNA for an enzyme
involved in ethylene production ? tomatoes make
only 10 of normal enzyme amount.

Sense/anti-sense peptidesHave been
therapeutically usedEspecially in cancer and
anti-viral therapy

Sense/anti-sense proteinsDoes it make
(anti)sense?Codons for hydrophilic and
hydrophobic amino acids on the sense strand may
sometimes be complemented, in frame, by codons
for hydrophobic and hydrophilic amino acids on
the antisense strand. Furthermore, antisense
proteins may sometimes interact with high
specificity with the corresponding sense
proteins BUT VERY RARE HIGHLY CONSERVED CODON
BIAS