Title: Information%20Storage%20and%20Processing%20in%20Biological%20Systems:
1Information Storage and Processing in Biological
Systems A seminar course for the Natural Sciences
Sept 16 Introduction / DNA, Gene
regulation Sept 18 Translation and Proteins
Sept 23 Enzymes and Signal transduction Sept
25 Biochemical Networks Sept 30 Simple
Genetic Networks Oct 2 Adventures in
Multicellularity Nov 6 Evolution,
Evolvability and Robustness
2- Background
- The Thread of Life. Susan Aldridge. Chapter 2
- Molecular Biology of the Cell. Alberts et al.
Garland Press - Suggested further reading
- Protein molecules as computational elements in
living cells. D. Bray. Nature. 1995 Jul
27376(6538)307-12. - Signaling complexes biophysical constraints on
intracellular communication. D. Bray. Annu Rev
Biophys Biomol Struct. 19982759-75. - Metabolic modeling of microbial strains in
silico. Ms W. Covert, et al. Trends in
Biochemical Sciences Vol.26 ( 2001). 179-186. - Modelling cellular behaviour. D. Endy R.
Brent. Nature(2001) 409 391-395.
3A - Introduction to Proteins / Translation
- The primary structure is defined as the sequence
of amino acids in the protein. This is
determined by and is co-linear to the sequence of
bases (triplet codons) in the gene.
5---CTCAGCGTTACCAT---3 3---GAGTCGCAATGGTA---5
5---CUCAGCGUUACCAU---3 N---Leu-Ser-Val-Thr--
-C
DNA RNA PROTEIN
transcription
translation
- this is not strictly true in most eukaryotic
genomes
4Structure of Genes In Eukaryotic Organisms
Transcription
hnRNA heterogeneous nuclear RNA
RNA splicing
mRNA
5Structure of Genes In Eukaryotic Organisms
Introns
Exons
Transcription
hnRNA heterogeneous nuclear RNA
RNA splicing
mRNA
6Structure of Genes In Eukaryotic Organisms
Transcription
hnRNA heterogeneous nuclear RNA
RNA splicing
Alternative RNA splicing
mRNA
mRNA
7Structure of Genes In Eukaryotic Organisms
Control Elements
Transcription
hnRNA heterogeneous nuclear RNA
RNA splicing
mRNA
8Structure of Genes In Eukaryotic Organisms
- Coding sequence can be discontinuous and the
gene can be composed of many introns and exons. - The control regions ( operators) can be
spread over a large region of DNA and exert
action-at-a-distance. - There can be many different regulators acting
on a single gene i.e. more signal integration
than in bacteria. - Alternate splicing can give rise to more than
one protein product from a single gene. - Predicting genes (introns, exons and proper
splicing) is very challenging. - Because the control elements can be spread over
a large segment of DNA, predicting the important
sites and their effects on gene expression are
not very feasible at this time.
9Translation
- Translation is the synthesis of a polypeptide
(protein) chain using the mRNA template. - Note the mRNA has directionality and is read
from the 5end towards the 3end. -
Note that many ribosomes can read one message
like beads on a string generating many
polypeptide chains simultaneously.
10Translation
- The 5end is defined at the DNA level by the
promoter but this does not define the translation
start. - The translation start sets the register or
reading frame for the message. - The end is determined by the presence of a STOP
codon (in the correct reading frame).
11Schematic Illustration of Translation Protein
Synthesis involves specialized RNA molecules
called transfer RNA or tRNA.
12Translation Start Position
The translation start is dependent on 1) a
sequence motif called a ribosome binding site
(rbs) 2) an AUG start codon 5-10 bp downstream
from the rbs
3end of 16S rRNA 3AU
//-5 UCCUCA
5-NNNNNNNAGGAGU-N5-10-AUG-//-3 mRNA
rbs start
13In bacteria a single mRNA molecule can code for
several proteins. Such messages are said to be
polycistronic. Since the message for all genes
in such a transcript are present at the same
concentration (they are on the same molecule),
one might predict that translation levels will be
the same for all the genes. This is not the case
translation efficiency can vary for the different
messages within a transcript.
Promoter (Start)
Terminator (Stop)
Gene 1 Gene 2 Gene 3 Gene 4
DNA
mRNA
4 genes , 1 message
14Translation Efficiency is an important part of
gene expression
Polycistronic mRNA
Translation
Tar Tap R B Y
Z 5000 1000
lt100 1000 18000 10000
(Protein monomer per cell)
A single mRNA may encode several proteins. The
final level of each protein may vary
significantly and is a function of 1)
translation efficiency 2) protein stability
15B Introduction to Proteins / Characteristics
- The primary structure is defined as the sequence
of amino acids in the protein. This is
determined by and is co-linear to the sequence of
bases (triplet codons) in the gene.
5---CTCAGCGTTACCAT---3 3---GAGTCGCAATGGTA---5
5---CUCAGCGUUACCAU---3 N---Leu-Ser-Val-Thr--
-C
DNA RNA PROTEIN
transcription
translation
- this is not strictly true in most eukaryotic
genomes
16There are 20 naturally occurring amino acids in
proteins, each with distinctive side chains
that give them characteristic chemical properties.
amino group carboxylic acid
amino acid (alanine)
17There are 20 naturally occurring amino acids in
proteins, each with distinctive side chains
that give them characteristic chemical properties.
amino group carboxylic acid
a-carbon
amino acid (alanine)
Amino acids differ in the side chains on the
a-carbon.
18There are 20 naturally occurring amino acids in
proteins, each with distinctive side chains
that give them characteristic chemical properties.
amino group carboxylic acid
a-carbon
amino acid (alanine)
-CH3 (methyl)
Amino acids differ in the side chains on the
a-carbon.
19Alanine Tyrptophan (ala) (trp) (A) (W)
H2O
Dipeptide (Ala-Trp)
By convention polypeptides are written from the
N-terminus (amino) to the C-terminus (carboxy)
peptide bond
20Alanine ala A Arginine arg
R Asparagine asn N Aspartic acid asp
D Cysteine cys C Glutamine gln
Q Glutamic acid glu E Glycine gly
G Histidine his H Isoleucine ile
I Leucine leu L Lysine lys
K Methionine met M Phenylalanine phe
F Proline pro P
Serine ser S Threonine
thr T Tryptophan trp W Tyrosine
tyr Y Valine val V
Glycine
Proline
Cysteine
21The Newly Synthesized Polypeptide
- The information from DNA?RNA?Protein is linear
and the final polypeptide synthesized will have a
sequence of amino acids defined by the sequence
of codons in the message. - The sequence of amino acids is called the
primary structure. - Secondary structure refers to local
regular/repeating structural elements. - The folded three dimensional structure is
referred to as tertiary structure. - Protein function depends on an ordered / defined
three dimensional folding. The final three
dimensional folded state of the protein is an
intrinsic property of the primary sequence. How
the primary sequence defines the final folded
conformation is generally referred to as the
Protein Folding Problem.
22Primary structure of green fluorescent protein
(single letter AA codes) SEQUENCE
238AA 26886MW MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEG
DATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFK
SAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGN
ILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQN
TPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDE
LYK
The primary sequence can be derived directly from
the gene sequence but going from sequence to
structure or sequence to function is not possible
unless there is a related protein for which
structure or function is known. Likewise, the
structure alone rarely provides information about
function (only if the function of a related
protein is known).
23Projections of the Tertiary Structure of Green
Fluorescent Protein
Backbone tracing
24Projections of the Tertiary Structure of Green
Fluorescent Protein
Ile188-Gly189-Asp190-Gly191-Pro192-Val193
Backbone tracing
25Projections of the Tertiary Structure of Green
Fluorescent Protein
Ribbon diagram showing secondary structures
26Projections of the Tertiary Structure of Green
Fluorescent Protein
Secondary structures
a-helix
Ribbon diagram showing secondary structures
27Projections of the Tertiary Structure of Green
Fluorescent Protein
Secondary structures
a-helix
b-strand
Ribbon diagram showing secondary structures
28Projections of the Tertiary Structure of Green
Fluorescent Protein
Ile188-Gly189-Asp190-Gly191-Pro192-Val193
Wireframe model showing all atoms and chemical
bonds.
29Projections of the Tertiary Structure of Green
Fluorescent Protein
Stick model showing all atoms and chemical
bonds.
Space filling model where each atom is
represented as a sphere of its Van der Waals
radius.
30The final folded three dimensional (tertiary)
structure is an intrinsic property of the primary
structure.
Primary structure Tertiary Structure
MSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTT
GKLPVPWPTLVTTFSYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFF
KDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNV
YIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHY
LSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELY
folding
denaturation
Random Coil Denatured Unfolded
Native Folded
In general, proteins are unstable outside of the
cell and very sensitive for solvent conditions.
31- Active site - the region of a protein (enzyme) to
which a substrate molecule binds. - The active site is formed by the three
dimensional folding of the peptide backbone and
amino acid side chains. (lock and key / induced
fit) - The active site is highly specific in binding
interactions (stereochemical specificity).
The three dimensional structure of CAP and the
cAMP ligand-binding site (Figures 3-45 and 3-55
from Alberts)
32Conformational Change in Protein Structure
Proteins can undergo changes in their three
dimensional structure in response to changing
conditions or interactions with other molecules.
This usually alters the activity of the protein.
33Conformational Change in Protein Structure
Proteins can undergo changes in their three
dimensional structure in response to changing
conditions or interactions with other molecules.
This usually alters the activity of the protein.
Binding of the substrate (glucose) cause the
protein (hexokinase) to shift from an open to
closed conformation. (Fig. 5-2, Alberts)