Title: A simple model for the evolution of molecular codes driven by the interplay of accuracy, diversity and cost
1A simple model for the evolution ofmolecular
codes driven by the interplayof accuracy,
diversity and cost
- Tsvi Tlusty, Physical Biology
- Gidi Lasovski
2The main idea
- Understanding molecular codes
- Their evolution and the forces that affect them
3 - What is a molecular code
- The genetic code
- The fitness of molecular codes
- The evolution and emergence of molecular codes
- Suggested experimental verification
4The Central Dogma of Molecular Biology
- A signaling protein binds to a gene
- The RNA polymerase generates mRNA from the gene
- The mRNA exits the nucleus of the cell
- A Ribosome reads the mRNA and creates a protein,
with the help of tRNAs - The tRNAs provide the Ribosome with amino acids,
the building blocks of the protein
5What is a molecular code?
- The Genetic Code is a molecular code
- The symbols are A, U, C G
- The Machine
- RNA Polymerase
- Signaling molecules (proteins)
- mRNA
- Ribosome
- The output
- Proteins
- The cost of operation of the machine is the ATP
and the tRNAs. - The symbols encode Amino Acids redundantly
- 64 options only 20 amino acids
- for robustness reasons?
6The genetic code
Acidic Basic Polar Non Polar
7The genetic code - similarity
8The fitness of molecular codes
- Three parameters
- Error load
- Diversity
- Cost
- We define the fitness of the code as the linear
combination of these three conflicting needs
9Error load
- When reading a number, we can misread 3 for 8 (or
vice versa) anywhere - 3838383838383838383838
- here or here
- We want to make sure the errors would be less
likely where theyre more important - 3838383838383838383838
10Error load
- Similar meaning should go with a similar (close)
symbol, so that a small reading error would cause
only a small understanding error. - If this -gt signifies the deviation of sugar,
which code would you prefer - A or B
11Diversity
- Enables efficient and accurate delivery of
different messages. - A small lack of sugar - Im hungry
- A medium lack of sugar - Im starving
- A large lack of sugar Lets go to San Martin
- NOW!
12Diversity
- Enables the code to transmit as many different
symbols as possible, equivalent to different
symbols in a UTM - Many different symbols less states of the
machine - More symbols also enable faster, more accurate
control
13Cost
- Car insurance the cost of improving the
robustness of your driving - Another example is the price of ink and space in
my demonstration
14Cost
- Strong binding takes up more energy to create and
read - The energy is proportional to the length of the
binding site. - The binding probability scales like e-E/T, E
ln(p) - Notice that diversity has its costs as well, more
symbols means longer molecules
15Summary
- The code has to be optimized at an equilibrium of
error load, diversity and cost.
16Quantifying the code
- Using Lagrange multipliers
- H -Load WD Diversity - WC Cost
- C is the reduction of entropy, so WC is
equivalent to the temperature (WCC TdS)
17The result is an Ising like model
? the order parameter H the fitness C the
cost D the diversity L the error load
- wc is equivalent to the temperature
- J/wc 1 is the phase transition
- liquid (the non coding state) J/wc lt 1
- solid (the coding state) J/wc gt 1
18Possible experiment
- Take a bacteria with the transcription factor i.
- Duplicate the gene that codes i, lets call the
duplicate j - i, j control the response to A(t)
- If A(t) fluctuates strongly, i, j may evolve to 2
different meanings - better control - If A(t) fluctuates weakly, maybe one of them
would be deleted. - Experiment around the critical point
19rij the probability to read i as j Pia the
probability for i to be mapped to a is Caß the
cost of misinterpreting a as ß
Cost C Sia pia ln(pia/pa) Eialn pia pa ns-1 Sj pja Diversity D Si,j,a,ß(1 - dij )piapjßcaß Error load L Si,j,a,ß rijpiapjßcaß
- Using Lagrange multipliers
- H -L WD D - WC C
- C is the reduction of entropy, so WC is
equivalent to the temperature (WCC TdS)
20Additional slides for the mathematical model
21H cJ?2 - wC(1 ?) ln(1 ?) (1 - ?) ln(1 -
?)
? the order parameter H the fitness C the
cost D the diversity L the error load ?
tanh (J/wC ?)
- J c (1-2r wD)
- wc is equivalent to the temperature
- J/wc 1 is the phase transition
- liquid (the non coding state) J/wc lt 1
- solid (the coding state) J/wc gt 1
22Quantifying the code
- Ns symbols (i, j, k..) mapped to Nm meanings (a,
ß..) - Pia - The probability for i to be mapped to a
- SaPia 1
- In the non coding state, the prob. is constant
1/Nm - rij the probability to read i as j.
- Caß the cost of misinterpreting a as ß
- The total error load
- L Si,j,a,ß rijpiapjßcaß
- Just like a ferromagnet r interaction, c
magnitude p the spin - Also prefers specific symbols L(rii) 0 only if
i signifies a specific meaning
23Toy model (1 bit)
- P - the optimal code, can be found by the
derivation ?HT/?pia 0 - pia z-1 pa exp(-Gia/wC) z Sß
pßexp(-Giß/wC) - Gia 2Sj,ß (rij - wD(1 - dij))pjßcaß
- c 0 c
- c 0
- r 1-r r
- r 1-r
- p 0.5 1 ? 1 - ?
- 1 - ? 1 ?
- ? tanh (J/wC ?)
- J c (1-2r wD)
- wC J (1 - 2r wD) c
24General criteria
- Qiajß -(?2H/?pia?pjß) stops being positive
definite - wC 2nm-1 (?r wD)?c
- ?r is the 2nd-largest eigenvalue of r
- ?c is the smallest eigenvalue of c - corresponds
to the longest wavelength smallest error load