Evolution of ProteinCoding Genes and the Generalized MistranslationInduced Misfolding Hypothesis' - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Evolution of ProteinCoding Genes and the Generalized MistranslationInduced Misfolding Hypothesis'

Description:

the Generalized Mistranslation-Induced Misfolding Hypothesis. ... No mechanistic explanation is suggested, however [Wolf YI, Carmel L, Koonin EV. ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 25
Provided by: Wol361
Category:

less

Transcript and Presenter's Notes

Title: Evolution of ProteinCoding Genes and the Generalized MistranslationInduced Misfolding Hypothesis'


1
Evolution of Protein-Coding Genesandthe
Generalized Mistranslation-Induced Misfolding
Hypothesis.
Yuri Wolf1, Irina Gopich2, Eugene Koonin1, David
Lipman1 1 NCBI and 2 NIDDK, NIH, Bethesda, MD, USA
MCCMB, July 2009, Moscow
2
Evolution Rate Two Invariants
The concept of "molecular clock" family-specific
evolutionary rates remain near constant over long
periods Zuckerkandl E, Pauling L. Molecules as
documents of evolutionary history. J Theor Biol.
1965, 8357-366.
Protein A faster
sequence distance
Protein B slower
time
Similar shape of organism-specific distributions
of evolutionary rates. Grishin NV, Wolf YI,
Koonin EV. From complete genomes to measures of
substitution rate variability within and between
proteins. Genome Res. 2000, 10991-1000.
Bacteria Archaea Eukaryota
3
Intrinsic Constraints and Importance
First explanation for differences in evolutionary
rates of protein-coding genes it is determined
by intrinsic structural-functional constraints
and gene dispensability Wilson AC, Carlson SS,
White TJ. Biochemical evolution. Annu Rev
Biochem. 1977, 46573-639.
Ri f(Pi)g(Qi)
Taken for granted for 20 years empirical
studies started in late 1990s Hurst LD, Smith
NG. Do essential genes evolve slowly? Curr Biol.
1999, 9747-750 many others in 2000-2006. A
very complex picture emerged evolution rate
appears to be very weakly correlated with
dispensability. The strongest observed correlate
for the evolution rate is expression level.
4
Gene Status
"STATUS"
"phenomic" measures
"evolutionary" measures
The concept of "gene status" as the most
important meta-variable affecting phenomic
variables (directly) and evolutionary variables
(indirectly) explains the observed pattern of
correlations and allows to predict correlations
for novel measurements. No mechanistic
explanation is suggested, however Wolf YI,
Carmel L, Koonin EV. Unifying measures of gene
function and evolution. Proc Biol Sci. 2006,
2731507-1515.
5
Mistranslation-Induced Misfolding
"We propose, and demonstrate using a
molecular-level evolutionary simulation, that
selection against toxicity of misfolded proteins
generated by ribosome errors suffices to create
all of the observed covariation between
evolutionary rate and expression Drummond DA,
Wilke CO. Mistranslation-induced protein
misfolding as a dominant constraint on
coding-sequence evolution. Cell. 2008,
134341-352.
6
Generalized MIM Hypothesis
overall effect
intrinsic properties of the protein
expression level
Specific properties of the gene determine the
likelihood of a disturbance. Intrinsic properties
of the protein (structural-functional
constraints, SFC) determine the outcome of a
disturbance. Expression level amplifies the
overall effect of the disturbance (amplification
by expression, ABE). Highly expressed genes are
fine-tuned to reduce the misfolding cost.
7
Why ABE?
Efficiency 50
8
ABE and Sequence Evolution
poor folding
cost of misfolding
robust folding
expression
High expression Large difference in cost Large
difference in fitness Strongly constrained
Low expression Small difference in cost Small
difference in fitness Weakly constrained
robustness
sequence space
9
SFC vs. ABE
Relative importance of these factors is unknown...
?
a
10
SFC vs. ABE
unequal representation of different types of
structures
evolution rate
expression level
because in real-world data influence of
expression is conflated with structural-functional
influence (i.e. highly-expressed proteins are
not a random subset by structure).
11
SFC vs. ABE the first glance
Domains in multidomain proteins different
structural/functional properties, but same
expression level
Can be used to investigate SFC effects when ABE
effects are controlled for Wolf MY, Wolf YI,
Koonin EV. Comparable contributions of
structural-functional constraints and expression
level to the rate of protein sequence evolution.
Biol Direct. 2008, 340. If ABE dominates,
different domains within multidomain proteins
should evolve at the same rate (rate
homogenization) if SFC dominates, domains in
multidomain proteins should evolve at their
"normal" rates (rate independence). We found that
neither is true, i.e. a ? ?.
12
New Experimental Data
Two parallel arrays of (apparently very clean)
protein MassSpec abundance data, well correlated
with evolution rate Schrimpf SP et al.
Comparative functional analysis of the
Caenorhabditis elegans and Drosophila
melanogaster proteomes. PLOS Biol. 2009, 7e48
13
SFC vs. ABE an orthogonal look
orthologs have the same structure and function
RATE EXPR RATE EXPR

so the difference between their evolution rates
is determined by the difference between their
expression levels.
Can be used to investigate ABE effects when SFC
effects are controlled.
14
Model Assumptions
  • Evolution rate is a multiplicative function of
  • structural and functional constraints (SFC)
  • effect of amplification by expression (ABE)
  • other unknown factors random noise errors of
    rate estimation.
  • Both SFC and ABE effects are well approximated by
    power functions of a hidden SF constraint
    factor and protein translation rate
    respectively. SFC effect is gene- and
    organism-independent ABE effect is
    gene-independent
  • Translation rate is estimated by measuring the
    abundance of the corresponding gene product.
  • Effects of unknown factors, random noise and
    imperfections of rate and estimation can be
    combined into a single random variable,
    independent from other variables.

15
Model Structure
AY
AX
re
?
?
TX
TY
aX
aY
S
?
?
RY
RX
Relationships between the model components.
16
Observables
AY
AX
TX
TY
S
RY
RX
Observed correlations.
17
Model Basic Equations
Parameters ? strength of the SFC effect aX,
aY strength of the ABE effect in two organisms
(e.g. worm and fly) ? relationship between
translation rate and measured abundance re
correlation between errors in abundance
measurements Data (log scale,
standardized) RX,i, RY,i evolution rates of
orthologous genes in two organisms AX,i, AY,i
abundances of orthologs in two organisms Compute
d correlations rR, rA, rRAXX, rRAYY, rRAXY,
rRAYX
ABE effect
ev. rate
SFC effect
random factors
abundance
random factors
translation rate
18
Model Solution
aX, aY and ? are expressed using the observed
correlations (rR, rA, rRAXX, rRAYY, rRAXY, rRAYX)
and (unknown) parameters characterizing the
experimental procedure ? and re.
19
Worm and Fly Correlations
Variables rA 0.80 rR 0.52 rRAXX,
rRAYY -0.41 -0.34 rRAXY, rRAYX -0.37 -0.32 r?
-0.09 - p-value 1.7x10-5
rRAXX
rRAYY
Rfly
Rworm
Afly
20
Worm and Fly Solution Area
aX, aY and ? are expressed through ? and re.
allowed values
Not all possible values of 0?1 and 0re1
satisfy boundary conditions (e.g. ?20)
re
impossible values
perfect measurement
?
21
Worm and Fly Solution Surfaces
aX
aY
?
re
re
re
?
?
?
  • Two approaches
  • assumption of perfect measurement (? 1, re 0)
  • Bayesian reasoning ("median" values of aX, aY
    and ? such that area of e.g. ?(?,re)lt? is 0.5
    of total area)

Variable "Perfect" "Median" aX, aY -0.17,
-0.10 -0.22, -0.13 ? -0.68 -0.64 ?/a 4.0,
6.9 2.9, 4.9
22
Estimates with Other Data
?/a estimate for "Perfect" "Median" worm/fly
MassSpec 4.0, 6.9 2.9, 4.9 worm/fly MassSpec
(bootstrap) 2.8, 20.7 worm/fly Affymetrix
mRNA 7.0, 72.5 5.2, 24.6 human/mouse EST
mRNA 9.3, 51.1
  • Conclusions (based on worm/fly MassSpec data and
    median estimates
  • correlation with abundance explains 10-17 of
    variance in evolution rate of protein-coding
    genes
  • SFC and ABE together could explain 49-57 of rate
    variance
  • the SFC effect is 3-5 times stronger than the ABE
    effect
  • SFC alone would explain 41 of rate variance
    ABE alone would explain 2-5 of rate variance
  • SFC and ABE are correlated at r?0.37 the
    combined effect explains the remaining 8-15 of
    rate variance

23
Generalized MIM Hypothesis
naive MIM model
generalized MIM model
folding robustness
high --- expression --- low
high --- expression --- low
fitness
narrow fitness peak low R
wide fitness peak high R
narrow fitness peak low R
wide fitness peak high R
24
Acknowledgments
Irina Gopich (NIDDK)
Eugene Koonin (NCBI)
David Lipman (NCBI)
Sabine Schrimpf and Christian von Mering (of
Schrimpf et al., 2009 University of Zurich,
Switzerland)
Write a Comment
User Comments (0)
About PowerShow.com