Diversity and Design in Cellular Networks

- Prediction, Control and Design of and with Biology

Adam Arkin, University of California,

Berkeley http//genomics.lbl.gov

"Nothing in biology makes sense except in the

light of evolution." Theodosius Dobzhansky, The

American Biology Teacher, March 1973

A scientist

The Advent of Molecular Biology

Genome

Macromolecules

Metabolites

Biochemistry

Through RNA

Feedback Feedforward

Myxococcus xanthus

- Even cells as simple as bacteria are highly

social, differentiating, sensing/actuation systems

Images from Reichardt or D. Kaiser

Immune cells

- They perform amazing engineering feats under the

control of complex cellular networks

Onsum, Arkin, UCB

Mione, Redd, UCL

1/50 of the known neutrophil chemotaxis network

c5a- receptor

Fc- receptor

PIP3 control

Calcium control

(No Transcript)

Systems and Synthetic Biology

- Systems biology seeks to uncover the design and

control principles of cellular systems through - Biophysical characterization of macromolecules

and other cellular structures - Comparative genomic analysis
- Functional genomic and high-throughput

phenotyping of cellular systems - Mathematical modeling of regulatory networks and

interacting cell populations. - Synthetic biology seeks to develop new designs in

the biological substrate for biotechnological,

medical, and material science. - Founded on the understanding garnered from

systems biology - New modalities for genetic engineering and

directed evolution - Scaling towards programmable biomaterials.

Systems biology is necessary

- Because of the highly interconnected nature of

cellular networks - Because it is the best way to understand what is

controllable and what is not in pathway dynamics - Because it discovers what designs evolution has

arrived at to solve cellular engineering problems

that we emulate in our own designs.

A broader overview

- Evolutionary Game Theory
- Ecological Modeling
- Population Biology
- Epidemiology
- Neuroscience
- Organ Physiology
- Immune Networks
- Cellular Networks
- Problems
- Static and Dynamic Representations
- Physical Picture for Representation (e.g.

deterministic vs. stochastic) - Mathematical Description of Physics (e.g.

Langevin vs. Master Equation) - Levels of abstraction Formal and ad hoc.
- Measurement High-throughput/broadbrush/imprecise

vs. low-through/targetted/precise

Chemical Kinetics The short course I.

Consider a collision between two hard

spheres In a small time interval, dt, sphere 1

will sweep out a small volume relative to sphere

2.

If the center of sphere 2 lies within this volume

at time t, then in the time small time interval

the spheres will collide. The probability that a

given sphere of type 2 is in that volume is

simply dVcol/V (where V is the containing

volume). All that remains is to average this

quantity over the velocity distributions of the

spheres.

Chemical Kinetics The short course II.

Given that, at time t, there are X1 type-1

spheres and X2 type-2 spheres then the

probability that a 1-2 collision will occur on V

in the next time interval is

Now if each collision has a probability of

causing a reaction then in analogy to the last

equation, all we can say is

X1 X2 c1 dt average probability that an R1

reaction will occur somewhere in V within the

interval dt.

Chemical Kinetics The Master Equation I.

If we wish to map trajectories of chemical

concentration, we want to know the probability

that there will be molecules of each

species in the chemical mechanism at time t in V.

We call that probability

This function gives complete knowledge of the

stochastic state of the system at time t. The

master equation is simply the time evolution of

this probability. To derive it we need to derive

which is simply done from our previous work.

It is the sum of two terms 1. The probability

that we were at X at time t and we stayed

there. 2. The probability that a reaction of type

m brought us to this state.

Chemical Kinetics The Master Equation II

The first term is given by

Where

The probability that a reaction of type m will

occur given that the system is in a given state

at time t.

and where hm is a combinatorial function of the

number of molecules of each chemical species in

reaction type m.

Chemical Kinetics The Master Equation III

The second term is given by

where Bm is the probability that the system is

one reaction m away from state at time t and

then undergoes a reaction of type m.

Plugging these terms into the equation for

and rearranging we arrive at the

master equation.

Deterministic Kinetics I.

Deterministic kinetics may be derived with some

assumptions from the master equation. The end

result is simple a set of coupled ODEs

where s is the stoichiometric matrix and v is a

vector of rate laws. Example Enzyme kinetics

Mathematical Representation

X 2 Y

2 Z

Z E

EZ

Enzymatic

EZ

EP

Very simplest Mass action representation

Flux Vector

Stoichiometric Matrix

Mathematical Representation

X 2 Y

2 Z

X 2 Y

2 Z

E

Z E

EZ

Z

P

EZ

EP

Often timesthe enzyme isnt represented

Enzyme Kinetics II.

But often times we make assumptions equivalent to

a singular perturbation. E.g. we assume that E,S

and ES are in rapid equilibrium

These forms are the common forms used in basic

analysis

Stationary State Analysis

Clearly, the steady state fluxes are in the null

space of the stoichiometric matrix. But these

are only unique if significant constraints are

also applied (the system in under-determined). Al

so highly dependent on representation.

The Stoichiometric Matrix

- This matrix is a description of the topology of

the network. - It is tricky to abstract into a simple incidence

matrix, for example. - Most experimental measurements can only capture a

small fraction of the interactions that make up a

network. - However, it does put some limits on behavior

Graph Theory Scale-Free networks?

- Nodes are protein domains
- Edges are interactions
- Statements are made about
- Robustness
- Signal Propagation (small world properties)
- Evolution

Stability Analysis for Deterministic Systems

- ? a v m
- a?b v k a
- a 2 b ? 3 b v ab2
- b ?c v b
- da/dt m- ka ab2
- db/dt ka ab2 - b

Stationary State

- da/dt m- ka ab20
- db/dt ka ab2 b0
- ass m/(m2k), bss m
- So for any given value of m or k we can calculate

the steady-state. These are parameters

Stability

- We calculate stability by figuring out if small

perturbations around a stationary state grow away

from the state or fall back towards the state. - So we expand our differential equations around a

steady state and ask how small pertubations in a

and b grow.

Stability

Stability

Stability

Thus the ? are the eigenvalues of the

perturbation matrix and will determine if the

perturbations grow or diminish.

Why is quantitative analysis important?

B-p

Å

Ass

B-p

B-p

?

Å

E.g. Focal Adhesion Kinase Alternative Splice

Quantitative Analysis

Bistability

A simple model of the positive feedback

kC1.6

kc

Monostable

Stationary state FAK-I

Irreversibly Bistable

Weakly bistable

kc catalytic constant for the

trans-autophosphorylation.

Signal Filtering

Brief Digression Chemical Impedance

I?A?

So A is the signal inside the cell that I is

outside the cell. What if A signals to downstream

targets by reacting with them? AB?C

The rates and concentrations of downstream

processes degrade the signal from A.

Brief Digression Chemical Impedance

I?A?

But what if reaction is by reversible

binding? AB?C

The rates and concentrations of downstream

processes dont affect the signal.

But.what about the ME

Error and ORDINARY DIFFERENTIAL EQUATIONS

Ordinary Differential Equations

- A differential equation defines a relationship

between an unknown function and one or more of

its derivatives - Physical problems using differential equations
- electrical circuits
- heat transfer
- motion

Ordinary Differential Equations

- The derivatives are of the dependent variable

with respect to the independent variable - First order differential equation with y as the

dependent variable and x as the independent

variable would be

Ordinary Differential Equations

- A second order differential equation would have

the form

Ordinary Differential Equations

- An ordinary differential equation is one with a

single independent variable. - Thus, the previous two equations are ordinary

differential equations - The following is not

Partial Differential Equations

d

d

Ordinary Differential Equations

- The analytical solution of ordinary differential

equation as well as partial differential

equations is called the closed form solution - This solution requires that the constants of

integration be evaluated using prescribed values

of the independent variable(s).

Ordinary Differential Equations

- At best, only a few differential equations can be

solved analytically in a closed form. - Solutions of most practical engineering problems

involving differential equations require the use

of numerical methods.

One Step Methods

- Focus is on solving ODE in the form

h

y

yi

x

This is the same as saying new value old value

(slope) x (step size)

One Step Methods

- Focus is on solving ODE in the form

This is the same as saying new value old value

(slope) x (step size)

One Step Methods

- Focus is on solving ODE in the form

This is the same as saying new value old value

(slope) x (step size)

Eulers Method

- The first derivative provides a direct estimate

of the slope at xi - The equation is applied iteratively, or one step

at a time, over small distance in order to reduce

the error - Hence this is often referred to as Eulers

One-Step Method

Taylor Series

EXAMPLE

For the initial condition y(1)1, determine y for

h 0.1 analytically and using Eulers method

given

(No Transcript)

(No Transcript)

step size

dy/dx

Recall the analytical solution was 1.4413 If we

instead reduced the step size to to 0.05

and apply Eulers twice

If we instead reduced the step size to to 0.05

and apply Eulers twice

Recall the analytical solution was 1.4413

Error Analysis of Eulers Method

- Truncation error - caused by the nature of the

techniques employed to approximate values of y - local truncation error (from Taylor Series)
- propagated truncation error
- sum of the two global truncation error
- Round off error - caused by the limited number of

significant digits that can be retained by a

computer or calculator

Taylor Series

Higher Order Taylor Series Methods

Derivatives

Modification of Eulers Methods

- A fundamental error in Eulers method is that the

derivative at the beginning of the interval is

assumed to apply across the entire interval - Two simple modifications will be demonstrated
- These modification actually belong to a larger

class of solution techniques called Runge-Kutta

which we will explore later.

Heuns Method

- Consider our Taylor expansion

Approximate f as a simple forward difference

Heuns Method

Substituting into the expansion

Heuns Method

- Determine the derivatives for the interval _at_
- the initial point
- end point (based on Euler step from initial

point) - Use the average to obtain an improved estimate of

the slope for the entire interval - We can think of the Euler step as a test step

y

Take the slope at xi Project to get f(xi1

) based on the step size h

h

xi xi1

y

h

xi xi1

y

Now determine the slope at xi1

xi xi1

y

xi xi1

Take the average of these two slopes

y

xi xi1

y

Use this average slope to predict yi1

xi xi1

y

Use this average slope to predict yi1

xi xi1

y

y

xi xi1

x

xi xi1

y

x

xi xi1

Improved Polygon Method

- Another modification of Eulers Method
- Uses Eulers to predict a value of y at the

midpoint of the interval - This predicted value is used to estimate the

slope at the midpoint

Improved Polygon Method

- We then assume that this slope represents a valid

approximation of the average slope for the entire

interval - Use this slope to extrapolate linearly from xi to

xi1 using Eulers algorithm

Improved Polygon Method

We could also get this algorithm from

substituting a forward difference in f to i1/2

into the Taylor expansion for f, i.e.

y

f(xi)

x

xi

y

h/2

x

xi xi1/2

y

h/2

x

xi xi1/2

y

f(xi1/2)

x

xi xi1/2

y

f(xi1/2)

x

xi xi1/2

y

Extend your slope now to get f(x i1)

h

x

xi xi1/2 xi1

y

f(xi1)

x

xi xi1/2 xi1

Conclusions

- Algorithms can be more or less stable to

truncation or round off error. - Algorithms can be better or worse approximations

to the math you want to do. - Algorithms can be more or less complex

Master Equation Simulation I

- (Based on Gillespie, D.T. (1977) JPC, 81(25)

2340) - We are given a system in the state (X1,...,XN)

at time t. - To move the system forward in time we must ask

two questions - When will the next reaction occur?
- What kind of reaction will it be?
- In order to answer these questions we introduce
- P(t,m)dt probability that, given the state
- (X1,...,XN) at time t, the next
- reaction in V will occur in the
- infinitesmal time interval
- (tt,ttdt) there will be a
- reaction of type Rm.

Master Equation Simulation II

Now we can define the P(t,m) to be the

probability that no reaction occurs in the

interval (t,tt) (Po(t)) times the probability

that reaction Rm will occur in the infinitesmal

time dt following this interval

(aµdt) P(t,m)dt Po(t) aµd t Now aµ is

simply a term related to the rate equation for a

given reaction. In fact it is a transition

probability, cµ, times a combinatorial term which

enumerates the number of ways n-species can react

in volume V given the configuration (X1,...,XN),

hµ. Therefore 1-S aµd t ' probability

that no reaction will occur in time d t '

from the state (X1,...,XN). and Po(t ' d t

') Po(t ')1-S aµd t ' the solution of which

is Po(t ') exp-S aµ t

Master Equation Simulation III

Endogenous Noise

P

r

o

m

o

t

e

r

g

a

P

- One gene
- Growing cell, 45 minutes division time
- Average 60 seconds between transcripts
- Average 10 proteins/transcript

A

e

n

e

A

A

A

A

2

S

i

g

n

a

l

P

r

o

t

e

i

n

A

What happens when you have bistability and noise?

Langevin equation

E

Å

- But what if there is external noise on E?
- Lets start with

The compact Langevin

- Plug the conservation conditions into the

equations for A-p (A)

Diffusion

Drift

Note that another term in 1/KA has been

introduced. There is now the possibility of a

cubic nullcline.

The Fokker-Planck equivalent.

Which yields the stationary nullcline

- Compared to the deteriministic nullcline

Depending on the noise type

Ass

p0 Normal Noise p1/2 Chi-square

noise p1 Log-normal noise

Validation by ME simulation

It turns out this generates log-normal noise on E

ME Simulation

With noise on E Without noise

Stationary Distribution with Noise

Stationary Distribution w/o Noise

Summary

- Adding noise to a system (in this case external

noise) can qualitatively change its dynamics. - Interestingly we can predict the effect with a

compact Langevin approach AND a MM approximation

pretty well compared to whats observed in a full

ME simulation. - The implications for noise-induced bistability

and switching havent been fully worked out.

But an Ugly specter is raised.

Is this really a valid picture? Adding noise

changes the nullcline!

Nonetheless Static noise can make things look

bistable

Linear

Switch

There is a relationship between the variance on E

and the slope of the response that determines

whether the stationary distribution will be

bimodal.

p(x)

p(x)

X

X

Niches are Dynamic

abiotoic reservoir

- Characteristic times may be spent in each

environment. - Environments themselves are variable.

Adaptability vs. Evolvability

Life Cycle

- Adaptability Adjustment on the time scale of

the life cycle of the organism

- Evolvability Capacity for genetic changes to

invade new life cycles

Chris Voigt

Evolvability

- In a dynamic environment, the lineage that

adapts first, wins - Fewer mutations means faster evolution
- Are some biosystems constructed to minimize the

mutations required to find improvements?

- Modularity
- Robustness / Neutral drift improves functional

sampling - Shape of functionality in parameter space
- Minimize null regions in parameter space

(entropy of multiple mutations)

Chris Voigt

Logic of B.subtilis stress response

Sporulation

- Network organization has a functional logic.
- There are different levels of abstraction to be

found.

Clustered Phylogenetic Profiles

species

2

1

4

3

5

- Clustered phylogenetic profile shows blocks of

conserved genes - methyl-processing receptors and chemotaxis genes

in motile bacteria - methyl-processing receptors and chemotaxis genes

in motile Archaea - flagellar genes in motile bacteria
- type III secretion system (virulence) in

non-motile pathogenic bacteria - motility genes in spore-forming bacteria
- late-stage sporulation genes in spore-forming

bacteria - spore coat and germination response genes in

spore-forming bacteria that are not competent - late-stage sporulation genes in spore-forming

bacteria that are also competent - DNA uptake genes in Gram positive bacteria
- DNA uptake genes in Gram negative bacteria

Chemotaxis

6

genes

Sporulation

7

8

8

Competence

9

10

Consider Chemotaxis E. coli

Periplasm

Cytoplasm

Consider Chemotaxis E. coli

Periplasm

Cytoplasm

Integral Feedback Controller

CheAWYZ

Flagella

receptors

cheB/cheR

Clusters are functionally coherent

Receptors

Signal Transduction (che)

Hook and Flagellar Body

Flagellar export/Type III secretion

Flagellar length and motor control

Hypthothetical receptors

Cross-Regulation with Sporulation/Cell Cycle

Different modules for different lives

Animal pathogens

Sporulators

Archeal Extremophiles

Plant pathogens

Endopathogens

Endopathogens

What Ontology Recovers Modules?

Systems Ontology

Color legend sensor controller actuator

cross-talk between networks unknown

Comparative analysis is especially important

Rao, CV, Kirby, J, Arkin, A,P. (2004) PLOS

Biology, 2(2), 239-252

These are the homologous chemotaxis pathways in

E.coli and B. subtilis They have the same

wild-type behavior. Different biochemical

mechanisms. Different robustnesses!

Chris Rao/John Kirby

Two important features

Adaptation Time

Exact Adaptation

Differences in robustness

E . Coli

Chris Rao/John Kirby

B . subtilis

Do these differences lead to differences in

actual fitness?

Evolvability

- In a dynamic environment, the lineage that

adapts first, wins - Fewer mutations means faster evolution
- Are some biosystems constructed to minimize the

mutations required to find improvements?

- Modularity
- Robustness / Neutral drift improves functional

sampling - Shape of functionality in parameter space
- Minimize null regions in parameter space

(entropy of multiple mutations)

Chris Voigt

Logic of B.subtilis stress response

Sporulation

- Network organization has a functional logic.
- There are different levels of abstraction to be

found.

Sporulation initiation

A Motif

The SIN Operon A recurrent motif

Environmental Cellular Signals

Sporulation genes (stage II) spoIIG as model

Spo0A

P1

P3

sinR

sinI

SIN Operon

- Vegetative (healthy) growth Constitutive SinR

expression from P3

Feedback provides filtering

INPUT of Spo0AP

Functional Regions in Parameter Space

Bistability

k1 DGS DGRNAP DGR AI gI KI

k3 AR gR KR

Parameter Space

Oscillations

Hopf points

k1 DGS DGRNAP DGR AI gI KI

k3 AR gR KR

SinR Activity

P3

P1

SinI Activity

Chris Voigt

Full Bifurcation Analyses Evolvability?

- Tuning the expression of SinR (AR) with respect

to SinI leads to dynamical plasticity - Transcription from P3 (k3) strengthens

bistability and damps oscillations

Bistability

Osc

Switch

Graded

Pulse

0A 10,000 nM

0A 10 nM

k3 (mRNA/s)

SinI (nM)

AR (protein/mRNA-s)

AR (protein/mRNA-s)

Examples of Protein-Antagonist Operons

- How can complicated dynamical behavior arise

from simple evolutionary events? - What are the requirements to bias the operon to

one function? - Once established can one function evolve into

another?

Chris Voigt

Comparative analysis of SinI/SinR

region affecting k1

KI

Comparison of five strains of Bacillus anthracis

Across ALL sporulators Very variable.

In anthracis Mutations mostly affect KI and

k1 Threshold of the switch is most affected.

Voigt, CA, Wolf, DM, Arkin, AP, (2004) Genetics,

In press PMID 15466432

Feedback induces stochastic bimodality

spo0Ap1nm

spo0Ap4nm

spo0Ap100nm

sinI

Though we must be careful since the addition of

noise itself changes the qualitative dynamics.

Heterogeneity of Entry to Sporulation

A.

B.

Microscopic analysis of LF25 (amyEPspoIIE cm).

Observation by DIC X60 (A.) and fluorescence (B.)

of cells resuspended to induce sporulation and

incubated 3 hours at 37C. An example of cells

not showing fluorescence are circled in figure A.

Lisa Fontaine-Bodin, Denise Wolf, Jay Keasling

Summary 1

So this motif

- Has flexible function based on parameters
- Most parameters tune response
- A couple of parameters qualitatively change the

response - Is an example of a possible Evolvable Motif
- Sometimes exhibits stochastic effects
- Are they adaptive?

Stochastic Effects Are Ubiquitous

Images

Clones

Stochastic Gene Expression in HIV-1 Derived

Lentiviruses Stable Clones

No Positive Feedback

Tat Feedback Very Bright Sort

Tat Feedback Bright Sort

Software

- MatLab
- Mathematica
- Berkeley Madonna
- GEPASI
- TerraNode
- JDesigner

The game of life

E3

Environment

E1

E1

E4

E2

E2

noise

t

?1

?2

quorum

pi

S1

S2

SN

S3

Output signals

S5

S4

Organism 2

Organism 1

Beginning to link Game Theory to Dynamical

Cellular Strategies.

Formal Model

?y

?y

sx1

a)

b)

Example two environments, two moves, no sensor

e.g. xpili yno pili E1in host E2out

IF E1 selects for x, against y E2 selects

against x, for y

Denise Wolf, Vijay Vazirani

With no sensor, the options are

Denise Wolf, Vijay Vazirani

- ALL cells in state x
- ALL cells in state y
- Statically mixed population (some x, some y)
- Phase variation of individual cells between x and

y

With no sensor, the options are

Denise Wolf, Vijay Vazirani

Extinction

- ALL cells in state x
- ALL cells in state y
- Statically mixed population (some x, some y)
- Phase variation of individual cells between x and

y

With no sensor, the options are

Denise Wolf, Vijay Vazirani

Extinction

- ALL cells in state x
- ALL cells in state y
- Statically mixed population (some x, some y)
- Phase variation of individual cells between x and

y

With no sensor, the options are

Denise Wolf, Vijay Vazirani

Extinction

- ALL cells in state x
- ALL cells in state y
- Statically mixed population (some x, some y)
- Phase variation of individual cells between x and

y

With no sensor, the options are

Denise Wolf, Vijay Vazirani

Proliferation!

- ALL cells in state x
- ALL cells in state y
- Statically mixed population (some x, some y)
- Phase variation of individual cells between x and

y

Phase variation for survival

Rate of Y?X Switching

Rate of X?Y Switching

This is a Devils compromise Phase-variation

behaviors is not optimal in any one environment

but necessary for survival with noisy sensors in

a fluctuating environment.

Denise Wolf, Vijay Vazirani

Learning Environment from Cell State

Strategy Sensor profile Environmental profile

Random Phase Variation (RPV) No sensors Devils Compromise (DC) lifecycle time varying environment with different environmental states selecting for different cell states. Optimal switching rates a function of lifecycle asymmetries and environmental autocorrelation. Time variation required (spatial variation insufficient).

Random Phase Variation (RPV) OLow prob. observable transitions over DC or extinction set. Devils Compromise (DC) lifecycle time varying environment with different environmental states selecting for different cell states. Optimal switching rates a function of lifecycle asymmetries and environmental autocorrelation. Time variation required (spatial variation insufficient).

Random Phase Variation (RPV) DLong delays relative to env. transition times. Devils Compromise (DC) lifecycle time varying environment with different environmental states selecting for different cell states. Optimal switching rates a function of lifecycle asymmetries and environmental autocorrelation. Time variation required (spatial variation insufficient).

Random Phase Variation (RPV) Perfect sensors Frequency dependent growth curves with mixed ESS.

Sensor Based Mixed OHigh prob. observable transitions APoor accuracy Devils Compromise lifecycle. Asymmetric lifecycle required. Optimal mixing probabilities biased toward selected cell-states in dominant environmental states.

Sensor Based Mixed LPF OHigh prob. observable transitions APoor accuracy. NHigh additive noise. Devils Compromise lifecycle. Asymmetric lifecycle required. Optimal mixing probabilities biased toward selected cell-states in dominant environmental states.

Sensor Based Pure OHigh prob. observable transitions AHigh accuracy or moderate accuracy and low noise N. Temporally or spatially varying environment with each environmental state selecting for a single cell state.

Sensor Based Pure LPF OHigh prob. observable transitions AModerate accuracy. NHigh additive noise. Temporally or spatially varying environment with each environmental state selecting for a single cell state.

Denise Wolf, Vijay Vazirani

Robustness and Fragility

- The stratagems of a cell evolve in a given

environment for robust survival. - Evolution writes an internal model of the

environment into the genome. - But the system is fragile both
- to certain changes in the environment (though

there are evolvable designs) - And certain random changes in its process

structure. - One of the central questions has to be Robust on

what time scale? Can evolution design for the

future by learning from the past?

Summary

- The availability of large numbers of bacterial

genomes and our ability to measure their

expression opens a new field of Evolutionary

Systems Biology or Regulatory Phylogenomics. - Comparative genomics identifies particularly

conserved motifs, parts of which are

evolutionarily variable and select for different

behaviors of the network. - By understanding what evolution selects in a

network context we better understand what the

engineerable aspects of the network are.

Acknowledgements

- Comparative Stress Response Amoolya Singh,

Denise Wolf - SinIR analysis Chris Voigt, Denise Wolf
- Chemotaxis Chris Rao, John Kirby
- HIV Leor Weinberger, David Schaffer
- Games Denise Wolf, Vijay V. Vazirani
- Funding
- NIGMS/NIH
- DOE Office of Science
- DARPA BioCOMP
- HHMI