Title: Cell Talk
1Cell Talk
NYU, Department of Computer Science 09 19 2003

 Bud Mishra
 Professor of CS Mathematics (Courant, NYU)
 Professor (Cold Spring Harbor Laboratory)
 Professor (Tata Institute of Fundamental
Research, Adjunct)  Professor of Human Genetics (Mt Sinai School of
medicine)
2(No Transcript)
3Robert Hooke
 Robert Hooke (16351703) was an experimental
scientist, mathematician, architect, and
astronomer. Secretary of the Royal Society from
1677 to 1682, he is remembered for the discovery
of the proportional relationship of the extension
of a spring and the force applied to produce that
extension.  His work Micrographia of 1665 contained his
microscopical investigations, which included the
first identification of biological cells.  Hooke became involved in a dispute with Isaac
Newton over the priority of the discovery of the
inverse square law of gravitation.  Aubrey held his ability in high regard "He is
certainly the greatest Mechanick this day in the
World. In his drafts of Book II, Newton had
referred to him as the most illustrious
HookeClarissimus Hookius. Hooke was
considered the Englands Da Vinci because of
his wide range of interests.
4Newton Hooke
 Huygens Preface is concerning those
properties of gravity which I myself first
discovered and showed to this Society and years
since, which of late Mr. Newton has done me the
favour to print and publish as his own
inventions.  And particularly that of the oval figure of the
Earth which was read by me to this Society about
27 years since upon the occasion of the carrying
the pendulum clocks to sea and at two other times
since, though I have had the ill fortune not to
be heard, and I conceive there are some present
that may very well remember and do know that Mr.
Newton did not send up that addition to his book
till some weeks after I had read and showed the
experiments and demonstration thereof in this
place and had answered the reproachful letter of
Dr. Wallis from Oxford.
5Newton to Halley
 Now is this not very fine? Mathematicians that
find out, settle do all the business must
content themselves with being nothing but dry
calculators drudges another that does nothing
but pretend grasp at all things must carry away
all the inventions  Should a man who thinks himself knowing loves
to know it in correction instructing others,
come to you when you are busy, notwithstanding
your excuse, press discourse upon you through
his own mistakes correct you multiply
discourses then make this use of it, to boast
that he taught you all he spake oblige you to
acknowledge it cry out injury injustice if
you do not  I beleive you would think him a man of a strange
unsociable temper.
6Newton to Hooke
 If I have seen further than other men, it is
because I have stood on the shoulders of giants
and you my dear Hooke, have not." Newton to
Hooke
7Image Logic
 The great distance between
 a glimpsed truth and
 a demonstrated truth
 Christopher Wren/Alexis Claude Clairaut
8MicrographiaPrincipia
9Micrographia
10The Brain the Fancy
 The truth is, the science of Nature has already
been too long made only a work of the brain and
the fancy. It is now high time that it should
return to the plainness and soundness of
observations on material and obvious things.  Robert Hooke. (1635  1703), Micrographia 1665
11Principia
12Induction Hypothesis
 Truth being uniform and always the same, it is
admirable to observe how easily we are enabled to
make out very abstruse and difficult matters,
when once true and genuine Principles are
obtained.  Halley, The true Theory of the Tides, extracted
from that admired Treatise of Mr. Issac Newton,
Intituled, Philosophiae Naturalis Principia
Mathematica, Phil. Trans. 226445,447.  This rule we must follow, that the argument of
induction may not be evaded by hypotheses.
Hypotheses non fingo.I feign no
hypotheses.Principia Mathematica.
13Morphogenesis
14Alan Turing 1952
 The Chemical Basis of Morphogenesis, 1952,
Phil. Trans. Roy. Soc. of London, Series B
Biological Sciences, 2373772.  A reactiondiffusion model for development.
15A mathematical model for the growing embryo.
 A very general program for modeling
embryogenesis The model is a simplification
and an idealization and consequently a
falsification.  Morphogen is simply the kind of substance
concerned in this theory in fact, anything that
diffuses into the tissue and somehow persuades
it to develop along different lines from those
which would have been followed in its absence
qualifies.
16Diffusion equation
first temporal derivative rate
second spatial derivative flux
a/ t Da r2 a
a concentration Da diffusion constant
17ReactionDiffusion
 a/ t f(a,b) Da r2 a f(a,b) a(b1) k1
 b/ t g(a,b) Db r2 b g(a,b) ab k2
Turing, A.M. (1952).The chemical basis of
morphogenesis. Phil. Trans. Roy. Soc. London B
237 37
18Reactiondiffusion an example
A2B ! 3B B ! P
B extracted at rate F, decay at rate k
A fed at rate F
Pearson, J. E. Complex patterns in simple
systems. Science 261, 189192 (1993).
19Reactiondiffusion an example
20Genes 1952
 Since the role of genes is presumably catalytic,
influencing only the rate of reactions, unless
one is interested in comparison of organisms,
they may be eliminated from the discussion
21Crick Watson 1953
22Genome
 Genome
 Hereditary information of an organism is encoded
in its DNA and enclosed in a cell (unless it is a
virus). All the information contained in the DNA
of a single organism is its genome.  DNA molecule can be thought of as a very long
sequence of nucleotides or bases  S A, T, C, G
23The Central Dogma
 The central dogma(due to Francis Crick in 1958)
states that these information flows are all
unidirectional  The central dogma states that once information'
has passed into protein it cannot get out again.
The transfer of information from nucleic acid to
nucleic acid, or from nucleic acid to protein,
may be possible, but transfer from protein to
protein, or from protein to nucleic acid is
impossible. Information means here the precise
determination of sequence, either of bases in the
nucleic acid or of amino acid residues in the
protein.
Transcription
Translation
DNA
RNA
Protein
24RNA, Genes and Promoters
 A specific region of DNA that determines the
synthesis of proteins (through the transcription
and translation) is called a gene  Originally, a gene meant something more
abstracta unit of hereditary inheritance.  Now a gene has been given a physical molecular
existence.  Transcription of a gene to a messenger RNA, mRNA,
is keyed by a transcriptional activator/factor,
which attaches to a promoter (a specific sequence
adjacent to the gene).  Regulatory sequences such as silencers and
enhancers control the rate of transcription
25The Brain the Fancy
 Work on the mathematics of growth as opposed to
the statistical description and comparison of
growth, seems to me to have developed along two
equally unprofitable lines It is futile to
conjure up in the imagination a system of
differential equations for the purpose of
accounting for facts which are not only very
complex, but largely unknown,What we require at
the present time is more measurement and less
theory.  Eric Ponder, Director, CSHL (LIBA), 19361941.
26Axioms of Platitudes E.B. Wilson
 Science need not be mathematical.
 Simply because a subject is mathematical it need
not therefore be scientific.  Empirical curve fitting may be without other than
classificatory significance.  Growth of an individual should not be confused
with the growth of an aggregate (or average) of
individuals.  Different aspects of the individual, or of the
average, may have different types of growth
curves.
27Genes for Segmentation
 Fertilisation followed by cell division
 Pattern formation instructions for
 Body plan (Axes AP, DV)
 Germ layers (ecto, meso, endoderm)
 Cell movement  form gastrulation
 Cell differentiation
28PI Positional Information
 Positional value
 Morphogen a substance
 Threshold concentration
 Program for development
 Generative rather than descriptive
 FrenchFlag Model
29bicoid
 The bicoid gene provides an AP morphogen
gradient
30gap genes
 The AP axis is divided into broad regions by gap
gene expression  The first zygotic genes
 Respond to maternallyderived instructions
 Shortlived proteins, gives bellshaped
distribution from source
31Transcription Factors in Cascade
 Hunchback (hb) , a gap gene, responds to the
dose of bicoid protein  A concentration above threshold of bicoid
activates the expression of hb  The more bicoid transcripts, the further back hb
expression goes
32Transcription Factors in Cascade
 Krüppel (Kr), a gap gene, responds to the dose
of hb protein  A concentration above minimum threshold of hb
activates the expression of Kr  A concentration above maximum threshold of hb
inactivates the expression of Kr
33Segmentation
 Parasegments are delimited by expression of
pairrule genes in a periodic pattern  Each is expressed in a series of 7 transverse
stripes
34Pattern Formation
 Edward Lewis, of the California Institute of
Technology  Christiane NuessleinVolhard, of Germany's
MaxPlanck Institute  Eric Wieschaus, at Princeton
 Each of the three were involved in the early
research to find the genes controlling
development of the Drosophila fruit fly.
35The Network of Interaction
 Legend
 WGwingless
 HHhedgehog
 CIDcubitus iterruptus
 CNrepressor fragment
 of CID
 PTCpatched
 PHpatchedhedgehog
 complex
positive interacions
negative interacions
mRNA
proteins
36Completenessvon Dassow, Meir, Munro Odell,
2000
 We used computer simulations to investigate
whether the known interactions among segment
polarity genes suffice to confer the properties
expected of a developmental module.  Using only the solid lines in earlier figure
we found no such parameter sets despite extensive
efforts.. Thus the solid connections cannot
suffice to explain even the most basic behavior
of the segment polarity network  There must be active repression of en cells
anterior to wgexpressing stripe and something
that spatially biases the response of wg to Hh.
There is a good evidence in Drosophila for wg
autoactivation
37Completeness
 We incorporated these two remedies first (light
gray lines). With these links installed there are
many parameter sets that enable the model to
reproduce the target behavior, so many that they
can be found easily by random sampling.
38Model Parameters
39Complete Model
40Complete Model
41Is this your final answer?
 It is not uncommon to assume certain biological
problems to have achieved a cognitive finality
without rigorous justification.  Rigorous mathematical models with automated tools
for reasoning, simulation, and computation can be
of enormous help to uncover  cognitive flaws,
 qualitative simplification or
 overly generalized assumptions.
 Some ideal candidates for such study would
include  prion hypothesis
 cell cycle machinery
 muscle contractility
 processes involved in cancer (cell cycle
regulation, angiogenesis, DNA repair, apoptosis,
cellular senescence, tissue space modeling
enzymes, etc.)  signal transduction pathways, and many others.
42Systems Biology
Combining the mathematical rigor of numerology
with the predictive power of astrology.
Cyberia
Numerlogy
Astrology
Numeristan
HOTzone
Astrostan
Infostan
Interpretive Biology
Computational Biology
Integrative Biology
Bioinformatics
BioSpice
43ComputationalSystems Biology
How much of reasoning about biology can be
automated?
44Graphical Representation
45Graphical Representation
The reaction between X1 and X2 requires coenzyme
X3 which is converted to X4
46Glycolysis
Glycogen
P_i
Glucose1P
Glucose
Phosphorylase a
Phosphoglucomutase
Glucokinase
Glucose6P
Phosphoglucose isomerase
Fructose6P
Phosphofructokinase
47An Artificial Clock
 Three proteins
 LacI, tetR l cI
 Arranged in a cyclic manner (logically, not
necessarily physically) so that the protein
product of one gene is rpressor for the next
gene.  LacI! tetR tetR! TetR
 TetR! l cI l cI ! l cI
 l cI! lacI lacI! LacI
48Cycles of Repression
 The first repressor protein, LacI from E. coli
inhibits the transcription of the second
repressor gene, tetR from the tetracyclineresista
nce transposon Tn10, whose protein product in
turn inhibits the expression of a third gene, cI
from l phage.  Finally, CI inhibits lacI expression,
 completing the cycle.
49Biological Model
 Standard molecular biology Construct
 A lowcopy plasmid encoding the repressilator and
 A compatible highercopy reporter plasmid
containing the tetrepressible promoter PLtet01
fused to an intermediate stability variant of gfp.
50Cascade Model Repressilator?
 dx2/dt a2 X6g26X1g21  b2 X2h22
 dx4/dt a4 X2g42X3g43  b4 X4h44
 dx6/dt a6 X4g64X5g65  b6 X6h66
 X1, X3, X5 const
51SimPathica System
52Simpathica Movies
..\..\Simpathica Movies
53Canonical Forms Model Building
54Systems of Differential Equations
 dXi/dt
 (instantaneous) rate of change in Xi at time t
 Function of substrate concentrations, enzymes,
factors and products  dXi/dt f(S1, S2, , E1, E2, , F1, F2,, P1,
P2,)  Ssystems result in Nonlinear TimeInvariant DAE
System.
55General Form
 dXi/dt Vi(X1, X2, , Xn) Vi(X1, X2, , Xn)
 Where Vi() term represents production (or
accumulation) rate of a particular metabolite and
Vi() represent s depletion rate of the same
metabolite.  Generalizing to n dependent variables and m
independent variables, we have  dXi/dt
 Vi(X1, X2, , Xn, U1, U2, , Um)
 Vi(X1, X2, , Xn, U1, U2, , Um)
56Canonical Forms
57SSystem Automaton AS
 SSystem Automata Definition
 Combine snapshots of the IDs (instantaneous
descriptions) of the system to create a possible
world model  Transitions are inferred from traces of the
system variables  DefinitionGiven an Ssystems S, the Ssystem
automaton AS associated to S is 4tuple AS (S,
D, S0, F), where S µ D1 L DW is a set of
states, D µ S S is the binary transition
relation, and S0, F ½ S are initial and final
states respectively. ð  Definition A trace of an Ssystem automaton AS
is a sequence s0, s1, , sn,, such that s0 2 S0,
D(si, si1), 8 i 0.ð
58Trace Automaton
Simple onetoone construction of the trace
automata AS for an Ssystem S
59Collapsing Algorithm
60Collapsed Automata
The effects of the collapsing construction of
the trace automata AS for an Ssystem S
61Temporal Logic Model Checking
62Models of Modal LogicKripke Structure
63Kripke Structure
64CTL
65Temporal Modes
66Syntax
67Semantics
68Least Fixed Point Characterization
69Model Checking
70Bisimulation
71Bisimulation
Theoretical Computer Science, 2003
72Bisimulation Lemma
73Example
74Equations in Canonical Form
75Structure of the Collapsed System
76Computational Differential Algebra
77Algebraic Approaches
78State Space Description
79InputOutput Relation
80Differential Algebra
81Example System
82InputOutput Relations
83Membership Problem
84Obstacles
85Some Remarks
 Many problems of Kinetic modeling lead naturally
to formulation in Differential Algebra!  Yet, most problems in Differential Algebra remain
to be solved satisfactorily!!  Many of the tools developed in the algebraic
setting (e.g., Gröbner bases, elimination theory,
etc.) do not generalize.  Complexity and solvability questions pose
intriguing and challenging problems for applied
mathematicians and computer scientists!!
86Purine Metabolism
87Purine Metabolism
 Purine Metabolism
 Provides the organism with building blocks for
the synthesis of DNA and RNA.  The consequences of a malfunctioning purine
metabolism pathway are severe and can lead to
death.  The entire pathway is almost closed but also
quite complex. It contains  several feedback loops,
 crossactivations and
 reversible reactions
 Thus is an ideal candidate for reasoning with
computational tools.
88Simple Model
89Biochemistry of Purine Metabolism
 The main metabolite in purine biosynthesis is
5phosphoribosyla1pyrophosphate (PRPP).  A linear cascade of reactions converts PRPP into
inosine monophosphate (IMP). IMP is the central
branch point of the purine metabolism pathway.  IMP is transformed into AMP and GMP.
 Guanosine, adenosine and their derivatives are
recycled (unless used elsewhere) into
hypoxanthine (HX) and xanthine (XA).  XA is finally oxidized into uric acid (UA).
90Biochemistry of Purine Metabolism
 In addition to these processes, there appear to
be two salvage pathways that serve to maintain
IMP level and thus of adenosine and guanosine
levels as well.  In these pathways, adenine phosphoribosyltransfera
se (APRT) and hypoxanthineguanine
phosphoribosyltransferase (HGPRT) combine with
PRPP to form ribonucleotides.
91Purine Metabolism
92XML Description
. . .
 lt?xml version"1.0" ?gt  ltmap
xmlnsxsi"http//www.w3.org/2001/XMLSchemainstan
ce" xsinoNamespaceSchemaLocation"map.xsd"gt
ltsubstrategt ltidgt1lt/idgt ltconcentrationgt5lt/concen
trationgt ltnamegtPRPPlt/namegt lt/substrategt
ltsubstrategt ltidgt2lt/idgt ltconcentrationgt100lt/conc
entrationgt ltnamegtIMPlt/namegt lt/substrategt
ltsubstrategt ltidgt3lt/idgt ltconcentrationgt2500lt/con
centrationgt ltnamegtAdolt/namegt lt/substrategt
ltsubstrategt ltidgt4lt/idgt ltconcentrationgt425lt/conc
entrationgt ltnamegtGMPlt/namegt lt/substrategt
 ltsynthesisgt ltreactant1gt1lt/reactant1gt
ltreactant2gt8lt/reactant2gt ltproductgt2lt/productgt
ltpower_function1gt1.1lt/power_function1gt
ltrate1gt12.570lt/rate1gt ltpower_function2gt0.48lt/pow
er_function2gt ltrate2gt12.570lt/rate2gt 
ltmodulationgt ltenzymegt2lt/enzymegt
ltpower_function_enzymegt0.89lt/power_function_enzym
egt lt/modulationgt lt/synthesisgt ltoutputgt
ltreactantgt11lt/reactantgt ltpower_functiongt2.21lt/po
wer_functiongt ltrategt0.00008744lt/rategt
lt/outputgt lt/mapgt
93Queries
 Variation of the initial concentration of PRPP
does not change the steady state.(PRPP 10
PRPP1) implies steady_state()  This query will be true when evaluated against
the modified simulation run (i.e. the one where
the initial concentration of PRPP is 10 times the
initial concentration in the first run PRPP1).
 Persistent increase in the initial concentration
of PRPP does cause unwanted changes in the steady
state values of some metabolites.  If the increase in the level of PRPP is in the
order of 70 then the system does reach a steady
state, and we expect to see increases in the
levels of IMP and of the hypoxanthine pool in a
comparable order of magnitude. Always (PRPP
1.7PRPP1) implies steady_state()
TRUE
TRUE
94Queries
 Consider the following statement
 Eventually
 (Always (PRPP 1.7 PRPP1) implies
steady_state() and Eventually  (Always(IMP lt 2 IMP1)) and Eventual
ly (Always  (hx_pool lt 10hx_pool1)))
 where IMP1 and hx_pool1 are the values observed
in the unmodified trace. The above statement
turns out to be false over the modified
experiment trace..
 In fact, the increase in IMP is about 6.5 fold
while the hypoxanthine pool increase is about 60
fold.  Since the above queries turn out to be false over
the modified trace, we conclude that the model
overpredicts the increases in some of its
products and that it should therefore be amended
False
95Final Model
96Purine Metabolism
97Query
 This change to the model allows us to reformulate
our query as shown below  Always(PRPP gt 50 PRPP1 implies (steady_stat
e() and Eventually(IMP gt IMP1) and
Eventually(HX lt HX1) and Eventually(Always(IMP
IMP1)) and Eventually(Always(HX HX1))  An (instantaneous) increase in the level of PRPP
will not make the system stray from the predicted
steady state, even if temporary variations of IMP
and HX are allowed.
TRUE
98Time FrequencyRAS Pathways
99Feedback in Biochemical Pathways
 Iyengar and Bhalla analyze a complex pathway
Science vol. 283, 1999  The pathway presents a feedback loop involving
PKC, MAPK, and Ras  Bistability plot of PKC vs MAPK concentrations
 A is the active point, B is the basal point, and
T is the threshold point  A and B are the stable states
100PLC?PKC and RasRafMAPK Pathways
101PLC?PKC and RasRafMAPK Pathways
The trajectories of MAPK and PKC change if EGF
stimulus is provided. A 6000sec EGF stimulus is
provided at 5 different levels (1,2,3,5 and
7nM). The two different modes for MAPK and PKC
are observed.
102Orthonormal Bases and Projection
 Behavior of a biological process can be described
by the trajectory of abundance of a particular
molecule or reactant  Time series functions their approximate
representation in terms of an Mdimensional
vector in a Euclidean space.  Projection of the time series.
 The most typical as well as robust behaviors of
the system are determined by the sets of time
series functions giving rise to unique clusters.
103Orthonormal Bases and Projection
 With a suitably chosen orthonormal bases
 g1, g2,,gM, ,
 the function can be expressed as a linear
combination  f(t)åi11 h f, gi i gi
 and yields a sufficiently good approximation
 fM(t)åi1M h f, gi i gi ¼ f(t), k f(t)
fM(t)k C Ma.  Note that for a well chosen orthonormal bases and
e gt 0,  9M, d k f1,M(t) f2,M(t)k lt d
 ) k f1(t) f2(t)k lt d 2 C Ma lt e.
 Projection P f a h f, gi i i1M
104MultiresolutionTimeFrequency Analysis
Timefrequency activity induced by EGF. Two
clusters are formed, corresponding to two levels
of EGF stimulus low (2nm, red o symbols) and
high (5nm, blue x symbols) levels of
MAPK/PKC. As the stimulus is applied, the system
shows higher activity along the chosen
discriminating vectors at high EGF stimulus. In
the relaxation phase, the system shows higher
timefrequency activity after the withdrawal of
the lower stimulus, as it is relaxing to the
stability points that existed before stimulation,
thereby reversing the effect of EGF stimulus.
After the 5nm stimulus, on the other hand, the
systems active components are relaxing to higher
concentration levels (memory effect), thereby
displaying lower activity along the chosen
timefrequency vectors.
105MultiresolutionTimeFrequency Analysis
Loss of memory caused by breaking the feedback
loop involving RAS, MAPK, etc. in the RAS
pathway.
106MultiresolutionTimeFrequency Analysis
107MultiresolutionTimeFrequency Analysis
108MultiresolutionTimeFrequency Analysis
109Time FrequencyCell Cycle
110The Cell Cycle
G1
start
cell division
Cdk
Cdk
Cdk
Cyclin
S
M (anaphase)
APC
APC
finish
G2
M (metaphase)
111The Cell Cycle
 The chromosome cycle is divided into four
classes  G1, S, G2, M
 During S phase, a new copy of each chromosome is
synthesized  During M phase (mitosis) the sister chromatids
are separated so that each daughter cell receives
a copy of each chromosome.  G1 and SG2M are separated by two transitions
 start finish
 Cell cycle events are controlled by a network of
molecular signals  The central components are Cdks (cyclindependent
protein kinases), cyclin molecules and APC
(anaphasepromoting complex)..
112The Cell Cycle
 In the G1 phase Cdk activity is low as cyclin
mRNA synthesis is inhibited and cyclin protein is
degraded rapidly  At start, cyclin synthesis is induced and cyclin
degradation is inhibited, causing a rise in Cdk
activity that persists throughout SG2M phase  High Cdk activity is needed for DNA replication,
chromosome condensation and spindle assembly.  At finish, the proteins needed for APC complex is
activated.  APC consists of core complex of a dozen
polypeptides plus two auxiliary proteins Cdc20
and Cdh1.  Together, Cdc20 and Cdh1 label cyclins for
degradation at telophase, thus returning the cell
to G1.
113The interaction ofCyclin B/Cdk and Cdh1/APC
 dCycB/dt
 k1 (k2 k2Cdh1)CycB
 dCdh1/dt
 (k3 k3 A) (1Cdh1)/ (J31 Cdh1)
k4 m CycBCdh1/ (J4 Cdh1)
 A pair of nonlinear ODE (ordinary differential
equations) describing the biochemical reactions
at the center.
114Yeast Cell Cycle Regulations
115Simulation of Yeast Cell Cycle
116Simulation of Yeast Cell Cycle
117Simulation of Yeast Cell Cycle
118Analysis of Experimental Traces(1)
The trace is loaded
119Analysis of Experimental Traces (2)
Queries can be asked to the system
120Simulated Yeast Cell Cycle.
 Plots of Cdh1 and CycB with respect to two
dominant timefrequency modes reflect the
distance of the initial conditions from the two
stable states
121Simulated Yeast Cell Cycle of wild type vs.
CKI/SK double mutant
 The points corresponding to the trajectories of
the double mutant (o symbols) are much more
scattered than those corresponding to the wild
type (x symbols), indicating that, although in
double mutant the oscillations of the cell cycle
is restored, the system is less stable than the
wild type.
122NYU SIM
123NYUSIM Trace Database
 Time course data need to be classified according
to various criteria  Parameters
 Algorithmic descriptions
 References to model used
 The objective is to avoid the directory dump
effect and to provide a standardized way to
access timeseries biological data
124NYUSIM/JDesigner
Data produced with JDesigner/SBW is readily
inserted in NYUSIM
125NYUSIM/Simpathica
Simpathica/XSSYS can access the NYUSIM DB and
analyze data produced by several sources.
126NYU BioWAVE
 A tool for classification of time course data
127NYUBIOWAVE Example
 The set of functions used to test the system
 30 Beta functions with different parameters
 10 Step functions with different amplitude,
sharpness and shift
128NYUBIOWAVE Example
 The same set of functions from another viewpoint
129NYUBIOWAVE Classification
 The Matlab NYUBIOWAVE interface showing the
classification of the step functions in a single
(amplitude normalized) group
130NYUBIOWAVE Classification
 Classification of bellshaped Beta functions by
NYUBIOWAVE
131C elegans
132Caenorhabditis elegans(C. elegans)

 An organism of exactly 959 cells.
 Two things are known about C. elegans
 the complete sequence of its DNA, and
 what every one of its 959 cells does.
 There are many things we don't know about C.
elegans. One is the answer to the question since
all 959 cells come from one original cell, how
does each of the 959 cells decide what sort of
cell to become
133Worm Guys
seminal discoveries concerning the genetic
regulation of organ development and programmed
cell death. By establishing and using the
nematode Caenorhabditis elegans as an
experimental model system, possibilities were
opened to follow cell division and
differentiation from the fertilized egg to the
adult. The discoveries are important for
medical research and have shed new light on the
pathogenesis of many diseases.
134Germ Line Cells in C. elegans
135More C. elegans
136Modeling Stem Cells Processes
 Mathematical Models
 Population Models for differentiation  i.e.
'state transitions'  Diffusion Models and preliminary regulatory
models for proliferation, differentiation and
selfrenewal  Model Types
 Differential Equations (usually differential
algebraic equations  DAE) Models
137Stem Cells Proliferation and Differentiation
 Stem Cells Division/Proliferation (Morrison et.
al. Cell 88, 287298, Feb 1997)
138Queue Model
a
b2
N1
g
b1
N2
N3
139Markov Model
h N1, N21, N32i
b2
g
a
h N1, N2, N3 i
h N1, N2, N3 1i
h N11, N2, N3 i
b1
h N11, N22, N3i
140Simulation (without aging)
N2
N3
N1
N1N2N3
141Simulation (with aging)
N1
N2
N3
N1N2N3
142ODE Model
 Model with three generations
 dN1/dt a(t)  b1 IN1 1
 dN2/dt 2 b1 IN1 1  b2 IN2 1
 dN3/dt 2 b2 IN2 1  g IN3 1
143Solution to the ODEs
without aging
with aging
1444 Generations..
N4
N1 N2 N3 N4
N1
N3
N2
145More to Come
 Image Processing
 Experiments
 Hypothesis Testing
 Statistical Algorithms
 Narrowed down two possible hypotheses
 Most likely, the combinatorial model is incorrect
 Physical model (Protein gradient vs. Prostheses)
146Stochastic Aggregate Discrete Model
 To introduce stochastic effects in our
simulations, we developed a stochastic aggregate
model based on a Finite/Hybrid Automata System  The simulation proceeds through a sequence of
transitions, which increment or decrement the
number of Stem Cells (Ns) or Committed
Progenitors (Np)  The simulation was carried out with any Hybrid
System Simulation tool, e.g.  LambdaSHIFT (Simsek, UC Berkeley, 2000)
 Charon, (Alur et al, Upenn, 2000)
147Stem Cell Finite/HybridState Automata
Asymmetric Division
Ns no. of Stem Cells Np no. of Progenitor
Cells
Np Np 1
Ns Ns 1
Symmetric Division 2S
die
quiescent
Ns Ns 1
Np Np 2 Ns Ns 1
SymmetricDivision 2P
148SpatialSim Tool
 The SpatialSim interface allows you to create
specialized Stem Cell population simulations (2D)
149Spatial Simulation Description (contd...)
 2D/3D Spatial Grid (3D grid visualization in
development)  Local Rules (e.g)
 apoptosis for Stem Cells
as / (1 Stem Cell Neighbors)
150Spatial Simulation Description (contd...)
 Other local rules
 Symmetric subdivision
 Asymmetric subdivision
 Migration
 Differentiation
151Spatial SimulationGillespielike Engine
 The simulation engine achieves its efficiency by
adopting a standard Poisson process assumption
regarding the probability of concurrent events.  At each simulation step a single cell is randomly
chosen and made evolve  This is similar to the Gillespie simulations of
Chemical reactions
152People
 Marco Antoniotti
 Sr. Res. Scientist (CS, Courant)
 Simulation System//Simpathica
 Archisman Rudra
 Sr. Res. Scientist (CS, Courant)
 Genome Grammar//Copy Number Analysis
 Raoul Daruwala
 Sr. Res. Scientist (CS, Courant)
 Copy Number Analysis// Learning
 Salvatore Paxia
 Sr. Res. Scientist (CS, Courant)
 Software Environment//Valis
 Vera Cherepinsky
 Sr. Res. Scientist (CS, Courant)
 Software Environment//Valis
 Gilad Lerman
 Sr. Res. Scientist (Mathematics, Courant)
 Multistrip Algorithms// Normalization
 Paolo Barbano
 Saurabh Sinha
 Postdoc (Courant Rockefeller)
 Detecting CIS elements
 Marc Rejali
 Sr. Res. Scientist (CS, Courant)
 Microarray Data Analysis//MAD
 Nadia Ugel
 Jr. Res. Scientist (CS, Courant)
 Simpathica//Stem Cell Models
 Marina Spivak
 Jr. Res. Scientist (Biology CS, Courant)
 Simpathica//Stem Cell Models
 Joe McQuown
 Jr. Res. Scientist (Stat, NYU)
 Statistical Analysis
 Graduate Students
 Joey Zhou (Biology, NYU)
 Bing Sun (Computer Science, NYU)
 Jerry Huang (Biology, NYU)
153Visitors Collaborators
 VISITORS
 Alberto Policriti
 Computer Science,
 University of Udine, Italy
 Pasquale Cainiello
 Computer Science,
 University of LAquilla, Italy
 Haim Wolfson
 Computer Science,
 Tel Aviv University, Israel
 Chris Wiggins
 Physics Applied Mathematics
 Columbia University, USA
 Franz Winkler
 Mathematics
 Johann Kepler University, Austria
 COLLABORATORS
 Mike Wigler, Rob Lucito Yuri Lazebnik
 Cold Spring Harbor Lab
 Misha Gromov Ale Carbone
 IHES Courant
 Amir Pnueli
 Minerva Center Courant
 Steve Burakoff
 Skirball Institute
 Harel Weinstein, Ravi Iyengar Bob Desnick
 Mt Sinai School of Medicine
 Sanjoy Mitter Dimitri Beretskas
 MIT
 Charles Cantor Jim Collins
 Boston Univ
 Mike Seoul
 Bioarrays
 VISITORS
 Frank Park
 Control Theory
 University of Seoul, S. Korea
 Naomi Silver
 Computer Science
 Marco Isopi
 Applied Mathematics
 Italy
 Ilya Nemenman
 Physics Neurosicience
 ITP, California
 David Harel
 Computer Science
 Weizmann Institute, Israel
 Carla Piazza
 Computer Science Mathematics
 Universita Ca Foscari di Venezia,
154Blakes Newton
 Newton says Doubt
 Aye thats the way to make all Nature out,
 Doubt Doubt dont believe without experiment
 William Blake,
 On the Virginity of the Virgin Mary Johanna
Southcott,  (17571827)

155The End
 http//www.cs.nyu.edu/mishra
 http//bioinformatics.cat.nyu.edu
 Valis, Gene Grammar, NYU MAD, Cell Simulation,
156Other Ongoing Projects
 OPTICAL MAPPING
 Single Molecule Genomics Optical Mapping,
Optical Sequencing RFLP Haplotyping  (In collaboration with Univ. Wisc. funded by
NCI)  Valis Bioinformatic
 Environment Language
 (Funded by DOE NYSTAR)
 ROMA (Representational Oligonucleotide Microarray
Analysis)  Microarraybased Genome Mapping
 (In collaboration with CSHL funded by NCI/NIH)
 Expression Data Analysis
 (In collaboration with NYU Biology funded by
NSF MHHI)  Cell Informatics
 (Funded by DARPA Airforce)
157Optical Mapping
158Optical Mapping
 Sizing Error
 (Bernoulli labeling, absorption crosssection,
PSF)  Partial Digestion
 False Optical Sites
 Orientation
 Spurious molecules, Optical chimerism, Calibration
Image of restriction enzyme digestedYAC clone
YAC clone 6H3, derived from human chromosome 11,
digested with the restriction endonuclease Eag I
and Mlu I, stained with a fluorochrome and imaged
by fluorescence microscopy.
159Optical MappingInterplay between Biology and
Computation
160Y
 From a genes point of view, reshuffling is a
great restorative  The Y, in its solitary state disapproves of such
laxity. Apart from small parts near each tip
which line up with a shared section of the X, it
stands aloof from the great DNA swap. Its genes,
such as they are, remain in purdah as the
generations succeed. As a result, each Y is a
genetic republic, insulated from the outside
world. Like most closed societies it becomes both
selfish and wasteful. Every lineage evolves an
identity of its own which, quite often, collapses
under the weight of its own inborn weaknesses.  Celibacy has ruined mans chromosome.
 Steve Jones, Y The descent of Men, 2002.
161Mapping the DAZ locus on Y Chromosome
162Gentig MapDeinococcus radiodurans
Nhe I map of D.radiodurans generated by Gentig
163E. coli Shotgun Map
164Gentig MapsPlasmodium falciparum
 A. Gapfree consensus BamHI NheI maps for all
14 chromosomes.  B.BamHI map
 C. NheI map
 D.NheI map of Chromosome 3 displayed by ConVEx
165P. Falciparum c14 Alignment
166HaplotypingOutput of the RFLP Phasing Algorithm
167Array Mapping
168Measuring distances
 A one dimensional Buffons needle problem.
 Take two points on a line, and drop unitlength
needles of some color.  The probability that the two points will have
different colors monotonically increases with the
distance between these two points  as distance increases from 0 to 1
 attains a fixed value for all distances konger
than 1.  One can generalize by considering
 More than two pointsP points.
 Dropping a small set of bichromatic needles
p
p
p
Distance ¼ 3/6 0.5
169The Experiments
cX coverage subsample
cX coverage subsample
 Probes are points
 BACs are needles
 Hybridization on an array simulates dropping the
bichromatic needles
M
High Coverage BAC Library
cX coverage subsample
cX coverage subsample
170Final Estimator
171Given Inferred Probe Positions
172Copy Number
173Amplifications Deletions
174ROMA.Tumor Vs. Normal
 Copy number can be measured by computing the fold
changes  Yellow Copy number unchanged
 Red Amplification (More tumor material than
normal)  Green Deletion (Less tumor material than normal)
175BglII Representation (3)
176Copy Number Fluctuation
177Detecting Amplifications Deletions
178VALIS
179Valis
180Valis Architecture
181Valis Screenshot
182NYU MAD
183Nitrogen Pathway
184NYU MAD
185Data Analysis in NYU MAD
186Shrinkage Estimators
187JSEJames Stein Estimator
188Simulation
189ROC Curve
190False Positives and Negatives
191Thanks . . .