Computational Discovery of Communicable Knowledge presentation

About This Presentation

Transcript and Presenter's Notes

Title: Computational Discovery of Communicable Knowledge

1
Computational Discovery of Explanatory Process
Models
Pat Langley Computational Learning
Laboratory Center for the Study of Language and
Information Stanford University, Stanford,
California http//cll.stanford.edu/langley langle
y_at_csli.stanford.edu
Thanks to N. Asgharbeygi, K. Arrigo, S. Bay, S.
Dzeroski, A. Pohorille, J. Sanchez, K. Saito,
Oren Shiran, J. Shrager, and L. Todorovski for
their contributions to this research, which is
funded by a grant from the National Science
Foundation.
2
Adbuctive Model Construction
Most mature sciences focus their efforts not on
discovering laws or forming theories, but on
constructing models that

build upon known laws and theoretical principles
adapt this knowledge to a particular scientific
setting
augment the knowledge with auxiliary assumptions
use the resulting model to explain observed
phenomena.

This task involves abduction of explanatory
models from domain knowledge, though it may also
have inductive aspects. In this talk, I examine
the construction of explanatory models for
dynamical systems that change over time.
3
Time Series from the Ross Sea Ecosystem
4
Inductive Process Modeling
Our approach is to design and implement
computational methods for inductive process
modeling, which

represent scientific models as sets of
quantitative processes
use these models to predict and explain
observational data
search a space of process models to find good
candidates
utilize background knowledge to constrain this
search.

This framework has great potential both for
modeling scientific reasoning and aiding
practicing scientists.
5
A Process Model for an Aquatic Ecosystem
model AquaticEcosystem variables phyto, zoo,
nitro, residue observables phyto, nitro process
phyto_loss equations dphyto,t,1 ? 0.307 ?
phyto dresidue,t,1 0.307 ? phyto process
zoo_loss equations dzoo,t,1 ? 0.251 ?
zoo dresidue,t,1 0.251 process
zoo_phyto_grazing equations dzoo,t,1 0.615
? 0.495 ? zoo dresidue,t,1 0.385 ? 0.495 ?
zoo dphyto,t,1 ? 0.495 ? zoo process
nitro_uptake conditions nitro gt 0
equations dphyto,t,1 0.411 ?
phyto dnitro,t,1 ? 0.098 ? 0.411 ?
phyto process nitro_remineralization
equations dnitro,t,1 0.005 ?
residue dresidue,t,1 ? 0.005 ? residue
6
Advantages of Quantitative Process Models
Process models offer scientists a promising
framework because

they embed quantitative relations within
qualitative structure
that refer to notations and mechanisms familiar
to experts
they provide dynamical predictions of changes
over time
they offer causal and explanatory accounts of
phenomena
while retaining the modularity needed for
induction/abduction.

Quantitative process models provide an important
alternative to formalisms typically used in
scientific modeling.
7
Generic Processes as Background Knowledge
We cast background knowledge as generic processes
that specify

the variables involved in a process and their
types
the parameters appearing in a process and their
ranges
the forms of conditions on the process and
the forms of associated equations and their
parameters.

Generic processes are building blocks from which
one can compose a specific process model.
8
Generic Processes for Aquatic Ecosystems
generic process exponential_loss generic process
remineralization variables Sspecies,
Ddetritus variables Nnutrient,
Ddetritus parameters ? 0, 1 parameters
? 0, 1 equations dS,t,1 ?1 ? ? ? S
equations dN, t,1 ? ? D dD,t,1 ? ?
S dD, t,1 ?1 ? ? ? D generic process
grazing generic process constant_inflow
variables S1species, S2species, Ddetritus
variables Nnutrient parameters ? 0, 1, ?
0, 1 parameters ? 0, 1
equations dS1,t,1 ? ? ? ? S1
equations dN,t,1 ? dD,t,1 (1 ? ?) ? ? ?
S1 dS2,t,1 ?1 ? ? ? S1 generic process
nutrient_uptake variables Sspecies,
Nnutrient parameters ? 0, ?, ? 0, 1, ?
0, 1 conditions N gt ? equations dS,t,1
? ? S dN,t,1 ?1 ? ? ? ? ? S
9
Constructing Process Models
training data
process model
Induction Abduction
generic processes
10
A Method for Process Model Construction
The IPM algorithm constructs explanatory models
from generic elements components in four stages
1. Find all ways to instantiate known generic
processes with specific variables, subject to
type constraints 2. Combine instantiated
processes into candidate generic models subject
to additional constraints (e.g., number of
processes) 3. For each generic model, carry
out search through parameter space to find good
coefficients 4. Return the parameterized model
with the best overall score.
Our typical evaluation metric is squared error,
but we have also explored other measures of
explanatory adequacy.
11
Estimating Parameters in Process Models
To estimate the parameters for each generic model
structure, the IPM algorithm
1. Selects random initial values that fall within
ranges specified in the generic processes 2.
Improves these parameters using the
Levenberg-Marquardt method until it reaches a
local optimum 3. Generates new candidate values
through random jumps along dimensions of the
parameter vector and continue search 4. If no
improvement occurs after N jumps, it restarts the
search from a new random initial point.
This multi-level method gives reasonable fits to
time-series data from a number of domains, but it
is computationally intensive.
12
Uses of Inductive Process Modeling
aquatic ecosystems
population dynamics
hydrology
biochemical kinetics
13
Intellectual Influences
Our approach to explanatory model construction
draws on ideas from many traditions

computational scientific discovery (e.g.,
Todorovski, 2003)
methods for causal model abduction (e.g., Zupan
et al., 2001)
qualitative physics and simulation (e.g., Forbus,
1984)
languages for scientific simulation (e.g.,
STELLA, MATLAB).

Our work combines these ideas in novel ways to
support abduction of models that explain the
behavior of dynamical systems.
14
Some Recent Extensions
In recent work, we have extended our approach to
incorporate

heuristic beam search through the space of
process models
hierarchical generic processes that further
constrain search
an ensemble-like method that mitigates
overfitting effects
metrics for explanatory adequacy based on
trajectory shapes.

We have also embedded our algorithms in an
interactive software environment for model
construction and revision.
15
End of Presentation
16
Backup Slides
17
Generating Predictions and Explanations
To utilize or evaluate a given process model, we
must simulate its behavior over time

specify initial values for input variables and
time step size
on each time step, determine which processes are
active
solve active algebraic/differential equations
with known values
propagate values and recursively solve other
active equations
when multiple processes influence the same
variable, assume their effects are additive.

This performance method makes specific
predictions that we can compare to observations.
18
Results on the Ross Sea Ecosystem
19
Results on Protist Predator-Prey System
20
Results on the Rinkobing Fjord
21
Results on Biochemical Kinetics
observed trajectories
predicted trajectories

Write a Comment

User Comments (0)

About PowerShow.com

Computational Discovery of Communicable Knowledge PowerPoint PPT Presentation