Computational Discovery of Communicable Knowledge - PowerPoint PPT Presentation

About This Presentation
Title:

Computational Discovery of Communicable Knowledge

Description:

Processes in the Ross Sea Ecosystem ... Results on Observations from Ross Sea ... 188 samples of phytoplnkton, nitrate, and ice measures taken from the Ross Sea. ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 28
Provided by: Lang8
Learn more at: http://www.isle.org
Category:

less

Transcript and Presenter's Notes

Title: Computational Discovery of Communicable Knowledge


1
Computational Discovery of Explanatory Process
Models
Pat Langley School of Computing and
Informatics Arizona State University Tempe,
Arizona http//cll.stanford.edu/langley langley_at_a
su.edu
Thanks to N. Asgharbeygi, K. Arrigo, D. Billman,
S. Borrett, W. Bridewell, S. Dzeroski, O.
Shiran, and L. Todorovski for their contributions
to this research, which is funded by a grant
from the National Science Foundation.
2
The Challenge of Systems Science
Disciplines like Earth science and computational
biology differ from traditional fields in that
they
  • focus on synthesis rather than analysis in their
    operation
  • develop system-level models with many variables /
    relations
  • rely on computational methods to aid in their
    construction.

However, the key challenge involves search
through the model space, not running rapid
simulations or handling large data sets.
3
Example Explain Data from the Ross Sea
4
A Model of the Ross Sea Ecosystem
dphyto,t,1 ? 0.307 ? phyto ? 0.495 ? zoo
0.411 ? phyto dzoo,t,1 ? 0.251 ? zoo 0.615
? 0.495 ? zoo ddetritus,t,1 0.307 ? phyto
0.251 ? zoo 0.385 ? 0.495 ? zoo ? 0.005 ?
detritus dnitro,t,1 ? 0.098 ? 0.411 ? phyto
0.005 ? detritus
Differential equation models of this sort are
regularly used to explain observations and
predict future behavior.
5
The Task of Model Construction
Environmental scientists are confronted with a
challenging task
  • Given A set of variables of interest to the
    scientist
  • Given Observations of how these variables change
    over time
  • Find A model that explains these variations in
    plausible terms and that generalizes well to
    future observations.

Automating such model construction is a natural
task for artificial intelligence and machine
learning. We can develop algorithms that search
the space of differential equation models, but
this space is huge, so we need constraints.
6
Another Account of the Ross Sea Ecosystem
As phytoplankton uptakes nitrogen, its
concentration increases and nitrogen decreases.
This continues until the nitrogen supply is
exhausted, which leads to a phytoplankton die
off. This produces detritus, which gradually
remineralizes to replenish the nitrogen.
Zooplankton grazes on phytoplankton, which slows
the latters increase and also produces detritus.
dphyto,t,1 ? 0.307 ? phyto ? 0.495 ? zoo
0.411 ? phyto dzoo,t,1 ? 0.251 ? zoo 0.615
? 0.495 ? zoo ddetritus,t,1 0.307 ? phyto
0.251 ? zoo 0.385 ? 0.495 ? zoo ? 0.005 ?
detritus dnitro,t,1 ? 0.098 ? 0.411 ? phyto
0.005 ? detritus
7
Processes in the Ross Sea Ecosystem
Knowledge about candidate processes requires that
some terms occur either together or not at all.
dphyto,t,1 ? 0.307 ? phyto ? 0.495 ? zoo
0.411 ? phyto dzoo,t,1 ? 0.251 ? zoo 0.615
? 0.495 ? zoo ddetritus,t,1 0.307 ? phyto
0.251 ? zoo 0.385 ? 0.495 ? zoo ? 0.005 ?
detritus dnitro,t,1 ? 0.098 ? 0.411 ? phyto
0.005 ? detritus
Here we highlight the terms related to
phytoplantkon loss, which decreases phyto
concentration and increases detritus.
8
Processes in the Ross Sea Ecosystem
Here we highlight terms related to zooplankton
grazing, which decreases phyto but increases zoo
and detritus.
dphyto,t,1 ? 0.307 ? phyto ? 0.495 ? zoo
0.411 ? phyto dzoo,t,1 ? 0.251 ? zoo 0.615
? 0.495 ? zoo ddetritus,t,1 0.307 ? phyto
0.251 ? zoo 0.385 ? 0.495 ? zoo ? 0.005 ?
detritus dnitro,t,1 ? 0.098 ? 0.411 ? phyto
0.005 ? detritus
We can use knowledge about processes to
reorganize models and constrain search through
the model space.
9
A Process Model for the Ross Sea
model Ross_Sea_Ecosystem variables phyto, zoo,
nitro, detritus observables phyto,
nitro process phyto_loss equations dphyto,t,1
? 0.307 ? phyto ddetritus,t,1 0.307 ?
phyto process zoo_loss equations dzoo,t,1
? 0.251 ? zoo ddetritus,t,1 0.251 ?
zoo process zoo_phyto_grazing
equations dzoo,t,1 0.615 ? 0.495 ?
zoo ddetritus,t,1 0.385 ? 0.495 ?
zoo dphyto,t,1 ? 0.495 ? zoo process
nitro_uptake equations dphyto,t,1 0.411 ?
phyto dnitro,t,1 ? 0.098 ? 0.411 ?
phyto process nitro_remineralization
equations dnitro,t,1 0.005 ?
detritus ddetritus,t,1 ? 0.005 ? detritus
This model is equivalent to a standard
differential equation model, but it makes
explicit assumptions about which processes are
involved. For completeness, we must also make
assumptions about how to combine influences from
multiple processes.
10
The Task of Inductive Process Modeling
We can use these ideas to reformulate the
modeling problem
  • Given A set of variables of interest to the
    scientist
  • Given Observations of how these variables change
    over time
  • Given Background knowledge about plausible
    processes
  • Find A process model that explains these
    variations and that generalizes well to future
    observations.

We can use background knowledge about candidate
processes to make search much more tractable.
Moreover, the resulting model will be consistent
with this domain knowledge, making it more
comprehensible.
11
Generic Processes as Background Knowledge
We cast background knowledge as generic processes
that specify
  • the variables involved in a process and their
    types
  • the parameters appearing in a process and their
    ranges
  • the forms of conditions on the process and
  • the forms of associated equations and their
    parameters.

Generic processes are building blocks from which
one can compose a specific process model.
12
Generic Processes for Aquatic Ecosystems
generic process exponential_loss generic process
remineralization variables Sspecies,
Ddetritus variables Nnutrient,
Ddetritus parameters ? 0, 1 parameters
? 0, 1 equations dS,t,1 ?1 ? ? ? S
equations dN, t,1 ? ? D dD,t,1 ? ?
S dD, t,1 ?1 ? ? ? D generic process
grazing generic process constant_inflow
variables S1species, S2species, Ddetritus
variables Nnutrient parameters ? 0, 1, ?
0, 1 parameters ? 0, 1
equations dS1,t,1 ? ? ? ? S1
equations dN,t,1 ? dD,t,1 (1 ? ?) ? ? ?
S1 dS2,t,1 ?1 ? ? ? S1 generic process
nutrient_uptake variables Sspecies,
Nnutrient parameters ? 0, ?, ? 0, 1, ?
0, 1 conditions N gt ? equations dS,t,1
? ? S dN,t,1 ?1 ? ? ? ? ? S
Our current library contains about 20 generic
processes, including ones with alternative
functional forms for loss and grazing processes.
13
Constructing Process Models
observations
process model
Heuristic Search
generic processes
14
A Method for Process Model Construction
Our initial system, IPM, constructs process
models from generic components in four stages
1. Find all ways to instantiate known generic
processes with specific variables, subject to
type constraints 2. Combine instantiated
processes into candidate generic models subject
to additional constraints (e.g., number of
processes) 3. For each generic model, carry
out search through parameter space to find good
coefficients 4. Return the parameterized model
with the best overall score.
Our typical evaluation metric is squared error,
but we have also explored other measures of
explanatory adequacy.
15
Results on Observations from Ross Sea
We provided IPM with 188 samples of phytoplnkton,
nitrate, and ice measures taken from the Ross
Sea. From 2035 distinct model structures, it
found accurate models that limited phyto growth
by the nitrate and the light available. Some
high-ranking models incorporated zooplankton,
whereas others did not.
16
Results with Inductive Process Modeling
battery behavior
population dynamics
hydrology
biochemical kinetics
17
Extensions to Inductive Process Modeling
In recent work, we have extended our system to
incorporate
  • heuristic beam search through the space of
    process models
  • hierarchical generic processes that further
    constrain search
  • an ensemble-like method that mitigates
    overfitting effects
  • an EM-like method that deals with missing
    observations.

This approach has great potential to speed the
construction of scientifc models provided that
domain users adopt it.
18
Interfacing with Scientists
Because few scientists want to be replaced, we
are developing an interactive environment,
PROMETHEUS, that lets users
  • specify a quantitative process model of the
    target system
  • display and edit the models structure and
    details graphically
  • simulate the models behavior over time and
    situations
  • compare the models predicted behavior to
    observations
  • invoke a revision module in response to detected
    anomalies.

The environment offers computational assistance
in forming and evaluating models but lets the
user retain control.
19
Viewing a Process Model Graphically
20
Viewing a Process Model as Equations
21
Adding a Process Manually
22
Requesting Automatic Model Revision
23
Results of Automatic Model Revision
24
Directions for Future Research
Despite our progress to date, we need further
work in order to
  • provide better ways to visualize models, data,
    and their relation
  • offer users more natural ways to define the
    space of models
  • specifying constraints on relations among
    entities and processes
  • characterizing subsystems that decompose complex
    models
  • incorporate intuitive metrics like match to
    trajectory shape
  • more generally improve the usability of
    PROMETHEUS

Taken together, these will make inductive process
modeling a more robust approach to scientific
model construction.
25
Intellectual Influences
Our approach to aiding scientific model
construction incorporates ideas from many
traditions
  • computational scientific discovery (e.g., Langley
    et al., 1983)
  • theory revision in machine learning (e.g.,
    Towell, 1991)
  • qualitative physics and simulation (e.g., Forbus,
    1984)
  • languages for scientific simulation (e.g.,
    STELLA, MATLAB)
  • interactive tools for data analysis (e.g.,
    Schneiderman, 2001).

Our work combines, in novel ways, insights from
machine learning, AI, programming languages, and
human-computer interaction.
26
Contributions of the Research
In summary, our work on computational model
construction has produced an approach that
  • incorporates a formalism that is familiar to many
    scientists
  • takes into account background knowledge about the
    domain
  • produces meaningful results from small amounts of
    data
  • generates models that explain rather than
    describe observations
  • provides an interactive environment for model
    construction.

We need much more research in computational
systems science that addresses these challenges.
27
End of Presentation
Write a Comment
User Comments (0)
About PowerShow.com