From - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

From

Description:

physical, chemical and biological scientists are taught lab-book discipline from an early age. ... Red/blue shading: ocean heating/cooling. Cyan/magenta line: ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 27
Provided by: nes5
Category:
Tags:

less

Transcript and Presenter's Notes

Title: From


1
From lab books to computational Earth science.
  • Chris Hill, MIT cnh_at_mit.edu
  • Edinburgh, July 2007

2
Lab books
A lab notebook is a primary record of research.
Researchers use a lab notebook to document their
hypotheses, experiments and initial analysis or
interpretation of these experiments. The notebook
serves as an organizational tool, a memory aid,
and can also have a role in protecting any
intellectual property that comes from the
research. The guidelines for lab notebooks vary
widely between institution and between individual
labs, but some guidelines are fairly common. The
lab notebook is usually written in as the
experiments progress, rather than a later date.
Many say that lab notebook should be thought of
as a diary of activities that are described in
sufficient detail to allow another scientist to
follow the same steps. To ensure that data cannot
be easily altered, notebooks with permanently
bound pages are often recommended. Researchers
are often encouraged to write only with
unerasable pen, to sign and date each page, and
to have their notebooks inspected periodically by
another scientist who can read and understand it.
All of these guidelines can be useful in proving
exactly when a discovery was made, in the case of
a patent dispute. Several companies now offer
electronic lab notebooks. This format has gained
some popularity, especially in large
pharmaceutical companies, which have large
numbers of researchers and great need to document
their experiments.
wikipedia
3
Lab books
  • physical, chemical and biological scientists are
    taught lab-book discipline from an early age.
  • reproducible results are the foundation of
    scientific and engineering disciplines e.g.
    Mickleson/Morley.
  • even an infamous Journal of Unreproducible
    Results
  • in computational science the lab book
    discipline is not so ubiquitous maybe because
  • program is a formal statement of applied
    mathematical axioms
  • axioms are deterministic
  • therefore reproducibility is not an issue
  • however, a programs i.e. a complex collection of
    simple elemental statements is hard to
    comprehend. If details are not recorded,
    reproducibility may well be an issue.

4
Some example computational Earth science
experiments.
  • Aqua-planet.
  • Eddying North Atlantic.
  • Global ocean with eddies and seaice.
  • IPCC

5
A simple GFD configuration
  • Some factors that affect the solution
  • Initial conditions.
  • Atmosphere Clouds, radiation, dynamics, boundary
    layer, temporal and spatial discretization.
  • Seaice Thermodynamics. Aging. Stress-strain
    relation.
  • Ocean Dynamics, coordinate system,
    vertical/horizontal friction and mixing.
  • Coupling Time stepping, emergetics.
  • External forcings Solar insolation, reference
    profiles

Water covered planet. Atmosphere-ocean-seaice.
Jean-Michel Campin and David Ferreira
6
An eddying, ocean only configuration
  • Some factors that affect the solution
  • Initial conditions.
  • Atmosphere fluxes Planetary boundary layer
    scheme.
  • Ocean Dynamics, coordinate system,
    vertical/horizontal friction and mixing.
  • Coupling Time stepping, emergetics.
  • External forcings Solar insolation, reference
    profiles, atmospheric reanalysis.
  • Non-linear/turbulent flow, so bitwise
    reproducibility subject to FP round off, parallel
    reduction operatations etc

Ocean-only, forced with atmospheric reanalysis
for Jan-Mar.
Red/blue shading ocean heating/cooling. Cyan/mag
enta line /-17.5OC _at_ 200m. Streaks
Windstress. Green thickness Ocean mixed layer
depth.
7
Global eddying ocean, sea-ice decadal ensemble.
50 members.
Ensemble perturbations Numerical
formulation Ocean parameters Seaice
parameters Initial conditions Boundary conditions
8
IPCC ocean ACC transports
Couples atmosphere, ocean, seaice, land,
vegetation, chemistry etc
Could I make this plot without too much
difficulty yes Could I rerun IPCC scenario
(possibly with some parameter change)
no Diagnosing these results is possible today
(PCMDI/ESG archives) for broad scientific
community. Rerunning experiments (with or without
small changes) is still very hard. Factors
affecting solution range from bottom drag to
land-surface formulation to emissions profiles.
9
Examples summary
  • Way Forward
  • hand record is not practical nor ideal (i.e. not
    as potentially useful as electronic record).
  • Electronic information should be stored so as to
    be amenable to machine reasoning.
  • requires defined vocabularies, precise formal
    structure, pattern matching, rules etc..
  • ?W3C/semantic web technologies - XML, RDF,
  • In theory, using XML, RDF etclt we could describe
    model systems using these and enable reruns for
    extra outputs (e.g. transport of S3 by flow) or
    derived runs (e.g. modified air-sea coupling
    coefficient of formulation).
  • In practice this is hardwork!
  • To reproduce an experiment
  • significant quantity of information needs to be
    stored spans broad big-picture information
    (water-covered planet, atmosoceanseaice) to
    minute details (bitwise reproducibility may
    require record of compiler, OS etc)

10
Baby steps toward a computational Earth science
model repository.
  • What is working today PCMDI/ESG
  • Steps toward future - ESC

11
PCMDI
  • Archive of all IPCC model outputs.
  • Stored in common format (netCDF with standard
    metadata).
  • Stored on common mesh. Simplifies things, but
    can/does degrade information and even mislead
    (e.g. conservation in one coordinate system may
    be inexact in another).
  • Very limited model metadata is held.
  • Very successful and technically impressive
    societal utility func. of model quality!

Schmittner et al (2005, GRL)
12
Earth System Curator (ESC)
Can we (for better or worse!) do for models what
PCMDI does for datasets? PCMDI datasets are data
wrapped in a common/standard container
(netCDF). The PCMDI container is
self-describing. This means we can query and
even combine (to some degree) the PCMDI
datasets. A container analogy for modeling
technology is the component architecture
supported by systems like ESMF.
13
Building a coupled model oriented solution
modeling system as a component tree
  • Some mathematics component M
  • no side-effects
  • possible persistent internal state
  • Supports representation as DAG such that

e.g
14
Example of actual component tree.
  • Tree of components from the GEOS-5 modeling
    system.
  • Each box is an ESMF component.
  • Components adhere to DAG semantics.

Suarez et. al
15
Individual components in ESMF
  • ESC builds on an ESMF-like component model.
  • ESMF Component
  • Container for sequence of computation that
    implements a particular algorithm (physics
    simulation e.g. Navier-Stokes solver or technical
    function e.g history manager). An ESMF component
    exposes its external interfaces through an ESMF
    state.
  • ESMF State
  • Container data type to transport data between
    components
  • ESMF Field
  • Container data type that can be used to push/pop
    n-dimensional data with an associated mesh from
    an ESMF State.

16
Given a component model, like the ESMF paradigm,
ESC
  • Describes a component in terms of
  • parameters that control the computation sequence.
  • states and fields that are passed into/out of the
    component.
  • Provides two levels of description
  • potential and specific.
  • Potential is a list of all possible parameters
    and fields. It is a virtualized description in
    that it is not describing a specific instance.
  • Specific is a description of an instantiated
    component in which parameters are bound to
    specific values and fields and states are bound
    to specific values.

17
ESC component descriptions are in terms of XML
schema.
  • Curator-NMM
  • Described numerical model parameters e.g.
    timestep, system requirements,
  • Gridspec
  • Describes numerical mesh.
  • Curator-CIAO
  • Describes components inputs and outputs
  • Curator-complete
  • Describes wiring together of components
  • A coupled component is also a component i.e.
    schema is recursive.

Some details (more at http//www.earthsystemcurato
r.org) ..
18
Curator-NMM
  • The Curator-NMM schema describes model
    components, their content, and their
    connections.  It is a superset of the NMM
    schema.  The main constructs in the Curator-NMM
    schema are component, potential model, and
    model.  Components are "composable" pieces of
    code that can be coupled together in various
    arrangements to form different models.  A
    potential model consists of a group of
    components, and describes the set of possible
    models that can be built from those components. 
    A model is a fully specified application based on
    a potential model and configuration choices. 

19
Curator-NMM
20
Mosaic Grid Specification
  • The Mosaic Grid Specification is a standardized
    description of muti-patch, structured grids being
    developed in coordination with CF activities.

21
Mosaic Grid Specification
22
Component component compatibility checking.
  • ESC can describe coupled (multi-component)
    systems.
  • In principle ESC could support recombination of
    components from coupled systems e.g. couple
    component A (atmosphere dynamics) with component
    B (land-surface).
  • Ideally, for this, compatibility constraints need
    to be expressed in a standard way.

23
Service architectures
  • Standards ? services
  • Developing standardized descriptions is a
    well-proven method toward a service oriented
    approach e.g.

24
Some useful (but an incomplete list of) URLs
Component models http//www.esmf.ucar.edu http//maplcode.org
Metadata standards http//www.earthsystemcurator.org http//ncas-cms.nerc.ac.uk/NMM/ http//www.earthsystemgrid.org/ http//www.cgd.ucar.edu/cms/eaton/cf-metadata/ http//sbml.org/index.psp http//cml.sourceforge.net/wiki/index.php/Main_Page http//www.w3.org/
25
Summary
  • Earth System Curator project is an activity
    developing schema and tools to capture semantic
    information about models.
  • Such information provides basis for formally
    recording numerical experiments computational
    Earth science lab book.
  • It also provides the basis for a formal approach
    reproducible numerical results fewer Journal
    of Irreproducible Results candidates.
  • Other efforts SBML (systems biology), CML
    (chemistry) - already uploads to Science
    submissions.
  • Maybe soon a computational Earth science
    challenge will become, how to stop people doing
    dumb things with easy to use modeling services,
    rather than how to get people to use obtuse
    legacy modeling systems - maybe! ?

26
ESC collaboration
  • NCAR (Cecelia Deluca, Julien Chastang), MIT
    (Chris Hill, Constantinos Evangelinos), Georgia
    Tech (Spencer Rubager, Rocky Dunlap, Angela),
    GFDL (Balaji, Sergey), Reading UK (Lois
    Steenman-Clark, Katherine Boughton), PRISM
    (Sophie Valcke).
Write a Comment
User Comments (0)
About PowerShow.com