Title: Embedding Live Access Server into GFDL Data Portal Infrastructure
1Embedding Live Access Server into GFDL Data
Portal Infrastructure
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- K.OBrien (PMEL), S.Nikonov (GFDL),
- R.Schweitzer (PMEL), S.Hankin (PMEL), V.Balaji
(GFDL)
2Outlines
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- Curator metadata DB, part of GFDL FMS Runtime
Environment (FRE) is a centralized metadata
storage for entire modeling process. - Live Access Server (LAS) is important component
of GFDL Data Portal. - FRE gt Curator DB gt LAS metadata stream.
- Benefits of symbiosis.
3The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
Live Access Server Curator LASurator
(not LACerater !!!)
- Symbiotic self configuring system combining LAS
and Curator metadata DB. - Part of FRE
- Essential part of GFDL Data Portal
4The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
Curator DB is important part of FRE
- Technical part of modeling process consists of
three important parts assembling model,
configuring it and running simulation. - Two ways of conducting these stages
- write once and forever sophisticated universal
scripts with a lot of input parameters needed for
running them - write tools which generate disposable scripts
with configurations parameters needed for
simulation hardcoded in these scripts. - FRE was design using the 2nd way avoiding
scientists the long road of configuring model
experiment every time. 1st version offers XML
metadata file as user interface for configuring
model and experiment. - The 2nd version will give convenience and
automation to scientists in controlling the
process of model building through more user
friendly interface then XML file based on
centralized storage Curator DB.
5Modes of working Curator DB within FRE
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- Research mode (component oriented) - modeler
introduces new physical processes /
parameterization / algorithmizations / components
from newly developed modules. New entities have
to be described in database. - Production mode (simulation oriented)
experimenter composes coupled model from
available components described in database,
builds scenario, postprocessing plan and runs
experiment. All this activity is recorded in
database. - Thoroughly elaborated very friendly GUI is
the critical thing for these modes otherwise
users will avoid the database load stage gt DB
will be empty. - Automatic mode applications write metadata into
database harvesting it from model output data
files or queries it from DB during data
searching. - Publishing mode metadata is extracted from DB
by publishing tools for describing public data. -
6Curator DB on different stages of modeling
process
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
in development currently
Data Portal Service
FMS Runtime Environment (currently developed
version)
Postprocessing
Experiment Preparation
Model Assembling
Component Building
Metadata Curator DB
7Curator DB Design Compartments
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- Process Domains Physical Process
decomposition of physical reality into
homogeneous domains with descriptions of
theoretical approaches for processes there
considered for modeling. - Algorithmization describes program modules of
elementary physical processes - Composition components, couplers drivers
technical environment needed for assembling model
as computer application - Simulation metadata on simulations and model
output data - Publishing all metadata on data available for
public includes descriptions needed for Data
Portal software (OPeNDAP, LAS).
8The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- Curator database
- Contains a lot of metadata information needed for
entire modeling process - Well ordered information architecture
- Live Access Server
- Configurable metadata access
- Configurable data browse/access
9The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- The GFDL IPCC hierarchy, for example
- IPCC (Project)
- GFDL (Institution)
- CM2.0 (Model number)
- Climate of the 20th Century (Scenario)
- Realization 1 (Which realization)
- Run 1 (Which run)
- 3 hourly data (Temporal domain)
- Jan 1991 Dec 2000 (Dates of on-line data)
10The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
Metadata Tree
GFDL
CM2.1
CM2.0
1/year CO2 increase scenario (to quadrupling)
Climate of the 20th Century
1/year CO2 increase scenario (to quadrupling)
Climate of the 20th Century
Run 1
Run 2
Run 1
Run 2
annual
annual
Jan 2196 Dec 2200
Jan 2046 Dec 2050
Jan 2146 Dec 2150
Jan 2096 Dec 2100
Precipitation
Surface Air Temperature
Sea Level Pressure
Surface Air Temperature
Surface Latent Heat Flux
11The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- Challenge
- How to configure LAS to allow efficient user
interaction and reflecting all variety of
available data - How to minimize work for LAS installer
- Solution
- Use Curator DB to pull together pertinent
metadata on experiments - Modify LAS utility addXML to read metadata from
mySQL database rather than individual or
aggregated files - Create generic velocity templates to present in
friendly HTML for users
12Simple but effective architecture
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
metadata Extractor, XMLGenerator
metadata Extractor/ XMLGenerator
metadata
metadata
metadata
metadata
addXML
Aggregations URLs
Categories
13Behind the scenes
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
14Tools populating Curator
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- Populating Curator DB is automatic.
- One piece of software scans data storage
following the list of public experiments from
Curator DB. Traversing data files it extracts
metadata and write it in Curator. In next version
of FRE metadata will populate DB automatically in
course of all stages from assembling model
through experiment running. - Another tool analyses metadata from Curator and
creates aggregations records in accordance to the
aggregation criterion. Currently, Experiment is
used as a sorting criterion for upper level
category. - Last stage is generating THREDDS configuration
XML files based on prepared metadata in Curator.
15Interaction FRE ? Curator ? LAS(final goal)
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
FMS
GUI
metadata
metadata
LAS
Config data
in development currently
16LAS oriented Curator DB design features
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
- Created specific tables for
- Inventorying metadata about time spans and
variables of available data - Describing projections in details
- Storing THREDDS aggregations descriptions
including such fields as type of averaging,
domain of variable (atmos, land,), time limits,
URL, aggregation criterion.
Dynamic hierarchy Implemented flexibility of
criteria for categories hierarchy giving to user
freedom in choosing his own interface layout
preferences.
17Curator DB table samples
The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
18The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
Future development Discovery/Navigation Interface
Imagine a search interface with Ajax talking to
Curator DB or machine oriented auxiliary DB
automatically designed based on adopted ontology
and populated by RDF triples from main metadata
DB - Curator (like its proposed and explored in
ESG project) .
As the user constrains the search, output and
menu lists adjust immediately.
19The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
20The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
21Desktop Matlab, IDL, IDV, Ferret, GrADS,
Information Products
Files netCDF, binary, spreadsheet, GIS layer,
22Benefits
The 5th GO-ESSP Workshop June 19-21 2006, LLNL
- Curator DB contains extensive metadata
(aggregations, gridspec descriptions, time spans
of data availability). - LAS configuration can be pulled directly from
Curator DB. - Simplifies the configuration of the complicated
information hierarchy for the LAS installer.
(using LAS categories) - Simplifies the complex mosaic of datasets which
are presented to the LAS user. - LAS addXML tool works faster.
23The 6th GO-ESSP Workshop June 11-13 2007
Jussieu, Paris
24The 5th GO-ESSP Workshop June 19-21 2006, LLNL
Compartment Structure of Curator Database
Domains
Atmosphere
Ocean
Ice
Land
Surf_BoundaryLayer
Rivers
Lakes
Dynamics
ProcCodeBase
Convection
NumArtificies
Radiation
Components
GridSpecs
IceProc
Projects
CmpPMIOD
BoundCond
BiotaProc
Experiments
CmpDrivers
TracerModels
Hydrology
Realizations
CouplModels
NameLists
CloudProc
InitCond
Services
DataSets
Chemistry
DomConstituents
Scenarios
Versioning
. . .
PostProc
Compiling
Variables
Others
PlatformEnv
Fields/Files
Aggregations