Title: Collaborative study of GENIEfy Earth System Models using Scripted Database Workflows in a Grid-enabled PSE
1Collaborative study of GENIEfy Earth System
Models using Scripted Database Workflows in a
Grid-enabled PSE
- UK e-Science All Hands Meeting 2006
- 21st September 2006
- Andrew Price
- Southampton Regional e-Science Centre
2The GENIE / GENIEfy Team
- Principal Investigator - GENIEfy
- Tim Lenton UEA Norwich
- Research Team and Collaborators
- James Annan FRSGC, Japan
- Chris Armstrong Manchester
- Chris Brockwell UEA Norwich
- David Cameron CEH Edinburgh
- Peter Cox Hadley Centre (UKMO)
- Neil Edwards Open University
- Sudipta Goswami UEA Norwich
- Robin Hankin NOC
- Julia Hargreaves FRSGC, Japan
- Phil Harris CEH Wallingford
- Zhuoan Jiao Southampton e-Science Centre
- Eleftheria Katsiri London e-Science Centre
- Valerie Livina UEA Norwich
- Dan Lunt Bristol
- Richard Myerscough NOC
- Principal Investigator - GENIE
- Paul Valdes Bristol
- Co-Investigators / Management team
- Peter Challenor NOC
- Trevor Cooper-Chadwick Southampton e-Sci.
Centre - Simon Cox Southampton e-Sci. Centre
- John Darlington London e-Science Centre
- Rupert Ford Manchester
- Eric Guilyardi Reading
- John Gurd Manchester
- Richard Harding CEH Wallingford
- Robert Marsh NOC
- Tony Payne Bristol
- Graham Riley Manchester
- John Shepherd NOC
- Rachel Warren UEA Norwich
- Andrew Watson UEA Norwich
3Outline
- Introduction
- Scientific Challenge
- Grid-enabled Problem Solving Environment
- Model Tuning
- Collaborative study of THC bi-stability
- Results
- Conclusions
4GENIE / GENIEfy
GENIEfy Grid ENabled Integrated Earth System
Model for the Community
- The GENIE project has developed a Grid-based
system to - Flexibly couple together state-of-the-art
components to form a unified Earth system model - Execute the resulting model on the Grid
- Share the distributed data produced in
simulations - Provide high-level open access to the system,
creating and supporting virtual organisations of
Earth system modellers
5Scientific Aims
- Orbital parameters affect incident radiation and
climate - Biological and geological processes interact
with, and feedback upon, the climate (via, for
instance, CO2)
6The target GENIE Model
7Flexible modelling framework
- Modularity
- Swappable components throughout
- e.g. Atmosphere 2D Energy-Moisture Balance Model
or 3D Intermediate GCM - Scalability
- Variable resolution
- e.g. Ocean 18x18, 36x36, 72x72 8-32 depth
layers - Traceability
- Common physics when resolution is varied
- Where a process is not resolved, parameterise it
based on a resolution that does resolve it
reasonably
8Thermohaline Circulation
- The worlds oceans transport heat through the
global conveyor belt - The strength of the overturning circulation is
sensitive to the global hydrological cycle
9GENIE-1 Configuration
2D atmosphere EMBM (36 x 36)
2D slab sea-ice
3D ocean (36 x 36 x 8)
2D land surface
10Surface freshwater flux correction
- Three zones where A-P flux (Fa) is applied,
indicating default values (from Oort 1983)
11Bi-Stability of the Thermohaline Circulation (THC)
OFF
ON
Single point in model parameter space sensitive
to initial conditions Atlantic Meridional
Overturning Circulation MOC (Sv) Annual
Average Air Temperature Difference (K) ON
OFF How close are we to collapse of the
thermohaline circulation?
Marsh, R. J. et al. (2004) Climate Dynamics
12GENIE-2 Configuration
- 1000 model years
- GENIE-1 1-2 hours CPU time
- GENIE-2 4-5 days CPU time
3D atmosphere IGCM (64 x 32 x 7)
2D atmosphere EMBM (36 x 36)
2D slab sea-ice
3D ocean (36 x 36 x 8) (64 x 32 x 8) (72 x 72 x
16)
3D ocean (36 x 36 x 8)
2D land surface
13Geodise Toolboxes
- Geodise Compute Toolbox
- Grid access from the Desktop
- Matlab and Jython interfaces
- Globus and Condor support
- Geodise Database Toolbox
- Associate metadata with data
- Programmatic and GUI access
- OptionsMatlab
- Engineering Design Optimisation
- Suite of multi-dimensional optimisation algorithms
14Grid Computation
CCS Gateway
GRAM Gateway
Institutional Resources (GT2)
National Grid Service (GT2)
Condor Pool
Microsoft Compute Cluster Server
Southampton Condor Pool
15Need Metadata
16Data Management System
17Multi-Objective Optimisation
- Single objective function
- Weighted sum of (model observation) RMS
differences - Some objectives can be improved at the expense of
others - Little improvement in the precipitation and
evaporation fields - Multi-objective optimisation
- Employ a Pareto Front to optimise multiple
objectives - Implementation of the Non-dominated Sorting
Genetic Algorithm (NSGA-II, Deb (2002)) - 3 objective functions
- Weighted sum of the RMS differences between
seasonal averages of model fields and equivalent
observational data - OBJ1 (sensible heat latent heat net solar
net long) - OBJ2 (precipitation rate evaporation)
- OBJ3 (wind stress_x wind stress_y)
- IGCM problem definition
- 32 free parameters (TXBLCNST TYBLCNST)
- 2 constraints on the parameters
- HUMCLOUDMAX gt HUMCLOUDMIN
- SNOLOOK2 gt ALBEDO_ICEHSEET
18Pareto Front Progression
- 50 generations of the NSGA-II algorithm
19Multi-objective Optimisation
- 5000 model invocations
- Southampton University Condor pool
- Iridis2 Compute Cluster
- National Grid Service
- Pareto Front driven towards origin
- 3 objective functions reduced
- Targeted improvements
- Evaporation fields improved without compromising
other fields
20Collaborative Model Study
- 12 Ensemble Experiments
- 3 x 1D FWF adjustment
- 3 GOLDSTEIN grid resolutions (36x36, 64x32,
72x72) - 1 x 2D FWF adjustment, Evaporation multiplier
- 36x36 GOLDSTEIN grid
- 8 x 1D FWF adjustment, restarted from output of
phase 1 - 3 GOLDSTEIN grid resolutions (36x36, 64x32,
72x72) - 72x72 models runs performed on both Linux and
win32 platforms - Resource
- National Grid Service (Oxford, Leeds, Manchester,
RAL, Bristol) - Condor Pools (Southampton, Bristol, NOC)
- Clusters
- Cluster1 (Norwich)
- Pacifica (Southampton), Iridis2 / Pacifica2 (dual
processor, dual core) - Microsoft Compute Cluster Server
21Collaborative Study
22Client Session
23Workflow
24Resource Usage
- Large ensemble of runs
- Each 1000 year simulation takes 5 days
- Total of 362 simulations
- Would take 5 years to run in series
25Results
- Bi-stability of the ocean thermohaline
circulation - In a fully 3-D ocean-atmosphere-sea ice model
- Varies with surface grid and ocean resolution
THC on
64328 ocean longitude-latitude grid
36368 ocean equal area grid
THC off
727216 ocean equal area grid
26THC on
27THC off
28 Two Parameter Study
29Summary
- THC as a function of freshwater fluxes obtained
for - Two levels of atmospheric complexity (including
previous studies) - Three model grids (varying horizontal resolution)
- THC bistability is more extensive with
- More complex atmosphere
- A regular 5.625 grid
- Results tentatively support the existence of THC
bistability in the real world (i.e. towards
infinite complexity)