Climateprediction'net NERC Annual eScience meeting, April 2006 - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Climateprediction'net NERC Annual eScience meeting, April 2006

Description:

Project development is getting easier thanks to generic platforms, e.g. the ... Little documentation (the science is well documented but not the software and ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 35
Provided by: paulv87
Category:

less

Transcript and Presenter's Notes

Title: Climateprediction'net NERC Annual eScience meeting, April 2006


1
Climateprediction.netNERC Annual eScience
meeting, April 2006
  • Nick Faull, Carl Christensen, Myles Allen, Dave
    Frame many others
  • Department of Physics, University of Oxford
  • nfaull_at_atm.ox.ac.uk

2
The project
  • Overall objective is to quantify the range of
    uncertainty of future climate change.
  • This requires 100s of thousands of climate model
    (GCM) simulations.
  • Use public resource distributed computing to meet
    the demand anyone can go to www.climatepredictio
    n.net and download the Hadley Centre climate
    model to run on their PC.

3
Volunteer Computing
  • A specialized form of distributed computing
    which is really an old idea in computer science
    -- using remote computers to perform a same or
    similar tasks
  • Was around before '99 but took off with SETI_at_home
  • SETI_at_home capacity with 500K users about 1 PF
    1000 TF
  • for comparison Earth Sim in Kyoto 35TF max
  • CPDN running at about 60 TF (30K users each 2GF
    machine average, i.e. PIV 2GHz)

4
Other Public Resource Distributed Computing
(PRDC) Projects
  • There are 100-200M PCs connected to the internet.
  • lt1 are involved.
  • Project development is getting easier thanks to
    generic platforms, e.g. the Berkeley Open
    Infrastructure for Network Computing (BOINC)

5
Running an Ensemble of GCM Simulations is not a
Typical PRDC Application
  • A typical GCM simulation takes weeks or months,
    not hours.
  • GCMs use more memory.
  • GCM simulations produce data for further
    analysis. Most PRDC projects analyse data or test
    an hypothesis so data transfer is not a problem.
  • Potentially all the simulations are useful, not
    just those which find a result.

6
CPDN Volunteer Computing Challenges...
  • Climate models (ESM's, AOGCM's etc) are very
    large, complex systems developed by physicists
    sometimes over decades ( proprietary in case of
    UKMO)
  • 1 million lines of Fortran code (HadSM3 -- 550
    files, 40MB text source code)
  • Little documentation (the science is well
    documented but not the software and design of the
    system per se)
  • Also utility code written by various scientists
    students over the years (outside of model code,
    220 files, 12MB source, 250K lines) often
    workable but hard to implement on a
    cross-platform PC project
  • Meant to be run on supercomputers, primarily
    64-bit not designed (or indeed envisioned) to
    be run on anything other than a supercomputer or
    at the very least, a Linux cluster

7
CPDN and BOINC Integration
Apologies to Bill Watterson
8
Why BOINC?
  • BOINC is based on the experiences of the
    SETI_at_home team in handling millions of users,
    downloads and uploads, investment of gtUS1million
  • So it makes sense to use BOINC which has a tried
    and tested framework instead of keep playing
    catch-up and reinventing the wheel
  • Basically, BOINC allows us to focus on what we do
    best (or should be doing best)
  • Climate science, climate modelling, visualisation
    packages (peer-to-peer perhaps?), cross-platform
    porting of models, grid applications to clamp
    on the BOINC server-side

9
Data nodes
  • Rely on donated server space
  • Data is federated across the nodes
  • Do not want end users to FTP the raw data
  • Instead
  • Provide a secure, robust, efficient, scalable
    environment for data discovery and analysis
  • Distribute the analysis by enabling each data
    node to process data
  • Allow multiple interfaces to access the data

10
Distributed analysis of data
  • Design problems
  • Data set is federated across data nodes
  • Only one copy of the data set
  • Data set is large
  • Servers are donated, potentially no root access
  • Analysis are computationally expensive, may take
    several days
  • Web services to analyse data
  • Lightweight way to build grid like infrastructure
  • Open, standardised protocols
  • Security features present in software stack
  • Support from industry (Sun, Microsoft, IBM, etc.)
  • Momentum in UK academic community (WSRF, OMII,
    etc.)
  • CPDN will provide data via grid-enabled web
    services to such providers as the NERC Data Grid
    http//ndg.badc.rl.ac.uk/

11
Security issues
  • Threats to participants (unexpected costs of
    participation)
  • Software package is digitally signed.
  • Communications are always be initiated by the
    client.
  • HTTP over a secure socket layer will be used
    where necessary to protect participant details
    and guarantee reliable data collection.
  • Digitally signed files can be used where
    necessary.
  • Threats to the experiment (falsified data)
  • Two types of run replication
  • Small number of repeated identical runs.
  • Large numbers of initial condition ensembles.
  • Checksum tracking of client package files to
    discourage casual tampering.
  • Opportunity to repeat runs as necessary.
  • Server security management and frequent backups.

12
Climateprediction.net participants
gt200,000 volunteers, gt150 countries, gt13M
model-years
13
Climateprediction.net What it looks like
14
Results from our initial climateprediction.net
experiment (Stainforth et al, 2005)
  • Using simplified model ocean to keep runs short
  • 15-year calibration phase to compute ocean heat
    transport
  • 15-year control phase with pre-industrial CO2
    (280ppm)
  • 15-year 2xCO2 phase with CO2 at 560ppm.
  • Repeat with different initial conditions to
    average out noise and quantify sampling
    uncertainty

15
Parameter perturbations
  • Critical Relative Humidity RHcrit
  • Accretion constant CT
  • Condensation nuclei concentration CW
  • Ice fall velocity VF1
  • Entrainment coefficient (EntCoef).
  • Empirically adjusted cloud fraction (EACF).

16
Frequency Distribution of Simulations
From Stainforth et al, 2005
17
Frequency distribution, eliminating drifting
control simulations
18
And at about the same time our participants
started reporting models freezing over
19
Un-physically strong low-cloud versus
surface-heat-flux feedback in equatorial Pacific
20
Climate sensitivities from climateprediction.net
Stainforth et al, 2005
21
And having got excited about the cold ones
22
BBC Climate Change Experiment
  • Transient simulation of 1920 to 2080 with HadCM3L
    exploring
  • Model uncertainty in the atmosphere.
  • Model uncertainty in the ocean.
  • Uncertainty in historic forcing.
  • Some uncertainty in future forcing.
  • Natural variability.

23
Over 50,000 active participants running HadCM3L
1920-2080, see bbc.co.uk/climatechange
24
(No Transcript)
25
The problem
  • An error in a file header caused the model to
    read in the man-made sulphate emissions from the
    wrong point in the file.
  • Resulted in having too little sulphate emission
    in the 20th century hence models warming up too
    fast no global dimming effect
  • ...but can still do useful science with GHG only
    ensemble.

26
Particpants thoughts
  • Whoops! Still that's science for you...
  • I would feel better about the error, if I
    thought the person/people responsible had
    been sacked!

27
HadCM3L Attribution Project (courtesy Daithi
Stone)
28
HadCM3L Attribution Project (courtesy Daithi
Stone)
29
Sahel desert drought experiment
  • The Sahel desert drought in 1970s and 1980s
    created a famine that killed a million people and
    afflicted more than 50 million.
  • Suggestion that the drought was likely caused by
    air pollution (global dimming) changing
    properties of clouds over the Atlantic ocean,
    disturbing the monsoons and shifting the tropical
    rains southwards.
  • With reduced sulphate aerosol in the model can
    test whether this had impact on rainfall in this
    region

30
Distributed computing is not just for
climate-resolution models
31
Distributed computing is not just for
climate-resolution models
32
the climate that might have beenhttp//attribut
ion.cpdn.orgHadAM3 N144 model288 longitude x
217 latitude x 30 vertical gridboxes
33
Educational Outreach
  • CPDN has public education via the website, media,
    and schools as an important facet of the project
  • Website has much information on climate change
    and related topics to the CPDN program.
  • Schools are running CPDN and comparing results,
    with special events at U Reading
  • Students will host a debate on climate change
    issues, compare and contrast their results etc.

Currently focused on UK schools, but as projects
added and staff resources are gained plan to
expand to other European schools and US schools
Students at Gosford Hill School, Oxon viewing
their CPDN model
34
Future Plans
  • Just released 160-yr HadCM3 1920-2080
    hind/forecast runs with the BBCs Climate Chaos
    Season of programmes, and Meltdown documentary
  • BBC World will hopefully pick up the programmes
    and push CPDN in July 2006.
  • Received funding from NERC Knowledge Transfer
    scheme for regional modelling (PRECIS) via CPDN
  • May have sister or spinoff projects in
    Germany the US (depending on proposals/funding)
Write a Comment
User Comments (0)
About PowerShow.com