Tony Hey - PowerPoint PPT Presentation

About This Presentation
Title:

Tony Hey

Description:

e-Science, Databases and the Grid Tony Hey Director of UK e-Science Programme Tony.Hey_at_epsrc.ac.uk – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 18
Provided by: ajg9
Category:
Tags: grid | hey | oracle | tony

less

Transcript and Presenter's Notes

Title: Tony Hey


1
  • Tony Hey
  • Director of UK
  • e-Science Programme
  • Tony.Hey_at_epsrc.ac.uk

2
e-Science and the Grid
  • e-Science is about global collaboration in key
    areas of science, and the next generation of
    infrastructure that will enable it.
  • e-Science will change the dynamic of the way
    science is undertaken.
  • John Taylor
  • Director General of Research
    Councils
  • Office of Science and Technology

3
NASAs IPG
  • The vision for the Information Power Grid is to
    promote a revolution in how NASA addresses
    large-scale science and engineering problems by
    providing persistent infrastructure for
  • highly capable computing and data management
    services that, on-demand, will locate and
    co-schedule the multi-Center resources needed to
    address large-scale and/or widely distributed
    problems
  • the ancillary services that are needed to support
    the workflow management frameworks that
    coordinate the processes of distributed science
    and engineering problems

4
IPG Baseline System
MCAT/SRB
MDS CA
Boeing
DMF
O2000
cluster
MDS
EDC
GRC
O2000
NGIX
CMU
NREN
ARC
NCSA
GSFC
LaRC
JPL
O2000
cluster
SDSC
NTON-II/SuperNet
MSFC
MDS
O2000
JSC
KSC
5
Multi-disciplinary Simulations
Wing Models
  • Lift Capabilities
  • Drag Capabilities
  • Responsiveness

Stabilizer Models
Airframe Models
  • Deflection capabilities
  • Responsiveness

Crew Capabilities - accuracy - perception -
stamina - re-action times - SOPs
Engine Models
  • Braking performance
  • Steering capabilities
  • Traction
  • Dampening capabilities
  • Thrust performance
  • Reverse Thrust performance
  • Responsiveness
  • Fuel Consumption

Landing Gear Models
Whole system simulations are produced by
couplingall of the sub-system simulations
6
Multi-disciplinary Simulations
National Air Space Simulation Environment
Stabilizer Models
GRC
44,000 Wing Runs
50,000 Engine Runs
Airframe Models
66,000 Stabilizer Runs
LaRC
ARC
Virtual National Air Space VNAS
22,000 Commercial US Flights a day
22,000 Airframe Impact Runs
  • FAA Ops Data
  • Weather Data
  • Airline Schedule Data
  • Digital Flight Data
  • Radar Tracks
  • Terrain Data
  • Surface Data

Simulation Drivers
48,000 Human Crew Runs
132,000 Landing/ Take-off Gear Runs
(Being pulled together under the NASA
AvSP Aviation ExtraNet (AEN)
Landing Gear Models
Many aircraft, flight paths, airport operations,
and the environment are combined to get a virtual
national airspace
7
The Grid as an Enabler for Virtual Organisations
  • Ian Foster and Carl Kesselman Take 2
  • The Grid is a software infrastructure that
    enables flexible, secure, coordinated resource
    sharing among dynamic collections of individuals,
    institutions and resources
  • - includes computational systems and data
    storage resources and specialized facilities
  • Enabling infrastructure for transient Virtual
    Organisations

8
Globus Grid Middleware
  • Single Sign-On
  • Proxy credentials, GRAM
  • Mapping to local security mechanisms
  • Kerberos, Unix, GSI
  • Delegation
  • Restricted proxies
  • Community authorization and policy
  • Group membership, trust
  • File-based
  • GridFTP gives high performance FTP integrated
    with GSI

9
US Grid Projects
  • NASA Information Power Grid
  • DOE Science Grid
  • NSF National Virtual Observatory
  • NSF GriPhyN
  • DOE Particle Physics Data Grid
  • NSF Distributed Terascale Facility
  • DOE ASCI Grid
  • DOE Earth Systems Grid
  • DARPA CoABS Grid
  • NEESGrid
  • DOH BIRN
  • NSF iVDGL

10
EU GridProjects
  • DataGrid (CERN, ..)
  • EuroGrid (Unicore)
  • DataTag (TTT)
  • Astrophysical Virtual Observatory
  • GRIP (Globus/Unicore)
  • GRIA (Industrial applications)
  • GridLab (Cactus Toolkit)
  • CrossGrid (Infrastructure Components)
  • EGSO (Solar Physics)

11
National Grid Projects
  • UK e-Science Grid
  • Japan Grid Data Farm, ITBL
  • Netherlands VLAM, PolderGrid
  • Germany UNICORE, Grid proposal
  • France Grid funding approved
  • Italy INFN Grid
  • Eire Grid proposals
  • Switzerland - Grid proposal
  • Hungary DemoGrid, Grid proposal
  • ApGrid

12
UK e-Science Initiative
  • 120M Programme over 3 years
  • 75M is for Grid Applications in all areas of
    science and engineering
  • 10M for Supercomputer upgrade
  • 35M for development of industrial strength
    Grid middleware
  • Require 20M additional matching funds
    from industry

13
UK e-Science Grid
Edinburgh
Glasgow
Newcastle
DL
Belfast
Manchester
Cambridge
Oxford
Hinxton
RAL
Cardiff
London
Southampton
14
Generic Grid Middleware
  • All e-Science Centres donating resources to form
    a UK national Grid
  • Supercomputers, clusters, storage, facilities
  • All Centres will run same Grid Software
  • - Starting point is Globus, Storage Resource
    Broker and Condor
  • Work with Global Grid Forum and major computing
    companies (IBM, Oracle, Microsoft, Sun,.)
  • Aim to industry harden Grid software to be
    capable of realizing secure VO vision

15
IBM Grid Press Release 2/8/01
  • Interview with Irving Wladawsky-Berger
  • Grid computing is a set of research management
    services that sit on top of the OS to link
    different systems together
  • We will work with the Globus community to build
    this layer of software to help share resources
  • All of our systems will be enabled to work with
    the grid, and all of our middleware will
    integrate with the software

16
Particle Physics and Astronomy e-Science Projects
  • GridPP
  • links to EU DataGrid, CERN LHC Computing
    Project, U.S. GriPhyN and PPGrid Projects, and
    iVDGL Global Grid Project
  • AstroGrid
  • links to EU AVO and US NVO projects

17
GridPP Project (1)
  • CERN LHC Machine due to be completed by 2006
  • ATLAS/CMS Experiments each involve more than 2000
    physicists from more than 100 organisations in
    USA, Europe and Asia
  • Within first year of operation from 2006 (7?)
    each project will need to store, access, process
    and analyze 10 PetaBytes of data
  • Use hierarchical Tiers of data and compute
    Centres providing 200 Tflop/s

18
GridPP Project (2)
  • LHC Datavolume expected to reach 1 Exabyte and
    require several PetaFlop/s compute power by 2015
  • Use simulated data production and analysis from
    2002
  • 5 complexity data challenge (Dec 2002)
  • 20 of the 2007 CPU and 100 complexity (Dec
    2005)
  • Start of LHC operation 2006 (7?)
  • Testbed Grid deployments from 2001

19
IRC e-HealthCare Grand Challenge
  • Equator Technological innovation in physical and
    digital life
  • AKT Advanced Knowledge Technologies -
  • DIRC Dependability of Computer-Based Systems
  • From Medical Images and Signals to Clinical
    Information

20
EPSRC e-Science Projects (1)
  • Comb-e-ChemStructure-Property Mapping
  • Southampton, Bristol, Roche, Pfizer, IBM
  • DAME Distributed Aircraft Maintenance
    Environment
  • York, Oxford, Sheffield, Leeds, Rolls Royce
  • Reality Grid A Tool for Investigating Condensed
    Matter and Materials
  • QMW, Manchester, Edinburgh, IC, Loughborough,
    Oxford, Schlumberger,

21
EPSRC e-Science Projects (2)
  • My Grid Personalised Extensible Environments for
    Data Intensive in silico Experiments in Biology
  • Manchester, EBI, Southampton, Nottingham,
    Newcastle, Sheffield, GSK, Astra-Zeneca, IBM
  • GEODISE Grid Enabled Optimisation and Design
    Search for Engineering
  • Southampton, Oxford, Manchester, BAE, Rolls Royce
  • Discovery Net High Throughput Sensing
    Applications
  • Imperial College, Infosense,

22
Comb-e-ChemStructure-Property Mapping
  • Goal is to integrate structure and property data
    sources within knowledge environment to find new
    chemical compounds with desirable properties
  • Accumulate, integrate and model extensive range
    of primary data from combinatorial methods
  • Support for provenance and automation including
    multimedia and metadata
  • Southampton, Bristol, Cambridge Crystallographic
    Data Centre
  • Roche Discovery, Pfizer, IBM

23
MyGrid An e-Science Workbench
  • Goal is to develop workbench to support
  • Experimental process of data accumulation
  • Use of community information
  • Scientific collaboration
  • Provide facilities for resource selection, data
    management and process enactment
  • Bioinformatics applications
  • Functional genomics, database annotation
  • Manchester, EBI, Newcastle,Nottingham, Sheffield,
    Southampton
  • GSK, AstraZeneca, Merck, IBM, Sun, ...

24
Grid Database Requirements (1)
  • Scalability
  • Store Petabytes of data at TB/hr
  • Low response time for complex queries to retrieve
    data for more processing
  • Large number of clients needing high access
    throughput
  • Grid Standards for Security, Accounting, ..
  • GSI with digital certificates
  • Data from multiple DBMS
  • Co-schedule database and compute servers

25
Grid Database Requirements (2)
  • Handling Unpredictable Usage
  • Most existing DB applications have reasonably
    predictable access patterns and usage ond DB
    resources can be restricted
  • Typical commercial applications generate large
    numbers of small transactions from large number
    of users
  • Grid applications can have small number of large
    transactions needing more ad hoc access to DBMS
    resources
  • much greater variations in time and resource usage

26
Grid Database Requirements (3)
  • Metadata-driven access
  • Expect need 2-step access to data
  • Step 1 Metadata search to locate required data
    on one or more DBMS
  • Step 2 Data accessed, sent to compute server for
    further analysis
  • Application writer does not know which specific
    DBMS accessed in Step 2
  • Need standard API for Grid-enabled DBMS
  • Multiple Database Integration
  • - Support distributed queries and transactions
  • - Scalability requirements

27
Grid-Service Interface to DBs (Thoughts of Paul
Watson)
  • Services would include
  • Metadata
  • Used by location and directory services
  • Query
  • Use GridFTP, support streaming and computation
    co-scheduling?
  • Transactions
  • Support distributed transactions via Virtual DBMS?

28
Grid-Service Interface to DBs(continued)
  • Bulk Loading
  • Use Grid FTP?
  • Scheduling
  • Allow DBMS and Compute resource to be
    co-scheduled and bandwidth pre-allocated
  • Major challenge for DBMS to support resource
    pre-allocation and management?
  • Accounting
  • Provide information for Grid accounting and
    capacity planning

29
  • Application projects use Clusters,
    Supercomputers, Data Repositories
  • Emphasis on support for data federation and
    annotation as much as computation
  • Metadata and ontologies key to higher level Grid
    services
  • For commercial success Grid needs to have
    interface to DBMS
Write a Comment
User Comments (0)
About PowerShow.com