The Grid a brief briefing - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

The Grid a brief briefing

Description:

Why does it? What are you doing? Governance & Control. Online Access to. Scientific Instruments ... not everything is JCL, FTP and LDAP. Distributed computation ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 57
Provided by: Carole153
Category:
Tags: brief | briefing | does | for | ftp | grid | stand | what

less

Transcript and Presenter's Notes

Title: The Grid a brief briefing


1
The Grida brief briefing
  • Carole Goble
  • Information Management Group

2
Roadmap
  • What is the Grid?
  • Example projects
  • Relationship to the Semantic Web
  • Example architectures
  • The international programme

3
Take Home
  • The Grid is an international activity
  • The Grid has attracted high profile industrial
    and government support and funding
  • The Information/Knowledge Grid is in many ways
    indistinguishable from the Semantic Web
  • The Grid Communitys understanding of generic and
    theoretical issues for the IK Grid is immature
    and hackery.

4
So whats the Grid?
  • Isnt it just High Performance Computing for High
    Energy Physicists?

5
Why Grids?
  • Large-scale science and engineering are done
    through the interaction of people, heterogeneous
    computing resources, information systems, and
    instruments, all of which are geographically and
    organizationally dispersed.
  • The overall motivation for Grids is to
    facilitate the routine interactions of these
    resources in order to support large-scale science
    and engineering.

From Bill Johnston 27 July 01
6
CERN Large Hadron Collider (LHC)
Raw Data 1 Petabyte / sec Filtered 100Mbyte /
sec 1 Petabyte / year 1 Million CD ROMs
CMS Detector
7
Why Grids?
  • A biochemist exploits 10,000 computers to screen
    100,000 compounds in an hour
  • A biologist combines a range of diverse and
    distributed resources (databases, tools,
    instruments) to answer complex questions
  • 1,000 physicists worldwide pool resources for
    petaop analyses of petabytes of data
  • Civil engineers collaborate to design, execute,
    analyze shake table experiments

From Steve Tuecke 12 Oct. 01
8
Why Grids? (contd.)
  • Climate scientists visualize, annotate, analyze
    terabyte simulation datasets
  • An emergency response team couples real time
    data, weather model, population data
  • A multidisciplinary analysis in aerospace couples
    code and data in four companies
  • A home user invokes architectural design
    functions at an application service provider

From Steve Tuecke 12 Oct. 01
9
Why Grids? (contd.)
  • An application service provider purchases cycles
    from compute cycle providers
  • Scientists working for a multinational soap
    company design a new product
  • A community group pools members PCs to analyze
    alternative designs for a local road

From Steve Tuecke 12 Oct. 01
10
The Grid Vision
  • flexible, secure, coordinated resource-sharing
    among dynamic collections of individuals,
    institutions, and resourceswhat we refer to as
    virtual organisations
  • The Anatomy of the Grid Enabling Scalable
    Virtual Organizations Foster, Kesselman and
    Tuecke, 2001

11
The Grid Problem
  • Enable communities (virtual organizations) to
    share geographically distributed resources as
    they pursue common goals -- assuming the absence
    of
  • central location,
  • central control,
  • omniscience,
  • existing trust relationships.

From Steve Tuecke 12 Oct. 01
12
Large scale
  • Multi-disciplinary simulation
  • Decision support and optimization
  • Virtual prototyping
  • Collaborative analysis and visualization
  • Large scale distributed data management
  • Large scale distributed computation
  • High speed communications
  • Dynamic collaborative virtual organisations

13
What is it? Where is it? How to get it? When did
it? happen?
Who knows it? Why does it? What are you doing?
interrogation
results
workflows
Governance Control
Collaboration Grid
Technology Grid
14
Online Access to Scientific Instruments
Advanced Photon Source
wide-area dissemination
desktop VR clients with shared controls
archival storage
real-time collection
tomographic reconstruction
DOE X-ray grand challenge ANL, USC/ISI, NIST,
U.Chicago
From Steve Tuecke 12 Oct. 01
15
Supernova Cosmology
16
  • GRID Software Components
  • An efficient data transfer mechanism
  • A resource broker
  • An interface for coupled applications
  • An interface for "computing-on-demand
  • An interface for interactive use
  • Distributed Simulation Codes for e-Science
    Testbed
  • Biomolecular simulations
  • Weather prediction
  • Coupled CAE simulations
  • ASP-type services
  • Real-time data processing

17
Network for EarthquakeEngineering Simulation
  • NEESgrid national infrastructure to couple
    earthquake engineers with experimental
    facilities, databases, computers, each other
  • On-demand access to experiments, data streams,
    computing, archives, collaboration

NEESgrid Argonne, Michigan, NCSA, UIUC, USC
From Steve Tuecke 12 Oct. 01
18
Home ComputersEvaluate AIDS Drugs
  • Community
  • 1000s of home computer users
  • Philanthropic computing vendor (Entropia)
  • Research group (Scripps)
  • Common goal advance AIDS research

From Steve Tuecke 12 Oct. 01
19
myGrid
  • Personalised extensible environments for
    data-intensive in silico experiments in biology
  • Straightforward discovery, interoperation,
    sharing
  • Workflow oriented
  • provenance
  • propagating change
  • Individual creativity collaborative working
  • personalisation

20
myGrid resources
  • Question
  • Nucleotide binding protein in mouse
  • Answer
  • P12345 in Swiss-Prot is an ATPase
  • Terri Attwood is an expert on this
  • Jackson Labs have a database but you need to
    register
  • A paper has just been published in Proteins by
    the Stanford lab on this.

21
GeoDISE engineering design optimisation
  • Access to knowledge repository
  • Access to optimisation and search tools
  • Industrial analysis codes
  • Distributed computing and data resources in
    design optimisation
  • Applied to industrial problems - large scale CFD
    codes
  • Demonstrate scalability across distributed
    computational and data resources and teams of
    designers

22
GeoDISE Modern engineering firms are global and
distributed
How to ?
CAD and analysis tools, user interfaces, PSEs,
and Visualization
improve design environments cope with legacy
code / systems
Optimisation methods
produce optimized designs
Management of distributed compute and data
resources
integrate large-scale systems in a
flexible way
Data archives (e.g. design/ system usage)
archive and re-use design history
Knowledge repositories knowledge capture and
reuse tools.
capture and re-use knowledge
  • Not just a problem of using HPC

23
Virtual Sky http//virtualsky.org/
24
Broader Context
  • Grid Computing has much in common with major
    industrial thrusts
  • Business-to-business, Peer-to-peer, Application
    Service Providers, Storage Service Providers,
    Distributed Computing, Internet Computing
  • Sharing issues not adequately addressed by
    existing technologies
  • Complicated requirements run program X at site
    Y subject to community policy P, providing access
    to data at Z according to policy Q
  • High performance unique demands of advanced
    high-performance systems

From Steve Tuecke 12 Oct. 01
25
Elements of the Problem
From Steve Tuecke 12 Oct. 01
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Sharing always conditional issues of trust,
    policy, negotiation, payment,
  • Coordinated problem solving
  • Beyond client-server distributed data analysis,
    computation, collaboration,
  • Dynamic, multi-institutional virtual
    organisations
  • Community overlays on classic org structures
  • Large or small, static or dynamic
  • Problem Solving Environments

26
Broader Context
  • Grid Computing has much in common with major
    industrial thrusts
  • Business-to-business, Peer-to-peer, Application
    Service Providers, Storage Service Providers,
    Distributed Computing, Internet Computing
  • Sharing issues not adequately addressed by
    existing technologies
  • Complicated requirements run program X at site
    Y subject to community policy P, providing access
    to data at Z according to policy Q
  • High performance unique demands of advanced
    high-performance systems

From Steve Tuecke 12 Oct. 01
27
The Globus Project
  • Close collaboration with real Grid projects in
    science and industry
  • Development and promotion of standard Grid
    protocols to enable interoperability and shared
    infrastructure
  • Development and promotion of standard Grid
    software APIs and SDKs to enable portability and
    code sharing
  • The Globus Toolkit Open source, reference
    software base for building grid infrastructure
    and applications
  • Global Grid Forum Development of standard
    protocols and APIs for Grid computing

From Steve Tuecke 12 Oct. 01
28
Doesnt Globus solve it all?
  • Globus ToolKit is focused on the
    Data/Computational layer
  • No database connectivity
  • Little brokering, and static not dynamic
  • Weak metadata management, workflow
  • Trashes firewalls
  • No, not everything is JCL, FTP and LDAP
  • Distributed computation dominates etcetc

29
Is it done?
  • NASA Power Grid is the only one really working
  • http//www.ipg.nasa.gov
  • Linking similar supercomputers owned by the same
    organisation
  • Computation-focused
  • High Energy Physics is atypical

30
Example Application Projects
  • AstroGrid astronomy, etc. (UK)
  • Earth Systems Grid environment (US DOE)
  • EU DataGrid physics, environment, etc. (EU)
  • EuroGrid various (EU)
  • Fusion Collaboratory (US DOE)
  • GridLab astrophysics, etc. (EU)
  • Grid Physics Network (US NSF)
  • MetaNEOS numerical optimization (US NSF)
  • NEESgrid civil engineering (US NSF)
  • RealityGrid (UK)
  • DAME (UK)
  • Comb-e-Chem (UK)
  • GeoDISE (UK)
  • iVDGL, StarLight (US/EU)
  • DiscoveryNet (UK)
  • myGrid (UK)
  • GridPP (UK)
  • Particle Physics Data Grid (US DOE)
  • etc

31
  • Since the early days of mankind the primary
    motivation for the establishment of communities
    has been the idea that by being part of an
    organized group the capabilities of an individual
    are improved. The great progress in the area of
    inter-computer communication led to the
    development of means by which stand-alone
    processing sub-systems can be integrated into
    multi-computer communities.

Miron Livny, Study of Load Balancing Algorithms
for Decentralized Distributed Processing
Systems., Ph.D thesis, July 1983.
32
Every Community needs a Matchmaker!
  • Condor uses Matchmakers to build Computing
    Communities out of Commodity Components
  • .. someone has to bring together community
    members who have requests for goods and services
    with members who offer them.
  • Both sides are looking for each other
  • Both sides have constraints
  • Both sides have preferences

33
Lets look at some Architectures
34
A Desiderata (adapted from Globus)
  • Software development toolkits e.g. Globus toolkit
  • Standard protocols, services APIs
  • A modular bag of technologies
  • Enable incremental development of grid-enabled
    tools and applications
  • Reference implementations
  • Learn through deployment and applications
  • Open source

A p p l i c a t i o n s
Diverse global services
Core services
Local OS
35
(No Transcript)
36
Globus Layered Grid ArchitectureCERN - High
Energy Physics
From Steve Tuecke 12 Oct. 01
37
Keith Jeffery
38
"Reproduced by permission of the IT Innovation
Centre, University of Southampton."
http//www.it-innovation.soton.ac.uk
Three Layer Grid Abstraction
Interoperability, higher level ontologies,
reasoning, discovery, Reasoning services,
Discovery services
Fulfillment
Grid
Scientific Problems
Knowledge
Knowledge / capability
Processes
Information
Value chain
Semantics / process
Jobs and Data
Data
Data / applications
Raw Resources
39
Architecture of a Grid
Discipline Specific Portals andScientific
Workflow Management Systems
Applications Simulations, Data Analysis,
etc. Toolkits Visualization, Data
Publication/Subscription, etc.
Grid Common Services Standardized Services and
Resources Interfaces
Collaboration and Remote Instrument Services
Grid Information Service
UniformResourceAccess
Co-Scheduling
Network Cache
Authentication Authorization
Security Services
Communication Services
Global Queuing
Global EventServices
Data Cataloguing
Uniform Data Access
Fault Management
Monitoring
Brokering
Auditing
Globus services
clusters
Distributed Resources
national supercomputer facilities
tertiary storage
national user facilities
Condor pools
networkcaches
High-speed Networks and Communications Services
40
Architecture of a Grid upper layers
  • Knowledge based query
  • Tools to implement the human interfaces, e.g.
    SciRun, ECCE, WebFlow, .....
  • Mechanisms to express, organize, and manage the
    workflow of problem ????solutions
    (frameworks)
  • Access control

Problem Solving Environments
Applications and Supporting Tools
Grid enabled libraries (security, communication
services, data access, global event management,
etc.)
Application Development and Execution Support
Grid Common Services
Distributed Resources
41
Knowledge Based Data Grids
Ingest Services
Management
Access Services
Knowledge or Topic-Based Query / Browse
Knowledge Repository for Rules
Relationships Between Concepts
Knowledge
XTM DTD
Rules - KQL
(Model-based Access)
Information Repository
Attribute- based Query
Attributes Semantics
XML DTD
SDLIP
Information
(Data Handling System - SRB)
Data
Fields Containers Folders
Storage (Replicas, Persistent IDs)
Grids
Feature-based Query
MCAT/HDF
42
Astronomy Sky Survey Data Grid
1. Portals and Workbenches
2.Knowledge Resource Management
Bulk Data Analysis
Metadata View
Data View
Catalog Analysis
3.
Standard APIs and Protocols
Concept space
4.Grid Security Caching Replication Backup Schedu
ling
Information Discovery
Metadata delivery
Data Discovery
Data Delivery
5.
Standard Metadata format, Data model, Wire format
Catalog Mediator
6.
Data mediator
Catalog/Image Specific Access
Compute Resources
Catalogs
Data Archives
Derived Collections
7.
43
User Interfaces
NSDL
Usage Enhancement
Delivery Presentation Aggregation - Channels
Information about collections
Core NSDL Bus
Meta-data delivery Data delivery Query Global
Ids Security Network
Metadata data access-based services
Virtual Collections Mediators
Collection Building
44
ERA Concept model
45
(No Transcript)
46
The De Roure Triangle
Grid Computing
?
e-Science
Agents
Web Services Semantic Web
e-Business
47
Roy Williams Paul Messina
California Institute of Technology
48
So what is going on?
  • UK http//www.escience-grid.org.uk/
  • International http//www.gridforum.org/

49
E-Science Programme
DG Research Councils
Grid TAG
E-Science Steering Committee
Director
Directors Management Role
Directors Awareness and Co-ordination Role
Generic Challenges EPSRC (15m), DTI (15m)
Academic Application Support Programme Research
Councils (74m), DTI (5m) PPARC (26m) BBSRC
(8m) MRC (8m) NERC (7m) ESRC (3m) EPSRC
(17m) CLRC (5m)
80m Collaborative projects
Industrial Collaboration (40m)
From Tony Hey 27 July 01
50
Key Elements of UK Grid Development Plan
  • Development of Generic Grid Middleware
  • Network of Grid Core Programme e-Science Centres
  • National Centre http//www.nesc.ac.uk/
  • Regional Centres http//www.esnw.ac.uk/
  • Grid IRC Grand Challenge Project
  • Support for e-Science Pilots
  • Short term funding for e-Science demonstrators
  • Grid Network Team Grid Engineering Team
  • Grid Support Centre Task Forces

Adapted from Tony Hey 27 July 01
51
Take Home
  • The Grid is an international activity
  • The Grid has attracted high profile industrial
    and government support and funding
  • The Information/Knowledge Grid is in many ways
    indistinguishable from the Semantic Web
  • The Grid Communitys understanding of generic and
    theoretical issues for the IK Grid is immature
    and hackery.

52
Spares
53
Supernova Cosmology
54
Home ComputersEvaluate AIDS Drugs
  • Community
  • 1000s of home computer users
  • Philanthropic computing vendor (Entropia)
  • Research group (Scripps)
  • Common goal advance AIDS research

From Steve Tuecke 12 Oct. 01
55
Grid viewpoints
What is it? Where is it? How to get it? When did
it happen?
Who knows it? Why does it? What are you doing?
interrogation
results
private
New Biology
workflows
public
Governance Control
Access Grid
Technology Grid
56
Particle Physics and Astronomy Research Council
(PPARC)
  • GridPP (http//www.gridpp.ac.uk/)
  • to develop the Grid technologies required to meet
    the LHC computing challenge
  • ASTROGRID (http//www.astrogrid.ac.uk/)
  • a 4M project aimed at building a data-grid for
    UK astronomy, which will form the UK contribution
    to a global Virtual Observatory

57
Infrastructure Deployments
  • Institutional Grid deployments deploying
    services and network infrastructure
  • DISCOM, IPG, TeraGrid, DOE Science Grid, DOD
    Grid, NEESgrid, ASCI (Netherlands)
  • International deployments supporting
    international experiments and science
  • iVDGL, StarLight
  • Support centers
  • U.K. Grid Center
  • U.S. GRIDS Center
Write a Comment
User Comments (0)
About PowerShow.com