Grid Computing - PowerPoint PPT Presentation

About This Presentation
Title:

Grid Computing

Description:

10 instruments on board. 200 Mbps data rate to ground. 400 Tbytes data archived/year ... pre-surgical planning and simulation. Why is the Grid successful? ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 68
Provided by: david2677
Category:
Tags: computing | grid

less

Transcript and Presenter's Notes

Title: Grid Computing


1
Grid Computing from a solid past to a bright
future?
David GroepNIKHEF DataGrid and VL
group2003-03-14
2
Grid more than a hype?
  • Imagine that you could plug your computer
  • into the wall and have direct access to huge
    computing resources immediately,
  • just as you plug in a lamp to get instant light.
  • Far from being science-fiction, this is the idea
  • the XXXXXX project is about to make into reality.

from a project brochure in 2001
3
  • Grids and their (science) applications
  • Origins of the grid
  • What makes a Grid?
  • Grid implementations today
  • New standards
  • Dutch dimensions

4
Grid a vision
Federico.Carminati_at_cern.ch
5
Communities and Apps
ENVISAT
  • 10 instruments on board
  • 200 Mbps data rate to ground
  • 400 Tbytes data archived/year
  • 100 standard products
  • 10 dedicated facilities in Europe
  • 700 approved science user projects

http//www.esa.int/
6
Added value for EO
  • enhance the ability to access high level products
  • allow reprocessing of large historical archives
  • data fusion and cross-validation,

7
The Need for Grids LHC
  • Physics _at_ CERN
  • LHC particle accellerator
  • operational in 2007
  • 5-10 Petabyte per year
  • 150 countries
  • gt 10000 Users
  • lifetime 20 years

40 MHz (40 TB/sec)
level 1 - special hardware
75 KHz (75 GB/sec)
level 2 - embedded
5 KHz (5 GB/sec)
level 3 - PCs
100 Hz (100 MB/sec)
data recording offline analysis
http//www.cern.ch/
8
And More
Bio-informatics
  • For access to data
  • Large network bandwidth to access computing
    centers
  • Support of Data banks replicas (easier and
    faster mirroring)
  • Distributed data banks
  • For interpretation of data
  • GRID enabled algorithms BLAST on distributed
    data banks, distributed data mining

9
Genome pattern matching
10
And even more
  • financial services, life sciences, strategy
    evaluation,
  • instant immersive teleconferencing
  • remote experimentation
  • pre-surgical planning and simulation

11
Why is the Grid successful?
  • Applications need large amounts of data or
    computation
  • Ever larger, distributed user community
  • Network grows faster than compute power/storage

12
Inter-networking systems
  • Continuous growth (now 180 million hosts)
  • Many protocols and APIs (3500 RFCs)
  • Focus on heterogeneity (and security)

http//www.caida.org/
http//www.isc.org/
13
Information on the net
  • Directory-style lookup tools
  • whois, finger for personal info
  • Menu-driven document retrieval (tree)
  • Gopher
  • Mesh of information, multiple media,
    collaboration support
  • WWW
  • Document management systems
  • Conferencing, VoIP

14
Remote Service
  • RPC proved hugely successful within domains
  • Network Information System (YP)
  • Network File System
  • Typical client-server stuff
  • CORBA also intra-domain
  • Extension of RPC to OO design model
  • Diversification
  • Web Services venturing in the inter org. domain
  • Standard service descriptions and discovery
  • Common syntax (XML/SOAP)

15
Grid beginnings - Systems
  • distributed computing research
  • Gigabit network test beds
  • Meta-supercomputing (I-WAY)
  • Condor flocking

GUSTO meta-computing test bed in 1999
16
Grid beginnings - Apps
  • Solve problems using systems in one domain
  • parameter sweeps on batch clusters
  • PIAF for (HE) physics analysis
  • Solvers using systems in multiple domains
  • SETI_at_home
  • Ready for the next step

17
What is the Grid about?
  • Resource sharing and coordinated problem solving
    in dynamic multi-institutional virtual
    organisations
  • Virtual Organisation (VO)
  • A set of individuals or organisations, not under
    single hierarchical control, temporarily joining
    forces to solve a particular problem at hand,
    bringing to the collaboration a subset of their
    resources, sharing those at their discretion and
    each under their own conditions.

18
What makes a Grid?
  • Coordinates resources not subject to central
    control
  • More than cluster centralised distributed
    computing
  • Security, AAA, billingpayment, integrity,
    procedures
  • using standard, open protocols
  • More than single-purpose solutions
  • Requires interoperability, standards body,
    multiple implementations
  • to deliver non-trivial QoS.
  • Sum more than individual components (e.g. single
    sign-on, transparency)

Ian Foster in Grid Today, 2002
19
Grid Architecture (v1)
20
Protocol Layers Bodies
Application
Presentation
Standards bodies GGFW3COASIS
Session
Transport
Standards body IETF
Network
Data Link
Standards body IEEE
Physical
21
Grid Architecture
Make all resources talk standard
protocols Promote interoperability of application
toolkit, similar to interoperability of networks
by Internet standards
Application Toolkits
DUROC
MPICH-G2
Condor-G
VLAM-G
Grid Services
GRAM
GridFTP
MDS
ReplicaSrv
Grid Security Infrastructure (GSI)
Grid Fabric
Condor
MPI
PBS
Internet
Linux
SUN
22
What should the Grid provide?
  • Dependable, consistent and pervasive access
  • Interoperation among organisations
  • Challenges
  • Complete transparency for the user
  • Uniform access methods for computing, data and
    information
  • Secure, trustworthy environment for providers
  • Accounting (and billing)
  • Management-free Virtual Organizations

23
Grid Middleware
  • Globus Project started 1997
  • Focus on research only
  • Used and extended by many other projects
  • Toolkit bag-of-services' approach not a
    complete architecture
  • Several middleware projects
  • EU DataGrid production focus
  • CrossGrid, GridLAB, DataTAG, PPDG, GriPhyN
  • Condor
  • In NL ICES/KIS Virtual Lab, VL-E

http//www.globus.org/
http//www.edg.org/
http//www.vl-e.nl/
24
Condor
  • Scavenging cycles off idle work stations
  • Leading themes
  • Make a job feel at home
  • Dont ever bother the resource owner!
  • Bypassredirect data to process
  • ClassAdsmatchmaking concept
  • DAGmandependent jobs
  • Kangaroofile staging hopping
  • NeSTallocated storage lots
  • PFSPluggable File System
  • Condor-Greliable job control for the Grid

http//www.cs.wisc.edu/condor/
25
Application Toolkits
  • Collect and abstract services in an order fashion
  • Cactus plug-n-play numeric simulations
  • Numeric propulsion system simulation NPSS
  • Commodity Grid Toolkits (CoGs) JAVA, CORBA,
  • NIMROD-G parameter sweeping simulations
  • Condor high-throughput computing
  • GENIUS, VLAM-G, (web) portals to the Grid

26
Grids Today
27
Grid Protocols Today
  • Use common Grid Security Infrastructure
  • Extensions to TLS for delegation (single sign-on)
  • Organisation of users in VOs
  • Currently deployed main services
  • GRAM (resource allocation) attrib/value pairs
    over HTTP
  • GridFTP (bulk file transfer) FTP with GSI and
    high-throughput extras (striping)
  • MDS (monitoring and discovery service) LDAP
    common resource description schema
  • Next generation Grid Services (OGSA)

28
Grid Security Infrastructure
  • Requirements
  • Secure
  • User identification
  • Accountability
  • Site autonomy
  • Usage control
  • Single sign-on
  • Dynamic VOs any time and any place
  • Mobility (easyEverything, airport kiosk,
    handheld)
  • Multiple roles for each user
  • Easy!

29
Authentication PKI
  • Asserting, binding identities
  • Trust issues on a global scale
  • EDG CA Coord. Group
  • 16 national certification authorities CrossGrid
    CAs
  • policies procedures ? mutual trust
  • users identified by CAs certificates
  • Part of world-wide GridPMA
  • Establishing minimum requirements
  • Includes several US and AP CAs
  • Scaling still a challenge

EDG CAs
CERN
CESNET
CNRS (3)
GermanGrid
Grid-Ireland
INFN
NIKHEF
NorduGrid
LIP
Russian DataGrid
DATAGRID-ES
GridPP
USDOE Root CA
US-DOE Sub CA
CrossGrid ()
http//marianne.in2p3.fr/datagrid/ca and
http//www.gridpma.org/
30
Authentication PKI
  • EU DataGrid PKI 1 PMA, 13 Certification
    Authorities
  • Automatic policy evaluation tools
  • Largest Grid-PKI in the world (and growing ?)

31
Getting People TogetherVirtual Organisations
  • The user community out there is large highly
    dynamic
  • Applying at each individual resource does not
    scale
  • Users get together to form Virtual Organisations
  • Temporary alliance of stakeholders (users and/or
    resources)
  • Various groups and roles
  • Managed by (legal) contracts
  • Setup and dissolved at willcurrently not yet
    that fast ?
  • Authentication, Authorization, Accounting (AAA)

32
Authorization (today)
  • Virtual Organisation directories
  • Members are listed in a directory
  • Managed by VO responsible
  • Sites extract access lists from directories
  • Only for VOs they have contract with
  • Still need OS-local accounts
  • May also use automated tools (sysadm level)
  • poolAccounts
  • slashGrid

http//cern.ch/hep-project-grid-scg/
33
Grid Security in Action
  • Key elements in Grid Security Infrastructure
    (GSI)
  • Proxy
  • Trusted certificate store
  • Delegation full or restricted rights
  • Access services directly
  • Establish trust between processes

34
GSI in ActionCreate Processes at A and B that
Communicate Access Files at C
Single sign-on via grid-id generation of
proxy cred.
User Proxy
User
Proxy credential
Or retrieval of proxy cred. from online
repository
Remote process creation requests
Site A (Kerberos)
GSI-enabled GRAM server
GSI-enabled GRAM server
Authorize Map to local id Create process Generate
credentials
Ditto
Site B (Unix)
Computer
Computer
Process
Process
Local id
Local id
Remote file access request
Kerberos ticket
Restricted proxy
Restricted proxy
GSI-enabled FTP server
Site C (Kerberos)
Authorize Map to local id Access file
With mutual authentication
Storage system
35
Large-scale production Grids
  • Until recently usually smallish
  • O(10) sites, O(20) users
  • Only one community (VO)
  • Running Production Grids
  • EU DataGrid (EDG)
  • Stress testing up to 2000 jobs at any time
  • Focus on stability (gt99 of jobs complete
    correctly)
  • VL-E
  • NASA IPG
  • LCG, PPDG/iVDGL

Example Grid
36
EU DataGrid
  • Middleware research project (2001-2003)
  • Driving applications
  • HE Physics
  • Earth Observation
  • Biomedicine
  • Operational testbed
  • 25 sites, 50 CEs
  • 8 VOs
  • 350 users, growing with 50/month!

http//www.eu-datagrid.org/
37
EU DataGrid Test Bed 1
  • DataGrid TB1
  • 14 countries
  • 21 major sites
  • CrossGrid 40 more sites
  • Submitting Jobs
  • Login only once,run everywhere
  • Cross administrativeboundaries in asecure and
    trusted way
  • Mutual authorization

http//marianne.in2p3.fr/
38
EDG 3 Tier Architecture
Request
Request
Result
Data
ClientUser Interface
Execution ResourcesComputeElement
Data ServerStorageElement
Database server
39
Example GOME
Step 8 Visualize Results
40
GOME processing cycle
Raw satellite data from the GOME instrument
Level 1

ESA KNMI Processing of raw GOME data to ozone
profiles With Opera and Noprego
LIDAR data
database
IPSL Validate GOME ozone profiles With Ground
Based measurements
Level 2
DataGrid
Visualization
41
EDG Logical Machine Types
  • Computing Element (CE)
  • Gatekeeper
  • (Front-end Node)
  • Worker Nodes (WN)
  • Storage Element (SE)
  • Replica Catalog (RC)
  • User Interface (UI)
  • Resource Broker (RB)
  • Information Service (IS)

42
Situation on a Grid
INFORMATION SERVICES
43
Information Services (IS)
HARDWARE fabric and storage
  • Cluster information
  • Storage capacity
  • Network connections

Today info-providers publish to IS hierarchical
directory Next week R-GMA producer-consumer
framework based on RDBMS
DATA files and collections
  • File replica locations

Today Replica Catalogue (RC)In few month
Replica Location Service
SOFTWARE programs services
  • RunTime Environment tags
  • Service entries (SE, CE, RC)

Today in IS
44
Grid job submission
  • Basic protocol GRAM
  • Job submission at individual CE
  • Status inqueries
  • Credential delegation
  • File staging
  • Job manager (baby-sitter)
  • Collective services (Workload Mngt System)
  • Resource broker
  • Job submission service
  • Logging and Bookkeeping
  • The EDG WMS tries to optimize the usage of
    resources
  • Will re-submit on resource failure

Many WMS's exist ...
45
The EDG WMS
  • The user interacts with Grid via a Workload
    Management System
  • The Goal of WMS is the distributed scheduling
    and resource management in a Grid environment.
  • What does it allow Grid users to do?
  • To submit their jobs
  • To execute them
  • To get information about their status
  • To retrieve their output

46
WMS Components
  • WMS is currently composed of the following parts
  • User Interface (UI) access point for the user
    to the WMS
  • Resource Broker (RB) the broker of GRID
    resources, responsible to find the best
    resources where to submit jobs
  • Job Submission Service (JSS) provides a
    reliable submission system
  • Information Index (II) a specialized Globus
    GIIS (LDAP server) used by the Resource Broker as
    a filter to the information service (IS) to
    select resources
  • Logging and Bookkeeping services (LB) store Job
    Info available for users to query

47
Job Preparation
  • Information to be specified
  • Job characteristics
  • Requirements and Preferences of the computing
    system
  • Software dependencies
  • Job Data requirements
  • Specified using a Job Description Language (JDL)

48
Example JDL File
  • Executable gridTest
  • StdError stderr.log
  • StdOutput stdout.log
  • InputSandbox home/joda/test/gridTest
  • OutputSandbox stderr.log, stdout.log
  • InputData LFtestbed0-00019
  • ReplicaCatalog ldap//sunlab2g.cnaf.infn.it201
    0/ \ lctest, rcWP2 INFN Test, dcinfn,
    dcit
  • DataAccessProtocol gridftp
  • Requirements other.ArchitectureINTEL \
    other.OpSysLINUX \
  • other.FreeCpus gt4
  • Rank other.MaxCpuTime

This JDL is input to dg-job-submit
49
Job Submission Scenario
Replica Catalogue (RC)
Information Service (IS)
Resource Broker (RB)
Storage Element (SE)
Logging Bookkeeping (LB)
Job Submission Service (JSS)
Compute Element CE)
50
Example
Job Status
Replica Catalogue (RC)
submitted
Information Service (IS)
Resource Broker (RB)
Storage Element (SE)
Logging Bookkeeping (LB)
Job Submission Service (JSS)
Compute Element (CE)
51
Example
Job Status
Replica Catalogue (RC)
submitted
Information Service (IS)
Resource Broker (RB)
Storage Element (SE)
Logging Bookkeeping (LB)
Job Submission Service (JSS)
Compute Element (CE)
52
Example
Job Status
Replica Catalogue (RC)
submitted
Information Service (IS)
waiting
Resource Broker (RB)
Storage Element (SE)
Logging Bookkeeping (LB)
Job Submission Service (JSS)
Compute Element (CE)
53
Example
Job Status
Replica Catalogue (RC)
submitted
Information Service (IS)
waiting
ready
Resource Broker (RB)
Storage Element (SE)
Logging Bookkeeping (LB)
Job Submission Service (JSS)
Compute Element (CE)
54
Example
Job Status
Replica Catalogue (RC)
submitted
Information Service (IS)
waiting
ready
scheduled
Resource Broker (RB)
Storage Element (SE)
Logging Bookkeeping (LB)
Job Submission Service (JSS)
Compute Element (CE)
55
Example
Job Status
Replica Catalogue (RC)
submitted
Information Service (IS)
waiting
ready
scheduled
Resource Broker (RB)
Storage Element (SE)
Logging Bookkeeping (LB)
Job Submission Service (JSS)
Compute Element (CE)
56
Example
Job Status
Replica Catalogue
submitted
Information Service
waiting
ready
scheduled
Resource Broker
running
Storage Element
Logging Bookkeeping
Job Submission Service
Compute Element
57
Example
Job Status
Replica Catalogue
submitted
Information Service
waiting
ready
scheduled
Resource Broker
running
Storage Element
done
Logging Bookkeeping
Job Submission Service
Compute Element
58
Example
Job Status
Replica Catalogue (RC)
submitted
Information Service (IS)
waiting
ready
scheduled
Resource Broker (RB)
running
Storage Element (SE)
done
Logging Bookkeeping (LB)
Job Submission Service (JS)
outputready
Compute Element (CE)
59
Data Access Transport
  • Requirements
  • Support single sign-on
  • Transfer large files quickly
  • Confidentiality/integrity
  • Integrated with information systems (RC)
  • Extensions to FTP protocol GridFTP
  • GSI, DCAU
  • Server striping, parallel streams
  • TCP protocol optimisation

not trivial!
60
EDG Storage Element
  • Transfer methods
  • gridFTP
  • RFIO
  • G-HTTPS
  • Replica Catalogue
  • Yesterday LDAP directory using GDMP
  • Today Replica Location Service and Giggle
  • Backend systems
  • Disk storage
  • HPSS via HRM
  • HPSS with explicit staging

61
Grid Data Bases ?!
  • Database Access and Integration (DAI)-WG
  • OGSA-DAI integration project
  • Data Virtualisation Services
  • Standard Data Source Services
  • Early Emerging Standards
  • Grid Data Service specification (GDS)
  • Grid Data Service Factory (GDSF)
  • Largely spin-off from the UK e-Science effort
    DataGrid

62
Grid Access to Databases
  • SpitFire (standard data source services)uniform
    access to persistent storage on the Grid
  • Multiple roles support
  • Compatible with GSI (single sign-on) though CoG
  • Uses standard stuff JDBC, SOAP, XML
  • Supports various back-end data bases

http//hep-proj-spitfire.web.cern.ch/hep-proj-spit
fire/
63
Spitfire security model
  • Standard access to DBs
  • GSI SOAP protocol
  • Strong authentication
  • Supports single-signon
  • Local role repository
  • Connection pool to
  • Multiple backend DBs
  • Version 1.0 out,
  • WebServices version in alpha

64
Bringing Grids to the User
  • Core services too complex to present to
    scientists
  • design (graphical/web) portals
  • VLAM-G
  • GENIUS/EnginFrame
  • EDG GUI
  • Application-specific interfaces

65
A Bright Future?
66
Grids Around the World
  • Many different grid projects
  • Different goals (and thus architectures)
  • Breath of applications
  • Meta-supercomputing (origin of the Grid)
  • High-throughput computing (DataGrids)
  • Collaboratories, data fusion grids
  • Harnassing idle workstations
  • Transaction-oriented grids (industry)
  • Interoperability requires standardisation!

67
Standards Requirements
  • GGF established in 2001merger of GridForum and
    Egrid Forum
  • Approx. 50 working research groups

http//www.ggf.org/
68
OGSA current directions
  • Open Grid Services Architecture cleaning
    up the protocol mess
  • Use standard containers (based on web services)
  • Based on common standards
  • SOAP, WSDL, UDDI
  • Running over upgraded Grid Security Infra (GSI)
  • New in OGSA adding transient manageable
    services
  • State of distributed activities
  • Workflow, multi-media, distributed data analysis

69
OGSA Roadmap
  • Introduced at GGF4 (Toronto, March 2002)
  • OGSI definition draft went for final call last
    week
  • First implementations Globus Toolkit v3
  • Currently in alpha testing
  • Beta release in July
  • Significant effort towards homogeneous interfaces
  • Large commitment (world-wide and local)

70
Dutch Dimensions
71
SURFnet5 connectivity
http//www.surfnet.nl/
72
Networking Europe
http//www.dante.net/
73
DutchGrid Platform
www.dutchgrid.nl
  • DutchGrid
  • Test bed coordination
  • PKI security
  • Support
  • Participation by
  • NIKHEF, KNMI, SARA
  • DAS-2 (ASCI)TUDelft, Leiden, VU, UvA, Utrecht
  • Telematics Institute
  • FOM, NWO/NCF
  • Min. EZ (ICES/KIS)
  • IBM, KPN,

ASTRONJIVE
Amsterdam
TELIN
Leiden
KNMI
Utrecht
Delft
Nijmegen
74
Resources
  • ASCI DAS-2 (VU, UvA, Leiden, TUDelft, Utrecht)
  • 200 dual P-III 1GHz CPUs
  • homogeneous clusters, 5 locations
  • NIKHEF DataGrid clusters
  • 75 dual P-III 1GHz
  • 1Gb/s IPv4 1Gb/s IPv6
  • NCF Gridnational computer facilities foundation
    from NWO
  • 66 node dual AMD-K7 Fabric Research Cluster
    (NIKHEF)
  • 32 node duals production quality cluster
    (SARA)
  • 10Gb/s optical lambda test bed
  • BioASP various smaller O(1-10 node) clusters

75
Resources (cont.)
  • SARA National HPC Centre
  • Processing
  • SGI 1024 processor MPP
  • Mass storage
  • StorageTek NearLine tape robot
  • currently 500 TByte
  • Integrated as an EDG Storage Element
  • User expertise centre
  • SURFnet networking
  • 2.5-10 Gb/s international
  • 10 Gb/s to dedicated centres (DAS-2, ASTRON)

76
A Bright Future!
You could plug your computer into the wall and
have direct access to huge (computing) resources
almost immediately (with a little help from
toolkits and portals) It may still be science
although not fiction but we are working hard to
get there!
Write a Comment
User Comments (0)
About PowerShow.com