The Anatomy of the Grid Enabling Scalable Virtual Organizations - PowerPoint PPT Presentation

About This Presentation
Title:

The Anatomy of the Grid Enabling Scalable Virtual Organizations

Description:

The Anatomy of the Grid. Enabling Scalable Virtual Organizations. Ian Foster ... Civil engineers collaborate to design, execute, & analyze shake table experiments ... – PowerPoint PPT presentation

Number of Views:194
Avg rating:3.0/5.0
Slides: 37
Provided by: ianf174
Category:

less

Transcript and Presenter's Notes

Title: The Anatomy of the Grid Enabling Scalable Virtual Organizations


1
The Anatomy of the GridEnabling Scalable
Virtual Organizations
  • Ian Foster
  • Mathematics Computer Science Division
  • Argonne National Laboratory
  • and
  • Dept. of Computer Science
  • The University of Chicago
  • http//www.mcs.anl.gov/foster

David S. Angulo Dept. of Computer Science The
University of Chicago and Mathematics Computer
Science Division Argonne National
Laboratory http//www.cs.uchicago.edu/dangulo
2nd US-Hungarian Workshop on Cluster and Grid
Computing, February 6, 2002
2
Abstract
  • "Grid" computing has emerged as an important new
    field
  • Distinguished from conventional distributed
    computing by focus on
  • Large-scale resource sharing
  • Innovative applications
  • High-performance orientation (in some cases)
  • In this talk, this new field is defined
  • First, "Grid problem reviewed, which Ian Foster
    defines as
  • flexible, secure, coordinated resource sharing
  • among dynamic collections of individuals,
    institutions, and resources (referred to as
    virtual organizations)
  • Challenges in such settings
  • authentication
  • authorization
  • resource access
  • resource discovery
  • and other challenges

3
Abstract (Cont.)
  • This class of problem addressed by Grid
    technologies
  • Major Grid projects worldwide reviewed
  • Describe their contributions to the realization
    of this architecture.
  • Future Architecture Overview
  • Open Grid Services Architecture is presented

4
Partial Acknowledgements
  • Globus ToolkitTM
  • RD involves
  • many fine scientists engineers at ANL/UofC,
    USC/ISI, and elsewhere (see www.globus.org)
  • Led by
  • Ian Foster _at_ Argonne/UofC
  • Carl Kesselman _at_ USC/ISI
  • Open Grid Services Architecture work performed by
  • Ian Foster, Globus Co-PI _at_ Argonne/UofC
  • Carl Kesselman, Globus Co-PI _at_ USC/ISI
  • Steve Tuecke, Globus Toolkit Architect _at_ANL
  • Jeff Nick, Steve Graham, Jeff Frey _at_ IBM
  • Strong collaborations with many outstanding EU,
    UK, US Grid projects
  • Support from DOE, NASA, NSF, Microsoft, IBM

5
Grid Computing
6
The Grid Problem
  • Resource sharing coordinated problem solving
    in dynamic, multi-institutional virtual
    organizations

7
Why Grids?
  • A biochemist exploits 10,000 computers to screen
    100,000 compounds in an hour
  • 1,000 physicists worldwide pool resources for
    petaflop analyses of petabytes of data
  • Civil engineers collaborate to design, execute,
    analyze shake table experiments
  • Climate scientists visualize, annotate, analyze
    terabyte simulation datasets
  • A home user invokes architectural design
    functions at an application service provider
  • An application service provider purchases cycles
    from compute cycle providers

8
Elements of the Problem
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Sharing always conditional issues of trust,
    policy, payment,
  • Coordinated problem solving
  • Beyond client-server distributed data analysis,
    computation,
  • Dynamic, multi-institutional virtual orgs
  • Community overlays on classic org structures
  • Large or small, static or dynamic

9
Grids Why Now?
  • Moores law improvements in computing produce
    highly functional end systems
  • The Internet and burgeoning wired and wireless
    provide universal connectivity
  • Network exponentials produce dramatic changes in
    geometry and geography

10
Grids Why Now?
  • Moores law improvements in computing produce
    highly functional endsystems
  • The Internet and burgeoning wired and wireless
    provide universal connectivity
  • Network exponentials produce dramatic changes in
    geometry and geography
  • 9-month doubling double Moores law!
  • 1986-2001 x340,000 2001-2010 x4000?

11
A Little History
  • Early 90s
  • Gigabit testbeds, metacomputing
  • Mid to late 90s
  • Early experiments (e.g., I-WAY), software
    projects (e.g., Globus), application experiments
  • 2002
  • Major application communities emerging
  • Major infrastructure deployments are underway
  • Rich technology base has been constructed
  • Global Grid Forum gt1000 people on mailing lists,
    192 orgs at last meeting, 28 countries

12
The Grid World Current Status
  • Dozens of major Grid projects in scientific
    technical computing/research education
  • Deployment, application, technology
  • Considerable consensus on key concepts and
    technologies
  • Globus Toolkit has emerged as de facto standard
    for major protocols services
  • Global Grid Forum has emerged as a significant
    force
  • And first Grid proposals at IETF

13
Selected Major Grid Projects
Name URL Sponsors Focus
Access Grid www.mcs.anl.gov/FL/accessgrid DOE, NSF Create deploy group collaboration systems using commodity technologies
BlueGrid IBM Grid testbed linking IBM laboratories
DISCOM www.cs.sandia.gov/discomDOE Defense Programs Create operational Grid providing access to resources at three U.S. DOE weapons laboratories
DOE Science Grid sciencegrid.org DOE Office of Science Create operational Grid providing access to resources applications at U.S. DOE science laboratories partner universities
Earth System Grid (ESG) earthsystemgrid.orgDOE Office of Science Delivery and analysis of large climate model datasets for the climate research community
European Union (EU) DataGrid eu-datagrid.org European Union Create apply an operational grid for applications in high energy physics, environmental science, bioinformatics
New
New
14
Selected Major Grid Projects
Name URL/Sponsor Focus
EuroGrid, Grid Interoperability (GRIP) eurogrid.org European Union Create technologies for remote access to supercomputer resources simulation codes in GRIP, integrate with Globus
Fusion Collaboratory fusiongrid.org DOE Off. Science Create a national computational collaboratory for fusion research
Globus Project globus.org DARPA, DOE, NSF, NASA, Msoft Research on Grid technologies development and support of Globus Toolkit application and deployment
GridLab gridlab.org European Union Grid technologies and applications
GridPP gridpp.ac.uk U.K. eScience Create apply an operational grid within the U.K. for particle physics research
Grid Research Integration Dev. Support Center grids-center.org NSF Integration, deployment, support of the NSF Middleware Infrastructure for research education
New
New
New
New
New
15
Selected Major Grid Projects
Name URL/Sponsor Focus
Grid Application Dev. Software hipersoft.rice.edu/grads NSF Research into program development technologies for Grid applications
Grid Physics Network griphyn.org NSF Technology RD for data analysis in physics expts ATLAS, CMS, LIGO, SDSS
Information Power Grid ipg.nasa.gov NASA Create and apply a production Grid for aerosciences and other NASA missions
International Virtual Data Grid Laboratory ivdgl.org NSF Create international Data Grid to enable large-scale experimentation on Grid technologies applications
Network for Earthquake Eng. Simulation Grid neesgrid.org NSF Create and apply a production Grid for earthquake engineering
Particle Physics Data Grid ppdg.net DOE Science Create and apply production Grids for data analysis in high energy and nuclear physics experiments
New
New
16
Selected Major Grid Projects
Name URL/Sponsor Focus
TeraGrid teragrid.org NSF U.S. science infrastructure linking four major resource sites at 40 Gb/s
UK eScience Grid grid-support.ac.uk U.K. eScience Support center for Grid projects within the U.K.
Unicore BMBFT Technologies for remote access to supercomputers
New
New
Also many technology RD projects e.g., Condor,
NetSolve, Ninf, NWS See also www.gridforum.org
17
Grid Communities ApplicationsData Grids for
High Energy Physics
www.griphyn.org www.ppdg.net
www.eu-datagrid.org
18
Grid Communities and ApplicationsMathematicians
Solve NUG30
  • Communityan informal collaboration of
    mathematicians and computer scientists
  • Condor-G delivers 3.46E8 CPU seconds in 7 days
    (peak 1009 processors) in U.S. and Italy (8
    sites)
  • Solves NUG30 quadratic assignment problem

14,5,28,24,1,3,16,15, 10,9,21,2,4,29,25,22, 13,26,
17,30,6,20,19, 8,18,7,27,12,11,23
www.mcs.anl.gov/metaneos Argonne, Iowa, NWU,
Wisconsin
19
Grid Communities and ApplicationsNetwork for
Earthquake Eng. Simulation
  • NEESgrid national infrastructure to couple
    earthquake engineers with experimental
    facilities, databases, computers, each other
  • On-demand access to experiments, data streams,
    computing, archives, collaboration

NEESgrid Argonne, Michigan, NCSA, UIUC, USC
www.neesgrid.org
20
The 13.6 TF TeraGridComputing at 40 Gb/s
Site Resources
Site Resources
26
HPSS
HPSS
4
24
External Networks
External Networks
8
5
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 8 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
TeraGrid/DTF NCSA, SDSC, Caltech, Argonne
www.teragrid.org
21
Intl. Virtual Data Grid Lab.
www.ivdgl.org
22
Access Grid
  • Collaborative work among large groups
  • 50 sites worldwide
  • Use Grid services for discovery, security
  • www.scglobal.org

Access Grid Argonne, others
www.accessgrid.org
23
Grid Architecture Globus Toolkit
  • The question
  • What is needed for resource sharing coordinated
    problem solving in dynamic virtual organizations
    (VOs)?
  • The answer
  • Major issues identified membership, resource
    discovery access, ,
  • Grid architecture captures core elements,
    emphasizing pre-eminent role of protocols
  • Globus Toolkit has emerged as de facto standard
    for major protocols services

24
The Critical Role of Protocols
  • Need for interoperability when different groups
    want to share resources
  • E.g., IP lets me talk to your computer, but how
    do we establish maintain sharing?
  • How do I discover, authenticate, authorize,
    describe what I want to do, etc., etc.?
  • Need for shared infrastructure services to avoid
    repeated development, installation, e.g.
  • One port/service for remote access to computing,
    not one per tool/application
  • X.509 enables sharing of Certificate Authorities

25
Grid Architecture
For more info www.globus.org/research/papers/anat
omy.pdf
26
Globus Project and Toolkit
  • Globus Project
  • RD project at ANL, U.Chicago, USC/ISI
  • Emphasis on identifying and defining core
    protocols and services
  • O(40) researchers developers
  • Globus Toolkit
  • A major product of the Globus Project
  • Open source software reference implementation of
    core protocols services
  • Growing open source developer community

27
Globus Architecture (1)Fabric Layer
  • Diverse resources that may be shared
  • Computers, clusters, Condor pools, file systems,
    archives, metadata catalogs, networks, sensors,
    etc., etc.
  • Speak connectivity, resource protocols
  • The neck of the protocol hourglass
  • May implement standard behaviors
  • Reservation, pre-emption, virtualization
  • Grid operation can have profound implications for
    resource behavior

Registration, enquiry, management, access
protocol(s)
Grid resource
28
Globus Architecture (2)Connectivity Layer
Protocols Services
  • Communication
  • Internet protocols IP, DNS, routing, etc.
  • Security Grid Security Infrastructure (GSI)
  • Uniform authentication authorization mechanisms
    in multi-institutional setting
  • Single sign-on, delegation, identity mapping
  • Public key technology, SSL, X.509, GSS-API
    (several Internet drafts document extensions)
  • Supporting infrastructure Certificate
    Authorities, key management, etc.

29
GSI in Action Create Processes at A and B that
Communicate Access Files at C
User
Site B (Unix)
Site A (Kerberos)
Computer
Computer
Site C (Kerberos)
Storage system
30
Globus Architecture (3)Resource Layer
Protocols Services
  • Resource management GRAM
  • Remote allocation, reservation, monitoring,
    control of compute resources
  • Data access GridFTP
  • High-performance data access transport
  • Information MDS (GRRP, GRIP)
  • Access to structure state information
  • others emerging database access, code
    repository access, accounting,
  • All integrated with GSI

31
GRAM ResourceManagement Protocol
  • Grid Resource Allocation Management
  • Allocation, monitoring, control of computations
  • Secure remote access to diverse schedulers
  • Current evolution
  • Immediate and advance reservation
  • Multiple resource types manage anything
  • Recoverable requests, timeout, etc.
  • Evolve to Web Services
  • Policy evaluation points for restricted proxies

Karl Czajkowski, Steve Tuecke, others
32
Data Access Transfer
  • GridFTP extended version of popular FTP protocol
    for Grid data access and transfer
  • Secure, efficient, reliable, flexible,
    extensible, parallel, concurrent, e.g.
  • Third-party data transfers, partial file
    transfers
  • Parallelism, striping (e.g., on PVFS)
  • Reliable, recoverable data transfers
  • Reference implementations
  • Existing clients and servers wuftpd, nicftp
  • Flexible, extensible libraries

Bill Allcock, Joe Bester, John Bresnahan, Steve
Tuecke, others
33
Grid Services Architecture (4)Collective Layer
Protocols Services
  • Community membership policy
  • E.g., Community Authorization Service
  • Index/metadirectory/ brokering services
  • E.g., Globus GIIS, Condor Matchmaker
  • Replica management and replica selection
  • Optimize aggregate data access performance
  • Co-reservation and co-allocation services
  • End-to-end performance
  • Middle tier services
  • MyProxy credential repository, portal services

34
Data Grids
  • Grid infrastructures, tools, and applications
    focused on enabling distributed access to,
    analysis of, large amounts of data
  • A specialization and extension of standard Grid
    technologies
  • Current application domains include high energy
    nuclear physics, climate data analysis,
    astronomy, bioinformatics

35
Grid Physics Network (GriPhyN)
  • Enabling RD for advanced data grid systems,
    focusing in particular on Virtual Data concept

ATLAS CMS LIGO SDSS
Paul Avery, Ian Foster, Co-PIs
www.griphyn.org
36
Future Directions
  • Initial exploration (1996-1999 Globus 1.0)
  • Extensive appln experiments core protocols
  • Data Grids (1999-?? Globus 2.0)
  • Large-scale data management and analysis
  • Open Grid Services Architecture (2001-??, Globus
    3.0)
  • Integration w/ Web services, hosting envs.
  • Integration with databases
  • Integrated set of higher-level services
  • Scalable systems (2003-??)
  • Sensors, wireless, ubiquitous computing

37
Summary
  • The Grid problem Resource sharing coordinated
    problem solving in dynamic, multi-institutional
    virtual organizations
  • Grid architecture Protocol, service definition
    for interoperability resource sharing
  • Globus Toolkit a source of protocol and API
    definitionsand reference implementations
  • And many projects applying Grid concepts (
    Globus technologies) to important problems
  • Timely to start applying technologies to
    industrial problems, within outside STC

38
For More Information
  • The Globus Project
  • www.globus.org
  • Global Grid Forum
  • www.gridforum.org
  • Grid architecture
  • www.globus.org/research/papers/anatomy.pdf
  • Open Grid Services Architecture (soon)
  • www.globus.org/research/papers/ogsa.pdf
  • www.globus.org/research/papers/gsspec.pdf
Write a Comment
User Comments (0)
About PowerShow.com