CS 267: Applications of Parallel Computers Grid Computing

About This Presentation

Title:

CS 267: Applications of Parallel Computers Grid Computing

Description:

Architecture and technologies Projects Future The Grid Problem ... Integration platform for experimental middleware technologies UC, USC ... – PowerPoint PPT presentation

Number of Views:395

Avg rating:3.0/5.0

Slides: 82

Provided by: KathyY150

Category:

more less

Transcript and Presenter's Notes

Title: CS 267: Applications of Parallel Computers Grid Computing

1
CS 267 Applications of Parallel
ComputersGrid Computing

Kathy Yelick
Material based on lectures by
Ian Foster, Carl Kesselman

2
Grid Computing is NOT
Slide source Jim Napolitano _at_ RPI
3
Problem Overview

Many activities in research and education are
collaborative
sharing of data and code
sharing of computing and experimental facilities
Goal of grids is to simplify these activities
Approach
Create advanced middleware services that enable
large-scale, flexible resource sharing on a
national and international scale.

Slide source Ian Foster _at_ ANL
4
Outline

Problem statement What are Grids?
Architecture and technologies
Projects
Future

5
The Grid Problem

Resource sharing coordinated problem solving
in dynamic, multi-institutional virtual
organizations

Slide source Ian Foster _at_ ANL
6
What is a Grid?

The "Grid problem is defined by Ian Foster as
flexible, secure, coordinated resource sharing
among dynamic collections of individuals,
institutions, and resources (referred to as
virtual organizations)
Terminology from analogy to the electric power
grid
Computing as a (public) utility
The key technical challenges in grid computing
authentication
authorization
resource access
resource discovery
Grid sometime categorized
Data grids, computational grids, business grids,

Slide derived from Sara Murphy _at_ HP
7
The evolving notion of the Grid

A computational grid is a hardware and software
infrastructure that provides dependable,
consistent, pervasive, and inexpensive access to
high-end computing capabilities.
Ian Foster and Carl Kesselman, editors, The
GRID Blueprint for a New Computing
Infrastructure (Morgan-Kaufmann Publishers, SF,
1999) 677 pp. ISBN 1-55860-8
The Grid is an infrastructure to enable virtual
communities to share distributed resources to
pursue common goals
The Grid infrastructure consists of protocols,
application programming interfaces, and software
development kits to provide authentication,
authorization, and resource location/access
Foster, Kesselman, Tuecke The Anatomy of the
Grid Enabling Scalable Virtual Organizations
http//www.globus.org/research/papers.html
The Grid integrates services across distributed,
heterogeneous, dynamic virtual organizations
formed from the disparate resources within a
single enterprise and/or from external resource
sharing and service provider relationships in
both e-business and e-science
Foster, Kesselman, Nick, Tuecke The Physiology
of the Grid http//www.globus.org/research/papers
/ogsa.pdf

8
Elements of the Problem

Resource sharing
Computers, storage, sensors, networks,
Sharing always conditional issues of trust,
policy, negotiation, payment,
Coordinated problem solving
Beyond client-server distributed data analysis,
computation, collaboration,
Dynamic, multi-institutional virtual
organizations
Community overlays on classic org structures
Large or small, static or dynamic

Slide source Ian Foster _at_ ANL
9
Why Grids?

A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour
1,000 physicists worldwide pool resources for
petaflop analyses of petabytes of data
Civil engineers collaborate to design, execute,
analyze shake table experiments
Climate scientists visualize, annotate, analyze
terabyte simulation datasets
A home user invokes architectural design
functions at an application service provider
Service provider buys cycles from compute cycle
providers

Slide source Dave Angulo _at_ U Chicago
10
Why Grids Now?
Conventional Wisdom, General Terms

Deployed internet bandwidth is increasing at a
faster rate than either CPU speed or memory or
data storage size
Therefore, it makes more sense to plan for
computing that is portable so that it can be
distributed worldwide.

11
Why Grids Now?

CPU speed doubles every 18 months
Moores Law
Data storage doubles every 12 months
Deployed network bandwidth doubles every 9 months
1986-2001 x340,000
Gilders Law

Internet bandwidth increasing faster than CPU
speed or memory or data storage size
Therefore, plan for computing that is portable
so that it can be distributed worldwide.

12
The Grid World Current Status

Dozens of major Grid projects in scientific
technical computing/research education
Deployment, application, technology
Considerable consensus on key concepts and
technologies
Globus Toolkit has emerged as de facto standard
for major protocols services
Although there are still competing alternatives
Global Grid Forum is a significant force
Cross project commnity

Slide derived from Dave Angulo _at_ U Chicago
13
Science Grids Really Big Science

The process of Large-Scale Science is changing
Large-scale science and engineering problems
require collaborative use of many compute, data,
and instrument resources all of which must be
integrated with application components and
efficient use of large resources is important
data sets that are
developed by independent teams of researchers
or are obtained from multiple instruments
at different geographic locations

Slide derived from Bill Johnston _at_ LBNL
14
Grid Applications in Physics

GriPhyN
CS and Physics collaboration to develop virtual
data concept
Physics ATLAS, CMS, LIGO, Sloan Digital Sky
Survey
CS build on existing technologies such as
Globus, Condor, fault tolerance, storage
management (SRB), plus new computer science
research
Main goal is to develop a virtual data catalog
and data language (VDC, VDL, VDLI)
iVDGL
International Virtual Data Grid Laboratory
Platform to design, implement, integrate and test
grid software
Infrastructure for ATLAS and CMS prototype Tier2
centers
Forum for grid interoperability collaborate
with EU DataGrid, DataTag, etc.

Slide source Rob Gardner _at_ Indiana U
15
Data Grids for High Energy Physics
Compact Muon Spectrometer at CERN
PBytes/sec
100 MBytes/sec
Offline Processor Farm 20 TIPSz
There is a bunch crossing every 25 nsecs. There
are 100 triggers per second Each triggered
event is 1 MByte in size
100 MBytes/sec
Tier 0
CERN Computer Centre
622 Mbits/sec
or Air Freight
(deprecated)
Tier 1
FermiLab 4 TIPS
France Regional Centre
Italy Regional Centre
Germany Regional Centre
622 Mbits/sec
Tier 2
622 Mbits/sec
Institute 0.25TIPS
Institute
Institute
Institute
Physics data cache
1 MBytes/sec
Tier 4
Physicist workstations
Slide source Ian Foster _at_ ANL
Image courtesy Harvey Newman, Caltech
16
Network for Earthquake Eng. Simulation

NEESgrid national infrastructure to couple
earthquake engineers with experimental
facilities, databases, computers, each other
On-demand access to experiments, data streams,
computing, archives, collaboration

Slide source Ian Foster _at_ ANL
NEESgrid Argonne, Michigan, NCSA, UIUC, USC
17
What is a Grid Architecture?

Descriptive
Provide a common vocabulary for use when
describing Grid systems
Guidance
Identify key areas in which services are required
Prescriptive
Define standard Intergrid protocols and APIs to
facilitate creation of interoperable Grid systems
and portable applications

Slide source Ian Foster _at_ ANL
18
What Sorts of Standards?

Need for interoperability when different groups
want to share resources
E.g., IP lets me talk to your computer, but how
do we establish maintain sharing?
How do I discover, authenticate, authorize,
describe what I want to do, etc., etc.?
Need for shared infrastructure services to avoid
repeated development, installation, e.g.
One port/service for remote access to computing,
not one per tool/application
X.509 enables sharing of Certificate Authorities

Slide source Ian Foster _at_ ANL
19
A Grid Architecture Must Address

Development of Grid protocols services
Protocol-mediated access to remote resources
New services e.g., resource brokering
On the Grid speak Intergrid protocols
Mostly (extensions to) existing protocols
Development of Grid APIs SDKs
Facilitate application development by supplying
higher-level abstractions
The model is the Internet and Web

Slide source Ian Foster _at_ ANL
20
Grid Services (aka Middleware) and Tools
net
Slide source Ian Foster _at_ ANL
21
Layered Grid Architecture
A Grid architecture uses layers, analogous to IP
layers
Application
Slide source Ian Foster _at_ ANL
22
Where Are We With Architecture?

No official standards exist
But
Globus Toolkit has emerged as the de facto
standard for several important Connectivity,
Resource, and Collective protocols
GGF has an architecture working group
Technical specifications are being developed for
architecture elements e.g., security, data,
resource management, information
Internet drafts submitted in security area

Slide source Ian Foster _at_ ANL
23
Grid Services Architecture (2) Connectivity Layer

Communication
Internet protocols IP, DNS, routing, etc.
Security Grid Security Infrastructure (GSI)
Uniform authentication authorization mechanisms
in multi-institutional setting
Single sign-on, delegation, identity mapping
Public key technology, SSL, X.509, GSS-API
(several Internet drafts document extensions)
Supporting infrastructure Certificate
Authorities, key management, etc.

Slide source Ian Foster _at_ ANL
24
GSI in Action Create Processes at A and B that
Communicate Access Files at C
User
Site B (Unix)
Site A (Kerberos)
Computer
Computer
Site C (Kerberos)
Storage system
Slide source Ian Foster _at_ ANL
25
Grid Services Architecture (3) Resource Layer

Resource Layer has Protocols and Services
Resource management GRAM
Remote allocation, reservation, monitoring,
control of compute resources
Data access GridFTP
High-performance data access transport
Information MDS (GRRP, GRIP)
Access to structure state information
others emerging catalog access, code
repository access, accounting,
All integrated with GSI

Slide source Ian Foster _at_ ANL
26
GRAM Resource Management Protocol

Grid Resource Allocation Management
Allocation, monitoring, control of computations
Simple HTTP-based RPC
Job request Returns opaque, transferable job
contact string for access to job
Job cancel, Job status, Job signal
Event notification (callbacks) for state changes
Protocol/server address robustness (exactly once
execution), authentication, authorization
Servers for most schedulers C and Java APIs

Slide source Ian Foster _at_ ANL
27
Resource Management

Advance reservations
As prototyped in GARA in previous 2 years
Multiple resource types
Manage anything storage, networks, etc., etc.
Recoverable requests, timeout, etc.
Build on early work with Condor group
Use of SOAP (RPC using HTTP XML)
First step towards Web Services
Policy evaluation points for restricted proxies

Slide source Ian Foster _at_ ANL
Karl Czajkowski, Steve Tuecke, others
28
Data Access Transfer

GridFTP extended version of popular FTP protocol
for Grid data access and transfer
Secure, efficient, reliable, flexible,
extensible, parallel, concurrent, e.g.
Third-party data transfers, partial file
transfers
Parallelism, striping (e.g., on PVFS)
Reliable, recoverable data transfers
Reference implementations
Existing clients and servers wuftpd, nicftp
Flexible, extensible libraries

Slide source Ian Foster _at_ ANL
29
Grid Services Architecture (4) Collective Layer

Index servers aka metadirectory services
Custom views on dynamic resource collections
assembled by a community
Resource brokers (e.g., Condor Matchmaker)
Resource discovery and allocation
Replica management and replica selection
Optimize aggregate data access performance
Co-reservation and co-allocation services
End-to-end performance
Etc.

Slide source Ian Foster _at_ ANL
30
The Grid Information Problem

Large numbers of distributed sensors with
different properties
Need for different views of this information,
depending on community membership, security
constraints, intended purpose, sensor type

Slide source Ian Foster _at_ ANL
31
The Globus Toolkit Solution MDS-2

Registration enquiry protocols, information
models, query languages
Provides standard interfaces to sensors
Supports different directory structures
supporting various discovery/access strategies

Slide source Ian Foster _at_ ANL
Karl Czajkowski, Steve Fitzgerald, others
32
GriPhyN/PPDG Data Grid Architecture
Application
initial solution is operational
DAG
Catalog Services
Monitoring
Planner
Info Services
Repl. Mgmt.
DAG
Executor
Policy/Security
Reliable Transfer Service
Compute Resource
Storage Resource
Slide source Ian Foster _at_ ANL
Ewa Deelman, Mike Wilde, others
www.griphyn.org
33
The Network Weather Service

A distributed system for producing short-term
deliverable performance forecasts
Goal dynamically measure and forecast the
performance deliverable at the application level
from a set of network resources
Measurements currently supported
Available fraction of CPU time
End-to-end TCP connection time
End-to-end TCP network latency
End-to-end TCP network bandwidth

Slide source Rich Wolski _at_ UCSB
34
NWS System Architecture

Design objectives
Scalability scales to any metacomputing
infrastructure
Predictive accuracy provides accurate
measurements and forecasts
Non-intrusiveness shouldnt load the resources
it monitors
Execution longevity available all time
Ubiquity accessible from everywhere, monitors
all resources

Slide source Rich Wolski _at_ UCSB
35
System Components

Four different component processes
Persistent State process handles storage of
measurements
Name Server process directory server for the
system
Sensor processes measure current performance of
different resources
Forecaster process predicts deliverable
performance of a resource during a given time

Slide source Rich Wolski _at_ UCSB
36
NWS Processes
Slide source Rich Wolski _at_ UCSB
37
NWS Components

Persistent State Management
Naming Server
Performance Monitoring NWS Sensors
CPU Sensor
Network Sensor
Sensor Control
Cliques hierarchy and contention
Adaptive time-out discovery
Forecasting
Forecaster and forecasting models
Sample forecaster results

Slide source Rich Wolski _at_ UCSB
38
Persistent State Management

All NWS processes are stateless
The system state (measurements) are managed by
the PS process
Storage retrieval of measurements
Measurements are time-stamped plain-text strings
Measurements are written to disk immediately and
acknowledged
Measurements are stored in a circular queue of
tunable size

Slide source Rich Wolski _at_ UCSB
39
Naming Server

Primitive text string directory service for the
NWS system
The only component known system-wide
Information stored include
Name to IP binding information
Group configuration
Parameters for various processes
Each process must refresh its registration with
the name server periodically
Centralized

Slide source Rich Wolski _at_ UCSB
40
Performance Monitoring

Actual monitoring is performed by a set of
sensors
Accuracy vs. Intrusiveness
A sensors life

Register with the NS Query the NS for
parameters Generate conditional test Forever
if conditions are met then perform
test time-stamp results and send them to the
PS refresh registration with the NS
Slide source Rich Wolski _at_ UCSB
41
CPU Sensor

Measures available CPU fraction
Testing tools
Unix uptime reports load average in the past x
minutes
Unix vmstat reports idle-, user- and system-time
Active probes
Accuracy
Results assume a full priority job
Doesnt know the priority of jobs in the queue

Slide source Rich Wolski _at_ UCSB
42
Active Probing Improvements
Measurements produced using vmstat
Measurements produced using uptime
Slide source Rich Wolski _at_ UCSB
43
Network Sensor

Carries network-related measurements
Testing using active network probes
Establish and release TCP connections
Moving large (small) data to measure bandwidth
(delay)
Measures connections with all peer sensors
Problems
Accuracy depends on socket interface
Complexity N2-N tests, collisions (contention)

Slide source Rich Wolski _at_ UCSB
44
Network Sensor Control

Sensors are organized into sensor sets called
cliques
Each clique is configurable and has one leader
Clique sets are logical, but can be based on
physical topology
Leaders are elected using a distributed election
protocol
A sensor can participate in many cliques
Advantages
Scalability by organizing cliques in a hierarchy
Reduce the N2-N
Accuracy by more frequent tests

Slide source Rich Wolski _at_ UCSB
45
Clique Hierarchy
Slide source Rich Wolski _at_ UCSB
46
Contention

Each leader maintains a clique token (and time
between tokens)
The sensor that has the token performs all its
tests then passes the token to the next sensor in
the list
Adaptive time-out discovery
Tokens have time-out field
Tokens have sequence numbers
The leader adaptively controls the time-out

Slide source Rich Wolski _at_ UCSB
47
Forecaster Process

A forecasting driver compile-time prediction
modules
Forecasting process
Fetching required measurements from the PS
Passing the time series to each prediction module
Choosing the best returned prediction
Incorporate sophisticated prediction techniques?

UC Santa Barbara Kansas State U. Recorded
Bandwidth
UC Santa Barbara Kansas State U. Forecasted
Bandwidth
Slide source Rich Wolski _at_ UCSB
48
Sample Graph
49
Selected Major Grid Projects
Name URL Sponsors Focus
Access Grid www.mcs.anl.gov/FL/accessgrid DOE, NSF Create deploy group collaboration systems using commodity technologies
BlueGrid IBM Grid testbed linking IBM laboratories
DISCOM www.cs.sandia.gov/discomDOE Defense Programs Create operational Grid providing access to resources at three U.S. DOE weapons laboratories
DOE Science Grid sciencegrid.org DOE Office of Science Create operational Grid providing access to resources applications at U.S. DOE science laboratories partner universities
Earth System Grid (ESG) earthsystemgrid.orgDOE Office of Science Delivery and analysis of large climate model datasets for the climate research community
European Union (EU) DataGrid eu-datagrid.org European Union Create apply an operational grid for applications in high energy physics, environmental science, bioinformatics
New
New
Slide source Ian Foster _at_ ANL
50
Selected Major Grid Projects
Name URL/Sponsor Focus
EuroGrid, Grid Interoperability (GRIP) eurogrid.org European Union Create technologies for remote access to supercomputer resources simulation codes in GRIP, integrate with Globus
Fusion Collaboratory fusiongrid.org DOE Off. Science Create a national computational collaboratory for fusion research
Globus Project globus.org DARPA, DOE, NSF, NASA, Msoft Research on Grid technologies development and support of Globus Toolkit application and deployment
GridLab gridlab.org European Union Grid technologies and applications
GridPP gridpp.ac.uk U.K. eScience Create apply an operational grid within the U.K. for particle physics research
Grid Research Integration Dev. Support Center grids-center.org NSF Integration, deployment, support of the NSF Middleware Infrastructure for research education
New
New
New
New
New
Slide source Ian Foster _at_ ANL
51
Selected Major Grid Projects
Name URL/Sponsor Focus
Grid Application Dev. Software hipersoft.rice.edu/grads NSF Research into program development technologies for Grid applications
Grid Physics Network griphyn.org NSF Technology RD for data analysis in physics expts ATLAS, CMS, LIGO, SDSS
Information Power Grid ipg.nasa.gov NASA Create and apply a production Grid for aerosciences and other NASA missions
International Virtual Data Grid Laboratory ivdgl.org NSF Create international Data Grid to enable large-scale experimentation on Grid technologies applications
Network for Earthquake Eng. Simulation Grid neesgrid.org NSF Create and apply a production Grid for earthquake engineering
Particle Physics Data Grid ppdg.net DOE Science Create and apply production Grids for data analysis in high energy and nuclear physics experiments
New
New
Slide source Ian Foster _at_ ANL
52
Selected Major Grid Projects
Name URL/Sponsor Focus
TeraGrid teragrid.org NSF U.S. science infrastructure linking four major resource sites at 40 Gb/s
UK Grid Support Center grid-support.ac.uk U.K. eScience Support center for Grid projects within the U.K.
Unicore BMBFT Technologies for remote access to supercomputers
New
New
Also many technology RD projects e.g., Condor,
NetSolve, Ninf, NWS See also www.gridforum.org
Slide source Ian Foster _at_ ANL
53
The 13.6 TF TeraGridComputing at 40 Gb/s
Site Resources
Site Resources
26
HPSS
HPSS
4
24
External Networks
External Networks
8
5
Caltech
Argonne
External Networks
External Networks
NCSA/PACI 8 TF 240 TB
SDSC 4.1 TF 225 TB
Site Resources
Site Resources
HPSS
UniTree
Slide source Ian Foster _at_ ANL
TeraGrid/DTF NCSA, SDSC, Caltech, Argonne
www.teragrid.org
54
iVDGLInternational Virtual Data Grid Laboratory
Slide source Ian Foster _at_ ANL
U.S. PIs Avery, Foster, Gardner, Newman, Szalay
www.ivdgl.org
55
NSF GRIDS Center

Grid Research, Integration, Deployment, Support
(GRIDS) Center
Develop, deploy, support
Middleware infrastructure for national-scale
collaborative science and engineering
Integration platform for experimental middleware
technologies
UC, USC/ISI, UW, NCSA, SDSC
Partner with Internet-2, SURA, Educause in NSF
Middleware Initiative

Slide source Ian Foster _at_ ANL
www.grids-center.org
www.nsf-middleware-org
56
The State of Grids Some Case Studies

Further, Grids are becoming a critical element of
many projects e.g.
The High Energy Physics problem of managing and
analyzing petabytes of data per year has driven
the development of Grid Data Services
The National Earthquake Engineering Simulation
Grid has developed a highly application oriented
approach to using Grids
The Astronomy data federation problem has
promoted work in Web Services based interfaces

57
High Energy Physics Data Management

Petabytes of data per year must be distributed to
hundreds of sites around the world for analysis
This involves
Reliable, wide-area, high-volume data management
Global naming, replication, and caching of
datasets
Easily accessible pools of computing resources
Grids have been adopted as the infrastructure for
this HEP data problem

58
High Energy Physics Data Management CERN / LHC
Data One of Sciences most challenging data
management problems
100 MBytes/sec
event simulation
Online System
PByte/sec
Tier 0 1
eventreconstruction
human2m
HPSS
CERN LHC CMS detector 15m X 15m X 22m, 12,500
tons, 700M.
2.5 Gbits/sec
Tier 1
German Regional Center
French Regional Center
FermiLab, USA Regional Center
Italian Center
0.6-2.5 Gbps
analysis
Tier 2
0.6-2.5 Gbps
Tier 3
CERN/CMS data goes to 6-8 Tier 1 regional
centers, and from each of these to 6-10 Tier 2
centers. Physicists work on analysis channels
at 135 institutes. Each institute has 10
physicists working on one or more channels. 2000
physicists in 31 countries are involved in this
20-year experiment in which DOE is a major player.
Institute 0.25TIPS
Institute
Institute
Institute
100 - 1000 Mbits/sec
Physics data cache
Tier 4
Courtesy Harvey Newman, CalTech
Workstations
59
High Energy Physics Data Management

Virtual data catalogues and on-demand data
generation have turned out to be an essential
aspect
Some types of analysis are pre-defined and
catalogued prior to generation - and then the
data products are generated on demand when the
virtual data catalogue is accessed
Sometimes regenerating derived data is faster and
easier than trying to store and/or retrieve that
data from remote repositories
For similar reasons this is also of great
interest to the EOS (Earth Observing Satellite)
community

60
US-CMS/LHC Grid Data Services TestbedInternation
al Virtual Data Grid Laboratory
metadatadescriptionof analyzeddata
Interactive User Tools
Data GenerationRequestExecution Management
Tools
Data Generation RequestPlanning Scheduling
Tools
Virtual Data Tools

Metadata catalogues
Virtual data catalogues

Security andPolicy
Other GridServices
ResourceManagement
Core Grid Services
Transforms
Distributed resources(code, storage,
CPUs,networks)
Raw datasource
61
CMS Event Simulation Using GriPhyN

Production Run on Integration Testbed (400 CPUs
at 5 sites)
Simulate 1.5 million full CMS events for physics
studies
2 months continuous running across 5 testbed
sites
Managed by a single person at the US-CMS Tier
1site
30 CPU years delivered 1.5 Million Events to
CMS Physicists

62
National Earthquake Engineering Simulation Grid

NEESgrid will link earthquake researchers across
the U.S. with leading-edge computing resources
and research equipment, allowing collaborative
teams (including remote participants) to plan,
perform, and publish their experiments
Through the NEESgrid, researchers will
perform tele-observation and tele-operation of
experiments shake tables, reaction walls, etc.
publish to, and make use of, a curated data
repository using standardized markup
access computational resources and open-source
analytical tools
access collaborative tools for experiment
planning, execution, analysis, and publication

63
NEES Sites

Large-Scale Laboratory Experimentation Systems
University at Buffalo, State University of New
York
University of California at Berkeley
University of Colorado, Boulder
University of Minnesota-Twin Cities
Lehigh University
University of Illinois, Urbana-Champaign
Field Experimentation and Monitoring
Installations
University of California, Los Angeles
University of Texas at Austin
Brigham Young University

Shake Table Research Equipment
University at Buffalo, State University of New
York
University of Nevada, Reno
University of California, San Diego
Centrifuge Research Equipment
University of California, Davis
Rensselaer Polytechnic Institute
Tsunami Wave Basin
Oregon State University, Corvallis, Oregon
Large-Scale Lifeline Testing
Cornell University

64
NEESgrid Earthquake Engineering Collaboratory
Instrumented Structures and Sites
Remote Users
Simulation Tools Repository
High-Performance Network(s)
Laboratory Equipment
Field Equipment
Curated Data Repository
Large-scale Computation
Global Connections
Remote Users (K-12 Faculty and Students)
Laboratory Equipment
65
NEESgrid Approach

Package a set of application level services and
the supporting Grid software in a singlepoint
of presence (POP)
Deploy the POP to a select set of earthquake
engineering sites to provide the applications,
data archiving, and Grid services
Assist in developing common metadata so that the
various instruments and simulations can work
together
Provide the required computing and data storage
infrastructure

66
NEESgrid Multi-Site Online Simulation (MOST)

A partnership between the NEESgrid team, UIUC and
Colorado Equipment Sites to showcase NEESgrid
capabilities
A large-scale experiment conducted in multiple
geographical locations which combines physical
experiments with numerical simulation in an
interchangeable manner
The first integration of NEESgrid services with
application software developed by Earthquake
Engineers (UIUC, Colorado and USC) to support a
real EE experiment
See http//www.neesgrid.org/most/

67
NEESgrid Multi-Site Online Simulation (MOST)
UIUC Experimental Setup
U. Colorado Experimental Setup
68
Multi-Site, On-Line Simulation Test (MOST)
Colorado Experimental Model
UIUC Experimental Model
SIMULATION COORDINATOR

UIUC MOST-SIM
Dan Abrams
Amr Elnashai
Dan Kuchma
Bill Spencer
and others
Colorado FHT
Benson Shing
and others

NCSA Computational Model
69
1994 Northridge Earthquake SimulationRequires a
Complex Mix of Data and Models
Pier 7
Pier 5
Pier 8
Pier 6
NEESgrid provides the common data formats,
uniform dataarchive interfaces, and
computational services needed to supportthis
multidisciplinary simulation
Amr Elnashai, UIUC
70
NEESgrid Architecture
Java Applet
Web Browser
User Interfaces
MultidisciplinarySimulations
Collaborations
Experiments
Curated Data Repository
Simulation Tools Repository
SIMULATION COORDINATOR
Data AcquisitionSystem
NEESpop
NEES Operations
E-Notebook Services
Metadata Services
CompreHensive collaborativE Framework (CHEF)
NEESgrid Monitoring
Video Services
GridFTP
NEESGrid StreamingData System
Accounts MyProxy
Grid Services
NEES distributed resources
Instrumented Structures and Sites
Large-scale Storage
Large-scale Computation
Laboratory Equipment
71
The Changing Face of Observational Astronomy

Large digital sky surveys are becoming the
dominant source of data in astronomy gt 100 TB,
growing rapidly
Current examples SDSS, 2MASS, DPOSS, GSC,
FIRST, NVSS, RASS, IRAS CMBR experiments
Microlensing experiments NEAT, LONEOS, and other
searches for Solar system objects
Digital libraries ADS, astro-ph, NED, CDS, NSSDC
Observatory archives HST, CXO, space and
ground-based
Future QUEST2, LSST, and other synoptic surveys
GALEX, SIRTF, astrometric missions, GW detectors
Data sets orders of magnitude larger, more
complex, and more homogeneous than in the past

72
The Changing Face of Observational Astronomy

Virtual Observatory Federation of N archives
Possibilities for new discoveries grow as O(N2)
Current sky surveys have proven this
Very early discoveries from Sloan (SDSS),2
micron (2MASS), Digital Palomar (DPOSS)
see http//www.us-vo.org

73
Sky Survey Federation
74
Mining Data is Often a Critical Aspect of Doing
Science

The ability to federate survey data is enormously
important
Studying the Cosmic Microwave Background a key
tool in studying the cosmology of the universe
requires combined observations from many
instruments in order to isolate the extremely
weak signals of the CMB
The datasets that represent the material
between us and the CMB are collected from
different instruments and are stored and curated
at many different institutions
This is immensely difficult without approaches
like National Virtual Observatory in order to
provide a uniform interface for all of the
different data formats and locations

(Julian Borrill, NERSC, LBNL)
75
NVO Approach

Focus is on adapting emerging information
technologies to meet the astronomy research
challenges
Metadata, standards, protocols (XML, http)
Interoperability
Database federation
Web Services (SOAP, WSDL, UDDI)
Grid-based computing (OGSA)
Federating data bases is difficult, but very
valuable
An XML-based mark-up for astronomical tables and
catalogs - VOTable
Developed metadata management framework
Formed international registry, dm (data
models), semantics, and dal (data access
layer) discussion groups
As with NEESgrid, Grids are helping to unify the
community

76
NVO Image Mosaicking

Specify box by position and size
SIAP server returns relevant images
Footprint
Logical Name
URL

Can choose standard URL http//....... SRB
URL srb//nvo.npaci.edu/..
77
Atlasmaker Virtual Data System
Metadata repositories Federated by OAI
Higher LevelGrid Services
Data repositories Federated by SRB
2d Store result return result
Core Grid Services
2c Compute on TG/IPG
Compute resources Federated by TG/IPG
78
Background Correction
Uncorrected
Corrected
79
NVO Components
Visualization
Resource/Service Registries
Web Services
Simple Image Access Services
Cone Search Services
VOTable
VOTable
Cross-Correlation Engine
UCDs
UCDs
Streaming
Grid Services
Data archives
Computing resources
80
International Virtual Observatory Collaborations

German AVO
Russian VO
e-Astronomy Australia
IVOA(International Virtual Observatory
Alliance)

Astrophysical Virtual Observatory (European
Commission)
AstroGrid, UK e-scienceprogram
Canada
VO India
VO Japan
(leading the work on VO query language)
VO China

US contacts Alex Szalay szalay_at_jhu.edu, Roy
Williams roy_at_cacr.caltech.edu,Bob Hanisch
lthanisch_at_stsci.edugt
81
And Whats This Got To Do With

CORBA?
Grid-enabled CORBA underway
Java, Jini, Jxta?
Java CoG Kit. Jini, Jxta future uncertain
Web Services, .NET, J2EE?
Major Globus focus (GRAM-2 SOAP, WSDL)
Workflow/choreography services
Q What can Grid offer to Web services?
Next revolutionary technology of the month?
Theyll need Grid technologies too

Slide source Ian Foster _at_ ANL
82
The Future All Software is Network-Centric

We dont build or buy computers anymore, we
borrow or lease required resources
When I walk into a room, need to solve a problem,
need to communicate
A computer is a dynamically, often
collaboratively constructed collection of
processors, data sources, sensors, networks
Similar observations apply for software

Slide source Ian Foster _at_ ANL
83
And Thus

Reduced barriers to access mean that we do much
more computing, and more interesting computing,
than today gt Many more components ( services)
massive parallelism
All resources are owned by others gt Sharing (for
fun or profit) is fundamental trust, policy,
negotiation, payment
All computing is performed on unfamiliar systems
gt Dynamic behaviors, discovery, adaptivity,
failure

Slide source Ian Foster _at_ ANL
84
Acknowledgments

Globus RD is joint with numerous people
Carl Kesselman, Co-PI
Steve Tuecke, principal architect at ANL
Others to be acknowledged
GriPhyN RD is joint with numerous people
Paul Avery, Co-PI Newman, Lazzarini, Szalay
Mike Wilde, project coordinator
Carl Kesselman, Miron Livny CS leads
ATLAS, CMS, LIGO, SDSS participants others
Support DOE, DARPA, NSF, NASA, Microsoft

Slide source Ian Foster _at_ ANL
85
Summary

The Grid problem Resource sharing coordinated
problem solving in dynamic, multi-institutional
virtual organizations
Grid architecture Emphasize protocol and service
definition to enable interoperability and
resource sharing
Globus Toolkit a source of protocol and API
definitions, reference implementations
See globus.org, griphyn.org, gridforum.org,
grids-center.org, nsf-middleware.org

Slide source Ian Foster _at_ ANL

Write a Comment

User Comments (0)