Grid Computing - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Grid Computing

Description:

April 19, 1965. Processor Evolution. From Processor Generation n to n 1, ... Livny, and Matt Mutka, Condor: A Hunter of Idle Workstations, IEEE 8th Intl. Conf. ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 43

Provided by: Adm952

Learn more at: http://lyle.smu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Grid Computing

1
Grid Computing

Advance Computer Architecture
CSE 8383

2
Outline

The Motivation Behind Grid
Existing Grid Infrastructures
Integral Components of a Grid System
Open Issues

3
Back to the Future

The complexity for minimum component costs has
increased at a rate of roughly a factor of two
per year
circuit densities in chips would double every
12-18 months.
Gordon Moore, Electronics Magazine. April 19,
1965

4
Processor Evolution

From Processor Generation n to n1,
Gate delay reduces by 1/?2 (basic frequency goes
up by ?2)
Number of transistors in a constant area (cost)
goes up by 2
Additional transistors enable an additional
increase in performance by the factor of ?2
Deeper pipeline, Offset pipeline penalties
(branch prediction), Increased number and/or size
of caches, Exploit Instruction Level Parallelism
(ILP)
Result 2x Performance at roughly the equal cost

5
Processor Evolution

Over the past 37 years, Moores law has
successfully predicted the exponential growth of
component densities

Year Transistors
1970 2,300
1975 6,000
1980 29,000
1985 275,000
1990 5,500,000
2000 42,000,000
6
Processor Evolution

Moores law held
Mainly because of advances in micro-architecture
that exploited huge growth in transistor/area and
that overcame interconnect limitations
Scarcity of new micro-architecture ideas
Pipelining, branch prediction, IP, Caching, has
reached a point of maturity
Power / Current Issues
Power density limitations
Voltage and doping (less switching energy
avaialable)

7
From single-CPU to Multi-Core

Gate delay does not reduce much (basic frequency
goes up a little)
Number of transistors in a constant area (cost)
goes up by 2
Global Wiring is not stressed
Result 2x performance at roughly equal cost

8
From single-CPU to Multi-Core

On the order of 1000, classic pipelined CPUs
can be fit in the area of a state-of-the-art
high-end CPU.
Can have the same performance as high end CPUs
but with limited scope
Result 1000x performance at roughly equal the
cost
If dual-core is 2x, quad-core is 4x, then why not?

9
Multi-Core The Pit Falls

Amdahls Law
A Portion of all parallel execution is serialized
Communication dependence on uncompleted
parallel execution
Conflicts Shared Resources
Coordination Synchronization with other
executions
Overhead For above mechanisms (semaphores,
locks, etc)
Speedup 1/(serial (1-serial)/CPUs)
All parallel computations are uniform and take
equal time

10
Costs

The cost of manufacturing facilities doubles
every generation. In the late 1980s,
billion-dollar plants seemed like something a
long way in the future. They seemed almost
inconceivable. But now, Intel has two plants that
will cost more than 2.5 billion. If we double it
for a couple of generations, were looking at 10
billion plants. I dont think theres any
industry in the world that builds 10 billion
plants, although oil refineries probably come
close
-Moores Law Repealed, Sort of, in Wired 1997
In 1995, the cost for a wafer fabrication plant
was US 1 billion and approximately 1 percent of
the annual microprocessor market. By 2011, the
cost will increase to US 70 billion per plant
and approximately 13 percent of the annual
microprocessor market

11
Multi-Core The Pit Falls

Lack of single parallel system organization
Client/ Server Architectures
Data Parallel
Pipelining
Streaming
One size does not fit all

12
Lessons from History - Summary

Moores law will not continue to provide 2x
performance every other year, using the usual
techniques in CPU architecture and process
technology
It is time that software must play its part to
provide performance
Such architecture must support all kind of
parallelism models
Parallel programming must move to higher level of
abstraction to become more pervasive

13
Parallelism through Software - Grid

Grid infrastructure will provide us with the
ability to dynamically link together resources as
an ensemble to support the execution of
large-scale, resource-intensive, and distributed
applications.

14
Grid Infrastructure

Catalysts for Grid Evolution
Increase in Processing power / storage capacity /
interconnection capacity
Cost to own
Off the shelf components

15
Grid Infrastructure

Grid is inherently distributed, heterogeneous and
dynamic
as compared to cluster, whose presence is
centralized, resource are homogenous and
composition is static (including bandwidth).
Due to high bandwidth connectivity, resource
sharing is easy

16
Grid Infrastructure

Resources can reside in multiple domains, and can
be utilized as per need basis

17
Grid Components - Hardware

Raw Resources
Interconnection Networks
Storage

18
Grid Components Software

Security / Usage / Fairness Policies
Scheduling and Resource Management
Program Execution
Data movement
Program migration
Programming Models

19
Grid Components Stack
20
Grid Workflow
21
Grid Workflow
22
Grid Resource Management

Resource Discovery
Resource Allocation
Resource Maintenance
Administrative Hierarchy
Communication Services
Information Services
Naming Services
Distributed File System and Caching
Security and Authorization
System Status and Fault Tolerance

23
Grid Resource Management - Globus

Coined the term Virtual Organization
Globus was envisioned as a
system for sharing computational resources
Resource manager, which is capable of collecting
resources and matching them with potential
requests
For scheduling Globus relies on third party
schedulers e.g. Condor (-G) and AppLes

24
Grid Resource Management - Globus

Provides a framework for
Discovering resources
Resource Information Base
Co-Allocation
Monitoring Resources and Online Control
Service Level Agreements (SLAs)

25
Grid Resource Management - Globus
RSL specialization
RSL
Application
Information Service
Queries
Info
Ground RSL
Simple ground RSL
Local resource managers
GRAM
GRAM
GRAM
LSF
EASY-LL
NQE
26
Grid Resource Management - Globus

Advance Reservation Mechanism
Given dynamic nature of Grid, mechanism ensuring
(future) availability of resource is necessary
Service Level Agreements (SLA)
Resource Service Level Agreements (RSLAs)
Task Service Level Agreements (TSLAs)
Binding Service Level Agreements (BSLAs)

27
Grid Scheduling

How can we execute a set of tasks T, on a set of
processors P subject to some set of optimizing
criteria C
Distributed Vs Parallel
True test is the support for autonomy of the
individual node (not communication!)
Design autonomy, communication autonomy,
execution autonomy, and administrative autonomy
By this test, hypercube is parallel machine
Network of Workstations is distributed

28
Grid Scheduling

Distributed Architectures
Network of Workstations (Commodity Grid
Computing)
Grid dedicated machines
Grid of Clusters

29
Grid Scheduling

Efficient application performance and efficient
system performance are not necessarily the same
It may not be possible to obtain optimal
performance for multiple applications
simultaneously
Load balancing may not provide the optimal
scheduling policy
Application and system environment must be
modeled in some detail in order to determine a
performance-efficient schedule

30
Grid Scheduling

Authorization Filtering
Application Definition
Min. Requirement Filtering

Advanced Reservation
Job Submission
Preparation Tasks
Monitoring Progress
Job Completion
Clean-up Tasks

Information gathering
System Selection

31
Grid Scheduling - Condor

Modern processing environments that consist
of large collections of workstations
interconnected by high capacity network raise the
following challenging question can we satisfy
the needs of users who need extra capacity
without lowering the quality of service
experienced by the owners of under utilized
workstations? . . . The Condor scheduling system
is our answer to this question.
Michael Litzkow, Miron Livny, and Matt
Mutka, Condor A Hunter of Idle Workstations,
IEEE 8th Intl. Conf. on Dist. Comp. Sys., June
1988.

32
Grid Scheduling Condor

Classads

33
Grid Scheduling Condor

Matching / Match making / Task Allocation

34
Grid Scheduling Condor

Schedulers
Master Worker

35
Grid Scheduling Condor

Schedulers
DAGMan

36
Grid Scheduling Condor

Prepare job to run un-attended Batch processing
Select the condor run time environment (universe)
Serial Job, Parallel Jobs, Grid and
Meta-scheduler
Create a submit description file
Submit the job

37
Grid Scheduling Condor

Preemptive Resume Scheduling
Take advantage of resources that may only be
available occasionally
Handling of job priority
Fair sharing
Checkpoint
Preempt
run elsewhere

38
Open Issues

How to study grid ?
Actual Implementation
Simulation
Simulation Emulation
Managing Workstations / Workers
Gangalia
Availability based models
Failure based models

39
Open Issues

Resource Managers
Maintaining state of the resources
Push based models
Pull based models
Sharing Fairness
Scheduling
Core of Grid
Parallel programs come in many flavors
Parameter-sweep applications (massively parallel)
Dependency oriented graphs

40
Open Issues

Application profiling
Taking out the guess work from predicting the
application performance on a grid workstation
Data Staging
How to minimize data transfer ?
How to transfer data in minimum amount of time
Network flow problem
esp. in data parallel applications

41
Applications