Grid Scheduling - PowerPoint PPT Presentation

About This Presentation
Title:

Grid Scheduling

Description:

WW Grid Grid Scheduling A Distributed Computational Economy and the Nimrod-G Grid Resource Broker Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS ... – PowerPoint PPT presentation

Number of Views:1075
Avg rating:3.0/5.0
Slides: 125
Provided by: cloudbusO
Learn more at: http://www.cloudbus.org
Category:
Tags: grid | scheduling

less

Transcript and Presenter's Notes

Title: Grid Scheduling


1
Grid Scheduling
A Distributed Computational Economy and the
Nimrod-G Grid Resource Broker
  • Rajkumar Buyya

Grid Computing and Distributed Systems (GRIDS)
Lab. The University of MelbourneMelbourne,
Australiawww.gridbus.org
2
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for
    Service-Oriented Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod/G -- Grid Resource Broker
  • Scheduling Experiments on World Wide Grid testbed
  • Drug Design Application Case Study
  • GridSim Toolkit and Simulations
  • Conclusions

3
(No Transcript)
4
Virtual Lab
5
The Gridbus Vision To Enable Service Oriented
Grid Computing Bus iness!
WW Grid
Nimrod-G
World Wide Grid!
6
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for
    Service-Oriented Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod/G -- Grid Resource Broker
  • Scheduling Experiments on World Wide Grid testbed
  • GridSim Toolkit and Simulations
  • Conclusions

7
A Typical Grid Computing Environment
Grid Information Service
Grid Resource Broker
Application
R2
R3
R4
R5
RN
Grid Resource Broker
R6
R1
Resource Broker
Grid Information Service
8
Need Grid tools for managing
Application Development Tools
9
What users want ?Users in Grid Economy Strategy
  • Grid Consumers
  • Execute jobs for solving varying problem size and
    complexity
  • Benefit by selecting and aggregating resources
    wisely
  • Tradeoff timeframe and cost
  • Strategy minimise expenses
  • Grid Providers
  • Contribute (idle) resource for executing
    consumer jobs
  • Benefit by maximizing resource utilisation
  • Tradeoff local requirements market opportunity
  • Strategy maximise return on investment

10
Sources of Complexity in Grid for Resource
Management and Scheduling
  • Size (large number of nodes, providers,
    consumers)
  • Heterogeneity of resources (PCs, Workstations,
    clusters, and supercomputers, instruments,
    databases, software)
  • Heterogeneity of fabric management systems
    (single system image OS, queuing systems, etc.)
  • Heterogeneity of fabric management polices
  • Heterogeneity of application requirements (CPU,
    I/O, memory, and/or network intensive)
  • Heterogeneity in resource demand patterns (peak,
    off-peak, ...)
  • Applications need different QoS at different
    times (time critical results). The utility of
    experimental results varies from time to time.
  • Geographical distribution of users located
    different time zones
  • Differing goals (producers and consumers have
    different objectives and strategies)
  • Unsecure and Unreliable environment

11
Traditional approaches to resource management
scheduling are NOT useful for Grid ?
  • They use centralised policy that need
  • complete state-information and
  • common fabric management policy or decentralised
    consensus-based policy.
  • Due to too many heterogenous parameters in the
    Grid it is impossible to define/get
  • system-wide performance matrix and
  • common fabric management policy that is
    acceptable to all.
  • Economic paradigm proved as an effective
    institution in managing decentralization and
    heterogeneity that is present in human economies!
  • Hence, we propose/advocate the use of
    computational economy principles in the
    management of resources and scheduling
    computations on the Grid.

12
Benefits of Computational Economies
  • It provides a nice paradigm for managing self
    interested and self-regulating entities (resource
    owners and consumers)
  • Helps in regulating supply-and-demand for
    resources.
  • Services can be priced in such a way that
    equilibrium is maintained.
  • User-centric / Utility driven Value for money!
  • Scalable
  • No need of central coordinator (during
    negotiation)
  • Resources(sellers) and also Users(buyers) can
    make their own decisions and try to maximize
    utility and profit.
  • Adaptable
  • It helps in offering different QoS (quality of
    services) to different applications depending the
    value users place on them.
  • It improves the utilisation of resources
  • It offers incentive for resource owners for being
    part of the grid!
  • It offers incentive for resource consumers for
    being good citizens
  • There is large body of proven Economic principles
    and techniques available, we can easily leverage
    it.

13
New challenges of Computational Economy
  • Resource Owners
  • How do I decide prices ? (economic models?)
  • How do I specify them ?
  • How do I enforce them ?
  • How do I advertise attract consumers ?
  • How do I do accounting and handle payments?
  • ..
  • Resource Consumers
  • How do I decide expenses ?
  • How do I express QoS requirements ?
  • How I trade between timeframe cost ?
  • .
  • Any tools, traders brokers available to
    automate the process ?

14
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for next
    generation Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod-G -- Grid Resource Broker
  • Deadline and Budget Constrained (DBC) Scheduling
    Experiments on World Wide Grid testbed
  • Conclusions

15
mix-and-match
Object-oriented
Internet/partial-P2P
Grid Computing Approaches
Network enabled Solvers
Market/Computational Economy
Nimrod-G
16
Many Grid Projects Initiatives
  • Australia
  • Nimrod-G
  • GridSim
  • Virtual Lab
  • Active Sheets
  • DISCWorld
  • ..new coming up
  • Europe
  • UNICORE
  • MOL
  • UK eScience
  • Poland MC Broker
  • EU Data Grid
  • EuroGrid
  • MetaMPI
  • Dutch DAS
  • XW, JaWS
  • Japan
  • Ninf
  • USA
  • Globus
  • Legion
  • OGSA
  • Javelin
  • AppLeS
  • NASA IPG
  • Condor-G
  • Jxta
  • NetSolve
  • AccessGrid
  • and many more...
  • Cycle Stealing .com Initiatives
  • Distributed.net
  • SETI_at_Home, .
  • Entropia, UD, Parabon,.
  • Public Forums
  • Global Grid Forum
  • P2P Working Group

http//www.gridcomputing.com
17
Many Testbeds ? who pays ?, who regulates
supply and demand ?
GUSTO (decommissioned)
World Wide Grid
Legion Testbed
NASA IPG
18
Testbeds so far -- observations
  • Who contributed resources why ?
  • Volunteers for fun, challenge, fame, charismatic
    apps, public good like distributed.net
    SETI_at_Home projects.
  • Collaborators sharing resources while developing
    new technologies of common interest Globus,
    Legion, Ninf, Gridbus, Nimrod-G, etc. unless you
    know lab. leaders, it is impossible to get
    access!
  • How long ?
  • Short term excitement is lost, too much of
    admin. Overhead (Globus inst), no incentive,
    policy change,
  • What we need ? Grid Marketplace!
  • Regulates supply-and-demand, offers incentive for
    being players, simple, scalable solution,
    quasi-deterministic proven model in real-world.

19
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for
    Service-Oriented Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod/G -- Grid Resource Broker
  • Scheduling Experiments on World Wide Grid testbed
  • GridSim Toolkit and Simulations
  • Conclusions

20
Building Grid Economy(Next Generation Grid
Computing!)
To enable the creation and promotion of Grid
Marketplace (competitive) ASP Service Oriented
Computing . . . And let users focus on their own
work (science, engineering, or commerce)!
21
GRACE A ReferenceGrid Architecture for
Computational Economy
Grid Bank
Information Service
Grid Market Services
Sign-on
HealthMonitor
Info ?
Grid Node N

Grid Explorer

Secure
ProgrammingEnvironments
Job Control Agent
Grid Node1
Applications
Schedule Advisor
QoS
Pricing Algorithms
Trade Server
Trading
Trade Manager
Accounting
Resource Reservation
Misc. services

Deployment Agent
JobExec
Resource Allocation
Storage
Grid Resource Broker

R1
R2
Rm
Grid Middleware Services
Grid Consumer
Grid Service Providers
22
Grid Components
Applications and Portals
Grid Apps.

Prob. Solving Env.
Collaboration
Engineering
Web enabled Apps
Scientific
Grid Tools
Development Environments and Tools

Web tools
Languages
Libraries
Debuggers
Resource Brokers
Monitoring
Grid Middleware
Distributed Resources Coupling Services

QoS
Security
Information
Process
Resource Trading
Market Info
Local Resource Managers
TCP/IP UDP

Operating Systems
Queuing Systems
Libraries App Kernels
Grid Fabric
Networked Resources across Organisations

Clusters
Data Sources
Scientific Instruments
Storage Systems
Computers
23
Economy Grid Globus GRACE
Applications
Grid Apps.

Science
Engineering
Commerce
Portals
ActiveSheet
High-level Services and Tools

Grid Tools
Cactus
MPI-G
CC
Nimrod Parametric Language
Nimrod-G Broker
Higher Level Resource Aggregators
Core Services
Grid Middleware
GRAM
GASS
GTS
GARA
GBank
GMD
DUROC
MDS
Globus Security Interface (GSI)
Grid Fabric
Local Services
GRD
QBank
JVM
Condor
TCP
UDP
eCash
LSF
PBS
Solaris
Irix
Linux
24
Economic Models
  • Price-based Supply,demand,value, wealth of
    economic system
  • Commodity Market Model
  • Posted Price Model
  • Bargaining Model
  • Tendering (Contract Net) Model
  • Auction Model
  • English, first-price sealed-bid, second-price
    sealed-bid (Vickrey), and Dutch
    (consumerlow,high,rate producerhigh, low,
    rate)
  • Proportional Resource Sharing Model
  • Monopoly (one provider) and Oligopoly (few
    players)
  • consumers may not have any influence on prices.
  • Bartering
  • Shareholder Model
  • Partnership Model

See SPIE ITCom 2001 paper! with Heinz
Stockinger, CERN!
25
Grid Open Trading Protocols
Trade Manager
Get Connected
Reply to Bid (DT)
API
Trade Server
Pricing Rules
Negotiate Deal(DT)
.
Confirm Deal(DT, Y/N)
DT - Deal Template - resource requirements
(TM) - resource profile (TS) - price (any one
can set) - status - change the above
values - negotiation can continue -
accept/decline - validity period
Cancel Deal(DT)
Change Deal(DT)
Get Disconnected
26
Cost Model
  • Without cost model any shared system becomes
    un-managable
  • Charge users more for remote facilities than
    their own
  • Choose cheaper resources before more expensive
    ones
  • Cost units (G) may be
  • Dollars
  • Shares in global facility
  • Stored in bank

27
Cost Matrix _at_ Grid site X
  • Non-uniform costing
  • Encourages use of local resources first
  • Real accounting system can control machine usage

Resource Cost Function (cpu, memory, disk,
network, software, QoS, current demand, etc.)
Simple price based on peaktime, offpeak,
discount when less demand, ..
28
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for
    Service-Oriented Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod/G -- Grid Resource Broker
  • Scheduling Experiments on World Wide Grid testbed
  • GridSim Toolkit and Simulations
  • Conclusions

29
Nimrod/G A Grid Resource Broker
  • A resource broker for managing, steering, and
    executing task farming (parameter sweep/SPMD
    model) applications on Grid based on deadline and
    computational economy.
  • Based on users QoS requirements, our Broker
    dynamically leases services at runtime depending
    on their quality, cost, and availability.
  • Key Features
  • A single window to manage control experiment
  • Persistent and Programmable Task Farming Engine
  • Resource Discovery
  • Resource Trading
  • Scheduling Predications
  • Generic Dispatcher Grid Agents
  • Transportation of data results
  • Steering data management
  • Accounting

30
Parametric Computing(What Users think of Nimrod
Power)
Parameters
Magic Engine
Multiple Runs Same Program Multiple Data
Killer Application for the Grid!
Courtesy Anand Natrajan, University of Virginia
31
Sample P-Sweep/Task Farming Applications
Bioinformatics Drug Design / Protein
Modelling
Combinatorial Optimization Meta-heuristic
parameter estimation
Ecological Modelling Control Strategies for
Cattle Tick
Sensitivityexperiments on smog formation
Data Mining
Electronic CAD Field Programmable Gate Arrays
High Energy Physics Searching for Rare Events
Computer Graphics Ray Tracing
Finance Investment Risk Analysis
VLSI Design SPICE Simulations
Civil Engineering Building Design
Network Simulation
Automobile Crash Simulation
Aerospace Wing Design
astrophysics
32
Drug Design Data Intensive Computing on Grid
Chemical Databases (legacy, in .MOL2 format)
  • It involves screening millions of chemical
    compounds (molecules) in the Chemical DataBase
    (CDB) to identify those having potential to serve
    as drug candidates.

33
MEG(MagnetoEncephaloGraphy) Data Analysis on the
Grid Brain Activity Analysis
64 sensors MEG
Analysis All pairs (64x64) of MEG data by
shifting the temporal region of MEG data over
time 0 to 29750 64x64x29750 jobs
2
3
Data Analysis
1
5
Nimrod-G
4
Life-electronics laboratory, AIST
World-Wide Grid
  • Provision of expertise in
  • the analysis of brain function
  • Provision of MEG analysis

Collaboration with Osaka University, Japan
34
P-study Applications -- Characteristics
  • Code (Single Program sequential or threaded)
  • Long-running Instances
  • Numerous Instances (Multiple Data)
  • High Resource Requirements
  • High Computation-to-Communication Ratio
  • Embarrassingly/Pleasantly Parallel

35
Thesis
  • Perform parameter sweep (bag of tasks) (utilising
    distributed resources) within T hours or early
    and cost not exceeding M.
  • Three Options/Solutions
  • Using pure Globus commands
  • Build your own Distributed App Scheduler
  • Use Nimrod-G (Resource Broker)

36
Remote Execution Steps
Choose Resource
Transfer Input Files
Set Environment
Start Process
Pass Arguments
Monitor Progress
Summary View Job View Event View
Read/Write Intermediate Files
Transfer Output Files
Resource Discovery, Trading, Scheduling,
Predictions, Rescheduling, ...
37
Using Pure Globus/Legion commands
Do all yourself! (manually)
Total Cost???
38
Build Distributed Application Scheduler
Build App case by case basis Complicated
Construction
E.g., AppLeS/MPI based
Total Cost???
39
Nimrod-G Broker Automating Distributed Processing
Compose, Submit, Play!
40
Nimrod Associated Family of Tools
Remote Execution Server (on demand Nimrod Agent)
P-sweep App. Composition Nimrod/ Enfusion Resour
ce Management and Scheduling Nimrod-G
Broker Design Optimisations Nimrod-O App.
Composition and Online Visualization Active
Sheets Grid Simulation in Java GridSim Drug
Design on Grid Virtual Lab
File Transfer Server
41
A Glance at Nimrod-G Broker
Nimrod/G Client
Nimrod/G Client
Nimrod/G Client
Nimrod/G Engine
Schedule Advisor
Trading Manager
Grid Store
Grid Dispatcher
Grid Explorer
Grid Middleware
TM TS
Globus, Legion, Condor, etc.
GE GIS
Grid Information Server(s)
RM TS
RM TS
RM TS
G
C
L
G
Legion enabled node.
Globus enabled node.
L
G
C
L
RM Local Resource Manager, TS Trade Server
Condor enabled node.
See HPCAsia 2000 paper!
42
Nimrod/G Grid Broker Architecture
Legacy Applications
Nimrod-G Clients
Customised Apps (Active Sheet)
Monitoring and Steering Portals
P-Tools (GUI/Scripting) (parameter_modeling)
Farming Engine
Meta-Scheduler
Algorithm1
Programmable Entities Management
Schedule Advisor
. . .
Resources
Jobs
Tasks
Channels
AlgorithmN
Nimrod-G Broker
Agents
AgentScheduler
JobServer
IP hourglass!
Trading Manager
Grid Explorer
Database
Dispatcher Actuators
. . .
Condor-A
Globus-A
Legion-A
P2P-A
. . .
Condor
GMD
Globus
Legion
P2P
GTS
G-Bank
Middleware
. . .
Computers
Storage
Networks
Instruments
Local Schedulers
Fabric
. . .
PC/WS/Clusters
Radio Telescope
Condor/LL/NQS
Database
43
A Nimrod/G Monitor
Deadline
Legion hosts
Globus Hosts
Bezek is in both Globus and Legion Domains
44
User Requirements Deadline/Budget
45
Active SheetMicrosoft Excel Spreadsheet
Processing on Grid
46
(No Transcript)
47
Nimrod/G Interactions
Grid Node
Compute Node
User Node
48
Adaptive Scheduling Steps
Discover More Resources
Discover Resources
Establish Rates
Evaluate Reschedule
Compose Schedule
Meet requirements ? Remaining Jobs, Deadline,
Budget ?
Distribute Jobs
49
Deadline and Budget Constrained Scheduling
Algorithms
Algorithm/Strategy Execution Time (Deadline, D) Execution Cost (Budget, B)
Cost Opt Limited by D Minimize
Cost-Time Opt Minimize when possible Minimize
Time Opt Minimize Limited by B
Conservative-Time Opt Minimize Limited by B, but all unprocessed jobs have guaranteed minimum budget
50
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for
    Service-Oriented Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod/G -- Grid Resource Broker
  • Scheduling Experiments on World Wide Grid testbed
  • GridSim Toolkit and Simulations
  • Conclusions

51
The World Wide Grid Sites
Cardiff/UK Portsmoth/UK Manchester, UK
TI-Tech/Tokyo ETL/Tsukuba AIST/Tsukuba
EUROPE ZIB/Germany PC2/Germany AEI/Germany
Lecce/Italy CNR/Italy Calabria/Italy Pozman/Poland
Lund/Sweden CERN/Swiss CUNI/Czech R. Vrije
Netherlands
ANL/Chicago USC-ISC/LA UTK/Tennessee UVa/Virginia
Dartmouth/NH BU/Boston UCSD/San Diego
Kasetsart/Bangkok
Singapore
Monash/Melbourne VPAC/Melbourne
Santiago/Chile
52
World Wide Grid (WWG)
Australia
North America
ANL SGI/Sun/SP2 USC-ISI SGI UVa Linux
Cluster UD Linux cluster UTK Linux
cluster UCSD Linux PCs BU SGI IRIX
Melbourne U. Cluster VPAC Alpha
Nimrod-GGridbus
GlobusLegion GRACE_TS
Solaris WS
Globus/Legion GRACE_TS
Internet
Europe
Asia
ZIB T3E/Onyx AEI Onyx Paderborn
HPCLine Lecce Compaq SC CNR Cluster Calabria
Cluster CERN Cluster CUNI/CZ Onyx Pozman
SGI/SP2 Vrije U Cluster Cardiff Sun
E6500 Portsmouth Linux PC Manchester O3K
Tokyo I-Tech. Ultra WS AIST, Japan Solaris
Cluster Kasetsart, Thai Cluster NUS, Singapore
O2K
Globus GRACE_TS
Chile Cluster
Globus GRACE_TS
Globus GRACE_TS
South America
53
Experiment-1 Peak and Off-peak
  • Workload
  • 165 jobs, each need 5 minute of cpu time
  • Deadline 1 hrs. and budget 800,000 units
  • Strategy Minimize Cost and meet the deadline
  • Execution Cost with cost optimisation
  • AU Peaktime471205 (G)
  • AU Offpeak time 427155 (G)

54
Application Composition Using Nimrod Parameter
Specification Language
Parameters Declaration parameter X integer range
from 1 to 165 step 1 parameter Y integer default
5 Task Definition task main Copy necessary
executables depending on node type copy
calc.OS nodecalc Execute program with
parameter values on remote node nodeexecute
./calc X Y Copy results file to use home
node with jobname as extension copy
nodeoutput ./output.jobname endtask
  • calc 1 5 ? output.j1
  • calc 2 5 ? output.j2
  • calc 3 5 ? output.j3
  • calc 165 5 ? output.j165

55
Resources Selected Price/CPU-sec.
Resource Type Size Owner and Location Grid services Peaktime Cost (G) Offpeak cost
Linux cluster (60 nodes) Monash, Australia Globus/Condor 20 5
IBM SP2 (80 nodes) ANL, Chicago, US Globus/LL 5 10
Sun (8 nodes) ANL, Chicago, US Globus/Fork 5 10
SGI (96 nodes) ANL, Chicago, US Globus/Condor-G 15 15
SGI (10 nodes) ISI, LA, US Globus/Fork 10 20
56
Deadline and Budget-based Cost Minimization
Scheduling
  1. Sort resources by increasing cost.
  2. For each resource in order, assign as many jobs
    as possible to the resource, without exceeding
    the deadline.
  3. Repeat all steps until all jobs are processed.

57
Execution _at_ AU Peak Time
58
Execution _at_ AU Offpeak Time
59
Experiment-2 Setup
  • Workload
  • 165 jobs, each need 5 minute of CPU time
  • Deadline 2 hrs. and budget 396000 G
  • Strategies 1. Minimise cost 2. Minimise time
  • Execution
  • Optimise Cost 115200 (G) (finished in 2hrs.)
  • Optimise Time 237000 (G) (finished in 1.25 hr.)
  • In this experiment Time-optimised scheduling run
    costs double that of Cost-optimised.
  • Users can now trade-off between Time Vs. Cost.

60
Resources Selected Price/CPU-sec.
Resource Location Grid services Fabric Cost/CPU sec.or unit No. of Jobs Executed No. of Jobs Executed
Resource Location Grid services Fabric Cost/CPU sec.or unit Time_Opt Cost_Opt.
Linux Cluster-Monash, Melbourne, Australia Globus, GTS, Condor 2 64 153
Linux-Prosecco-CNR, Pisa, Italy Globus, GTS, Fork 3 7 1
Linux-Barbera-CNR, Pisa, Italy Globus, GTS, Fork 4 6 1
Solaris/Ultas2 TITech, Tokyo, Japan Globus, GTS, Fork 3 9 1
SGI-ISI, LA, US Globus, GTS, Fork 8 37 5
Sun-ANL, Chicago,US Globus, GTS, Fork 7 42 4
Total Experiment Cost (G) 237000 115200
Time to Complete Exp. (Min.) 70 119
61
Deadline and Budget Constraint (DBC) Time
Minimization Scheduling
  1. For each resource, calculate the next completion
    time for an assigned job, taking into account
    previously assigned jobs.
  2. Sort resources by next completion time.
  3. Assign one job to the first resource for which
    the cost per job is less than the remaining
    budget per job.
  4. Repeat all steps until all jobs are processed.
    (This is performed periodically or at each
    scheduling-event.)

62
Resource Scheduling for DBC Time Optimization
63
Resource Scheduling for DBC Cost Optimization
64
Virtual Laboratory
  • Molecular Modeling for Drug Discovery on the
    World-Wide Grid
  • -- Application Case Study --

65
Drug Design Data Intensive Computing on Grid
Chemical Databases (legacy, in .MOL2 format)
  • It involves screening millions of chemical
    compounds (molecules) in the Chemical DataBase
    (CDB) to identify those having potential to serve
    as drug candidates.

66
DataGrid Brokering
Screen 2K molecules in 30min. for 10
Nimrod/G Computational Grid Broker
Algorithm1
Data Replica Catalogue
CDB Broker
. . .
AlgorithmN
3
CDB replicas please?
advise CDB source?
5
1
4
2
Grid Info. Service
process send results
selection advise use GSP4!
Screen mol.5 please?
Is GSP4 healthy?
7
6
mol.5 please?
CDB Service
CDB Service
GSP1
GSP2
GSPm
GSP4
GSP3(Grid Service Provider)
GSPn
67
Software Tools
  • Molecular Modelling Application (DOCK)
  • Parameter Modelling Tools (Nimrod/enFusion)
  • Grid Resource Broker (Nimrod-G)
  • Data Grid Broker
  • Chemical DataBase (CDB) Management and
    Intelligent Access Tools
  • PDB databse Lookup/Index Table Generation.
  • PDB and associated index-table Replication.
  • PDB Replica Catalogue (that helps in Resource
    Discovery).
  • PDB Servers (that serve PDB clients requests).
  • PDB Brokering (Replica Selection).
  • PDB Clients for fetching Molecule Record (Data
    Movement).
  • Grid Middleware (Globus and GrACE)
  • Grid Fabric Management (Fork/LSF/Condor/Codine/)

68
The Virtual Lab. Software Stack
APPLICATIONS
PROGRAMMING TOOLS
USER LEVEL MIDDLEWARE
Nimrod-G and CDB Data Broker task farming
engine, scheduler, dispatcher, agents, CDB
(chemical database) server
CORE MIDDLEWARE
Globus security, information, job submission
FABRIC
Worldwide Grid
Distributed computers and databases with
different Arch, OS, and local resource management
systems
69
V-Lab Components Interaction
Grid Node
Compute Node
User Node
70
DOCK code(Enhanced by WEHI, U of Melbourne)
  • A program to evaluate the chemical and geometric
    complementarities between a small molecule and a
    macromolecular binding site.
  • It explores ways in which two molecules, such as
    a drug and an enzyme or protein receptor, might
    fit together.
  • Compounds which dock to each other well, like
    pieces of a three-dimensional jigsaw puzzle, have
    the potential to bind.
  • So, why is it important to able to identify small
    molecules which may bind to a target
    macromolecule?
  • A compound which binds to a biological
    macromolecule may inhibit its function, and thus
    act as a drug.
  • E.g., disabling the ability of (HIV) virus
    attaching itself to molecule/protein!
  • With system specific code changed, we have been
    able to compile it for Sun-Solaris, PC Linux, SGI
    IRIX, Compaq Alpha/OSF1

Original Code University of California, San
Francisco http//www.cmpharm.ucsf.edu/kuntz/
71
Dock input file
  • score_ligand yes
  • minimize_ligand yes
  • multiple_ligands no
  • random_seed 7
  • anchor_search no
  • torsion_drive yes
  • clash_overlap 0.5
  • conformation_cutoff_factor 3
  • torsion_minimize yes
  • match_receptor_sites no
  • random_search yes
  • . . . . . .
  • . . . . . .
  • maximum_cycles 1
  • ligand_atom_file S_1.mol2
  • receptor_site_file ece.sph
  • score_grid_prefix ece
  • vdw_definition_file parameter/vdw.defn
  • chemical_definition_file parameter/chem.defn

72
Parameterize Dock input file(use Nimrod Tools
GUI/language)
score_ligand score_ligand minim
ize_ligand minimize_ligand multipl
e_ligands multiple_ligands random_s
eed random_seed anchor_search
anchor_search torsion_drive
torsion_drive clash_overlap
clash_overlap conformation_cutoff_factor
conformation_cutoff_factor torsion_minimize
torsion_minimize match_receptor_sit
es match_receptor_sites random_search
random_search . . . . . .
. . . . . . maximum_cycles
maximum_cycles ligand_atom_file
ligand_number.mol2 receptor_site_file
HOME/dock_inputs/receptor_site_file score_g
rid_prefix HOME/dock_inputs/score_
grid_prefix vdw_definition_file
vdw.defn chemical_definition_file
chem.defn chemical_score_file
chem_score.tbl flex_definition_file
flex.defn flex_drive_file
flex_drive.tbl ligand_contact_file
dock_cnt.mol2 ligand_chemical_file
dock_chm.mol2 ligand_energy_file
dock_nrg.mol2
73
Create Dock PlanFile1. Define Variable and their
value
parameter database_name label "database_name"
text select oneof "aldrich" "maybridge"
"maybridge_300" "asinex_egc" "asinex_epc"
"asinex_pre" "available_chemicals_directory"
"inter_bioscreen_s" "inter_bioscreen_n"
"inter_bioscreen_n_300" "inter_bioscreen_n_500"
"biomolecular_research_institute"
"molecular_science" "molecular_diversity_preservat
ion" "national_cancer_institute" "IGF_HITS"
"aldrich_300" "molecular_science_500" "APP" "ECE"
default "aldrich_300" parameter CDB_SERVER text
default "bezek.dstc.monash.edu.au" parameter
CDB_PORT_NO text default "5001"parameter
score_ligand text default "yes" parameter
minimize_ligand text default "yes" parameter
multiple_ligands text default "no" parameter
random_seed integer default 7 parameter
anchor_search text default "no" parameter
torsion_drive text default "yes" parameter
clash_overlap float default 0.5 parameter
conformation_cutoff_factor integer default
5 parameter torsion_minimize text default
"yes" parameter match_receptor_sites text
default "no" parameter random_search text
default "yes" . . . . . . . . . . .
. parameter maximum_cycles integer default
1 parameter receptor_site_file text default
"ece.sph" parameter score_grid_prefix text
default "ece" parameter ligand_number integer
range from 1 to 2000 step 1
Molecules to be screened
74
Create Dock PlanFile2. Define Task that jobs
need to do
task nodestart copy ./parameter/vdw.defn
node. copy ./parameter/chem.defn node.
copy ./parameter/chem_score.tbl node.
copy ./parameter/flex.defn node. copy
./parameter/flex_drive.tbl node. copy
./dock_inputs/get_molecule node. copy
./dock_inputs/dock_base node. endtask task main
nodesubstitute dock_base dock_run
nodesubstitute get_molecule
get_molecule_fetch nodeexecute sh
./get_molecule_fetch nodeexecute
HOME/bin/dock.OS -i dock_run -o dock_out
copy nodedock_out ./results/dock_out.jobname
copy nodedock_cnt.mol2
./results/dock_cnt.mol2.jobname copy
nodedock_chm.mol2 ./results/dock_chm.mol2.jobnam
e copy nodedock_nrg.mol2
./results/dock_nrg.mol2.jobname endtask
75
Nimrod/TurboLinux enFuzion GUI tools for
Parameter Modeling
76
Docking Experiment Preparation
  • Setup PDB DataGrid
  • Index PDB databases
  • Pre-stage (all) Protein Data Bank (PDB) on
    replica sites
  • Start PDB Server
  • Create Docking GridScore (receptor surface
    details) for a given receptor on home node.
  • Pre-Staging Large Files required for Docking
  • Pre-stage Dock executables and PDB access client
    on Grid nodes, if required (e.g., dock.Linux,
    dock.SunOS, dock.IRIX64, and dock.OSF1 on Linux,
    Sun, SGI, and Compaq machines respectively). Use
    globus-rcp.
  • Pre-stage/Cache all data files (3-13MB each)
    representing receptor details on Grid nodes.
  • This can can be done demand by Nimrod/G for each
    job, but few input files are too large and they
    are required for all jobs). So,
    pre-staging/caching at http-cache or broker level
    is necessary to avoid the overhead of copying the
    same input files again and again!

77
Chemical DataBase (CDB)
  • Databases consist of small molecules from
    commercially available organic synthesis
    libraries, and natural product databases.
  • There is also the ability to screen virtual
    combinatorial databases, in their entirety.
  • This methodology allows only the required
    compounds to be subjected to physical screening
    and/or synthesis reducing both time and expense.

78
Target Testcase
  • The target for the test case electrocardiogram
    (ECE) endothelin converting enzyme. This is
    involved in heart stroke and other transient
    ischemia.
  • Ischemia A decrease in the blood supply to a
    bodily organ, tissue, or part caused by
    constriction or obstruction of the blood vessels.

79
Scheduling Molecular Docking Application on Grid
Experiment
  • Workload Docking 200 molecules with ECE
  • 200 jobs, each need in the order of 3 minute
    depending on molecule weight.
  • Deadline 60 min. and budget 50, 000 G/tokens
  • Strategy minimise time / cost
  • Execution Cost with cost optimisation
  • Optimise Cost 14, 277(G) (finished in 59.30
    min.)
  • Optimise Time 17, 702 (G) (finished in 34 min.)
  • In this experiment Time-optimised scheduling
    costs extra 3.5K compared to that of
    Cost-optimised.
  • Users can now trade-off between Time Vs. Cost.

80
Resources Selected Price/CPU-sec.
 
Resource Location Grid services Fabric Cost/CPU sec. or unit No. of Jobs Executed No. of Jobs Executed
Resource Location Grid services Fabric Cost/CPU sec. or unit Time_Opt Cost_Opt
Monash, Melbourne, Australia (Sun Ultra01) Globus, Nimrod-G, GTS (master node) -- -- --
AIST, Tokyo, Japan, Ultra-4 Globus, GTS, Fork 1 44 102
AIST, Tokyo, Japan, Ultra-4 Globus, GTS, Fork 2 41 41
AIST, Tokyo, Japan, Ultra-4 Globus, GTS, Fork 1 42 39
AIST, Tokyo, Japan, Ultra-2 Globus, GTS, Fork 3 11 4
Sun-ANL, Chicago,US, Ulta-8 Globus, GTS, Fork 1 62 14
Total Experiment Cost (G) 17,702 14,277
Time to Complete Exp. (Min.) 34 59.30
81
DBC Time Opt. Scheduling
82
DBC Scheduling for Time Optimization No. of
Jobs in Exec.
83
DBC Scheduling for Time Optimization No. of
Jobs Finished
84
DBC Scheduling for Time Optimization Budget
Spent
85
DBC Cost Opt. Scheduling
86
DBC Scheduling for Cost Optimization No. of
Jobs in Exec.
87
DBC Scheduling for Cost Optimization No. of
Jobs Finished
88
DBC Scheduling for Cost Optimization Budget
Spent
89
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for
    Service-Oriented Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod/G -- Grid Resource Broker
  • Scheduling Experiments on World Wide Grid testbed
  • GridSim Toolkit and Simulations
  • Conclusions

90
Grid SimulationUsing the GridSim Toolkit
  • Grid Resource Modelling and Application
    Scheduling Simulation

91
Performance Evaluation With Large Scenarios
  • Varying the number of
  • Resources (1 to 100s..1000s..)
  • Resource capability
  • Cost (Access Price)
  • Users
  • Deadline
  • Budget
  • Workload
  • Different Time (Peak/Off-Peak)
  • We need repeatable and controllable environment
  • Can this be achieved on Real Grid testbed ?

92
Grid Environment
  • Dynamic
  • Resource Condition/Availability/Load/Users
    various with time.
  • Experiment cannot be repeated
  • Resources/Users are distributed and owned by
    different organization
  • It is hard to create controllable environment.
  • Grid testbed size is limited.
  • Also, creating moderate testbed is resource
    intensive time consuming expensive need to
    handle many political problems (access
    permission).
  • Hence, scheduling algorithm developers turn to
    Simulation .

93
Discrete-Event Simulation
  • A proven technique
  • Used in modeling and simulation of real world
    systems business ? factory assembly line ?
    computer systems design.
  • Allows creation of scalable, repeatable, and
    controllable environment for large-scale
    evaluation.
  • Language/Library based simulations tools are
    available.
  • Simscript, parsec
  • Bricks, MicroGrid, Simgrid, GridSim.

94
The GridSim ToolkitA Java based tool for Grid
Scheduling Simulations
Application, User, Grid Scenarios Input and
Results
. . .
Application Configuration
Resource Configuration
User Requirements
Grid Scenario
Output
Grid Resource Brokers or Schedulers
GridSim Toolkit
Application Modeling
Information Services
Resource Allocation
Statistics
Job Management
Resource Entities
Resource Modeling and Simulation (with Time and
Space shared schedulers)
Clusters
Single CPU
Reservation
SMPs
Load Pattern
Network
Basic Discrete Event Simulation Infrastructure
SimJava
Distributed SimJava
Virtual Machine (Java, cJVM, RMI)
Distributed Resources
PCs
Workstations
Clusters
SMPs
95
GridSim Entities
ShutdownSignal Manager i
Internet
User i
Broker i
Output
Output
Resource j
Scheduler
Application
Job Out Queue
Jobs
Process Queue
Job In Queue
Input
Input
Input
Output
Resource List
Report Writer i
InformationService
96
GridSim Entities Communication Model
97
Time Shared Multitasking and Multiprocessing
Tasks onPEs/CPUs
P1-G2
P1-G1
P3-G2
P1-G3
P2-G3
P2-G2
G3
G2
G3
PE2
G2
PE1
G1
G2
2
6
9
12
16
19
26
22
Time
G1
G1F
G3
G2
G2F
G3F
G1
G1 Gridlet1 Arrives
Gridlet1 (10 MIPS)
G1F Gridlet1 Finishes
G2
Gridlet2 (8.5 MIPS)
P1-G2 Gridlet2 didnt finish at the 1st
prediction time.
G3
P2-G2 Gridlet2 finishes at the 2nd prediction
time.
Gridlet3 (9.5 MIPS)
98
Space Shared Multicomputing
Tasks onPEs/CPUs
P1-G1
P1-G2
P1-G3
G2
G3
PE2
G1
G3
PE1
2
6
9
12
16
19
26
22
Time
G1
G1F
G3
G2
G2F
G3F
G1
G1 Gridlet1 Arrives
Gridlet1 (10 MIPS)
G1F Gridlet1 Finishes
G2
Gridlet2 (8.5 MIPS)
P1-G2 Gridlet2 finishes as per the 1st
Predication
G3
Gridlet3 (9.5 MIPS)
99
Simulating Economic Grid Scheduler
Broker Entity
R1
R1
5
4
Rm
R2
(Broker Resource List and Gridlets Q)
1
User Entity
Scheduling Flow Manager
Experiment Interface
3
6
7
Dispatcher
Resource Discovery and Trading
Time optimize
CT optimize
Cost optimize
None Opt.
Rn
Gridlet Receptor
Grid Resources
GIS
2
100
Interactions and Events (Time-shared)
Grid Resource Entity
Grid Information Service Entity
Grid Shutdown Entity
User1 Grid Broker Entity
Grid User1 Entity
ReportWriter Entity
Grid Statistics Entity
(Register Resource)
(Get Resource List)
(Submit Expt.)
(Get Resource Characteristics)
(Submit Gridlet1)
(Submit Gridlet2)
1st, 2nd, 3rd time predicted completion time
of Gridlet1
(Submit Gridlet3)
(Gridlet1 Finished)
Gridlet2completion event
(Gridlet2 Finished)
(DoneExpt.)
Gridlet3completion event
(Gridlet3 Finished)
(Record My Statistics)
(I am Done)
(Get Resource List)
If all Usersare Done
(Terminate)
The delivery of the most recently scheduled
internal asynchronous event to indicate the
Gridlet completion.
(Terminate)
(Create Report)
(Get Stat)
(Synchronous Event)
Internal asynchronous event is ignored since the
arrival of other events has changed the resource
scenario.
(Done)
(Terminate)
(Asynchronous Event)
101
Interactions and Events (Space-shared)
Grid Resource Entity
Grid Information Service Entity
Grid Shutdown Entity
User1 Grid Broker Entity
Grid User1 Entity
ReportWriter Entity
Grid Statistics Entity
(Register Resource)
(Get Resource List)
(Submit Expt.)
(Get Resource Characteristics)
(Submit Gridlet1)
Gridlet1 completion event
(Submit Gridlet2)
(Submit Gridlet3)
(Gridlet1 Finished)
Gridlet2completion event
(Gridlet2 Finished)
Gridlet3completion event
(DoneExpt.)
(Gridlet3 Finished)
(Record My Statistics)
(I am Done)
(Get Resource List)
If all Usersare Done
(Terminate)
(Terminate)
(Create Report)
(Get Stat)
Internal Asynchronous Event scheduled and
delivered to indicate the completion of Gridlet.
(Synchronous Event)
(Done)
(Terminate)
(Asynchronous Event)
102
Experiment-3 Setup Using GridSim
  • Workload Synthesis
  • 200 jobs, each job processing requirement 10K
    MI or SPEC with random variation from 0-10.
  • Exploration of many scenarios
  • Deadline 100 to 3600 simulation time, step 500
  • Budget 500 to 22000 G, step 1000
  • DBC Strategies
  • Cost Optimisation
  • Time Optimisation
  • Resources Simulated WWG resources

103
Simulated WWG Resources
104
Deadline and Budget-based Cost-Time Opt Scheduling
  • It is a combination of Cost and Time Optimisation
    Algorithm.
  • Create resource groups (RGs) each containing
    resources with the same cost as.
  • Sort RGs by increasing cost.
  • For each resource in RG in order, assign as many
    jobs as possible to the resources using the Time
    opt scheduling, without exceeding the deadline.
  • Repeat all steps until all jobs are processed.

105
DBC Cost Optimisation
  • No. of Jobs 200 (Heterogeneous)
  • Job Length 100 SPEC/MIPS on standard CPU with
    0-10 of variation randomly.

106
DBC Time Optimisation
107
Comparison D 3100, B varied
Cost Opt
Time Opt
Execution Time vs. Budget
Execution Cost vs. Budget
108
WWG Resources in Cost Time Opt
109
Cost-Time Opt Scheduling
Deadline is High
Budget is High
110
CT Opt Time and Budget Spent
111
DBC Conservative Time Min. Scheduling
  1. Split resources by whether cost per job is less
    than budget per job.
  2. For the cheaper resources, assign jobs in inverse
    proportion to the job completion time (e.g. a
    resource with completion time 5 gets twice as
    many jobs as a resource with completion time
    10).
  3. For the dearer resources, repeat all steps (with
    a recalculated budget per job) until all jobs are
    assigned.
  4. Schedule/Reschedule Repeat all steps until all
    jobs are processed.

112
Selected GridSim Users!
113
Agenda
  • A quick glance at todays Grid computing
  • Resource Management challenges for
    Service-Oriented Grid computing
  • A Glance at Approaches to Grid computing
  • Grid Architecture for Computational Economy
  • Nimrod/G -- Grid Resource Broker
  • Scheduling Experiments on World Wide Grid testbed
  • GridSim Toolkit and Simulations
  • Conclusions

114
Conclude with a comparison to the Electrical
Grid..
  • Where we are ????

Courtesy Domenico Laforenza
115
Alessandro Volta in Paris in 1801 inside French
National Institute shows the battery while in the
presence of Napoleon I
  • Fresco by N. Cianfanelli (1841)
  • (Zoological Section "La Specula" of National
    History Museum of Florence University)

116
.and in the future, I imagine a Worldwide Power
(Electrical) Grid ...
Oh, mon Dieu !
What ?!?! This is a mad man
117
2002 - 1801 201 Years
2002
118
Electric Grid Management and Delivery methodology
is highly advanced
Production Utility
Consumption
Regional Grid
Central Grid
Local Grid
Regional Grid
Local Grid
Whereas, our Computational Grid is in
primitive/infancy state?
119
Grid Computing A New Wave ?
Can we Predict its Future ?
I think there is a world market for about five
computers. Thomas J. Watson Sr., IBM Founder,
1943
120
Summary and Conclusion
  • Grid Computing is emerging as a next generation
    computing platform for solving large scale
    problems through sharing of geographically
    distributed resources.
  • Resource management is a complex undertaking as
    systems need to be adaptive, scalable,
    competitive,, and driven by QoS.
  • We proposed a framework based on computational
    economies for resource allocation and for
    regulating supply-and-demand for resources.
  • Scheduling experiments on the World Wide Grid
    demonstrate our Nimrod-G broker ability to
    dynamically lease services at runtime based on
    their quality, cost, and availability depending
    on consumers QoS requirements.
  • Easy to use tools for creating Grid applications
    are essential to attracting and getting
    application community on board.
  • The use of economic paradigm for resource
    management and scheduling is essential for
    pushing Grids into mainstream computing and
    weaving the World-Wide Grid Marketplace!

121
Download Software Information
  • Nimrod Parameteric Computing
  • http//www.csse.monash.edu.au/davida/nimrod/
  • Economy Grid Nimrod/G
  • http//www.buyya.com/ecogrid/
  • Virtual Laboratory Toolset for Drug Design
  • http//www.buyya.com/vlab/
  • Grid Simulation (GridSim) Toolkit (Java based)
  • http//www.buyya.com/gridsim/
  • World Wide Grid (WWG) testbed
  • http//www.buyya.com/ecogrid/wwg/
  • Cluster and Grid Info Centres
  • www.buyya.com/cluster/ www.gridcomputing.com

122
Further Information
  • Books
  • High Performance Cluster Computing, V1, V2,
    R.Buyya (Ed), Prentice Hall, 1999.
  • The GRID, I. Foster and C. Kesselman (Eds),
    Morgan-Kaufmann, 1999.
  • IEEE Task Force on Cluster Computing
  • http//www.ieeetfcc.org
  • Global Grid Forum
  • www.gridforum.org
  • IEEE/ACM CCGridxy www.ccgrid.org
  • CCGrid 2002, Berlin ccgrid2002.zib.de
  • Grid workshop - www.gridcomputing.org

123
Further Information
  • Cluster Computing Info Centre
  • http//www.buyya.com/cluster/
  • Grid Computing Info Centre
  • http//www.gridcomputing.com
  • IEEE DS Online - Grid Computing area
  • http//computer.org/dsonline/gc
  • Compute Power Market Project
  • http//www.ComputePower.com

124
Final Word?
Write a Comment
User Comments (0)
About PowerShow.com