GangSim: A Simulator for Grid Scheduling Studies - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

GangSim: A Simulator for Grid Scheduling Studies

Description:

GangSim: A Simulator for Grid Scheduling Studies. Catalin L. Dumitrescu ... Resource Scheduling Studies. 2. Talk Outline ... VO: composed of a groups and users ... – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 27
Provided by: catal7
Category:

less

Transcript and Presenter's Notes

Title: GangSim: A Simulator for Grid Scheduling Studies


1
GangSim A Simulator for Grid Scheduling Studies
  • Catalin L. Dumitrescu
  • The University of Chicago

Ian Foster Argonne National Laboratory The
University of Chicago
2
Talk Outline / Part I
  • Part I
  • Introduction
  • Our Approach GangSim, a discrete simulator
  • Motivating Scenarios
  • Architecture
  • Evaluation Criteria
  • Part II
  • Simulation and Validation Results
  • Conclusions and Questions

3
Introduction
  • Large distributed Grid systems pose new
    challenges
  • Overwhelming resource characteristics
  • Complex workload characteristics
  • Complex interactions and resource allocations
  • Analytical modeling is either impractical or
    impossible

4
Our Approach GangSim
  • Derived from Ganglia Monitoring Toolkit
  • Real-time simulator
  • Focus on local VO interactions
  • Mixing simulations with real testbeds
  • Provide simple means for result visualization
  • Interactions with various Resource Managers (RMs)

5
GangSim Novelty
  • Simulates (and Handles)
  • Sites with RMs
  • VO and groups
  • Submission hosts
  • Model usage allocations (SLAs) at several levels
  • Capacity to combine simulated results with real
    results collected from a real Grid
  • Useful for simulations of future trends

6
Environment Overview
7
Environment Details
  • Simulations target environments with
  • large number of resources
  • resource owners
  • VOs
  • A few examples are
  • Grid3
  • OSG
  • TeraGrid
  • DataGrid

8
Initial Research Problems
  • What site usage policies are appropriate in a
    Grid environment, and how do these policies
    impact achieved site and VO performance?
  • What usage policy may be applied at the VO
    level?
  • What site selection policies are best suited for
    various Grid environments?

9
GangSim Details
10
GangSim Concepts
  • Site characterized by various metrics about CPU,
    disk space and network connectivity
  • VO composed of a groups and users
  • External Schedulers, Local Schedulers, and Data
    Schedulers scheduling decision points at various
    levels in the grid
  • Policy enforcement points (S-PEP and V-PEP)
    responsible to gather usage and allocation
    information and provide/control how many jobs
    should run

11
GangSim Strategies
  • Various algorithms can be used for scheduling
  • Site usage policy
  • Simple fair share
  • Extensible fair share
  • Commitment fair share
  • Others
  • ES task assignment strategies
  • Last recently used (according to available
    allocations)
  • Least used (according to available allocations)
  • Round robin / random assignment ()

12
Implementation Details
  • Ganglia (and VO-Centric Ganglia) various
    components were replaced
  • New components
  • Simulator modules track client and provider
    states
  • Task assignment policies various algorithm
    invoked during running
  • Metric aggregators monitoring sub-components
    used for scheduling decisions
  • Grid components internal data structures
  • Interfaces a set of CGI scripts remotely
    accessible

13
Interface Screenshot Example
14
Talk Outline / Part II
  • Part I
  • Introduction
  • Our Approach GangSim, a discrete simulator
  • Motivating Scenarios
  • Architecture
  • Evaluation Criteria
  • Part II
  • Simulation and Validation Results
  • Conclusions and Questions

15
Achievable Results
  • Interested in three main aspects
  • Task Assignment and Policies
  • Simulated Architecture Variations
  • Simulator Performance

16
Task Assignment and Policies
Round Robin Assignment Policy
Least Used Site Assignment Policy
Round Robin Assignment Policy
Used Site Assignment Policy
17
Analytical Results
  • Automated performance metric computation
  • Example
  • ART Si1..N RTi / N

Table 2 Unsynchronized Workloads ART
Table 1 Synchronized Workloads ART
Policy/Limit No limit Fix-limit Ext-limit
Round Robin 11.09 19.39 11.32
Least Used 13.25 15.14 15.06
Policy/Limit No limit Fix-limit Ext-limit
Round Robin 7.78 14.82 9.34
Least Used 10.57 13.68 11.37
18
Simulated Architectures
  • Various architectures can be simulated
  • Required changes of a few parameters
  • New algorithms can be considered

Analytical Approach in Site Selection
Observational Approach in Selection
19
Simulator Performance
  • Important to find simulator limits
  • 15 VO and 100 sites on a single GangSim instance
    is achievable

15 VOs and 100 sites (6 VOs drawn)
20
Validation Results
  • Results Comparison GangSim vs. Grid3
  • Site Level Comparisons
  • VO Level Comparisons
  • Quantitative Comparisons

21
Site Level Comparisons
  • GangSim and Grid3 on a single site (FermiLab)
  • 4 identical workloads
  • The GangSim and FermiLab executions both
    completed in close to the same time, but show
    rather different execution behavior

Per-VO, FermiLab (Grid3)
Per-VO, FermiLab (GangSim)
22
VO Level Comparisons
  • GangSim and Grid3 runs across 12 sites
  • Starting times iVDGL-1 at 20 seconds, BTEV-1 and
    USATLAS-1 at 200, LIGO-1 at 700 sec, BTEV-2 at
    800, iVDGL-2 at 1000, USATLAS-2 at 1500, and
    LIGO-2 at 1700.

Per-VO, 12 sites (Grid3)
-VO, 12 sites (GangSim)
23
Quantitative Comparisons
  • aggregated resource utilization (ARU)
  • average response time (ART)
  • ART Si1..N RTi / N.
  • average starvation factor (ASF)
  • ASF S ( MIN (STi, RTi) ) / S (ETi)

Table 3 Simulation (S) vs. Grid3 (G) Metrics
Level Site Site VO VO
Metric S G S G
ARU 0.12 0.16 0.07 0.06
ASF 2.36 3.9 9.88 5.09
ART 1521.31 1100.7 1824.25 639.5
24
Conclusions about GangSim
  • a Grid simulator for analysis of different
    scheduling policies in a multi-site and multi-VO
    environment
  • Designed for discrete simulation techniques and
    modeling of important system components
  • demonstrated by describing studies of different
    VO-level scheduling policies in the presence of
    different local site resource allocation policies

25
Addressed Questions
  • What site usage policies are appropriate in a
    Grid environment, and how do these policies
    impact achieved site and VO performance?
  • What usage policy may be applied at the VO
    level?
  • What site selection policies are best suited for
    various Grid environments?

26
Thanks
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com