Application of Methods of Queuing Theory to Scheduling in GRID - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Application of Methods of Queuing Theory to Scheduling in GRID

Description:

A Queuing Theory-based mathematical model is presented, and an explicit form of ... Demarcate the phenomena specific to scheduling in GRID, and the generic phenomena ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 28
Provided by: artsand5
Category:

less

Transcript and Presenter's Notes

Title: Application of Methods of Queuing Theory to Scheduling in GRID


1
Application of Methods of Queuing Theory to
Scheduling in GRID
  • A Queuing Theory-based mathematical model is
    presented, and an explicit form of the optimal
    control procedure obtained as the solution to
    the problem of maximizing the system throughput.

2
Why Queuing Theory?
  • Indeed, there are queues in real GRIDs
  • The services GRIDs offer to end users much
    resemble the services offered by telephone
    networks, the typical subject of study in Queuing
    Theory
  • The complexity of the associated processes leaves
    little options but to use the probabilistic
    techniques

3
Complexity The Principal Limiting Factor to
Modeling
  • GRIDs are very complicated systems themselves
  • GRIDs are composed of smaller complicated systems
  • Computer hardware
  • Networks
  • Software
  • GRIDs are embedded into the larger complicated
    systems
  • Scientific organizations
  • RD activities
  • Globalization processes

4
Stopping Decomposition as Soon as Possible to
Avoid Unnecessary Complexity
  • Demarcate the phenomena specific to scheduling in
    GRID, and the generic phenomena
  • Model complicated behavior of the components with
    probabilistic techniques
  • Find the most general expression of the effects

5
Ultimate Stopper of Decomposition
  • No entity in the modeled system should be
    decomposed, if the system persists when that
    entity is replaced with another similar one.

6
Implications
  • There is no need to develop detailed models of
    computers, networks, software or interaction
    external to GRID
  • There is no need to model the intra-GRID
    interaction, which does not directly affect
    scheduling
  • Information about how long it will take to
    process a demand on each node is all we need to
    know about the demand.

7
Mathematical Concepts Involved
  • Probability
  • Poisson Process
  • Multivariate Distribution
  • Linear Programming
  • Convergence By Law

8
Simplified Model
  • There is a finite number of classes of demands
    (all demands from the same class have equal
    complexity)
  • Sub-Model of Structure
  • Set of N nodes with queues
  • Sub-Model of Flow of Demands
  • Poisson process of arrivals with intensity ?
  • M classes of demands
  • Sub-Model of Scheduling Procedure
  • Recognizes distinct classes of demands and routes
    the demands to the nodes it chooses

9
Sub-Model Structure
10
Sub-Model Flow of Demands
  • Demands from class j arrive with intensity ?j
    ?pj (?1 ?m ?)
  • Upon arrival, a demand from class j is routed to
    node i with probability si,j
  • A demand from class j requires ?i,j units of
    processing time, if routed to node i
  • The computing time is incompressible
    processing two demands with complexities T1 and
    T2 at a particular node requires T1T2 time units
    independently of the order (or level of
    parallelism) in which they are processed

11
Two Important Facts About Poisson Processes
  • Let X1 and X2 be independent Poisson processes
    with intensity ?1 and ?2.Then X1 X2 is a Poisson
    process with intensity ?1 ?2.
  • Suppose a Poisson process X with intensity ? is
    split into X1 and X2. With probability p events
    are passed to X1 and otherwise to X2. Then X1 and
    X2 are Poisson processes with intensities p? and
    (1-p)?.

12
Flow of Demands Scheduling Procedure
13
Sub-Model Scheduling Procedure
  • The GRID operates in a stable environment
  • Routing of any demand in each moment depends on
    the current state of the system only
  • For all nodes load ?ilt1
  • ?
  • The system can operate in the stationary mode
  • The stationary mode is stable

14
Stationary Mode
15
Implications of Stationary Operation
  • Incoming demands of class j are routed to node i
    with stationary probability si,j
  • Load of node i has the form
  • ?i ? ? si,j ?i,j pj lt 1

16
Optimization Problem
17
Linear Programming
  • It is possible to rewrite the constraints in the
    folowing form
  • ?i ? si,j ?i,j pj
  • ?i ? ?
  • ??min
  • Now it is an LP problem

18
From Simplified to Real-World Model
  • How to handle non-discrete distributions of
    demands?
  • How to handle errors in classification (imperfect
    information)?
  • What about non-stationary modes?
  • Short-term excesses are not fatal because of
    stability
  • Long-term changes in distribution of demands can
    render the S.P. non-optimal

19
Approximating Actual Distribution of Demands with
A Discrete Distribution
20
A Better Approximation
21
What Happens When M???
  • Simplified
  • s is a matrix
  • s NxM?0,1
  • ? NxM?0,?)
  • ?i ? ? si,j ?i,j pj
  • Marginal
  • s is a function
  • si RM?0,1
  • ? multivariate random value (in RM )
  • ?i ?E ?isi(?)

22
Handling Imperfect Information
  • Average values of ?i,j can be used
  • The scheduling procedure should be iteratively
    re-evaluated when more information becomes
    available
  • In the real world applications, the exact
    distribution of demands is unknown, but can be
    approximated from the history of the system
    operation

23
A Comparison
  • Let ? be an exponentially distributed random
    value with average 1
  • ?i,j 1?
  • Trivial procedure distributes demands with equal
    probability to any node
  • An optimized procedure is obtained as shown

24
Scheduling Trivial vs. Optimized
Maximum Throughput
Optimized
Trivial
Num. of Nodes
25
Conclusions
  • The exact upper bound of throughput for a given
    GRID can be estimated
  • A scheduling procedure which achieves this limit
    can be constructed from a solution of an LP
    problem
  • The optimal scheduling procedure should be
    non-deterministic
  • Trivial and deterministic schedulers are
    generally unlikely to achieve the theoretical
    limit

26
References
  • L. Kleinrock, Queueing Systems, 1976
  • Andrei Dorokhov, Simulation simple models and
    comparison with queueing theory
    http//csdl.computer.org/comp/proceedings/hpdc/200
    3/1965/00/19650034abs.htm
  • Atsuko Takefusa, Osamu Tatebe, Satoshi Matsuoka,
    Youhei Morita, Performance Analysis of
    Scheduling and Replication Algorithms on Grid
    Datafarm Architecture for High-Energy Physics
    Applications
  • GNU Linear Programming Kit, http//www.fsf.org

27
My Special Thanks To
  • Dr. V.A. Ilyin for directing my work in the field
    of GRID systems
  • Prof. A.N. Shiryaev for directing my work in the
    Theory of Probability
Write a Comment
User Comments (0)
About PowerShow.com