Coordinated Scheduling of Jobs in Distributed Systems - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Coordinated Scheduling of Jobs in Distributed Systems

Description:

Boost the priority of communicating processes according to the 'recent' message ... The priority of the delayed job will be boosted so as soon as the job fits in ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 33
Provided by: dimit61
Category:

less

Transcript and Presenter's Notes

Title: Coordinated Scheduling of Jobs in Distributed Systems


1
Coordinated Scheduling of Jobs in Distributed
Systems
  • Dr. Dimitrios S. Nikolopoulos
  • CSL/UIUC

2
Outline
  • Motivation for coordinated scheduling
  • Gang scheduling
  • Coscheduling with implicit communicated
    information
  • Dynamic coscheduling
  • Dynamic coscheduling with tight memory resources

3
Motivation
  • Processes in distributed systems need to
    communicate
  • Parallel jobs
  • Client/server
  • Peer-to-peer
  • What happens if we send a message to a process
    which is not scheduled to receive the message ?

4
Uncoordinated scheduling
running
P1 sends to P2
blocked
P2 not running
unutilized time
5
Cascaded effect
running
P1 sends to P2
blocked
P2 not running
P1 not running
P2 not running
6
Purpose of coscheduling
  • Try to schedule simultaneously communicating
    processes
  • If for any reason the peer is not scheduled
    release the processors/resources for other jobs

7
Ideal coscheduling
running
P1 sends to P2
blocked
P1 not running, run another job
P2 not running run another job
P1 not running
8
Forms of coscheduling
  • Explicit coscheduling a.k.a. gang scheduling
  • Implicit coscheduling
  • Dynamic coscheduling

9
Gang scheduling
job1
T01
job2
T02
T03
job3
T04
job1
T05
job2
T06
job3
T07
job1
T08
job2
T09
job3
T010
job1
T011
job2
T012
job3
job1
T013
job2
T014
T015
job3
10
Gang scheduling
  • Good for tightly synchronized programs
  • Inappropriate for client server or peer-to-peer
  • Hard to implement event on tightly coupled
    multiprocessors
  • Requires centralized controller
  • Even distributed or hierarchical control schemes
    for gang scheduling dont scale well enough for
    large systems
  • Requires time quanta in the order of 10s of
    seconds

11
Coscheduling with implicit information
  • Try to coschedule processes controlled by
    different controllers (i.e. operating systems)
    using implicit information available locally
  • Types of implicit information
  • The turnaround time of a message
  • The frequency of arrivals of messages

12
Basic Idea
  • Try to infer if your peer is scheduled or not by
    checking
  • The time it takes your peer to respond to a
    message
  • The number of messages sent by/to your peer
    recently

13
Implicit vs. dynamic coscheduling
  • Implicit coscheduling is based on the
    responsiveness of the peer
  • The difficulty of implicit coscheduling
    algorithms is to determine how much time a job
    should wait for its peer to respond
  • Dynamic coscheduling is based on message
    send/receive frequencies
  • The difficulty of dynamic coscheduling

14
Basic impl. coscheduling algorithm
  • Wait for a predefined amount of time
  • This time might vary at runtime according to
    measurements obtained at runtime, but the way we
    compute this time is predefined
  • If a message does not arrive within the
    predefined time interval release the processor

15
Waiting Interval
  • Simple scenario ping-pong, or protocols with
    handshaking
  • Wait at least for 2l 2c o (turnaround time
    for message including overhead)
  • Competitive solution wait for another 2l2co
  • This limits the cost we pay for waiting
    needlessly to 4l4c2o

16
Competitive waiting
running
l
P1 sends to P2
blocked
co
l
co
17
Problems with competitive waiting
  • Difficult to compute analytically even for simple
    communication patterns
  • Example Hard to find analytical solution for
    barriers (Arpaci-Dusseau, ACM TOCS)
  • Impossible to compute analytically for
    unstructured or complex communication patterns
  • Personalized, broadcast, shuffles, etc.
  • Requires modifications in the communication
    libraries and the operating system.

18
Dynamic coscheduling
  • Proposed to cope with the difficulties of
    implicit coscheduling
  • Implementation
  • Computing the waiting interval
  • Main idea
  • If a process is not scheduled when a message
    arrives for that process schedule the process
    asap!

19
Intrusive dynamic coscheduling
  • Upon message arrival schedule the receiving
    process
  • Intrusive because it might mess up the scheduling
    queue
  • Frequently communicating jobs are treated
    favorably compared to other non-communicating
    jobs

20
Less intrusive dynamic coscheduling
  • Periodic priority boost
  • In each time slice
  • Boost the priority of communicating processes
    according to the recent message send/receive
    history

co
21
Benefits of dynamic coscheduling
  • Effective coscheduling if the communication
    pattern is unstructured, unknown, or hard to
    analyze for determining the right waiting
    interval
  • Simple implementation
  • You still have to modify the operating system
  • But the information/variables you have to access
    are already there (message buffers, priority)

22
Coscheduling and thrashing prevention
  • Adaptive scheduling under memory pressure on
    multiprogrammed clusters (NikolopoulosPolychronop
    oulos, IEEE/ACM ccGrid02)
  • partly based on
  • Adaptive scheduling under memory pressure on
    multiprogrammed SMPs
  • (NikolopoulosPolychronopoulos, IEEE IPDPS02)

23
Thrashing prevention
  • Thrashing the situation in which a computer
    pages data to/from the disk without doing useful
    work
  • Reason the running programs consume the memory
    of the system
  • Impact severe (slowdowns of a factor of 100x),
    because I/O from the disk is hundreds of times
    slower than accessing memory

24
Adaptive algorithm for thrashing prevention
  • Programs prevent thrashing by
  • Dynamically detecting paging at memory allocation
    points
  • Backing-off until paging stops
  • Memory-hungry jobs make room for memory resident
    jobs
  • Algorithm works for dynamic workloads (random
    arrivals and departures of jobs)

25
Thrashing prevention on an SMP
26
Coscheduling under memory constraints
  • If one system of a distributed system thrashes
    the effect may be sensed throughout the system
    (busy-waiting jobs)
  • Coscheduling solves the problem of needless
    waiting, but not the problem of thrashing. We
    either need a large number of jobs to sustain
    utilization or prevent thrashing (proposed
    solution)

27
Combining coscheduling with thrashing prevention
  • Scenarios
  • Non-communicating jobs that fit in memory -gt
    nothing to do
  • Communicating jobs that fit in memory -gt
    coscheduling takes priority
  • Non-communicating jobs that fit in memory -gt
    thrashing prevention takes priority
  • Communicating jobs that do not fit in memory -gt
    coscheduling or thrashing prevention first ?

28
Combining coscheduling with thrashing prevention
  • If coscheduling takes priority
  • Communicating job will consume the message but..
  • The system will page
  • If thrashing prevention takes priority
  • Communicating job will be delayed
  • Paging will be avoided
  • The priority of the delayed job will be boosted
    so as soon as the job fits in memory it will
    consume the message
  • We choose the second option due to the
    unpredictable latency of paging

29
Details
  • Implementation of paging prevention in Linux uses
    a shared-memory interface to communicate memory
    utilization info
  • Memory requirements of jobs are estimated
    dynamically by intercepting system calls
  • Dynamic coscheduling requires no more than 10
    lines of code added to the Linux scheduler

30
Details
  • Starvation is avoided by running jobs that are
    suspended more than 10 time needed to load their
    resident (just a heuristic)
  • Large jobs are stalled at memory allocation
    points, before the beginning of computation (I.e.
    once a job establishes its working set it runs
    and only jobs submitted later may stall due to
    memory pressure)

31
Results
32
Results
Write a Comment
User Comments (0)
About PowerShow.com