Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System

About This Presentation

Title:

Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System

Description:

Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil – PowerPoint PPT presentation

Number of Views:134

Avg rating:3.0/5.0

Slides: 35

Provided by: ravi154

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System

1
Distributed Scheduling In Sombrero, A Single
Address Space Distributed Operating System
Milind Patil
2
Contents

Distributed Scheduling
Features of Sombrero
Goals
Related Work
Platform for Distributed Scheduling
Distributed Scheduling Algorithm (Simulation)
Scaling of the Algorithm (Simulation)
Initiation of Porting to Sombrero Prototype
Testing
Conclusion
Future Work

3
Distributed Scheduling

A distributed scheduling algorithm provides for
sharing as well as better usage of resources
across the system.
The algorithm will allow threads in the
distributed system to be scheduled among the
different processors in such a manner that CPU
usage is balanced.

4
Features of Sombrero
Distributed scheduling in Sombrero takes
advantage of the distributed SASOS features

The shared memory inherent to a distributed SASOS
provides an excellent mechanism to distribute
load information of the nodes in the system
(information policy).
The ability of threads to migrate in a simple
manner across machines has a potentially
far-reaching affect on the performance of the
distributed scheduling mechanism.

5
Features of Sombrero (contd.)

The granularity of migration is a thread not a
process. This allows the distributed scheduling
algorithm to have a flexible selection policy
(determines which thread is to be transferred to
achieve load balancing).
This feature also reduces the software complexity
of the algorithm.

6
Goals

Platform for Distributed Scheduling
Simulation of Distributed Scheduling Algorithm
Scaling of the Algorithm (Simulation)
Initiation of Porting to Sombrero Prototype

7
Related Work

Load-Balancing Algorithms for
Sprite
PVM
Condor
UNIX

8
Requirements

A working prototype of Sombrero is needed that
has the ability to manage extremely large data
sets across a network in a distributed single
address space.
A functional prototype is needed which implements
essential features such as protections domains,
Sombrero thread support, token tracking support,
etc.
The prototype is under construction and not
available as development platform. Windows NT is
used since the prototype is being developed on it.

9
Sombrero Node
Architecture of Sombrero Nodes
10
Sombrero Clusters
The Sombrero system is organized into hierarchies
of clusters for scalable distributed scheduling.
11
Sombrero Router
Architecture of Sombrero Routers
12
Inter-node Communication
Sombrero nodes communicate with each other
through the routers.
13
Router Tables
Router 0x1
14
Router Tables(contd.)
Router 0x3
15
Address Space Allocation
This project implements an address space
allocation mechanism to distribute the 264 bytes
address space amongst the nodes in the
system. Example- Consider a system of four
Sombrero nodes (A, B, C and D). The nodes come
online for the very first time in the order - A,
B , C and D.
16
Address Space Allocation(contd.)

The address space allocated for the nodes when A
is initialized will be
A 0x0000000000000000 0xfffffffffffffff
The address space allocated for the nodes when B
is initialized will be
A 0x0000000000000000 0x7fffffffffffffff
B 0x8000000000000000 0xffffffffffffffff

17
Address Space Allocation(contd.)

The address space allocated for the nodes when C
is initialized will be
A 0x0000000000000000 0x3fffffffffffffff
B 0x8000000000000000 0xffffffffffffffff
C 0x4000000000000000 0x7fffffffffffffff
The address space allocated for the nodes when D
is initialized will be
A 0x0000000000000000 0x3fffffffffffffff
B 0x8000000000000000 0xffffffffffffffff
C 0x4000000000000000 0x5fffffffffffffff
D 0x6000000000000000 0x7fffffffffffffff

18
Load Measurement

A nodes workload can be estimated based on some
measurable parameters
Total number of threads on the node at the time
of load measurement.
Instruction mixes of these threads (I/O bound or
CPU bound).

19
Load Measurement (contd.)
Work Load ?i (pi ? fi)
p ? processor utilization of a thread f ?
heuristic factor (adjusts the importance of
thread depending on how it is being used) The
heuristic factor f should have a large value
for I/O intensive threads and a small value for
CPU intensive threads. The values of the
heuristic factor can be empirically determined by
using a fully functional Sombrero prototype.
20
Load Measurement - Simulation

In the simulation we assume that the processor
utilization of all threads is the same
This is sufficient to prove the correctness of
the algorithm
The measure of load at the node level is the
number of Sombrero threads.
A threshold policy has been defined
high--number of Sombrero threads ? HIGHLOAD
low--number of Sombrero threads lt MEDIUMLOAD
medium--number of Sombrero threads lt HIGHLOAD and
number of Sombrero threads ? MEDIUMLOAD

21
Load Tables

Shared memory is used to distribute load
information. (In Sombrero the shared memory
consistency is managed by the token tracking
mechanism)
One load table is needed for each cluster.
Thresholds of load have been established to
minimize the exchange of load information in the
network. Only threshold crossings are recorded in
the load table.

22
Distributed Scheduling Algorithm
Highly loaded nodes in minority Sender Initiated
Algorithm
Lightly loaded nodes in minority Receiver
Initiated Algorithm
23
Distributed Scheduling Algorithm
The algorithm used is dynamic i.e. sender
initiated at lower loads and receiver initiated
at higher loads. 1. Nodes loaded in the medium
range do not participate in load balancing. 2.
The load balancing is not to be done if the node
belongs to the majority (larger of the groups of
highly or lightly loaded nodes).
24
Distributed Scheduling Algorithm

3. Load balancing is to be done if node belongs
to the minority (smaller of the groups of highly
or lightly loaded nodes).
The node is heavily loaded and the algorithm is
sender initiated- choose a lightly loaded node
at random and the RGETTHREADS message protocol is
followed for thread migration.
The node is lightly loaded and the algorithm is
receiver initiated- choose a highly loaded node
at random and the GETTHREADS message protocol is
followed for thread migration.

25
Scaling the Algorithm

Aggregating the clusters provides scalability.
Thresholds for clusters are defined as given
high - no cluster members are lightly loaded
and at least one member is highly loaded
low - no cluster members are highly loaded and
at least one member is lightly loaded
medium - all other cases of loads where load
balancing can occur within the cluster members or
when all members of the cluster are medium loaded

26
Scaling the Algorithm
1. At any level of cluster only the nodes
belonging to the minority group at that level
will be active. 2. Load balancing at an nth
level cluster will be attempted every
(n?SOMECONSTANT) times the number of unsuccessful
attempts at the node level. 3. A suitable nth
level target cluster is found through the
corresponding load table and the TRANSFERREQUEST
message protocol is followed for thread migration.
27
Testing Eight Nodes
Cluster of highly loaded nodes, of medium
loaded nodes, of lightly loaded nodes
28
Testing Three Clusters
29
Testing Six Clusters at Two Levels
30
Conclusion

The testing of distributed scheduling using the
simulator verifies that the algorithm functions
correctly.
It is observed that the increase in number of
messages is proportional to the increase in
number of heavily loaded nodes.
The number of messages required for load
balancing at the first level and above is the
same if the ratio of heavily and lightly loaded
nodes is kept constant at both levels.

31
Conclusion (contd.)

Only one additional load table is required per
additional cluster. Hence, the required number of
messages is expected to increase by a small
constant factor as the level of clustering
increases.
It can be concluded that the algorithms
complexity is O(n) where n is the number of
highly loaded nodes.

32
Future Work

Porting of code from NT to Sombrero for the
Sombrero node - communication code.
Changing definition of load measurement to the
more general formula.
Reuse code from the Sombrero router.
Adaptive cluster forming algorithm.

33
Acknowledgements
Dr. Donald Miller Dr. Rida Bazzi Dr. Bruce
Millard Mr. Alan Skousen Mr. Raghavendra
Hebbalalu Mr. Ravikanth Nasika Mr. Tom Boyd
34

Write a Comment

User Comments (0)

About PowerShow.com

Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System - PowerPoint PPT Presentation

Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System

Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System Milind Patil – PowerPoint PPT presentation