Title: Distributed Scheduling In Sombrero, A Single Address Space Distributed Operating System
1Distributed Scheduling In Sombrero, A Single
Address Space Distributed Operating System
Milind Patil
2Contents
- Distributed Scheduling
- Features of Sombrero
- Goals
- Related Work
- Platform for Distributed Scheduling
- Distributed Scheduling Algorithm (Simulation)
- Scaling of the Algorithm (Simulation)
- Initiation of Porting to Sombrero Prototype
- Testing
- Conclusion
- Future Work
3Distributed Scheduling
- A distributed scheduling algorithm provides for
sharing as well as better usage of resources
across the system. - The algorithm will allow threads in the
distributed system to be scheduled among the
different processors in such a manner that CPU
usage is balanced.
4Features of Sombrero
Distributed scheduling in Sombrero takes
advantage of the distributed SASOS features
- The shared memory inherent to a distributed SASOS
provides an excellent mechanism to distribute
load information of the nodes in the system
(information policy). - The ability of threads to migrate in a simple
manner across machines has a potentially
far-reaching affect on the performance of the
distributed scheduling mechanism.
5Features of Sombrero (contd.)
- The granularity of migration is a thread not a
process. This allows the distributed scheduling
algorithm to have a flexible selection policy
(determines which thread is to be transferred to
achieve load balancing). - This feature also reduces the software complexity
of the algorithm.
6Goals
- Platform for Distributed Scheduling
- Simulation of Distributed Scheduling Algorithm
- Scaling of the Algorithm (Simulation)
- Initiation of Porting to Sombrero Prototype
7Related Work
- Load-Balancing Algorithms for
- Sprite
- PVM
- Condor
- UNIX
8Requirements
- A working prototype of Sombrero is needed that
has the ability to manage extremely large data
sets across a network in a distributed single
address space. - A functional prototype is needed which implements
essential features such as protections domains,
Sombrero thread support, token tracking support,
etc. - The prototype is under construction and not
available as development platform. Windows NT is
used since the prototype is being developed on it.
9Sombrero Node
Architecture of Sombrero Nodes
10Sombrero Clusters
The Sombrero system is organized into hierarchies
of clusters for scalable distributed scheduling.
11Sombrero Router
Architecture of Sombrero Routers
12Inter-node Communication
Sombrero nodes communicate with each other
through the routers.
13Router Tables
Router 0x1
14Router Tables(contd.)
Router 0x3
15Address Space Allocation
This project implements an address space
allocation mechanism to distribute the 264 bytes
address space amongst the nodes in the
system. Example- Consider a system of four
Sombrero nodes (A, B, C and D). The nodes come
online for the very first time in the order - A,
B , C and D.
16Address Space Allocation(contd.)
- The address space allocated for the nodes when A
is initialized will be - A 0x0000000000000000 0xfffffffffffffff
- The address space allocated for the nodes when B
is initialized will be - A 0x0000000000000000 0x7fffffffffffffff
- B 0x8000000000000000 0xffffffffffffffff
17Address Space Allocation(contd.)
- The address space allocated for the nodes when C
is initialized will be - A 0x0000000000000000 0x3fffffffffffffff
- B 0x8000000000000000 0xffffffffffffffff
- C 0x4000000000000000 0x7fffffffffffffff
- The address space allocated for the nodes when D
is initialized will be - A 0x0000000000000000 0x3fffffffffffffff
- B 0x8000000000000000 0xffffffffffffffff
- C 0x4000000000000000 0x5fffffffffffffff
- D 0x6000000000000000 0x7fffffffffffffff
18Load Measurement
- A nodes workload can be estimated based on some
measurable parameters - Total number of threads on the node at the time
of load measurement. - Instruction mixes of these threads (I/O bound or
CPU bound).
19Load Measurement (contd.)
Work Load ?i (pi ? fi)
p ? processor utilization of a thread f ?
heuristic factor (adjusts the importance of
thread depending on how it is being used) The
heuristic factor f should have a large value
for I/O intensive threads and a small value for
CPU intensive threads. The values of the
heuristic factor can be empirically determined by
using a fully functional Sombrero prototype.
20Load Measurement - Simulation
- In the simulation we assume that the processor
utilization of all threads is the same - This is sufficient to prove the correctness of
the algorithm - The measure of load at the node level is the
number of Sombrero threads. - A threshold policy has been defined
- high--number of Sombrero threads ? HIGHLOAD
- low--number of Sombrero threads lt MEDIUMLOAD
- medium--number of Sombrero threads lt HIGHLOAD and
number of Sombrero threads ? MEDIUMLOAD
21Load Tables
- Shared memory is used to distribute load
information. (In Sombrero the shared memory
consistency is managed by the token tracking
mechanism) - One load table is needed for each cluster.
- Thresholds of load have been established to
minimize the exchange of load information in the
network. Only threshold crossings are recorded in
the load table.
22Distributed Scheduling Algorithm
Highly loaded nodes in minority Sender Initiated
Algorithm
Lightly loaded nodes in minority Receiver
Initiated Algorithm
23Distributed Scheduling Algorithm
The algorithm used is dynamic i.e. sender
initiated at lower loads and receiver initiated
at higher loads. 1. Nodes loaded in the medium
range do not participate in load balancing. 2.
The load balancing is not to be done if the node
belongs to the majority (larger of the groups of
highly or lightly loaded nodes).
24Distributed Scheduling Algorithm
- 3. Load balancing is to be done if node belongs
to the minority (smaller of the groups of highly
or lightly loaded nodes). - The node is heavily loaded and the algorithm is
sender initiated- choose a lightly loaded node
at random and the RGETTHREADS message protocol is
followed for thread migration. - The node is lightly loaded and the algorithm is
receiver initiated- choose a highly loaded node
at random and the GETTHREADS message protocol is
followed for thread migration.
25Scaling the Algorithm
- Aggregating the clusters provides scalability.
- Thresholds for clusters are defined as given
- high - no cluster members are lightly loaded
and at least one member is highly loaded - low - no cluster members are highly loaded and
at least one member is lightly loaded - medium - all other cases of loads where load
balancing can occur within the cluster members or
when all members of the cluster are medium loaded
26Scaling the Algorithm
1. At any level of cluster only the nodes
belonging to the minority group at that level
will be active. 2. Load balancing at an nth
level cluster will be attempted every
(n?SOMECONSTANT) times the number of unsuccessful
attempts at the node level. 3. A suitable nth
level target cluster is found through the
corresponding load table and the TRANSFERREQUEST
message protocol is followed for thread migration.
27Testing Eight Nodes
Cluster of highly loaded nodes, of medium
loaded nodes, of lightly loaded nodes
28Testing Three Clusters
29Testing Six Clusters at Two Levels
30Conclusion
- The testing of distributed scheduling using the
simulator verifies that the algorithm functions
correctly. - It is observed that the increase in number of
messages is proportional to the increase in
number of heavily loaded nodes. - The number of messages required for load
balancing at the first level and above is the
same if the ratio of heavily and lightly loaded
nodes is kept constant at both levels.
31Conclusion (contd.)
- Only one additional load table is required per
additional cluster. Hence, the required number of
messages is expected to increase by a small
constant factor as the level of clustering
increases. - It can be concluded that the algorithms
complexity is O(n) where n is the number of
highly loaded nodes.
32Future Work
- Porting of code from NT to Sombrero for the
Sombrero node - communication code. - Changing definition of load measurement to the
more general formula. - Reuse code from the Sombrero router.
- Adaptive cluster forming algorithm.
33Acknowledgements
Dr. Donald Miller Dr. Rida Bazzi Dr. Bruce
Millard Mr. Alan Skousen Mr. Raghavendra
Hebbalalu Mr. Ravikanth Nasika Mr. Tom Boyd
34