Title: Green HPC: Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-Enabled Data Centers
1Green HPCPower Aware Scheduling of Bag-of-Tasks
Applications with Deadline Constraints on
DVS-Enabled Data Centers
- Kyong Hoon Kim1, Rajkumar Buyya1, and Jong Kim2
1Grid Computing and Distributed Systems (GRIDS)
LaboratoryDept. of Computer Science and Software
EngineeringThe University of Melbourne,
Australiawww.gridbus.org 2POSTECH, Korea
Gridbus Sponsors
2Outline
- Introduction
- Related Work
- System Model
- Job Admission Control
- DVS-based Cluster Scheduling
- EDF-based scheduling
- Proportional share-based scheduling
- Simulation Results
- Summary
3Background
- Traditionally, high-performance computing (HPC)
community has focused on performance (speed). - At the same time microprocessor vendors have not
only doubled the number of transistors (and
speed) every 18-24 months, but they have also
doubled the power densities. - Moores Law for Power Consumption
4Research Motivations of Power Aware/Energy
Efficient High Performance Computing (HPC)
- Rapid uptake of HPC-architecture based Data
Centers for hosting industrial applications - Reducing the operational costs of powering and
cooling HPC systems - The tremendous increase in computer performance
has come with an even grater increase in power
usage. - According to Eric Schmit, CEO of Google, what
matter most to Google is not speed but power,
because data centers can consume as much
electricity as a city. - Improving reliability
- As a rule of thumb, for every 10C increase in
temperature, the failure rate of a system
doubles. - Computing environment affected the correctness of
the results. - The 18-node Linux cluster produced an answer
outside the residual (i.e., a silent error) when
running in dusty 85F warehouse but produced the
correct answer when running in a 65F
machine-cooled room.
5Reliability/Implications
- Reliability of Leading Edge Supercomputer (D.
Reed, 2004) - Estimated Cost of An hour of system downtime (W.
Feng, (ACM Queue, 2003)
6Power Aware Computing
- Power Aware (PA) computing/communications
- The objective of PA computing/communications is
to improve power management and consumption using
the awareness of power consumption of devices. - Power consumption is one of the most important
considerations in mobile devices due to the
limitation of the battery life. - System level power management
- Recent devices (CPU, disk, communication links,
etc.) support multiple power modes. - System scheduler can use these multiple power
modes to reduce the power consumption.
7Outline
- Introduction
- Related Work
- System Model
- Job Admission Control
- DVS-based Cluster Scheduling
- EDF-based scheduling
- Proportional share-based scheduling
- Simulation Results
- Summary
8Related Work (1/3)
- Research on power reduction for scientific
applications - Hsu and Feng (SC 2005) Los Alamos National Lab,
USA - ? -adaptation algorithm
- Automatic adaptation of CPU frequencies
- ? the intensity level of off-chip accesses
- Ge, Feng, and Cameron (2005)
- Three DVS scheduling strategies
- Software framework to implement scheduling
techniques - Hotta, et. al. (2006)
- Profile-based power-performance optimization
- Selection of an appropriate gear using DVS
scheduling - Development of power-profiling system called
PowerWatch -
9Related Work (2/3)
- Energy reduction for MPI programs
- Kappiah, et. al. (2005) - NC State, and Georgia
Uni, USA - Inter-node bottle problem in MPI programs
- Selection of an appropriate gear based on slack
time - Lim, et. al. (2006) NC State, and Georgia Uni,
USA - Adaptive DVS of Communication Phases in MPI
programs - Son, et. al. (2006)
-
- Two approaches for building power-aware cluster
platforms - Design and develop systems with consideration of
energy consumption. - BlueGene/L, Green Destiny,
- Use DVS-enabled commodity systems.
- Clusters with AMD Athlon64s, Pentium Ms, AMD
Opterons,
10Related Work (3/3)
- DVS (Dynamic Voltage Scaling) technique
- Reducing the dynamic energy consumption by
lowering the supply voltage at the cost of
performance degradation - Recent processors support such ability to adjust
the supply voltage dynamically. - The dynamic energy consumption ? Vdd2
Ncycle - Vdd the supply voltage
- Ncycle the number of clock cycle
- An example
deadline
Power
Power
deadline
5.02
2.02
10 msec
25 msec
10 msec
25 msec
(a) Supply voltage 5.0 V
(b) Supply voltage 2.0 V
11DVS-based Power Aware Cluster Scheduling
- Research motivation
- Previous work has focused on the development of
DVS-enabled cluster systems. - Few works have considered the scheduling problem
in power-aware clusters. - Problem to solve
- To provide scheduling algorithms in DVS-enabled
cluster systems in order to minimize the energy
consumption and to meet the job deadline. - Exploit industries move towards Utility Model /
SLA-based Resource Allocation
12Outline
- Introduction
- Related Work
- System Model
- Job Admission Control
- DVS-based Cluster Scheduling
- EDF-based scheduling
- Proportional share-based scheduling
- Simulation Results
- Summary
13System Model (1/2)
- Cluster model
- A cluster system is defined as (N, Q).
- N the number of processors
- Q the processing performance of each PE in terms
of MIPS - Job model
- A job is considered to be a bag-of-tasks
application. - The deadline is used as a QoS parameter of a job.
- A job (p, l1, l2, , lp, d)
- p the number of sub-tasks
- li the length in MI of the i-th task
- d the job deadline
14System Model (2/2)
- Energy model
- Energy consumption of a task execution
- E ?V2L
- L the task length
- V the supply voltage
- ? a proportional constant
- Dynamic Voltage Scaling
- V1, , Vm m different voltage levels
- Qi the processor speed (MIPS) under the
associated voltage level Vi - Si the normalized speed of each voltage level
Vi (Si Qi/Qm) - An Example
15Outline
- Introduction
- Related Work
- System Model
- Job Admission Control
- DVS-based Cluster Scheduling
- EDF-based scheduling
- Proportional share-based scheduling
- Simulation Results
- Summary
16Proposed Cluster RMS System Architecture with
Energy-Efficient Resource Allocation
- (1) Job submission
- (2) Schedulability test Energy estimation
- (3) Acknowledgement of schedulability and energy
amount - (4) Selection of PEs
17Application Admission and Resource Allocation
Algorithm
Algorithm Admission_Resource_Allocation (J (p,
l1, , lp, d)) 1 for i from 1 to p do 2
PEalloc ? null 3 energymin ? MAX_VALUE 4
for k from 1 to N do 5 if schedulable
(PEk, li, d) true then 6 energyk
? energy_estimate (PEk, li, d) 7 if
energyk lt energymin then 8
energymin ? energyk 9 PEalloc ?
PEk 10 endif 11 endif 12
endfor 13 if PEalloc ! null then 14
Allocate the i-th task of J to PEalloc 15
else 16 Cancel all tasks of J. 17
return reject 18 endelse 19 endfor 20
return accept
18Outline
- Introduction
- Related Work
- System Model
- Job Admission Control
- DVS-based Cluster Scheduling
- EDF-based scheduling
- Proportional share-based scheduling
- Simulation Results
- Summary
19EDF-based on DVS scheduling (1/4)
- Basics
- Tk ?k,i (ek,i, dk,i) i 1, , nk
- The current available task set in the k-th PE
- nk the current number of tasks
- ?k,i (ek,i, dk,i) the i-th task in Tk
- ek,i the remaining execution time
- dk,i the remaining deadline
- EDF (Early Deadline First) policy
- Tk is sorted by the deadline so that dk,i ?
dk,i1 - The scheduler always executes the
earliest-deadline task in the queue.
20EDF-based DVS scheduling (2/4)
- The temporary utilization, uk,i
- The required processor utilization for task ?k,i
by EDF - The continuous speed level of the
highest-priority task, sk - The supply voltage level of the highest-priority
task, vk
21EDF-based DVS scheduling (3/4)
- An example
- Tk ?k,1(1, 4), ?k,2(2, 6), ?k,3(2, 10)
- Temporary utilizations at time 0
- uk,1 1/4
- uk,2 (1 2)/6 1/2
- uk,3 (1 2 2)/10 1/2
- Scaling factors
- At time 0
- sk maxuk,1, uk,2, uk,3 1/2
- vk 1.1V
Speed level
0.6
0.4
?k,1
?k,2
?k,3
0
5
10
10/6
vk
0.9V
1.1V
1.1V
22EDF-based DVS scheduling (4/4)
- Schedulability test of EDF
Algorithm energy_estimate_EDF (PEk, l,
d) Ecurrent ? energy_consumption (Tk, nk) Tk ?
Tk ? (l/Qm, d) Enew ? energy_consumption (Tk,
nk1) return (Enew Ecurrent) function
energy_consumption (T, n) Energy ? 0 time ? the
current time for i from 1 to n do for j from
i to n do uj ? ? ek /dj s ? max uj v
? min Vj Sj ? s s ? min Sj Sj ? s
Energy ? Energy ?v2eiQm time ? time
ei/s for j from i to n do dj ? dj
ei/s endfor return Energy
Algorithm schedulable_EDF (PEk, l, d) Tk ? Tk ?
(l/Qm, d) Sort Tk in the order of
deadline. for i from 1 to nk 1 do uk,i ? ?
ek,i / dk,i if uk,i gt 1 then return
false endfor return true
23Proportional Share-based DVS scheduling (1/2)
- The proportional share scheme
- Multiple tasks share the processor performance in
proportion to each tasks weight. - Each task should be given at least ek,i/dk,i
under the maximum processor speed to meet the
deadline. - The continuous processor speed level, sk
-
- The supply voltage level of the highest-priority
task, vk - The proportional share of each task, sharek
1
6
8
15
24Proportional Share-based DVS scheduling (2/2)
- An example
- Tk ?k,1(1, 4), ?k,2(2, 6), ?k,3(2, 10)
- Schedulability test and energy estimation is
similar to EDF algorithm.
Speed level
?k,i sharek,i
0.8
?k,1 0.32
0.6
?k,2 0.62
?k,2 0.425
0.4
?k,3 1.0
?k,3 0.38
?k,3 0.255
0
3.906
0
5.712
7.69
vk
0.9V
1.3V
1.1V
8
15
25Outline
- Introduction
- Related Work
- System Model
- Job Admission Control
- DVS-based Cluster Scheduling
- EDF-based scheduling
- Proportional share-based scheduling
- Simulation Results
- Summary
26Simulation Environment
- Using the GridSim toolkit
- A cluster system with 32 DVS-enabled processors
- Operating points of simulated processors based on
Athlon-64 - 1000 bag-of-tasks applications
- Task characteristics
- Task length 600,000 MIs 7,200,000 MIs
- The number of tasks 2 32
- Deadline 20 100 more than average execution
time
27Simulated algorithms
- DVS-based scheduling
- EDF-DVS
- PShare-DVS
- Scheduling at maximum processor speed
- EDF-1.5V
- PShare-1.5V
- Scheduling at minimum processor speed
- EDF-0.9V
- PShare-1.5V
28Job Acceptance Rate
29Energy consumption
- Normalized to EDF-1.5V at inter-arrival time of 2
mins.
Normalized value
Inter-arrival time (min)
30Normalized performance of DVS
31Impact of granularity/number of controllable
voltage levels
-
- Normalized performance of EDF
- Normalized performance of PShare
Normalizedvalue
Normalizedvalue
32Outline
- Introduction
- Related Work
- System Model
- Job Admission Control
- DVS-based Cluster Scheduling
- EDF-based scheduling
- Proportional share-based scheduling
- Simulation Results
- Summary
33Summary
- Two primary drivers for Power-Aware HPC
- Operational cost
- Reliability
- Power-aware scheduling with deadline constraints
- Reducing energy consumption
- Meeting jobs deadlines
- The proposed scheduling algorithms
- DVS-based scheduling based on
- Space-shared policy EDF
- Time-share policy / Proportional Resource Sharing
- Minimizing cost under the constraint of the job
deadline - Future work
- Budget-constrained power-aware scheduling
- Power-aware workflow scheduling
34Thanks for your attention!
We Welcome Cooperation in Research and
Development! http/www.gridbus.org
eScience2007.org