Power Management: Research Review - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Power Management: Research Review

Description:

Server systems App Servers, Storage Servers, Front-end Servers - Local schemes per server ... between racks on the same row. ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 63
Provided by: int687
Category:

less

Transcript and Presenter's Notes

Title: Power Management: Research Review


1
Power Management Research
Review
  • Bithika Khargharia
  • Aug 5th, 2005

2
Single data-center rack Some figures
  • Cost of power and cooling equipment 52,800
    over 10 yr lifespan
  • Electricity costs for a typical 300W server
  • Energy consumption/year 2,628 kWh
  • Cooling/year
    748 kWh
  • Electricity/kWh
    0.10
  • Excludes energy costs due to air circulation and
    power delivery sub-systems
  • Electricity cost/10 years for typical data center
    rack 22,800

Total 338/year
3
Motivation Reduce TCO
  • Power Equipment 36
  • Cooling Equipment 8
  • Electricity 19
  • -----------------------------
  • Total 63 of the TCO of data-centers
    physical infrastructure

4
Some Objectives
  • Explore possible power savings areas
  • Reduce TCO by operating within a reduced power
    budget.
  • Develop QoS aware power management techniques.
  • Develop power aware resource scheduling, resource
    partitioning techniques.

5
Power management Problem Domains
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

6
Power management Problem Domains
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

7
Battery-operated devices Power management
  • Transition hardware components between high and
    low power states (Hsu Kremer, 03, Rutgers,
    Weiser, 94, Xerox PARC)
  • Deactivation decisions involve Power Usage
    Prediction
  • - Periods of inactivity e.g. time
    between disk accesses (Douglis, Krishnan,
  • Marsh, 94, Li, 94, UCB)
  • - Other high-level information
    (Health, 02, Rutgers, Weissel et al, 02,
    University
  • of Erlangen)
  • Mechanism supported by ACPI technology
  • Usually incurs both energy and performance
    penalties

8
Power management Problem Domains
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

9
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

10
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

11
Server Power management Local Schemes
  • Attacks processor power usage (Elnozahy, Kistler,
    Rajamony, 03, IBM, Austin)
  • DVS
  • - extends DVS to server environments with
    concurrent tasks (Flautner, Reinhardt, Mudge,
    01, UMich)
  • - conserves the most energy for
    intermediate load intensities
  • Request Batching
  • - processor awakens when
  • accumulated requests pending
    time gt batch time-out
  • - conserves the most energy for low load
    intensities
  • Combination of both
  • - conserves energy for wide range of load
    intensities

12
Server Power management QoS driven Local Schemes
Apply Management Strategies
QoS aware management strategies
Specified QoS
Compute QoS
Actual QoS
Fig Feed-back driven control framework
13
Server Power management QoS driven Local Schemes
  • Some results (Elnozahy, Kistler, Rajamony, 03,
    IBM, Austin)
  • Measured QoS is 90th percentile response time of
    50ms
  • Validated Web-server simulator
  • Web workload from real Web server systems
  • - Nagano Olympics 98 server
  • - Financial Services company site
  • - Disk-intensive workload.

14
Server Power management QoS driven Local
Schemes
Savings increase with workload, stabilize and
then reduce
Some results
Finance Workload
Disk-intensive Workload
15
Server Power management QoS driven Local
Schemes
  • Results Summary
  • DVS saves 8.7 to 38 of the CPU energy
  • Request Batching saves 3.1 to 27 of CPU energy
  • Combined technique saves 17 to 42 for all the
    three workload types for different load
    intensities.

16
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

17
Server Power management Local Schemes
  • Storage servers Attacks disk power usage
  • Multi-speed disks for servers (Carrera, Pinheiro,
    Bianchini, 02 ,Rutgers, Gurumurthi, PennState,
    IBM T.J Watson, 03,)
  • - dynamically adjust speed according to
    load imposed on the disk
  • - performance and power models exist for
    multi-speed disks
  • - based on disk response time, transition
    speeds dynamically
  • - results with simulation and synthetic
    workload energy savings up to 60


18
Server Power management Local Schemes
  • Storage servers Attacks disk power usage
    (Carrera, Pinheiro, Bianchini, 02 ,Rutgers)
  • Four disk energy management techniques
  • - combines laptop and SCSI disks
  • - results with kernel level implementation
    and real workloads Up to 41
  • energy savings for over-provisioned
    servers
  • - two-speed disks (15,000 rpm and 10,000
    rpm)
  • - results with emulation and same real
    workload energy savings up to 20
  • for properly provisioned servers.


19
Server Power management Local Schemes
Alternation of server load peaks and valleys
Lighter weekend loads
22 energy savings
Switch to 15,000 rpm only 3 times
20
Server Power management Local Schemes
  • Storage servers Attacks database servers power
    usage
  • Effect of RAID parameters for disk-array based
    servers (Gurumurthi, 03, PennState)
  • - RAID level, stripe size, number of disks
    parameters
  • - effect of varying these parameters on
    performance and energy
  • consumption for database servers running
    transaction workloads


21
Server Power management Local Schemes
  • Storage servers Attacks disks power usage
  • Storage cache replacement techniques (Zhu 04,
    UIUC)
  • - Increase disk idle time by selectively
    keeping certain disk blocks in main
  • memory cache
  • Dynamically adjusted memory partitions for
    caching disk data (Zhu, Shankar, Zhou 04, UIUC)


22
Server Power management Local Schemes
  • Storage servers Attacks disks power usage,
    involves data
  • Movement
  • Using MAID (massive array of idle disks)
    (Colarelli, GrunWald, 02, U of Colorado,
    Boulder)
  • - replace old tape back-up archives
  • - copy accessed data to cache-disks, spin
    down all disks
  • - LRU to implement cache disk replacement
  • - write back when dirty
  • - sacrifice access time in favor of energy
    conservation

23
Server Power management Local Schemes
  • Storage servers Attacks disks power usage,
    involves data
  • movement
  • Popular data concentration (PDC) technique
    (Pinheiro, Bianchini, 04, Rutgers)
  • - heavily skewed file access frequencies
    for server workloads
  • - concentrate most popular disk data on a
    sub-set of disks
  • - other disks are idle longer
  • - sacrifice access time in favor of energy
    conservation

24
Server Power management Local Schemes
  • Some results Comparing MAID and PDC(Pinheiro,
    Bianchini, 04, Rutgers)
  • MAID and PDC can only conserve energy when server
    is very low
  • Using 2-speed disks MAID and PDC can conserve
    30-40 of disk energy with small fraction of
    delayed requests
  • Overall PDC is more consistent and robust than
    MAID

25
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

26
Server Power management Local Schemes
  • Power management schemes for application servers
    has not
  • been much explored.

27
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

28
Server Power management Partition-wide Schemes
  • No known work done so far

29
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

30
Server Power management Component-wide Schemes
  • The power management schemes in this space are
    mostly the ones used by battery-operated devices
  • Scheme applies to transitioning single device
    (CPU, memory, NIC etc) into different power modes
  • These schemes normally work independently of each
    other, even when applied to server power
    management techniques at the local level

31
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

32
Server Power management Heterogeneous
Cluster-wide Schemes
  • Not much work done in this space

33
Power management Schemes Server Systems
  • Battery-operated devices
  • Server systems App Servers, Storage Servers,
    Front-end Servers
  • - Local schemes per server
  • - Partition-wide schemes
  • - Component-wide schemes
  • Whole data centers Server systems, Interconnect
    switches, power supplies, disk-arrays
  • - Heterogeneous cluster-wide schemes
  • - Homogeneous cluster-wide schemes

34
Server Power management Homogeneous
Cluster-wide Schemes
  • Front-end Web servers (Pinheiro, 03, Rutgers,
    Chase, 01, Duke)
  • Load Concentration (LC) technique
  • - dynamically distributes load offered to
    a server cluster under light load
  • - idles some hardware and puts them in low
    power mode
  • - under heavy load the system brings back
    resources to high power mode

35
Server Power management Cluster-wide Schemes
As load increases, of nodes increases
Some results
38 energy savings
36
Server Power management Homogeneous
Cluster-wide Schemes
  • Front-end Web server clusters Attacks CPU power
    usage
  • (Elnozahy, Kistler, Rajamony, 03, IBM, Austin)
  • Independent voltage scaling (IVS)
  • - server independently decides CPU
    operating points (voltage , frequency) at
  • runtime
  • Co-coordinated voltage scaling (CVS)
  • - servers co-ordinate to determine CPU
    operating points (voltage , frequency) for
    overall energy conservation.

37
Server Power management Homogeneous
Cluster-wide Schemes
  • Hot server clusters Thermal Management (Weissel,
    Bellosa,Virginia)
  • Throttling processes to keep CPU temperatures in
    server clusters
  • - CPU performance counters to infer the
    energy that each process
  • consumes
  • - CPU halt cycles introduced if energy
    consumption is more than permitted
  • Results
  • - Implementation in Linux Kernel for a
    server cluster with one Web, one
  • factorization and one database server
  • - Can schedule client requests according to
    pre-established energy
  • allotments when throttling CPU

38
Server Power management Homogeneous
Cluster-wide Schemes
  • Hot server clusters Thermal Management for Data
    centers
  • (Moore et al, HP Labs)
  • Hot spots can develop at certain parts
    irrespective of cooling
  • - temperature modeling work by HP Labs
  • Temperature aware load-distribution policies
  • - adjusts load distribution to racks
    according to temperature distribution
  • between racks on the same row.
  • - moved load away from regions directly
    affected by failed air-conditioners

39
Challenges
  • No existing tool to model power and energy
    consumption.
  • Develop schemes that intelligently exploit SLAs
    such as request priorities to increase savings.
  • Develop accurate workload based power usage
    prediction.
  • Partition-wide power management schemes are not
    yet explored.
  • Power management schemes for application servers
    has not been much explored.
  • - use CPU and memory intensively
  • - store state typically not replicated
  • - challenge is to correctly trade-off
    energy savings and performance
  • overheads

40
Challenges
  • No previous work for energy conservation in
    memory servers
  • - Challenge is to properly lay-out data
    across main memory banks and chips
  • to exploit low power states more
    extensively.
  • Power management for Interconnects and interfaces
  • - 32 port gigabit ethernet switch consumes
    700W when idle.
  • Thermal Management
  • - very good understanding of components and
    system lay-outs, air-flow in server enclosures
    and data centers required.
  • - accurate temperature monitoring mechanisms

41
Challenges
  • Peak power management
  • - dynamic power management can limit over
    provisioning of cooling
  • - challenge is to provide the best
    performance under fixed smaller power
  • budget
  • - IBM Austin is doing some work related to
    memory
  • - power shifting project dynamically
    redistributes budget between active
  • and inactive components
  • - lightweight mechanisms to control
    power and performance of different system
  • components.
  • - automatic work-load characterization
    techniques.
  • - algorithms for allocating power
    among components

42
Discussion
43
Power related decision making
QoS aware adaptive Power-management Schemes
  • Translate certain power envelope to compute IO
    power.
  • 2. Add a new parameter to workload
    requirements characterization Power
  • 3. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • different kinds of workloads like
    compute-intensive, IO intensive etc
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)
  • For devices that exists in the battery-operated
    world
  • - CPU NIC, memory etc
  • (additional power savings ?) e.g.
  • 2. For new devices introduced
  • by data-centers disk-arrays,
    interconnect switches etc.
  • 3. Relate power consumption with the
  • ability to self-optimize a platform to
  • achieve promised QoS
  • - Power QoS aware scheduling
  • - Power QoS aware resource
  • aggregation to provision
  • platforms on demand.
  • - Power QoS aware Resource
  • partitioning.

4. Exploit SLAs such as request
priorities to increase savings. 5. Exploit
Server characteristics to increase power
savings workloads, replication,
frequency of access for disk array servers
44
Power related decision making
  • Translate certain power envelope to compute IO
    power.
  • 2. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • kinds of workloads like
    compute-intensive, IO intensive etc
  • 3. Add a new parameter to workload
    requirements characterization Power
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)

45
Power related decision making
  • Translate certain power envelope to compute IO
    power.
  • 2. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • kinds of workloads like
    compute-intensive, IO intensive etc
  • 3. Add a new parameter to workload
    requirements characterization Power
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)

46
Power related decision making
  • Translate certain power envelope to compute IO
    power.
  • 2. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • kinds of workloads like
    compute-intensive, IO intensive etc
  • 3. Add a new parameter to workload
    requirements characterization Power
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)

47
Power related decision making
QoS aware adaptive Power-management Schemes
  • Translate certain power envelope to compute IO
    power.
  • 2. Add a new parameter to workload
    requirements characterization Power
  • 3. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • kinds of workloads like
    compute-intensive, IO intensive etc
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)
  • For devices that exists in the battery-operated
    world
  • - CPU NIC, memory etc
  • (additional power savings ?) e.g.
  • 2. For new devices introduced
  • by data-centers disk-arrays,
    interconnect switches etc.

48
Power related decision making
QoS aware adaptive Power-management Schemes
  • Translate certain power envelope to compute IO
    power.
  • 2. Add a new parameter to workload
    requirements characterization Power
  • 3. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • kinds of workloads like
    compute-intensive, IO intensive etc
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)
  • For devices that exists in the battery-operated
    world
  • - CPU NIC, memory etc
  • (additional power savings
  • 2. For new devices introduced
  • by data-centers disk-arrays,
    interconnect switches etc.
  • 3. Relate power consumption with the
  • ability to self-optimize a platform to
  • achieve promised QoS
  • - Power QoS aware scheduling
  • - Power QoS aware resource
  • aggregation to provision
  • platforms on demand.
  • - Power QoS aware Resource
  • partitioning.

49
Power related decision making
QoS aware adaptive Power-management Schemes
  • Translate certain power envelope to compute IO
    power.
  • 2. Add a new parameter to workload
    requirements characterization Power
  • 3. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • kinds of workloads like
    compute-intensive, IO intensive etc
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)
  • For devices that exists in the battery-operated
    world
  • - CPU NIC, memory etc
  • (additional power savings
  • 2. For new devices introduced
  • by data-centers disk-arrays,
    interconnect switches etc.
  • Exploit SLAs such as request priorities to
    increase savings.
  • 5. Exploit Server characteristics
  • to increase power savings
  • workloads, replication,
  • frequency of access for disk
  • array servers

50
Power related decision making
QoS aware adaptive Power-management Schemes
  • Translate certain power envelope to compute IO
    power.
  • 2. Add a new parameter to workload
    requirements characterization Power
  • 3. Power usage prediction for different
    devices (CPU, ,memory, disks etc) and server
    systems under
  • kinds of workloads like
    compute-intensive, IO intensive etc
  • 4. Global power states for servers and
    data-center systems like ACPI (ACPI has
    rudimentary global states right now)
  • For devices that exists in the battery-operated
    world
  • - CPU NIC, memory etc
  • (additional power savings ?) e.g.
  • 2. For new devices introduced
  • by data-centers disk-arrays,
    interconnect switches etc.
  • 3. Relate power consumption with the
  • ability to self-optimize a platform to
  • achieve promised QoS
  • - Power QoS aware scheduling
  • - Power QoS aware resource
  • aggregation to provision
  • platforms on demand.
  • - Power QoS aware Resource
  • partitioning.
  • Exploit SLAs such as request priorities to
    increase savings.
  • 5. Exploit Server characteristics
  • to increase power savings
  • workloads, replication,
  • frequency of access for disk-
  • array servers

51
Performance Metrics ?
52
End
53
  • Here follows some very good examples of how QoS
    for different devices can be related to power
  • A hard drive that provides levels of maximum
    throughput that corresponds to levels of power
    consumption.
  • An LCD panel that supports multiple brightness
    levels that correspond to levels of power
    consumption.
  • A graphics component that scales performance
    between 2D and 3D drawing modes that corresponds
    to levels of power consumption.
  • An audio subsystem that provides multiple levels
    of maximum volume that corresponds to levels of
    maximum power consumption.
  • A Direct-RDRAMTM controller that provides
    multiple levels of memory throughput performance,
    corresponding to multiple levels of power
    consumption, by adjusting the maximum bandwidth
    throttles.

back
54
Power related decision making
  • Can new workload be scheduled, given the power
    budget and peak power requirements?
  • What power management algorithms to apply to meet
    specified QoS?
  • How to partition a blade for maximum power
    savings?
  • How to aggregate resources for maximum power
    savings?

back
55
Extra Slides
56
Research Issues
  • How do you translate a certain power envelope
    into compute I/O power
  • How do you enforce that power envelope given
    dynamic runtime changes such as workload
  • How do you maintain that power envelope given a
    certain QoS (best effort, guaranteed bandwidth)
  • Given that chipsets processors are going to
    become very cheap, it make more sense to
    translate an applications compute I/O
    requirements into power consumption in the
    platform instead of frequency for the processor
    and bandwidth for I/O.

57
Research Issues
  • How do you relate power consumption with the
    ability to self-optimize a platform to achieve
    promised QoS
  • Can you come up with new energy conservation
    states beyond ACPI states, and techniques
    possibly beyond DVS (dynamic voltage scaling)
    techniques that processors deploy?
  • Can you come up with schemes to optimize power
    consumption with resource allocation, given you
    are not only dealing with processor, but with
    memory, network and storage, as well
  • -This is interesting, not only
    processor but also memory, network and
  • storage are looked at as
    resources. This can also translate to
  • activities such as partitioning of
    a single blade.

58
  • Workload requirements characterization
  • Combined power states for servers and data-center
    systems like ACPI Look at ACPI global states
    for comparison
  • Power-aware Scheduling, Partitioning
  • Relationship between workload and power
    consumption for different devices?

59
Power management Microprocessors
  • Dynamic Voltage Scaling (DVS)
  • - Supported by Transmeta Crusoe
  • - Power proportional to (voltage)2
    frequency
  • - Slower program execution
  • Halting or Deactivation
  • - Supported by Intels Pentium4
  • - Halting stops processor from doing
    any instruction execution
  • - Deactivation puts it to a deeper
    sleep state
  • - Transition costs vary

60
Power management Disks
  • Transition to multiple inactive modes
  • - Consumes most power during accesses
    while spinning
  • - Low power modes involves reduced
    spin speed
  • - High transition overheads

61
Server Systems Power management Challenges
  • Introduces new components
  • - power supplies, disk arrays,
    interconnection switches
  • - few management techniques for these
    devices so far
  • - power supplies exhibit high power
    losses (store spare capacity for
  • load peaks)
  • Server workloads are different
  • - power mode transitioning may incur
    high overhead
  • - sometimes power mode transitioning
    may not at all be possible
  • Widespread replication of resources for high
    availability and BW
  • - turning on/off devices may involve
    state migration as well.


62
Server Power management Motivations from server
characteristics
  • Turn on/off resources addresses high base power
    consumption
  • - exploits wide load variations
  • - exploits resource replication
  • - concentrate load onto a subset of
    resources, turn off idle resources
  • Energy management for disk-array based servers
  • - frequency of server requests
  • Request batching degrade response time for
    energy conservation
  • - wide-area network delays involved in
    accessing servers

Write a Comment
User Comments (0)
About PowerShow.com