PB-LRU: A Self-Tuning Power Aware Storage Cache Replacement Algorithm for Conserving Disk Energy - PowerPoint PPT Presentation

About This Presentation
Title:

PB-LRU: A Self-Tuning Power Aware Storage Cache Replacement Algorithm for Conserving Disk Energy

Description:

Qingbo Zhu, Asim Shankar and Yuanyuan Zhou. Presented: Hang Zhao Chiu Tan. 9/1/09 ... PB-LRU is a power aware, on-line cache management algorithm. ... – PowerPoint PPT presentation

Number of Views:159
Avg rating:3.0/5.0
Slides: 28
Provided by: ZHAO1
Category:

less

Transcript and Presenter's Notes

Title: PB-LRU: A Self-Tuning Power Aware Storage Cache Replacement Algorithm for Conserving Disk Energy


1
PB-LRU A Self-Tuning Power Aware Storage Cache
Replacement Algorithm for Conserving Disk Energy
  • Qingbo Zhu, Asim Shankar and Yuanyuan Zhou

Presented Hang Zhao Chiu Tan
2
PB-LRU Partition-Based LRU
  • Storage is a major energy consumer, 27 of power
    budget in a data center.
  • PB-LRU is a power aware, on-line cache
    management algorithm.
  • PB-LRU dynamically partitions cache at run time
    for energy optimal cache size per disk.
  • Practical algorithm that dynamically adapts to
    workload changes with little tuning.

3
Outline
  • Motivation
  • Background
  • Why need PB-LRU?
  • Main Idea
  • Energy estimation at Run Time
  • Solving MCKP
  • Evaluation Simulation
  • Conclusion

4
Motivation
  • Why is power conservation important?
  • Data centers are an important component of the
    Internet infrastructure.
  • Power needs for a data center are increasing at
    25 a year, with storage taking up 27.
  • How to reduce power in storage?
  • Simple. Spin down disk when not in use.

5
Motivation (II)
  • But
  • Performance and energy penalty when disk moving
    from low to high mode.
  • Data center volume is high. Idle periods small.
    Makes spinning up and down impractical.
  • Solution Multi-speed disk architecture.
  • PB-LRU targets multi-speed disk.

6
Background
  • Break-even time Minimum length of idle time
    needed justify spinning up/down.
  • Oracle DPM Knows length of next idle period.
    Uses this to regulate power modes.
  • Practical DPM Use thresholds to regulate
    powering up or down.

7
Why need PB-LRU?
  • Earlier work PA-LRU.
  • Idea Keep blocks from less active disks in
    cache. Thus extends idle period.
  • Cost More misses to active disks.
  • Justification Since active disks are already
    spinning, cheaper in terms of power consumption.

8
However
  • PA-LRU requires complicated parameter tuning. 4
    parameters needed.
  • No intuition between parameters and disk power
    consumption or IO times.
  • Thus difficult to adopt simple extensions or
    heuristics for real world implementation.
  • PB-LRU is a practical implementation !

9
PB-LRU Main Idea
  • Divide cache into partitions, one for each disk.
  • Each partitioned managed individually.
  • Resize partitions periodically.
  • Workloads are not equally distributed through
    different disks.

10
Main Idea (II)
  • So what do we need?
  • Estimate, for each disk, the energy consumed for
    a particular cache size. (estimation problem)
  • Use these estimates to find partitioning that
    minimize total energy consumption for all disks.
    (MCKP problem)

11
Estimation Problem
  • Q How to estimate energy consumption per disk
    for different cache sizes at run time?
  • Use simulators. One (multi-disk) simulator for
    every cache size.
  • Requires (NumCacheSizes X NumDisks) simulators.
    Impractical!

12
Estimation Problem (II)
  • Mattsons Stack
  • Take advantage of inclusion property.
  • A cache of k blocks is a subset of k1 blocks.
    Accessing a stack at position i means a miss at
    caches smaller than size i.
  • PB-LRU uses Mattsons Stack to predict hit or a
    miss for different partition sizes.

13
Estimation Problem (III)
  • In addition, PB-LRU keeps track of previous
    access time and previous energy consumption.
  • With these pieces of information, energy
    consumption of various cache is estimated.

14
Cache Accesses
Time T1 T2 T3 T4 T5
Access 5 4 3 2 1
5 possible Cache sizes
Stack
1
2
3
4
5
Mattson Stack
Cache Size Pre_miss Energy
1 T5 E5
2 T5 E5
3 T5 E5
4 T5 E5
5 T5 E5
RCache
1
2
3
Before
Existing Cache (real)
15
Time T1 T2 T3 T4 T5 T6
Access 5 4 3 2 1 4
LRU
4th element of stack. Miss for cache size lt 4
Stack
4 (1)
1 (2)
2 (3)
3 (4)
5 (5)
T6 Access Block 4
Cache Size Pre_miss Energy
1 T6 E6
2 T6 E6
3 T6 E6
4 T5 E5
5 T5 E5
RCache
4
2
3
E6 E5 E(T6-T5) 10ms ActivePower
LRU
16
Solving MCKP
  • MCKP is NP-hard. But modified problem solvable
    using dynamic programming.
  • General result Increase cache size for less
    active disks, decrease cache size for active
    disks.
  • Why? Penalty for reducing cache size of an active
    disk is small, while the energy saved for
    increasing cache size for inactive disk is large

17
Evaluation Methodology
  • The integrated simulator
  • Disk power model
  • CacheSim
  • DiskSim
  • Multi-speed disks model
  • Similar to IBM Ultrastar 36Z15
  • Add 4 lower-speed modes 12k, 9k, 6k and 3k RPM
  • Power model 2-competitive thresholds

18
Evaluation Methodology cont.
  • The traces
  • Real system traces
  • OLTP database storage system, (21 disks, 128MB
    cache)
  • Cello96 Cello file server from HP, (19 disks,
    32MB cache)
  • Synthetic traces
  • generated based on storage system workloads
  • zipf distribution to distribute requests among 24
    disks and blocks in each disk
  • hill shape to reflect temporal locality
  • Inter-request arrival distribution exponential,
    Pareto

19
Simulation results
Limited save due to high cold misses rate 64
  • Algorithms
  • Infinite cache
  • LRU
  • PA-LRU
  • PB-LRU

PB-LRU saves 9
Outperform LRU 22
20
Simulation results cont.
PB-LRU has 5 better response time
saves 40 response time
  • Oracle DPM does not slow down the average
    response time for it always spin disk in time for
    a request
  • All PB-LRU results are insensitive to the epoch
    length

21
Accuracy of Energy Estimation
  • OLTP, 21 disks with Practical DPM
  • Largest deviation of estimated energy from real
    energy is 1.8

22
Cache partition sizes
11-12MB
1MB
  • MCKP partition tendency
  • gives small sizes to disks which remain active
  • increase the sizes assigned to relatively
    inactive disks

23
Effects of spin-up cost
Disks stay longer at low-power mode
Break-even time increases
24
Sensitivity Analysis on Epoch Length
  • The epoch length just needs to be large enough to
    accommodate the warm-up period after
    re-partitioning.

25
Conclusion
  • PB-LRU online storage cache replacement
    algorithm partitioning the total system cache
    amongst individual disks
  • It focuses on multiple disks with data center
    workloads
  • Achieving similar or better energy saving and
    response time improvement with significant less
    parameter tuning

26
Future work
  • Taking pre-fetching into consideration to
    investigate the role of cache management in
    energy conservation
  • Optimally divide the total cache between the
    cache and pre-fetching buffers
  • Implement the disk power modeling component into
    the real storage system

27
Impact of PB-LRU
  • 5 citations found at Google Scholar
  • Energy conservation techniques for disk
    array-based servers (ICS04)
  • Performance Directed Energy Management for Main
    Memory and Disks (ASPLOS04)
  • Power Aware Storage Cache Management
  • Power and Energy Management for Server Systems
  • Management Issues
Write a Comment
User Comments (0)
About PowerShow.com