CS 7810 Lecture 12 - PowerPoint PPT Presentation

About This Presentation
Title:

CS 7810 Lecture 12

Description:

Modeling Challenges for Next-Generation. Microprocessors. D. Brooks et al. ... When a processor structure is not used in a cycle, ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 17
Provided by: RajeevBala4
Category:
Tags: lecture | modeling

less

Transcript and Presenter's Notes

Title: CS 7810 Lecture 12


1
CS 7810 Lecture 12
Power-Aware Microarchitecture Design
and Modeling Challenges for Next-Generation Microp
rocessors D. Brooks et al. IEEE Micro, Nov/Dec
2000
2
Power/Energy Basics
  • Energy Power x time
  • Dynamic Power a C V2 f
  • a switching activity factor
  • C capacitances being charged
  • V voltage swing
  • f processor frequency
  • Current trends f and C are rising, V is
    dropping,
  • overall dynamic power is increasing
  • Leakage energy is also increasing

3
Processor Breakdowns
Alpha 21264 Caches
16 O-o-o Issue Logic 19 Mem management
unit 9 FP unit
11 Integer unit 11 Clock
power 34
Pentium Pro
4
Metrics
  • Performance a f a 1/D (D is delay or execution
    time)
  • Delay of a circuit a 1/(V Vt) lower
    frequency
  • tolerates longer delays, hence, can reduce
    voltage
  • Power a C V2 f since f is roughly
    proportional
  • to voltage, P a V3 a f3
  • Since V and f are variable, remove it from the
  • expression PD3 constant (regardless of V
    and f)
  • This is the best metric to compare
    processors
  • any other metric (say, perf/watt) can be
    fudged
  • by changing voltage or frequency

5
Metric Example
Proc-A
Proc-B V 1.25 f
1GHz Perf 1000 MIPS
800 MIPS Power
100W 80W V
1.0 f 0.8GHz Perf
800 MIPS 640 MIPS Power
51.2W
41W V 1.5 f 1.2GHz Perf
1200 MIPS 960 MIPS Power
172.8W
138.2W
Power/f3 100
80
MIPS/W 10
MIPS/W 10
MIPS/W 15.6
MIPS/W 15.6
MIPS/W 6.9
MIPS/W 6.9
6
Metrics
  • PD3 gives ratio of power if two processors were
  • tuned to yield the same performance
  • (PD3)1/3 gives ratio of performance if two
  • processors were tuned to yield the same power
  • Tuning is done through voltage and frequency
  • scaling and it is assumed that a linear
    relationship
  • exists between V and f note that in modern
  • processors, this is not true and PDx is the
    right
  • metric, where x gt 3 (x can be 1 or 2 in
    markets
  • where performance is not very critical)

7
Commercial Examples
8
Global Power Saving Strategies
  • Dynamic frequency scaling trivially reduces
  • power, worsens performance, no effect on energy
  • If off-chip components (memory) dominate, there
  • will be an energy reduction with DFS
  • Leakage power is unaffected by DFS, so if
    leakage
  • dominates, overall energy increases
  • Montecito 20MHz changes in frequency can
  • happen in a single cycle

9
Global Power Saving Strategies
  • Dynamic voltage scaling since we are changing
  • frequency, can also combine it with voltage
    scaling
  • as each circuit has longer slack has a more
    than
  • quadratic effect on dynamic power, a linear
    effect
  • on leakage power, and a more than linear effect
  • on energy
  • Intel Xscale roughly 50ms to scale from
    1.65-0.75V
  • DVS opportunities are reducing lower voltage
  • margins, error rates may increase

10
Localized Power Saving Strategies
  • When a processor structure is not used in a
    cycle,
  • gate off its clock for that cycle gating can
    happen
  • in a single cycle increase in complexity
  • Leakage energy can be reduced by gating off
  • supply voltage V during periods of inactivity
    takes
  • more time to effect
  • Body biasing can also reduce leakage power

11
Localized Power Saving Strategies
Dynamically adjust frequency/voltage and size for
each domain, based on thruput rates
12
Leakage Power
Leakage is a linear function of supply voltage,
a linear function of the number of transistors,
and an exponential function of threshold voltage
From Butts and Sohi, MICRO00
13
Power-Performance Trade-Offs
14
Power-Performance Trade-Offs
Caches, bpreds are doubled at each point below,
while the x-axis represents the sizes of issue
queues, registers, ROB, etc.
Argues against going to wider/larger superscalars
15
Other Observations
  • Clustered architectures have better power
    scalability
  • (since the complexity of each cluster remains
    unchanged)
  • CMP and SMT
  • can employ
  • complexity-effective
  • designs power
  • consumption is low
  • (little wasted work)
  • and multi-threaded
  • performance
  • continues to be high

From ISPASS06
16
Title
  • Bullet
Write a Comment
User Comments (0)
About PowerShow.com