Instruction Level Power Analysis - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Instruction Level Power Analysis

Description:

Unmanageable time complexity even for simpler designs ... Still unmanageable time complexity especially to use in design space exploration ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 31
Provided by: embeddedC
Category:

less

Transcript and Presenter's Notes

Title: Instruction Level Power Analysis


1
Instruction Level Power Analysis
  • Manoj Gupta, 2001119
  • Mayank Gupta, 2001120

2
Layout
  • Introduction
  • Components of Power Consumption
  • Power Characterization
  • Instruction Level Power Analysis for RISC
    processors
  • Extensions for VLIW/EPIC processors
  • Register Files
  • Caches

3
Introduction
  • Why power of nano-electronics became so
    important?
  • Because of Moores law still holds true through
    complex applications
  • Mobile systems battery bottleneck
  • High performance computation heat extraction
  • Operating cost and reliability
  • Data warehouse of ISP with 8000 servers needs 2 MW

4
Introduction
  • Power or Energy? Arent they go hand-in-hand?
  • Power varies significantly with time!
  • A given battery has fixed amount of energy
  • Average power consumption Energy/Execution-time
  • Decides average chip and junction temperature
  • Decides battery life (if peak current lt rated
    current)
  • Peak power and current
  • Voltage drops, hot spots, rate of battery
    discharge
  • Power-efficient, Energy-efficient,
    Battery-efficient design paradigms do exist!

5
Components of Power Consumption
  • System hardware platform software (sys.
    app.)
  • Software impacts hardware power consumption
  • Static power
  • Sub-threshold leakage reverse biased junction
    leakage
  • Quiescent biasing power (in case of non-CMOS
    circuits)
  • Dynamic power
  • Charging and discharging of capacitance
    (switching activity)
  • Short circuit power during transition (rate of
    change, delay)
  • Alternative grouping (used at component/cell
    level)
  • Switching power at the boundaries of cells
  • Internal cell power
  • Short circuit power
  • Switching power at internal nodes

6
System Abstractions - Power
Functional Specifications and Constraints System
Level Netlist Register Transfer Level (RTL)
Netlist Component/Cell Level Netlist Layout or
Configuration-bits Chip
Time complexity
Accuracy of power characterization
Opportunities for optimization
7
Power Characterization
  • Measurement (Chip/Board Level)
  • Most accurate
  • Perhaps the fastest, if setup and tools exist
  • Too late to change hardware details
  • Software/Load control is still possible
  • Typically used for software optimizations

8
Power Characterization (cont)
  • Transistor Level (estimation)
  • Spice simulation of transistor level netlist
  • Most accurate in the simulation world
  • Requires complete implementation details
  • Unmanageable time complexity even for simpler
    designs
  • Typically used for cell/component
    characterization
  • Synopsys PowerMill (said to provide spice-like
    accuracy)

9
Power Characterization (cont)
  • Cell Level (estimation)
  • After logic synthesis
  • Requires RTL implementation
  • Simulation to capture switching activity
  • Requires delay simulation if glitches need to be
    accounted
  • Characterized cells empirical formulas or table
    look-up
  • Interconnect power
  • Either unaccounted or
  • Using estimated wire load models (typically based
    on experience) or
  • Extracted layout (if done after physical
    synthesis)
  • Still unmanageable time complexity especially to
    use in design space exploration
  • Synopsys PrimePower
  • Netlist, interconnect capacitance, VCD traces,
    cell power library

10
Power Characterization (cont)
  • Register Transfer Level (estimation)
  • Requires conceptual RTL description (detailed
    micro-architecture)
  • Data-path is modeled as netlist of macro cells,
    which are characterized offline
  • Control path and glue logic
  • Either unaccounted or estimated based on I/O
  • Simulation to capture switching activity
  • Typically glitches are not considered but methods
    do exist
  • Interconnect power
  • Typically unaccounted but possible to estimate
    through floor-planning
  • Typically used in DSE mostly using in-house tools

11
System Level Power Estimation
  • For Design Space Exploration
  • Least accurate but uncertainty of exploration
    results can be reduced if models have good
    fidelity
  • Purpose, target architecture and available system
    details govern the system-level estimation models
  • Selecting algorithm or designing hardware for
    given algorithm?
  • ASIC based or processor based?
  • Is ISA fixed or extensible?
  • Typically system-level power estimation models
    are macro-architecture template specific
  • Major constituents of power consumption
  • Computation, communication, storage units
    peripherals

12
Power Estimation Models
  • Activity Based Models
  • Instruction Level Energy Models

13
Activity Based Models
  • Fixed Activity Model
  • N-Transition Model
  • Dual Bit Model

14
Fixed Activity Model
  • P ? i kiGifi
  • Where
  • ki PFA proportionality constant extracted
    empirically from past designs
  • Gi Measure of hardware complexity
  • fi Activation frequency
  • Disadvantage Do not model the influence of data
    activity on power consumption

15
N-Transition Model
  • P Pconst n.Pchange
  • Disadvantage
  • It does not differentiate between transitions
    on different inputs.

16
Dual Bit Type Model
  • Drawback in previous approaches
  • Less Accurate
  • Characterizes the module on basis of Uniform
    White Noise (UWN) input
  • Leads to high error if the input dynamic range
    does not fully occupy the word length

17
Dual Bit Type ModelThe Approach
  • Combines reduced complexity of the architecture
    level with the accuracy of gate and circuit level
  • Black box model of capacitance switched in each
    module for various types of inputs
  • Easy to parameterize capacitance models to take
    into account size , etc.

18
Dual Bit Type ModelModeling Complexity
  • Power consumed by a module is a function of its
    complexity as large modules contain more
    circuitry
  • Examples
  • Capacitance of N-bit ripple carry subtracter
  • CT Ceff N
  • Not restricted to linear models, but can be used
    to specify even more complex models

19
Dual Bit Type ModelCapacitive Data Coefficients
  • Describe the average amount of capacitance
    switched within a module during an input
    transition
  • LSB regions suffer random transitions and hence
    can be characterized by a single capacitive
    coefficient CUU
  • MSB region experiences sign transitions and so is
    characterized by capacitive sign coefficients
    C-,C, etc.

20
Instruction Level Power Estimation
  • First introduced to characterize processor power
    consumption to drive software optimizations
  • Each instruction is associated with some current
  • Inter instruction effects for better accuracy

21
Instruction Level Power Estimation
  • E S(Bi x Ni) S(O(i,j) x N(I,j)) SEk
  • Bi Base Energy Cost
  • Oi.j Inter-instruction effect Energy Cost
  • Ek additional energy penalties due to resource
    constraints
  • Require cost associated with every pair of
    instructions O(N2), where N number of
    instructions in ISA

22
JouleTrack
  • Experiments on StrongARM by Amit Sinha
    A.P.Chandran
  • Current/instruction 0.2A (averaged over all
    instructions)
  • Min-max variation of 38 of average current
  • Address mode and data dependent variation is
    smaller
  • But, max current variation across benchmarks is lt
    8 !
  • Concluded that first order energy model of a
    given processor is, E V I(V, f) T
  • Second order effects can be significant for
    data-path dominated processors such as DSP, VLIW

23
Instruction Level Power Estimation
  • Impractical for CISC processors with very large
    instruction set
  • Higher Average Instruction Energy
  • Low Energy Per Instruction Variance
  • Do not consider inter instruction effects
  • Cluster Similar Instructions as a single class
  • Exponential Storage Problem for VLIW
    architectures
  • No. of Long Instructions N operations into a
    K-wide VLIW N(2k)

24
Modified Energy Model for VLIW
  • Assume Independent Energy dissipation for
    different Execution slots
  • Consider nop as the base energy
  • E(W) SU(wnwn-1) mxpxS lxqxM
  • U(wnwn-1) U(00) Sv(wnk,wn-1k)
  • Wnk operation issued on lane k by instruction
    wn
  • Example
  • Wn ALU NOP NOP NOP, Wn-1 LS NOP ALU
    NOP
  • U(wnwn-1) U(00) v(ALULS) v(NOPALU)
  • Memory Requirement
  • O(KN2)

25
Modified Energy Model for VLIW
  • Cluster Similar Instructions based on cost
  • T e1, e2, , et
  • et energy consumption of instruction t
  • Partition T into K clusters (C1, C2, , Ck) s.t.
  • SS (xi,j cj)2 minimum
  • Large number of clusters
  • Good Accuracy
  • Huge no. of experiments
  • Small number of clusters
  • Small number of experiments
  • High Variance between clusters
  • Reduced Accuracy
  • Memory Requirement
  • O(CN2)

26
Limitations of ILPA
  • Does not provide any insight on the causes of
    power consumption within the processor core
  • Does not account for the power consumed in the
    memory system, which is often dominant
  • To address the second limitation, power
    estimation frameworks which integrate processor
    and memory models are built around instruction
    set simulators

27
MicroArchitecture ILPA
  • Pipeline Aware Instruction Level Energy Model
  • Divide the design into smaller architectural
    blocks
  • Usually Processors Pipeline Stages
  • Fetch, Decode, RF, Execute, WB
  • E(wnwn-1) S As(wnwn-1) I(wnwn-1)
  • As Energy Consumed Per stage s when executing
    wn after wn-1
  • I(wnwn-1) Interstage connections energy
    (PipeLine Registers Buses)
  • Provides better insight for power bottlenecks
  • Smoother Energy Behaviour than Blackbox model
  • Require a Pipeline Structure Aware ISS

28
Energy Models for Register File
  • Assume Linear Power Behaviour for access across
    different ports
  • PRF Pi 1/T S (Er,n Ew,n)
  • Er,n S H(RRi,n, RRi,n-1) Erb
  • Ew,n S H(RWi,n, oldi,n) Ewb

29
Energy Model for Caches
  • Power consumption depends on mode of operation
    (read, write, idle)
  • Energy consumed in a given clock cycle is
    function of node transition between previous and
    current cycle.
  • Characterize energy as function of state
    transitions(read-read, read-write, etc).
  • For a given transition, dependence upon
    transition on address lines.

30
Thank You
Write a Comment
User Comments (0)
About PowerShow.com