Tortola: Addressing Tomorrows Computing Challenges through HardwareSoftware Symbiosis - PowerPoint PPT Presentation

About This Presentation
Title:

Tortola: Addressing Tomorrows Computing Challenges through HardwareSoftware Symbiosis

Description:

Cannot ship 'MS Office for 1Q05 batch of Pentium-4 3GHz, 1GB RAM, BrandX power ... dynamic instrumentation system (x86/Linux version) HW. SW Application ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 27
Provided by: kimhaze5
Category:

less

Transcript and Presenter's Notes

Title: Tortola: Addressing Tomorrows Computing Challenges through HardwareSoftware Symbiosis


1
Tortola Addressing Tomorrows Computing
Challenges through Hardware/Software Symbiosis
  • Kim Hazelwood
  • September 29, 2006

2
Modern Computing Challenges
  • Performance
  • Power
  • Energy consumption, max instantaneous power,
    di/dt
  • Temperature
  • Total heat output, hot spots
  • Reliability
  • Neutron strikes, alpha particles, MTBF, design
    flaws
  • Approaches Circuit, microarchitecture, compiler
  • Constraint Fixed HW-SW interface (e.g., x86)

3
Typical Approaches
  • Optimize using SW or HW techniques in isolation
  • Performance
  • SW Compile-time optimizations
  • HW Architectural improvements, VLSI technology
  • Reliability Code/data duplication (HW or SW)
  • Power Temperature
  • HW control mechanisms
  • Profile recompile cycle

4
Modern Design Constraints
  • Compilers Compile once, run anywhere
  • Cannot ship MS Office for 1Q05 batch of
    Pentium-4 3GHz, gt 1GB RAM, BrandX power supply,
    located in high altitudes
  • Microarchitecture Limited window of application
    knowledge (past must predict the future)
  • VLSI Guaranteed correctness, reliability
  • We currently must optimize for the common case
    (but must design for the worst case)

5
The Power of Virtualization
  • A HW-SW interface layer

SW Applications
Binary Modifier
HW
6
Dynamic Binary Modification
  • Creates a modified code image at run time
  • Examples
  • Dynamo (HP)
  • DAISY/BOA (IBM)
  • CMS (Transmeta)
  • Mojo (Microsoft)
  • Strata (UVa)
  • Pin (Intel)

EXE
Transform
Code Cache
Profile
Execute
7
Dynamic Instrumentation Demo
  • Pin
  • Four architectures IA32, EM64T, IPF, XScale
  • Four OSes Linux, FreeBSD, MacOS, Windows
  • http//rogue.colorado.edu/pin/

8
Dynamic Optimization Demo
  • DynamoRIO
  • Windows and Linux for IA32
  • http//www.cag.lcs.mit.edu/dynamorio/

9
Dynamic Binary Modification
  • Creates a modified code image at run time
  • Always triggered by software events until now
  • Examples
  • Dynamo (HP)
  • DAISY/BOA (IBM)
  • CMS (Transmeta)
  • Mojo (Microsoft)
  • Strata (UVa)
  • Pin (Intel)

EXE
Transform
Code Cache
Profile
Execute
10
Tortola Symbiotic Optimization
  • Enable HW/SW Communication

SW Applications
Binary Modifier
HW
11
Simulation Methodology
  • SimpleScalar 4.0 for x86
  • Wattch 1.02 power extensions
  • Pin dynamic instrumentation system (x86/Linux
    version)

SW Application
Benchmarks
Binary Modifier
Pin
HW
Wattch Simplescalar/x86
12
Tortola Applications
  • Combine global program information with run-time
    feedback
  • System-specific power usage
  • Application-specific heat anomalies
  • Workload/input specific performance optimization
  • Reduce hardware complexity
  • No more backwards compatibility warts
  • Fix bugs after shipment
  • Reduce time to market for new architectures
  • One such application The di/dt problem

13
The Di/dt Problem
  • Voltage stability is important for reliability,
    performance
  • Low-power techniques have a negative side effect
    current variation
  • Dips (undershoots) in supply voltage can cause
    incorrect values to be calculated or stored
  • Spikes (overshoots) in supply voltage can cause
    reliability problems

14
The Di/dt Problem
  • ITRS cites noise management as a Grand Challenge
    for 5-10 year time frame
  • Several trends are aggravating the issue
  • Voltage is scaling down with technology
  • Current draw is increasing
  • Package impedance is not scaling as quickly
  • Aggressive clock gating causes large swings in
    processor current draw (di/dt)

15
Di/dt Solutions
Software MicroArch Circuit-Level
Compiler Optimizations Sensor/Actuator
Mechanisms Decoupling capacitors More
Vdd Gnd pins on package
16
Sensor-Actuator Mechanisms
  • On-chip voltage sensors detect abnormally
    high/low voltage levels
  • On-chip actuator then attempts to quickly
    raise/lower the processors current draw
  • Phantom firing
  • increases current (at the expense of power)
  • Resource throttling
  • reduces current (at the expense of performance)

17
Detecting Imminent Emergencies
Hard Emergency
Soft Emergency
Control Threshold
1.05V 1.03V 1V 0.97V 0.95V
Operating Voltage Range
18
Targeting Mid-Frequency Di/dt
  • Problematic wide current spike
  • Worst case pulse at the resonant frequency

Processor Current (A)
Processor Current (A)
From Joseph et al. HPCA-9
Supply Voltage (V)
Supply Voltage (V)
Time (Cycles)
Time (Cycles)
19
A Di/dt Stressmark
BEGIN_LOOP ldt f1, (4) divt f1, f2,
f3 divt f3, f2, f3 stt f3, 8(4) ldq 7,
8(4) cmovne 31, 7, 3 stq 3, (4) stq 3,
(4) stq 3, (4) stq 3, (4) JUMP
BEGIN_LOOP
  • ButActuator engages every loop iteration
    degrading performance
  • Why not correct the problem in the code?

Sequential Low Current
Parallel High Current
20
Proposed Solution
  • Leverage our additional software layer to
    supplement existing solutions
  • Microarchitecture provides feedback to our
    software-based virtual layer

Altered Executable
Binary Modifier
VL
Executable
SW
HW
SensorActuator Ext
Microprocessor
21
Required Investigations
  • Characterizing emergencies
  • How often do we see di/dt emergency loops?
  • Communication between the microarchitecture and
    the virtual layer
  • What information should be passed to virtual
    layer during an emergency?
  • Fixing di/dt via binary modification
  • Will existing techniques help?
  • New algorithms?

22
Static vs. Dynamic Instances
Data suggests modifying a few code sequences will
eliminate many voltage emergencies
23
Possible Compiler Optimizations
  • Our goal is to
  • Smooth out current profile, or
  • Knock pulses off of the resonant frequency
  • Some existing options
  • Software pipelining, code motion, instruction
    padding

Executable
Apply Optimizations
Altered Executable
Binary Modifier
SensorActuator Extns
Microprocessor
24
Loop Unrolling SW Pipelining
A A B B
Problematic loop
Current
25
Unrolling the Di/dt Stressmark
H1
H
H2
L
L1
L2
26
Summary
  • Symbiotic program optimization is a powerful
    approach
  • The di/dt problem well suited for a symbiotic
    solution
  • The Tortola design can also target power
    reduction, temperature reduction, reliability,
    etc.
  • http//www.tortolaproject.com/
Write a Comment
User Comments (0)
About PowerShow.com