DynamoSim A Tracebased Dynamically Compiled Instruction Set Simulator - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

DynamoSim A Tracebased Dynamically Compiled Instruction Set Simulator

Description:

Native code dynamically compiled from a chunk of target instructions ... store h3 into simRegs[3]; Translation. t3 = t1 t2; t3 = t3 * t2; Target Executable ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 23
Provided by: mong52
Category:

less

Transcript and Presenter's Notes

Title: DynamoSim A Tracebased Dynamically Compiled Instruction Set Simulator


1
DynamoSim -A Trace-based Dynamically Compiled
Instruction Set Simulator
  • Wai Sum Mong, Jianwen Zhu
  • ECE, University of Toronto
  • Nov 8th, 2004

ICCAD 04
2
Outline
  • Instruction Set Simulation Techniques
  • Our Strategies
  • Experiments
  • Conclusion

3
Interpretive Simulation
  • ? Simple
  • ? Slow
  • Software decoding is slow and redundant
  • Each target instruction is simulated by
  • multiple host instructions

E.g. SimpleScalar (Winconsin)
4
Static Compiled Simulation
  • ? Avoid expensive and redundant decoding
  • ? Cannot handle dynamic program code

E.g. Compiled simulator for DSP architectures
(Zivojnvic95) Ultra-fast instruction set
simulator (Zhu99)
5
Dynamic Compiled Simulation
  • Translation
  • Native code dynamically compiled from a chunk of
    target instructions
  • Caches translation for reuse

Lookup PC in Translation Cache
simulation compiler
Translation Cache
00100010001011000 ..
00100010001011000 01001000100100001 . 010
00100100.00001 00001000100001000 1111100000101
01..100 .
00100010001011000 01001000100100001 .
6
Dynamic Compiled Simulation Contd
  • Examples
  • Shade (Cmelik94)
  • Embra (Witchel96)
  • Other solutions
  • JIT-CCS (Nohl02)
  • Caches decoded info.
  • IS-CS (Reshadi03)
  • Only decodes dynamic code at runtime
  • ? Flexibility of
  • interpretive simulation
  • ? Performance approaches
  • static compilation if
  • translations are
  • repeatedly reused

We propose three techniques to improve the
dynamic compiled simulation
7
Our Strategies
  • Selective Compilation
  • Applying dynamic compilation only on selected
    parts of program code
  • Widening Translation Scope
  • Extending compilation region beyond the scope of
    basic blocks
  • Register Allocation
  • Mapping target registers directly to host
    registers in translations

8
Selective Compilation Why?
  • Dynamic compilation
  • High compilation overhead
  • Low variable cost
  • Lower avg. cost if the translation is reused
  • Interpretation
  • High variable cost
  • Cheaper approach for infrequently used instruction

Solution Interpretation Compilation
9
Selective Compilation How?
  • Interpret
  • Observe program behavior
  • Switch to compile when profitable code portion
    is detected
  • Switch to execute when a translation is found
  • Compile
  • Compiles profitable code
  • Caches the translation
  • Switch to execute
  • Execute
  • Execute a translation

10
Widening Translation Scope Why?
translations
  • Overhead of prologue and epilogue
  • Traffic between interpreter and executor
  • Cache lookup operations
  • instruction-level parallelism

simulator
11
Widening Translation Scope - Trace
  • Trace
  • Sequence of dynamically executed instructions
  • Spans multiple basic blocks
  • Single-entry multiple exits
  • Hot Trace
  • Frequently executed path

Trace
A
B
C
D
12
Trace-based Selective Compilation
  • Interpret
  • Watches for a hot trace
  • Switch when a hot trace is identified
  • Interpret Compile
  • Works out the trace
  • Interprets the instruction as we compile
  • Switch back to interpret

How to identify hot trace with partial execution?
13
Hot Trace Prediction Methodology - Dynamo
  • Hot trace prediction using partial execution
    profile
  • Dynamo (PLDI 2000, HP Lab)
  • A transparent dynamic optimization system
  • Interprets and optimizes native code in software
    at runtime
  • This is NOT a simulator!
  • Methodology
  • Interprets application in user-mode
  • Watches for a hot trace
  • Optimizes the hot trace and caches for reuse

Predicting hot traces from interpretation
14
Adapting Dynamos Hot Trace Prediction Method to
DynamoSim
  • Identifies hot traces in loops
  • Associates a counter to the start of a potential
    hot trace
  • Increments the counter at each time the
    start-of-trace condition is satisfied
  • If counter gt threshold,
  • mark as start of a hot trace
  • Compilation engine starts
  • Interprets and compiles until end-of-trace
    condition is satisfied

15
Dynamos Hot Trace Prediction Method Hot Trace
Definition
  • Start-of-trace candidate counter
  • Targets of taken backward branches
  • Potential loop header
  • Exits of previously identified hot traces
  • Statistically likely that subsequent trace is hot
  • End-of-trace compiler stops
  • backward-taken branches
  • Signals end of a loop
  • Taken branch addresses hit translation cache
  • Potential start of a translation
  • Expensive to do lookup for each inst.

A
B
C
D
16
Register Allocation Why?
  • ? Reduce memory traffic
  • ? Reduce translation size
  • ? May not have enough usable host registers for
    each
  • target register

?
17
Register Allocation How?
  • Trace-based
  • Host registers are lazily allocated in
    compilation
  • Allocated host registers are not released until
    translation ends
  • If no host register can be allocated, use a
    scratch register
  • Reserved host registers
  • Temporary mapping
  • Commit values in dirty registers back to
    simulated registers in the end of translation

18
Register Allocation How?
scratch
Allocation Table
host registers h0, h1, h2, h3, h4
t1 Ø
t1 t2

load simRegs1 to h3 load simRegs2 to h4 h2
h3 h4 store h2 into simRegs3
load simRegs3 to h0 h2 h0 h4 Store h2
into simRegs3
Translation with RA
19
Experiment Setup
  • SimpleScalar
  • Translation cache
  • 4-way associative cache
  • Trace prediction
  • Threshold 3
  • Dynamic compilation
  • VCODE (Engler95)
  • Fast dynamic code generation system
  • SPEC2000 Testbenches

20
Experimental Result
Compiled traditional dynamic compiled
simulation Hybrid dynamic compiled
selective compilation
trace-based Hybrid RA dynamic compiled
selective compilation
trace-based register allocation
21
Experimental Result Statistics of 181.mcf
22
Conclusions
  • We propose 3 techniques to improve dynamic
    compiled instruction-set simulation
  • Selective compilation
  • Interpretation compilation
  • Widening translation scope
  • Extends from basic block to trace
  • Register Allocation
  • Maps host registers directly to target registers
  • Our experimental results proved that the proposed
    techniques are effective
Write a Comment
User Comments (0)
About PowerShow.com