Fully Dynamic Specialization - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Fully Dynamic Specialization

Description:

Explicitly annotate static data ... Good candidate instructions are predictable: result in (only) a few hot values ... Case study: Interpreter ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 25
Provided by: ajsha9
Category:

less

Transcript and Presenter's Notes

Title: Fully Dynamic Specialization


1
Fully Dynamic Specialization
  • AJ Shankar
  • OSQ Lunch
  • 9 December 2003

2
Thats Why They Play the Game
  • Programs are executed because we cant determine
    their behavior statically!
  • Idea Optimize programs dynamically to take
    advantage of runtime information we cant get
    statically
  • Look at portions of the program for predictable
    inputs that we can optimize for

3
Specialization
  • Recompile portions of the program, using known
    runtime values as constants
  • Possibly many variants of the same code
  • Allow for fallback to original code when
    assumptions are not met
  • Predictable recurrent

4
How It Works
LOAD pc
X
X
  • Chose a good region of code to specialize after
    a good predictable instruction
  • Insert dispatch that checks the result of the
    chosen instruction
  • Recompile code for different results of the
    instruction
  • During execution, jump to appropriate specialized
    code

Dispatch(X)
Dispatch(X)
Dispatch(X)
Spec1
Spec2
Default
Spec1
Spec2
Default
Spec1
Spec2
Default






Rest of Code
5
Tying Things Together
  • If Foo is specialized on X
  • And because of X, Y is constant
  • And Foo calls Bar with param Y
  • And Bar is specialized on Y
  • Foo can jump straight to that specialized version
    of Bar

Method Foo
Method Bar
Dispatch
Dispatch
Spec_X
Spec_Y
Spec_Z
Bar(Y)


6
When Is This a Good Idea?
  • Any app whose execution is heavily dependent on
    input
  • For instance
  • Interpreters
  • Raytracers
  • Dynamic content producers (CGI scripts, etc.)

7
Specialization Is Hard!
  • Specializing code at runtime is costly
  • Can even slow the program down
  • Existing specializers rely on static annotations
    to clue them in about profitable areas
  • Difficult to get right
  • Limits specialization potential

8
Existing DyC, Cyclone, etc.
  • Explicitly annotate static data
  • No support for automatic specialization of
    frequently-executed code
  • Could compile lots of useless stuff
  • No concrete store information
  • Doesnt take advantage of the fact that memory
    location X is constant for the lifetime of the
    program

9
Existing Calpa
  • Mock, et al, 2000. Extension to DyC.
  • Profile execution on sample input to derive
    annotations
  • But converting a concrete profile to an abstract
    annotation means
  • Still unable to detect concrete memory constants
  • Frequently executed code for arbitrary input?
  • Still needs source, is offline!

10
Motivating Example Interpreter
  • while(1)
  • i instrspc
  • switch(instr.opcode)
  • case ADD
  • envi.res envi.op1 envi.op2
  • pc
  • break
  • case BNEQ
  • if (envi.op1 ! 0)
  • pc envi.op2
  • else pc
  • break
  • ...

Sample interpreted program X 10 WHILE (Z
! 0) Y XZ
  • X is constant after initialization
  • concrete memory location
  • Y XZ executed frequently

11
Motivating Example Interpreter
  • while(1)
  • i instrspc
  • switch(instr.opcode)
  • case ADD
  • envi.res envi.op1 envi.op2
  • pc
  • break
  • case BNEQ
  • if (envi.op1 ! 0)
  • pc envi.op2
  • else pc
  • break
  • ...

Sample interpreted program X 10 WHILE (Z
! 0) Y XZ
while(1) while (pc 15) // Y X
Z env3 10 env2 // Z ! 0 ? if
(env2 0) pc 19 else // normal
loop
12
A More Concrete Approach
  • Do everything at runtime!
  • Specialize on execution-time hot values
  • Know which concrete memory locations are constant
  • Other benefits of this approach
  • Specialize temporally, as execution progresses
  • Specialize dynamically loaded libraries as well
  • No annotations or source code necessary

13
A Quick Recap
LOAD pc
X
X
LOAD pc
  • Chose a good region of code to specialize
  • Insert dispatch that checks the result of the
    chosen instruction (the trigger)
  • Recompile code for different values of a hot
    instruction
  • During execution, jump to appropriate specialized
    code

Dispatch(X)
Dispatch(X)
Dispatch(X)
Dispatch(pc)
Spec1
Spec2
Default
Spec1
Spec2
Default
Spec1
Spec2
Default
pc15
pc27
while(1)







Rest of Code
14
The Details
  • Need to identify the best predictable instruction
  • Specializing on its result should provide the
    greatest benefit
  • To find it, gather profile information about all
    instructions
  • Need to actually do the specializing

15
Instrumentation Hot Values
  • Whats a hot value? One that occurs frequently as
    the result of an instruction
  • x 2 has two very hot values, 0 and 1
  • Good candidate instructions are predictable
    result in (only) a few hot values
  • For instance, small_constant_tablex, but not
    rand(x)
  • Case study Interpreter
  • Predictable instructions LOAD pc, instr.opcode
  • instr instrspc
  • switch(instr.opcode)

16
Instrumentation Store Profile
  • Keep track of memory locations that have been
    written to
  • Idea if a location hasnt been written to yet,
    it probably wont be later, either
  • Case study Interpreter
  • Store profile says envY written to a lot, but
    envX, instrs never written to
  • regsinstr.res regsinstr.op1
    regsinstr.op2

17
Invalidating Specialized Code
  • Memory locations may not really be constant
  • When constant memory is overwritten, must
    invalidate or modify specializations that
    depended on it
  • How does Calpa handle invalidation?
  • Computes points-to set
  • Inserts invalidation calls at all appropriate
    points (offline)
  • Too costly an approach, without modification

18
Invalidation Options
Class Interpreter private Instruction
instrs void SetInstrs(Instruction is)
instrs is
  • Write barrier
  • Still feasible if field is private
  • On-entry checks
  • Feasible if specialization depends on a small
    number of memory locations
  • e.g. Factor(BigInt x)
  • Hardware support
  • e.g. Mondrian
  • Ideal solution
  • Possible to simulate?

Hot Instruction
CheckMem
Dispatch
Invalidate
Spec1
Default
19
Specialization Procedure
  • Recap We know which instructions are good
    candidates, what their hot values are, and what
    parts of memory are likely to be invariant
  • Want to compile different versions of the same
    block of code relative to a chosen trigger
    instruction
  • Each version is keyed on a hot value of that
    instruction
  • What instruction, if any, should be a basis for
    specialization?

20
Specialization Algorithm
  • Find good candidate instructions
  • Predictable
  • Frequently executed
  • For each candidate instruction
  • Simultaneously evaluate method using constant
    propagation for some of its hot values
  • Compute overall cost/benefit
  • Choose the best instruction

21
Algorithm Pseudo-code
  • foreach(value v in hot values)
  • worklist.push(ltstart node, vgt)
  • previously_emitted ltunspecialized nodes,
    default stategt
  • while (ltn, sgt pop worklist)
  • ltn', s'gt evaluate(ltn, sgt) // uses store
    information, fixes jumps
  • foreach (n'' in succ(n'))
  • // have we already seen this node/state pair
    before?
  • prev_instr previously_emittedltn'', s'gt
  • if (prev_instr) // if so, link to it
  • n'.modify_jump_to(n''-gtprev_instr)
  • else // otherwise, keep evaluating
  • worklist.push(ltn'', s'gt)
  • instr emit_instruction(n')
  • // remember this pair in case we see it again
  • previously_emittedltn', s'gt instr

22
Specializing the Interpreter
  • while(1)
  • i instrspc
  • switch(instr.opcode)
  • case ADD
  • envi.res envi.op1 envi.op2
  • pc
  • break
  • case BNEQ
  • if (envi.op1 ! 0)
  • pc envi.op2
  • else pc
  • break
  • ...

Candidates
Instr.opcode Executed very frequently A small
handful of values
pc Executed very frequently More values, but
still reasonable
23
Specializing on instr.opcode
Dispatch(opcode)
LOOP i instrspc
switch(ADD)
switch(i.opcode)
i.opcode ADD
switch(ADD)
benefit 1
case ADD


i.opcode ADD
case ADD
benefit 2
envi.res envi.op1envi.op2
i.opcode ADD
envi.res envi.op1envi.op2
pc pc 1
i.opcode ADD
pc pc 1
goto LOOP
i.opcode ADD
goto LOOP
benefit 3
i.opcode ADD
LOOP i instrspc

Other values of opcode have similar results
24
Specializing on pc
Y X Z
Dispatch(pc)
LOOP i instrs15
LOOP i instrspc
pc 15
LOOP i instrs15
benefit 1
switch(i.opcode)
pc 15 i ADD Y, X, Z
switch(ADD)
benefit 2
case ADD


pc 15 i ADD Y, X, Z
case ADD
benefit 3
envi.res envi.op1envi.op2
pc 15 i ADD Y, X, Z
envY 10 envZ
benefit 6
pc 15 i ADD Y, X, Z
pc pc 1
pc 15 1
benefit 7
pc 16 i ADD Y, X, Z
goto LOOP
LOOP i instrs16
benefit 8
pc 16 i BNEQ Z, 15
switch(BNEQ)
benefit 9
pc 16 i BNEQ Z, 15
if (envZ ! 0)
benefit 10
pc 16 i BNEQ Z, 15
pc
benefit
25
Final Result
  • Choose to specialize on pc because benefit is far
    greater than for instr.opcode
  • Generate different versions for each of the
    hottest values of pc
  • Terminate loop unrolling either naturally (when
    we dont know what pc is anymore) or with a
    simple heuristic

26
Heuristics
  • Algorithm may not terminate when unrolling loops
  • Simple heuristic widen variables when weve seen
    the same node, say, 10 times (or use frequency
    statistics)
  • Algorithm may generate lots of code
  • Need to only look at parts of state that matter
  • Widen somewhere
  • Other issues Algorithm may be slow
  • Need better way to prune off bad candidates

27
Implementation Ideas
  • Use Dynamo
  • Hot trace as basis for specialization
  • Intuitively, follow the lifetime of an object as
    it travels through the program across function
    boundaries
  • Unfortunately, closed-source, and API isnt
    expressive enough

28
Implementation Ideas
  • JikesRVM
  • Java VM written in Java
  • Has a primitive framework for sampling
  • Has a fairly sophisticated framework for dynamic
    recompilation
  • Does aggressive inlining
  • Only instrument hot traces (but compiler is slow)
Write a Comment
User Comments (0)
About PowerShow.com