Procedure Optimizations - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Procedure Optimizations

Description:

In-Line Expansion: Hand-crafted assembly language Procedure Integration. ... Tail Call optimization replaces procedure call with a branch, and callee then ... – PowerPoint PPT presentation

Number of Views:275
Avg rating:3.0/5.0
Slides: 24
Provided by: csT4
Category:

less

Transcript and Presenter's Notes

Title: Procedure Optimizations


1
  • Procedure Optimizations
  • Timo Rasi

2
Whats a Procedure Optimization?
  • Optimizations appying to whole procedures instead
    if procedure contents.
  • Tail Call f() calling g() only returns directly
    after the call, enabling transformation of the
    call into a branch.
  • Tail Recursion Special case of tail call where
    f() g(), enabling transformation of the call
    into a loop.
  • Procedure Integration Replacement of a procedure
    call with the procedure contents.
  • In-Line Expansion Hand-crafted assembly language
    Procedure Integration.
  • Leaf Routine Optimization Elimination of
    unnecessary procedure and epilogue code from leaf
    procedures.
  • Shrink Wrapping Deferred procedure prologue and
    epilogue code.

3
Procedure Body Lookup Schemes
  • Depending on the case, the optimizer needs to
    have access to the whole callee contents or at
    least the stack frame size.
  • The easy part Same compilation unit.
  • Other ways
  • Saved intermediate code,
  • Link time,
  • Manual Labor Bundling of several disjoint source
    code files into one.

4
Tail Call Optimization
  • Tail Call optimization replaces procedure call
    with a branch, and callee then returns on behalf
    on the caller.
  • Tail Call optimization cannot be done in source
    code form Branch from a procedure into another
    would violate almost any (?) high-level language
    semantics.
  • Callers stack frame is larger than callees
    Callees procedure epilogue deallocates callers
    whole stack frame.
  • Callers stack frame is smaller than callees
    Before entering callee, either allocate remainder
    or release stack frame totally and use standard
    procedure prologue.

5
Tail Call and Tail Recursion Book Example, a
Curious One
6
Tail Call Book Example, a Buggy One
7
HP PA-RISC aCC Tail Call Optimization
  • Despite experimentation, the author was not able
    to manifest tail call optimization in practise.
  • Compiler insisted on adhering to the PA-RISC
    procedure call convention
  • Caller Branch to a procedure and save return
    address to a link register.
  • Callee Opionally, save the return address into
    the stack frame.
  • Callee Optionally, save callee-save registers.
  • Callee Execute procedure body.
  • Callee Optionally, restore callee-save
    registers.
  • Callee Optionally, restore the return address.
  • Callee Branch back to caller via the link
    register.

8
Tail Recursion Optimization
  • Tail Recursion optimization transforms a
    recursive call into a loop.
  • Procudure call is replaced by parameter renaming,
    branch and deletion of a return.
  • Can usually be done relatively simply in a
    high-level code form.
  • A blind procedure call offers little to be
    optimized, but tail recursion optimization opens
    doors to other optimizations.

9
Tail Recursion Book Example
10
HP PA-RISC aCC Tail Recursion Optimization
  • Compiler did real good job in optimizing a void
    function.
  • Surprise Compiler refused to optimize even a
    trivial leaf int function. Is there a good reason
    for this? There was, last operation in the
    example function was addition instead of a call.
  • Loop unrolling didnt manifest.

11
What is Inlining? What is Inlining Not?
  • Yes Substitution of procedure call with the
    procedure body.
  • No Definition of procedure body simultaneously
    or separately with declaration in a C header
    file, although thats equivalent to the inline
    keyword.
  • No Inline C assembly.

12
Whats Procedure Integration Good For?
  • Eliminates procedure call overhead.
  • Produces larger basic blocks
  • Enables other optimizations, for example Constant
    Propagation, Loop Unrolling etc.
  • Eliminates branch pipeline penalty.

13
Whats Procedure Integration Not So Good For?
  • Protests revolve around code bloat and caches.
  • Maintenance scheme may have a role
  • Shared library delivery no-no.
  • Recompilation of subproduct maybe.
  • Recompilation of whole product ok.
  • Bad compilers might generate bad code.
  • Procedures with local state may generate
    surprises.
  • Uninlined procedures may end up to a static copy
    in every compilation unit, although the linker is
    supposed to remove duplicates and/or dead code.
  • Integration of a procedure is not a property of
    the procedure itself but a property of all call
    sites within an executable after applied
    optimizations.
  • Some considerations Procedure size, number of
    calls to the procedure, resulting loop(s),
    amount of constant arguments.
  • Very likely the compiler and runtime profiling
    tools know this stuff better than humans do.

14
HP PA-RISC aCC Procedure Integration
  • Elaborate scheme, rough rules and more finer
    control, implicit inlining, run-time inline
    adviser profiler.
  • Keen to inline, even when explicitly instructed
    not to.
  • Effective in short C methods augmented with the
    processors inability to predict procedure return
    branches.
  • Compiler did real good job in loop unrolling and
    procedure integration. One test demonstrated a
    60x improvement over a naive call into a shared
    library.
  • One experiment compilation of 350.000 lines of
    legacy C from a single file with aggressive
    inlining and elaborate optimizations -gt 46000
    inlined functions, 18 hours CPU, 5.5 days
    wallclock time.

15
In-Line Expansion
  • In essence, a hand-written assembly language code
    sequence.
  • Used for operations that the compiler cannot do
  • Operations outside mainstream code generation
    scheme,
  • Difficult processor-specific optimizations,
  • Operating Systemspecific operations,
  • Augmentation of missing language features.
  • Compiler has no idea about asm(statement)
    contents Effects need to be communicated somehow
    or the compiler for example takes the safest bet
    and throws out all optimizations.
  • Microsoft has compiler intrinsics compiling
    directly into MMX instructions. Those could be
    regarded as compiler-assisted in-line expansion
    operations.

16
HP PA-RISC aCC In-Line Expansion
  • Not much to tell about in this case either
  • Warning 669 "expansion.cc", line 10 The asm
    declaration is ignored.
  • asm("nop")

17
Leaf Routine Optimization
  • Lowering the overhead of a leaf procedure call.
  • Highly desirable with little effort.
  • Are there enough leaf routines in the first
    place? Yes, for example a binary three has more
    leafs than non-leafs.
  • Removal of procedure prologue and epilogue code
    associated with preparation to a subprocedure
    call.

18
HP PA-RISC aCC Compiler Leaf Routine Optimization
  • A lightweight leaf routine call was one original
    PA-RISC design goal due to the simple hardwired
    RISC instructions.
  • A millicode call such as multiply only requires a
    branch link instruction and utilizes
    caller-save registers.
  • Separate scheme for millicode and high-level
    language leaf routines.
  • Actions a leaf routine can usually or always do
    without
  • Saving of the return address register into the
    stack frame.
  • Allocation of a stack frame,
  • Saving and restoring of scratch registers. Input
    parameter registers are also eligible for scratch
    registers.
  • Deallocation of the stack frame,
  • Restoration of the return address register.

19
Shrink Wrapping
  • In Essence Lazy Stack Frame construction.
  • Moving of prologue and epilogue code to enclose
    only minimal appropriate code segments.

20
Shrink Wrapping Data Flow Analysis
  • A register is anticipatable at a poit in a
    flowgraph, if all execution paths from that point
    contain defintions or uses of it. In other words,
    the register is being taken into use.
  • A register is available at a point in a
    flowgraph, if all execution paths to that point
    include definitions or uses of the register. In
    other words, the register is released from use.
  • Main Idea Register saving code is inserted where
    the register is anticipatable and register
    restoring code is inserted where the register is
    available.
  • In Other Words Saving code is inserted just
    before the first use and restore code is inserted
    just after the final use.
  • Legend for the example
  • RANTin(i) Register i is anticipatable on entry
    to block i.
  • RANTout(i) Register i is anticipatable on exit
    from block i.
  • RAVin(i) Register i is available on entry to
    block i.
  • RAVout(i) Register i is avaiable on exit from
    block i.

21
Shrink Wrapping Formulas, Theory
22
Shrink Wrapping Example
23
Shrink Wrapping Formulas, Practise
Write a Comment
User Comments (0)
About PowerShow.com