Pro64: Performance Compilers For IA64 - PowerPoint PPT Presentation

About This Presentation
Title:

Pro64: Performance Compilers For IA64

Description:

IR is called WHIRL. Common interface between components. Multiple languages and multiple targets ... Object files (*.o) contain WHIRL. IPA in ld invokes backend ... – PowerPoint PPT presentation

Number of Views:528
Avg rating:3.0/5.0
Slides: 30
Provided by: dou124
Category:

less

Transcript and Presenter's Notes

Title: Pro64: Performance Compilers For IA64


1
Pro64 Performance Compilers For IA-64
  • Jim Dehnert
  • Principal Engineer
  • 5 June 2000

2
Outline
  • IA-64 Features
  • Organization and infrastructure
  • Components and technology
  • Where we are going
  • Opportunities for cooperation

3
IA-64 Features
  • It is all about parallelism
  • at the process/thread level for programmer
  • at the instruction level for compiler
  • Explicit parallel instruction semantics
  • Predication and Control/Data Speculation
  • Massive Resources (registers, memory)
  • Register stack and its engine
  • Software pipelining support
  • Memory hierarchy management support

4
Structure
  • Logical compilation model
  • Base compilation model
  • IPA compilation model
  • DSO structure

5
Logical Compilation Model
6
Base Compilation Model

7
IPA Compilation Model

8
DSO Structure

9
Intermediate Representation
  • IR is called WHIRL
  • Common interface between components
  • Multiple languages and multiple targets
  • Same IR, 5 levels of representation
  • Continuous lowering as compilation progresses
  • Optimization strategy tied to level

10
Components
  • Front ends
  • Interprocedural analysis and optimization
  • Loop nest optimization and parallelization
  • Global optimization
  • Code generation

11
Front ends
  • C front end based on gcc
  • c front end based on g
  • Fortran90/95 front end

12
IPA
  • Two stage implementation
  • Local gather local information at end of front
    end process
  • Main analysis and optimization

13
IPA Main Stage
  • Two phases in main stage
  • Analysis
  • PIC symbol analysis
  • Constant global identification
  • Scalar mod/ref
  • Array section
  • Code layout for locality
  • Optimization
  • Inlining
  • Intrinsic function library inlining
  • Cloning for constants, locality
  • Dead function, variable elimination
  • Constant propagation

14
IPA Engineering
  • User transparent
  • Additional command line option (-ipa)
  • Object files (.o) contain WHIRL
  • IPA in ld invokes backend
  • Integrated into compiler
  • Provides information to loop nest optimizer,
    global optimizer, and code generator
  • Not disabled by normal .o or DSO object
  • Can analyze DSO objects

15
Loop Nest Optimizer/Parallelizer
  • All languages
  • Loop level dependence analysis
  • Uniprocessor loop level transformations
  • OpenMP
  • Automatic parallelization

16
Loop Level Transformations
  • Based on unified cost model
  • Heuristics integrated with software pipelining
  • Fission
  • Fusion
  • Unroll and jam
  • Loop interchange
  • Peeling
  • Tiling
  • Vector data prefetching

17
Parallelization
  • Automatic
  • Array privatization
  • Doacross parallelization
  • Array section analysis
  • Directive based
  • OpenMP
  • Integrated with automatic methods

18
Global optimization
  • Static Single Assignment is unifying technology
  • Industrial strength SSA
  • All traditional optimizations implemented
  • SSA-preserving transformations
  • Deals with aliasing and calls
  • Uniformly handles indirect loads/stores
  • Benefits over bit vector techniques
  • More efficient setup and use
  • More natural algorithms gt robustness
  • Allows selective transformation

19
Code Generation
  • Inner loops
  • IF conversion
  • Software pipelining
  • Recurrence breaking
  • Predication and rotating registers
  • Elsewhere
  • Hyperblock formation
  • Frequency based block reordering
  • Global code motion
  • Peephole optimization

20
Technology
  • Target description tables (targ_info)
  • Feedback
  • Parallelization
  • Static Single Assignment
  • Software pipelining
  • Global code motion

21
Target description tables
  • Isolate machine attributes from compiler code
  • Resources functional units, busses
  • Literals sizes, ranges, excluded bits
  • Registers classes, supported types
  • Instructions opcodes, operands, attributes,
    scheduling, assembly, object code
  • Scheduling resources, latencies

22
Feedback
  • Used throughout the compiler
  • Instrumentation can be added at any stage
  • Explicit instrumentation data incorporated where
    inserted
  • Instrumentation data maintained and checked for
    consistency through program transformations

23
SSA Advantages
  • Built-in use-def edges
  • Sparse representation of data flow information
  • Sparse data flow propagation based on SSA graph
  • Linear or near-linear algorithms
  • Every optimization is global
  • Transform one construct at a time, customize to
    context
  • Handle second order effects

24
SSA as IR for optimizer
  • SSA constructed only once at set-up time
  • Use-def info inherently part of SSA
  • Use only optimization algorithms that preserve
    SSA form
  • Transformations do not invalidate SSA info
  • Full set of SSA-preserving algorithms
  • No SSA construction overhead between phases
  • Can arbitrarily repeat a phase for newly
    exposed optimization opportunities
  • Extended to uniformly handle indirect memory
    references

25
Software Pipelining
  • Technology evolved from Cydra compilers
  • Powerful preliminary loop processing
  • Effective minimization of loop overhead code
  • Highly efficient backtracking for scheduling
  • Integrated register allocation, interface with CG
  • Integrated with LNO loop nest transformations

26
Global Code Motion
  • Moves instructions between basic blocks
  • Purpose balance resources, improve critical
    paths
  • Uses program structure to guide motion
  • Uses feedback or estimated frequency to
    prioritize motion
  • No artificial barriers, no exclusively-optimized
    paths

27
Where are we going?
  • Open source compiler suite
  • Target description for IA-64
  • Available via usual Linux distributions and
    www.oss.sgi.com
  • Beta version in June
  • MR version when Intel ships systems
  • OpenMP for c/c (later)
  • OpenMP extensions for NUMA (later)

28
Areas for collaboration
  • Target descriptions for other ISAs
  • real or prototype
  • Additional optimizations
  • Generate information for performance analysis
    tools
  • Extensions to OpenMP
  • Surprise me

29
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com