Code optimization by partial redundancy elimination using Eliminatability paths (E-paths) - PowerPoint PPT Presentation

About This Presentation
Title:

Code optimization by partial redundancy elimination using Eliminatability paths (E-paths)

Description:

Computer Science & Engineering, Indian Institute of Technology, Bombay ... Computer Science & Engineering, Indian Institute of Technology, Bombay. A brief ... – PowerPoint PPT presentation

Number of Views:317
Avg rating:3.0/5.0
Slides: 60
Provided by: ProfDMDh
Category:

less

Transcript and Presenter's Notes

Title: Code optimization by partial redundancy elimination using Eliminatability paths (E-paths)


1
Code optimization bypartial redundancy
eliminationusing Eliminatability paths (E-paths)
  • Prof. Dhananjay M Dhamdhere

2
These slides are based on
  • D. M. Dhamdhere E-path_PRE---Partial redundancy
  • elimination made easy, SIGPLAN Notices, v
    37, n 8
  • (2002), 53-65.
  • D. M. Dhamdhere Eliminatability path---A
    versatile basis
  • for partial redundancy elimination, 2002
  • Dheeraj Kumar Syntactic and Semantic Partial
  • Redundancy elimination, M. Tech.
    dissertation,
  • I.I.T. Bombay, 2006.

3
Partial redundancy elimination
  • Partial redundancy
  • An expression e in statement s is partially
    redundant if its value is
  • identical with value of e in some path from
    start of program to s
  • Partial redundancy elimination
  • -- A partially redundant occurrence of e is
    made totally redundant by
  • inserting evaluations of e in some path(s)
    from start of the
  • program to s
  • -- The totally redundant occurrence of e is
    now eliminated

4
An example of PRE
tab
ab
tab
2
1
2
1
ab
t
3
3
-- Insert ab in node 2
-- Delete ab from node 3
5
Partial redundancy elimination
PRE subsumes 3 important classical optimizations
  • Common subexpression elimination (CSE)
  • - Expression e is computed along all paths
    reaching its occurrence
  • Loop invariant movement
  • - A loop-invariant expression is available
    along the looping edge.
  • Hence it is partially redundant.
  • Classical code motion
  • - A less known optimization. It is in fact
    partial redundancy
  • elimination in specific situations.

6
PRE subsumes 3 optimizations
1. CSE
1
- ab of node 5 is a CSE.
2. Loop invariant movement
a..
ab
2
4
  • ab of node 4 is partially
  • redundant

ab
3
5
3. Code movement
  • ab of node 6 can be
  • moved to node 3.

ab
6
7
Benefits and costs of PRE
  • Benefits
  • Execution efficiency through a reduction in
    the number of
  • expression occurrences along a graph path
  • Costs
  • - Use of compiler generated temporaries to
    hold values
  • of expressions
  • - Lifetimes of compiler generated temporaries
    increase
  • register pressure
  • - Insertion of new blocks due to edge
    placement
  • Desirable goals
  • Computational optimality and lifetime
    optimality

8
Data flow concepts used in partial redundancy
elimination
  • Availability An expression e is available at a
    program
  • point p if its value is computed along ALL
    paths from
  • start of the program to p
  • Partial availability An expression e is
    partially available
  • at a program point p if its value is computed
    along SOME
  • path from start of the program to p
  • Availability Total redundancy
  • Partial availability Partial redundancy

9
Data flow concepts used in partial redundancy
elimination
  • Anticipatability An expression e is
    anticipatable (that is,
  • very busy) at a program point p if it is
    computed along
  • ALL paths from p to an exit of the program

10
Data flow concepts used in partial redundancy
elimination
  • Anticipatability An expression e is
    anticipatable (that is,
  • very busy) at a program point p if it is
    computed along
  • ALL paths from p to an exit of the program
  • Safety of a computation (Kennedy 1972) An
    expression
  • e is safe at a program point p if it is
    either available or
  • anticipatable at p
  • - Insertion of e at p is a new computation
    if e is not
  • safe at p.
  • - It increases the execution time of the
    program. It may
  • also raise new exceptions

11
Safe insertion of computations
ab
ab
11
12
21
22
ab
ab
13
23

-- ab is anticipatable in node 12, but not
anticipatable in node 22
-- Insertion of ab in node 12 is safe, however
in 22 it is unsafe
-- Insertion in edge (22,23) is safe!
12
Some partial redundancies cannot be eliminated
through safe code insertion
ab
i
-- Insertion in the in-edge of node n is
unsafe because ab is not anticipatable
m
tab
ab available
n
ab available, anticipatable
ab anticipatable
ab
k
13
Performing Partial Redundancy Elimination
  • Identify partially redundant occurrences of an
    expression e in a program
  • Insert occurrences of e at some program points
    where e is safe
  • Delete partially redundant occurrences of e which
    have become totally
  • redundant
  • Classical PRE Elimination of partial
    redundancies in a program through safe
  • insertion of computations.
  • - Can be looked upon as code movement from
    the point of original
  • occurrence to the point of insertion
  • - It cannot eliminate all partial
    redundancies in a program!

14
A brief history of PRE
  • Morel, Renvoise (1979) Bidirectional data flows
    for code placement in nodes (MRA).
  • Lacks both computational and lifetime
    optimality.
  • Dhamdhere (1988) Computational optimality and
    reduced lifetimes of temporaries
  • than Morel-Renvoise through placement in
    nodes and edges (EPA).
  • Knoop, Ruthing, Steffen (1992) Lazy code motion
    (LCM) offering computational
  • optimality and lifetime optimality through
    a priori edge splitting and placement in
  • nodes. Drechsler and Stadel (1993)
    reformulated LCM to handle basic blocks.
  • Bodik, Gupta, Soffa (1998) Complete elimination
    of partial redundancies through
  • selective code expansion (ComPRE). Based on
    the work by Steffen (1996).
  • Kennedy et al (1999) PRE in SSA representation
    of programs (SSAPRE).
  • Dhamdhere (2002) Eliminatability path --- A
    versatile basis for PRE
  • (E-path_PRE). Develops a concept originating
    in Dhaneshwar, Dhamdhere (1995)
  • and uses it for evaluation of PRE algorithms
    and development of new ones.

15
Morel-Renvoise Algorithm (MRA)
  • Performs insertions strictly in nodes of the
    program graph
  • Placement possibility (PP) of e at entry/exit of
    basic blocks
  • whether it is feasible and safe to place
    expression e at entry/exit
  • of a block
  • Insert e at the exit of a basic block b if it can
    be placed at the
  • exit of b but not at its entry
  • Delete an existing occurrence of e in a basic
    block if it can be
  • placed at the entry of that block

16
Morel-Renvoise Algorithm (MRA)(simplified
equations)
17
Morel-Renvoise Algorithm (MRA)
1
1
a..
a..
ab
tab
2
4
2
4
tab
ab
t
3
5
3
5
ab
6
6
t
1. ab is inserted in node 2. Insertion in node 3
would have been lifetime optimal.
2. ab of node 4 cannot be optimized because it
cannot be inserted in node 1.
3. ab is saved in t in nodes 2 and 4. ab of
node 6 is replaced by use of t.
18
Edge placement algorithm (Dhamdhere 1988)
  • Performs insertions both in nodes and along edges
    in the program graph
  • An expression is hoisted as far up as possible to
    obtain computational optimality
  • It is then subjected to sinking (without
    affecting computational optimality) to obtain
    lifetime optimality
  • It is placed along an edge only if it cannot be
    placed in a node
  • It is performed only along a critical edge, i.e.,
    an edge from a branch node to a join node

19
Edge placement algorithm (Dhamdhere 1988)
20
Edge placement algorithm (Dhamdhere 1988)
  • A. Computational optimality
  • The ? term of PPIN is dropped. Hence PPIN can be
    true even if
  • PPOUT of a predecessor is false.
  • If PP is true for entry of a basic block i but
    PP is false for exit of a
  • predecessor j, e is placed along the edge
    (j,i).
  • -- It is called edge placement. A basic block
    is inserted in the
  • edge if e is to be placed along it.
  • -- Edge placement performed only along a
    critical edge, i.e.
  • along an edge from a branch node to a
    join node.
  • Placement into nodes is done as in MRA.

21
Edge placement algorithm(Dhamdhere 1988)
  • B. Reducing lifetimes of expression variables
  • Move insertion points as far down as possible
    without sacrificing
  • computational optimality (it is achieved by
    the ? term)

22
Edge placement algorithm (Dhamdhere 1988)
  • EPA solution technique (hoisting-followed-by-sin
    king approach)
  • Solve the unidirectional data flow problem
    obtained by omitting the
  • ? term from the PPIN equation.

It hoists e as far up as possible. Provides
computational optimality.
2. Now a second data flow is solved to
incorporate the ? term We examine all
predecessors of a block i and change PPIN of
block i from true to false if the ? term is
false for its predecessors.
It sinks the hoisted expression as far down as
possible without compromising computational
optimality.
23
Edge placement algorithm (EPA)
1
1
tab
a..
ab
2
4
2
4
t
a..
ab
tab
t
3
5
3
5
ab
6
t
6
1. ab is inserted in node 3. However, EPA does
not provide lifetime optimality in some cases.
2. ab is inserted in edge (1,4). This is
computationally optimal.
24
Lazy code motion (KRS 92)
  • All join edges are split a priori by inserting
    blocks along them
  • D-Safe-earliest points An expression is placed
    at the earliest points
  • where it is anticipatable.
  • Evaluation of an expression is delayed to the
    latest point where it
  • can be placed without losing computational
    optimality.
  • Thus, it conceptually performs hoisting-followed-
    by-sinking, as in the edge placement algorithm.
  • Insertion and saving is performed uniformly.
  • Data flow equations are not given here.
    (Drechsler and Stadel
  • reformulated them.)

25
Lazy code motion (KRS)
1
1
tab
(1,4)
tab
a..
ab
2
4
2
4
t
a..
ab
t
3
5
3
5
tab
(3,6)
ab
6
t
6
1. Edges (1,4), (3,6), (5,6) and (5,4) are split
a priori
2. ab is inserted in edge (3,6). LCM provides
lifetime optimality
3. ab is inserted in edge (1,4). As in EPA, this
is computationally optimal
4. Empty blocks removed
26
Eliminatability paths offer ..
  • A conceptual basis for PRE
  • - Identifies partial redundancies which can
    be
  • eliminated through insertion of code in
    safe places
  • We call them eliminatable partial
    redundancies
  • - A simple method for identifying safe
    insertion points
  • which offer lifetime optimality
  • - Thus, no hoisting-followed-by-sinking

27
Eliminatability paths offer ..
  • Computationally optimal PRE
  • - Elimination of all eliminatable partial
    redundancies
  • identified by E-paths through appropriate
  • insertions provides computational
    optimality

28
Eliminatability paths offer ..
  • PRE with lifetime optimality
  • - Insertions performed using the notion of
    E-paths
  • provides lifetime optimality

29
Eliminatability paths offer ..
  • A versatile basis for PRE
  • - Classical PRE PRE performed by insertion,
    deletion and
  • saving of expressions over a program graph
  • - PRE over SSA representations of programs

30
Eliminatability paths offer ..
  • Simplicity
  • - Insertion, deletion and save points are
    identified using
  • simple and well-known data flow concepts of
    availability
  • and anticipatability

31
Eliminatability paths offer ..
  • A basis for evaluating effectiveness of an
    approach to
  • PRE
  • - Does the approach provide computational
    optimality?
  • (i.e. does it eliminate all partial
    redundancies which can
  • be eliminated?)
  • - Does the approach provide lifetime
    optimality?

32
Eliminatability Paths (E-paths)
  • A path i .. k in a program control flow graph is
    an E-path for an
  • expression e if
  • - Node i contains a locally available
    occurrence of e and node k
  • contains a locally anticipatable occurrence
    of e
  • - Nodes in the path (i .. k) are empty wrt e,
    i.e. they do not contain
  • an occurrence of e or a definition of any of
    its operands
  • - e is safe at the exit of each node in i ..
    k), i.e., it is either available
  • or anticipatable at the exit of each node in
    i .. k).

Path i .. k) includes node i, but excludes node
k. Path (i .. k) excludes nodes i and k.
33
Eliminatability Path
ab
i
- ab available at exit of i .. m
- ab anticipatable at exit of n .. k)
m
  • Occurrence of ab in node k

n
is said to be eliminatable
ab
k
Dhaneshwar, Dhamdhere (1995) used
eliminatability of exps, but did not define
or use E-paths explicitly.
34
Properties of E-paths 1
  • PRE using E-paths provides computational
    optimality
  • Use of this property
  • - Use it to evaluate computational
  • optimality of a PRE algorithm.
  • - A PRE algorithm possesses computational
    optimality if it can
  • eliminate partial redundancy of e in EACH
    node k such that
  • an E-path i .. k exists in G.

35
Properties of E-paths 2
  • If i .. k is an E-path and j is a node in (i ..
    k
  • - For each in-edge (g, j) such that node g is
    not in an E-path
  • if node g has a successor s which is not
    in an E-path
  • then insert e in edge (g, j)
  • else insert e in node g
  • - Such insertion provides lifetime optimality
    of the temporary variable
  • used to hold value of e
  • Use of the property
  • - Check whether a PRE algorithm provides
    lifetime optimality by comparing
  • program points where insertions are made

36
Lifetime optimality using E-paths
ab
i
m
g1
tab
tab
j
g2
- i .. k is an E-path
- Insertion in edge (g1, j) and
node g2 is lifetime optimal
ab
k
37
Evaluating MRA using E-paths
1
1
a..
a..
ab
tab
2
4
2
4
tab
ab
t
3
5
3
5
ab
6
t
6
0. Three E-paths exist 4 .. 5, 5 .. 4 and 5 ..
6.
1. 5 .. 6 is an E-path. Insertion node 3 would
have been lifetime optimal.
2. 5 .. 4 is an E-path. Hence ab of node 4 is
eliminatable, but not eliminated!
38
PRE using E-paths
  • For an E-path i .. k
  • a) Insertions For a node j in (i .. k
  • - Insert e in edge (g, j) if g is not in
    an E-path and has a
  • successor which is not in an E-path
  • - Insert e in predecessor g if g is not in
    an E-path and all its
  • successors are in E-paths
  • b) Save Save the computation of e in node i,
    unless i is the
  • end-node of some E-path h .. i (in which
    case it would be
  • deleted).
  • c) Deletion Delete the occurrence of e in
    node k.

39
PRE using E-paths
  • E-path i .. k may contain 3 kinds of segments
  • - Avail . Ant segment
  • - Avail . Ant segment
  • - Avail . Ant segment This is called
    the E-path suffix.
  • Find a node m Avail(m) . Anticipatable(m). ?
    Avail(p), ppred
  • This is the start node of the E-path suffix.
  • - Trace Avail . Ant segment backwards from m
    to find node i, the
  • start of the E-path and perform a save in it
  • - Trace Avail . Ant segment forward from m
  • a) to perform appropriate insertion for
    in-edges
  • b) to find k and perform a deletion

40
Segments in an E-Path
ab
1
a) 1 .. 2 Avail Ant.
2
b) 3 .. 4 Avail Ant.
c) 5 .. 10 Avail ?Ant (E-path
suffix).
3
4
Start node Of E-path suffix
5
E-path suffix insertions may be needed
in paths joining it
?
ab
10
41
Simple data flows for E-path_PRE_at_
Comp e is locally available (i.e. downwards
exposed) in node
Antloc e is locally anticipatable (i.e.
upwards exposed) in node
Transp node does not contain definitions of es
operands
_at_ Terminology is from Morel-Renvoise algorithm
42
Simple data flows for E-path_PRE
  • Availability and Anticipatability (i.e. very busy
    exps.)
  • Eps-in/Eps-out (Node is in E-path suffix)

43
Simple data flows for E-path_PRE
  • Availability and Anticipatability
  • Eps-in/Eps-out (Node is in E-path suffix)
  • SA_in/SA_out (A save should be performed above)

44
Efficiency of E-path_PRE data flows
  • The generalized theory of bit-vector data flow
    analysis by Khedker,
  • Dhamdhere (1994) defines two concepts for
    determining the cost of data
  • flow analysis
  • - Information flow path (ifp) A graph path
    along which data flow
  • information may flow during data flow
    analysis.
  • (Information flow Values of data flow
    properties change fromlattice
  • top to lattice bot during iterative
    data flow analysis)
  • - Width of a graph (reduces to depth of a
    graph for unidirectional data
  • flows)

45
Efficiency of E-path_PRE data flows
  • The generalized theory of bit-vector data flow
    analysis by Khedker,
  • Dhamdhere (1994) defines two concepts for
    determining the cost of data
  • flow analysis
  • - Information flow path (ifp) A graph path
    along which data flow
  • information may flow during data flow
    analysis.
  • (Information flow Values of data flow
    properties change fromlattice
  • top to lattice bot during iterative
    data flow analysis)
  • - Width of a graph (reduces to depth of a
    graph for unidirectional data
  • flows)
  • Number of bit-vector operations during work-list
    iterative df analysis
  • depend on length of an ifp, and the number
    of iterations during
  • round-robin iterative df analysis depend on
    width of an ifp

46
Efficiency of E-path_PRE data flows
  • The Eps_in/out data flow of E-path_PRE has been
    designed to have
  • short information flow paths. This fact may
    also lead to small
  • width of a program graph.
  • Short information flow paths and small width
    leads to smaller
  • solution times of data flows.
  • This fact is borne out by experimentation ---
    comparison with the
  • later data flow of Drechsler, Stadel (1993)
    (Dhamdhere 2002)
  • - In worklist solution No. of bit vector
    operations is 80 smaller
  • - In round-robin iterative solution No. of
    iterations is 37 smaller

47
Code placement models in PRE
  • Node model
  • - Simple node model
  • Each node contains a single statement
  • - Basic block model
  • Each node is a basic block
  • Insertion and Saving model
  • - Saving in situ
  • Value of an expression is saved in the
    place where it is located
  • - Saving in entry/exit of node
  • An expression is moved to node
    entry/exit if its value is to be saved
  • - Insertion at entry/exit of node
  • - Unified insertion and saving
  • This is possible only when saving is
    done at node entry/exit

48
Code placement models in PRE
  • Morel-Renvoise Algorithm (MRA)
  • - Basic blocks, saving in situ, insertion at
    exit
  • Edge placement algorithm (EPA)
  • - Basic blocks, saving in situ, insertions
    at node exit and in critical edges
  • (edge splitting performed on a needs
    basis)
  • Lazy Code Motion (LCM)
  • - Simple nodes, unified saving and
    insertion, insertion at node entries and
  • in blocks inserted in join edges in a
    priori edge splitting
  • E_path-PRE
  • - Basic blocks, saving in situ, insertions
    at node exit and in critical edges
  • SIM-PRE
  • - Basic blocks, saving in situ, insertion
    strictly along edges

49
Evaluation of code placement models using E-paths
  • Morel-Renvoise algorithm (MRA)
  • Missed opportunities of optimization (seen
    before)
  • Lazy code motion (LCM)
  • Performs insertion in a join edge (p,j)
    even if it could have been
  • performed in node p

ab
1
2
ab inserted
3
ab
50
Evaluation of code placement models using E-paths
  • Optimal code motion (OCM) Knoop et al 1994
  • - Basic blocks, Hybrid model, Insertions at
    node entry and exit
  • - Hybrid Uniform insertion and saving model
    but saving is
  • performed in situ
  • No insertions and savings will be
    performed at entry to a node
  • (Lemmas 19 and 23). Hence this feature is
    redundant.

51
Evaluations of code placement models using E-paths
  • Complete elimination of partial redundancies
    (ComPRE)
  • Bodik, Gupta and Soffa (1998) (when adapted
    to classical PRE)
  • - Simple nodes, unified saving and insertion
    only in edges
  • An expression in a node is redundantly
    hoisted into its entry-edges
  • - Addressing this problem will require an
    additional data flow
  • problem, making it less efficient than
    E-path_PRE.

52
Later work
  • SIM-PRE by J. Xue, J. Knoop (2006) inserts only
    along edges

53
SIM-PRE by Xue and Knoop
  • This data flow traces an E-path !

54
SIM-PRE by Xue and Knoop
ab
i
m
g1
tab
j
g2
- i .. k is an E-path
- Insertion in edge (g1, j) and
t ab
node g2 is lifetime optimal
l
  • SIM-PRE inserts in edges
  • (g1, j) and (g2, l )

ab
k
55
SIM-PRE by Xue and Knoop
  • SIM-PRE performs better than E-path_PRE in bit
    vector operations
  • (Graphic is from J. Xue, J. Knoop (2006))
  • However, it adds almost 50 more new blocks than
    E-path_PRE
  • (Dheeraj Kumar, 2006)

56
Work by Dheeraj Kumar (2006)_at_
  • Simplified the data flows of E-path_PRE
  • Eps_in/out data flow finds nodes i that belong
    to an E-path and have Antouti true
  • SA_in/out data flow finds nodes i that belong
    to an E-path and have Avouti true

_at_ M. Tech. dissertation, IIT Bombay
57
Work by Dheeraj Kumar (2006)
  • Simplified data flows of E-path_PRE (Proposal 2)


58
Work by Dheeraj Kumar (2006)
  • Experimental results
  • - SPECcpu2000 benchmark under GCC 3.4.3
  • - Proposal 2 performance
  • Bit map operations 5.5 smaller than
    SIM-PRE
  • in worklist and 15.8 smaller in
    iterative
  • Introduced 30 fewer blocks than
    SIM-PRE

59
Thus, eliminatability paths offer ..
  • A conceptual basis for PRE
  • A versatile basis for PRE
  • A basis for evaluating effectiveness of an
    approach to PRE
  • (Efficiency is a bonus!)
Write a Comment
User Comments (0)
About PowerShow.com