Title: Control Dependences
1Control Dependences
Chapter 7
2Control Dependences
- Roadmap
- If-conversion
- Control dependence
3Control Dependences
- Constraints posed by control flow
- DO 100 I 1, N
- S1 IF (A(I-1).GT. 0.0) GO TO 100
- S2 A(I) A(I) B(I)C
- 100 CONTINUE
- If we vectorize by...
- S2 A(1N) A(1N) B(1N)C
- DO 100 I 1, N
- S1 IF (A(I-1).GT. 0.0) GO TO 100
- 100 CONTINUE
- we get the wrong answer
- We are missing dependences
- There is a dependence from S1 to S2 - a control
dependence
S2 ?1 S1
4Control Dependences
- Two strategies to deal with control dependences
- If-conversion expose by converting to data
dependences. Used for vectorization - Explicitly expose as control dependences. Used
for automatic parallelization
5If-conversion
- Underlying Idea Convert statements affected by
branches to conditionally executed statements - DO 100 I 1, N
- S1 IF (A(I-1).GT. 0.0) GO TO 100
- S2 A(I) A(I) B(I)C
- 100 CONTINUE
- can be converted to
- DO I 1, N
- IF (A(I-1).LE. 0.0) A(I) A(I) B(I)C
- ENDDO
6If-conversion
- DO 100 I 1, N
- S1 IF (A(I-1).GT. 0.0) GO TO 100
- S2 A(I) A(I) B(I) C
- S3 B(I) B(I) A(I)
- 100 CONTINUE
- can be converted to
- DO 100 I 1, N
- S2 IF (A(I-1).LE. 0.0) A(I) A(I) B(I)
C - S3 IF (A(I-1).LE. 0.0) B(I) B(I) A(I)
- 100 CONTINUE
- vectorize using the Fortran WHERE statement
- DO 100 I 1, N
- S2 IF (A(I-1).LE. 0.0) A(I) A(I)
B(I) C - 100 CONTINUE
- S3 WHERE (A(0N-1).LE. 0.0) B(1N) B(1N)
A(1N)
7If-conversion
- If-conversion assumes a target notation of
guarded execution in which each statement
implicitly contains a logical expression
controlling its execution - S1 IF (A(I-1).GT. 0.0) GO TO 100
- S2 A(I) A(I) B(I)C
- 100 CONTINUE
- with guards in place
- S1 M A(I-1).GT. 0.0
- S2 IF (.NOT. M) A(I) A(I) B(I)C
- 100 CONTINUE
8Branch Classification
- Forward Branch transfers control to a target
that occurs lexically after the branch but at the
same level of nesting - Backward Branch transfers control to a statement
occurring lexically before the branch but at the
same level of nesting - Exit Branch terminates one or more loops by
transferring control to a target outside a loop
nest
9If-conversion
- If-conversion is a composition of two different
transformations - 1. Branch relocation
- 2. Branch removal
10Branch removal
- Basic idea
- Make a pass through the program.
- Maintain a Boolean expression cc that represents
the condition that must be true for the current
expression to be executed - On encountering a branch, conjoin the controlling
expression into cc - On encountering a target of a branch is
encountered, its controlling expression is
disjoined into cc
11Branch Removal Forward Branches
- Remove forward branches by inserting appropriate
guards - DO 100 I 1,N
- C1 IF (A(I).GT.10) GO TO 60
- 20 A(I) A(I) 10
- C2 IF (B(I).GT.10) GO TO 80
- 40 B(I) B(I) 10
- 60 A(I) B(I) A(I)
- 80 B(I) A(I) - 5
- ENDDO
-
DO 100 I 1,N m1 A(I).GT.10 20
IF(.NOT.m1) A(I) A(I) 10 IF(.NOT.m1) m2
B(I).GT.10 40 IF(.NOT.m1.AND..NOT.m2) B(I)
B(I) 10 60 IF(.NOT.m1.AND..NOT.m2.OR.m1)A(I)
B(I) A(I) 80 IF(.NOT.m1.AND..NOT.m2.OR.m1.O
R..NOT.m1 .AND.m2) B(I) A(I) - 5 ENDDO
12Branch Removal Forward Branches
- We can simplify to
- DO 100 I 1,N
- m1 A(I).GT.10
- 20 IF(.NOT.m1) A(I) A(I) 10
- IF(.NOT.m1) m2 B(I).GT.10
- 40 IF(.NOT.m1.AND..NOT.m2)
- B(I) B(I) 10
- 60 IF(m1.OR..NOT.m2)
- A(I) B(I) A(I)
- 80 B(I) A(I) - 5
- ENDDO
- vectorize to
- m1(1N) A(1N).GT.10
- 20 WHERE(.NOT.m1(1N)) A(1N) A(1N) 10
- WHERE(.NOT.m1(1N)) m2(1N) B(1N).GT.10
- 40 WHERE(.NOT.m1(1N).AND..NOT.m2(1N))
- B(1N) B(1N) 10
- 60 WHERE(m1(1N).OR..NOT.m2(1N))
- A(1N) B(1N) A(1N)
13Branch Removal Forward Branches
- To show correctness we must establish
- the guard for statement instance in the new
program is true if and only if the corresponding
statement in the old program is executed, unless
the statement has been introduced to capture a
guard variable value, which must be executed at
the point the conditional expression would have
been evaluated - the order of execution of statements in the new
program with true guards is the same as the order
of execution of those statements in the original
program - Any expression with side effects is evaluated
exactly as many times in the new program as in
the old program
14Exit Branches
- DO J 1, M
- DO I 1, N
- A(I,J) B(I,J) X
- S IF (L(I,J)) GO TO 200
- C(I,J) A(I,J) Y
- ENDDO
- D(J) A(N,J)
- 200 F(J) C(10,J)
- ENDDO
- more complicated because they terminate a loop
- Solution relocate exit branches and convert
them to forward branches
15Exit Branches
- DO J 1, M
- DO I 1, N
- A(I,J) B(I,J) X
- S IF (L(I,J)) GO TO 200
- C(I,J) A(I,J) Y
- ENDDO
- D(J) A(N,J)
- 200 F(J) C(10,J)
- ENDDO
- DO J 1, M
- DO I 1, N
- IF (C1) A(I,J) B(I,J) X
- Sa Code to set C1 and C2
- IF (C2) C(I,J) A(I,J) Y
- ENDDO
- Sb IF (.NOT.C1.OR..NOT.C2) GO TO 200
- D(J) A(N,J)
- 200 F(J) C(10,J)
16Exit Branches
- Statements in the inner loop should be executed
only if exit branch was not taken on any previous
iteration - For the ith iteration, C1 and C2 should be
- lm AND( ? L(k, J) ), 1 ? k ? i-1
- DO J 1, M
- lm .TRUE.
- DO I 1, N
- IF (lm) A(I,J) B(I,J) X
- IF (lm) m1 .NOT. L(I,J)
- lm lm .AND. m1
- IF (lm) C(I,J) A(I,J) Y
- ENDDO
- m2 lm
- IF (m2) D(J) A(N,J)
- 200 F(J) C(10,J)
- ENDDO
17Exit Branches
- After forward substitution and expansion of lm,
we get - DO J 1, M
- lm(0,J) .TRUE.
- DO I 1, N
- IF (lm(I-1,J)) A(I,J) B(I,J) X
- IF (lm(I-1,J)) m1 .NOT.L(I,J)
- lm(I,J) lm(I-1,J) .AND. m1
- IF (lm(I,J)) C(I,J) A(I,J) Y
- ENDDO
- IF (lm(N,J)) D(J) A(N,J)
- 200 F(J) C(10,J)
- ENDDO
- codegen will produce four vectorized loops
18Exit Branches
- After running codegen
- DO J 1, M
- lm(0,J) .TRUE.
- DO I 1, N
- IF (lm(I-1,J)) m1 .NOT.L(I,J)
- lm(I,J) lm(I-1,J) .AND. m1
- ENDDO
- ENDDO
- WHERE(lm(0N-1,1M)) A(1N,1M)B(1N,1M)X
- WHERE(lm(0N-1,1M)) C(1N,1M)A(1N,1M)Y
- WHERE(lm(N,1M)) D(1M) A(N,1M)
- 200 F(1M) C(10,1M)
- Procedure relocate_branches()
19Backward Branches
- Problems
- Create implicit loops. Backward control flow
cannot be simulated by simple guards - Complicate removal of forward branches - may
create loops into which forward branches jump - IF (P) GO TO 200
- ...
- 100 S1
- ...
- 200 S2
- ...
- IF (Q) GO TO 100
- Applying forward if-conversion
- m1 .NOT. P
- ...
- 100 IF (m1) S1
- ...
- 200 S2
- ...
- IF (Q) GO TO 100
20Backward Branches
- Solutions?
- Avoid region within a backward control flow edge
- Eliminate backward branches through a variant of
if-conversion - Note that
- S1 is executed on the first pass through the code
only if P is false - S1 is always executed when the backward branch is
taken - Use a backward branch guard!
21Backward Branches
- Using a backward branch guard
- IF (P) GO TO 200
- ...
- 100 S1
- ...
- 200 S2
- ...
- IF (Q) GO TO 100
- converted to
- m P
- ...
- bb .FALSE.
- 100 IF (.NOT.m .OR (m.AND.bb)) S1
- ...
- 200 S2
- ...
- IF (Q) THEN
- bb .TRUE.
22Backward Branches
- In general, two ways a target of a backward
branch can be reached - Fall through
- Branch around the statement but reach it via a
backward branch - Thus, if current condition just prior to target y
is cc, the branch condition is m, and the
backward branch condition is bb, the guard at y
should be cc OR (m AND bb)
23Complete Forward Branch Removal
- Statement is branch target combine (disjoin) set
of conditions associated with branches to that
target with the current condition passed from the
lexical predecessor - Statement is any type except DO, ENDDO, CONTINUE
the current condition is conjoined to the guard
for the current statement - Statement is a DO invoke relocate_branches to
remove exit branches. Recur on body of the loop.
May generate some statements before the loop
which should be guarded by the current condition
24Complete Forward Branch Removal
- Statement is a conditional branch 2 copies of
the current condition cc are made. - The compiler generated variable associated with
the new condition is conjoined with cc and the
result is appended to the list associated with
the branch target - The negation of the variable is conjoined to cc
and is the current condition for the next
statement - Statement is an unconditional branch current
condition, cc, is appended to the list of
conditions for the branch target. Current
condition for the next statement is set to false - Continue processing at step 1 for next statement
25Simplification
- Boolean Simplifier is NP-Complete
- Use Simplify, an O(N2) algorithm by tweaking
simplification process to focus on if-conversion
26Iterative Dependences
- Iterative statements can also create control
dependences - 20 DO I 1, 100
- 40 L 2I
- 60 DO J 1,L
- 80 A(I,J) 0
- ENDDO
- ENDDO
- If we vectorize as
- 20 DO I 1, 100
- 40 L 2I
- 100 ENDDO
- 80 A(1100,1L) 0
- Incorrect!
- Must capture the notion that the DO statement
controls the number of times a particular
statement is executed.
27Iterative Dependences
- Notation used
- A(I, J) (irange)
- where irange is a compiler generated scalar which
holds the iteration range - Using this notation, the example will be
converted to - 20 irange1 (1,100)
- DO I irange1
- 40 L 2I (irange1)
- 60 irange2 (1,L) (irange1)
- DO J irange2
- 80 A(I,J) 0 (irange2)
- ENDDO
- ENDDO
28Iterative Dependences
- Forward substituting constants and
loop-independent variables - 20 DO I 1,100
- 40 L 2I (1,100)
- 60 DO J 1,L (1,100)
- 80 A(I,J) 0 (1,L) (1,100)
- ENDDO
- ENDDO
- which vectorizes to
- 20 DO I 1, 100
- 40 L 2I
- 80 A(I,1L) 0
- ENDDO
29If-reconstruction
- If-conversion may degrade performance when
vectorization is not possible - DO 100 I 1, N
- IF (A(I) .GT. 0) GOTO 100
- B(I) A(I) 2.0
- A(I1) B(I) 1
- 100 CONTINUE
- After if-conversion
- DO 100 I 1, N
- m1 (A(I) .GT. 0)
- IF (.NOT. m1) B(I) A(I) 2.0
- IF (.NOT. m1) A(I1) B(I) 1
- 100 CONTINUE
30If-reconstruction
- On a machine without predicated execution
- DO 100 I 1, N
- m1 (A(I) .GT. 0)
- IF ( m1) GOTO 10
- B(I) A(I) 2.0
- 10 IF (m1) GOTO 20
- A(I1) B(I) 1
- 20 CONTINUE
- 100 CONTINUE
- Overheads!
- If-reconstruction replace sections of guarded
code with a minimal set of branches that enforce
the guarded execution
31Control Dependence
- Disadvantages of if-conversion
- Unnecessarily complicates code when code cannot
be vectorized - Cannot a priori analyze code to decide whether
if-conversion will lead to parallel code. - Alternate approach explicitly expose constraints
due to control flow as control dependences
32Control Dependence
- A node x in directed graph G with a single exit
node postdominates node y in G if any path from y
to the exit node of G must pass through x. - A statement y is said to be control dependent on
another statement x if - there exists a non-trivial path from x to y such
that every statement z?x in the path is
postdominated by y and - x is not postdominated by y.
- In other words, a control dependence exists from
S1 to S2 if one branch out of S1 forces execution
of S2 and another doesnt - Note that control dependences can be looked as a
property of basic blocks
33Control Dependence Example
34Control Dependence Example
- n nodes and O(n2) control dependences.
- Control dependence graphs can thus get much
larger than the corresponding CFG - procedure ConstructCD constructs the control
dependence relation
35Control Dependence Loops
- Loops can be converted to a CFG and then
ConstructCD can be applied - Want to treat loops as special cases to help in
transforming loops - Use a loop control node to represent the loop
- 10 DO I 1, 100
- 20 A(I) A(I) B(I)
- 30 IF (A(I).GT.0) GO TO 50
- 40 A(I) -A(I)
- 50 B(I) A(I) C(I)
- ENDDO
36Execution Model
- In Chapter 2, we annotated each statement S with
the corresponding iteration vector i - S(i) could execute whenever every statement
instance that it depended on had already executed - However
- DO I 1, N
- S0 IF (P) GO TO S2
- S1 ...
- S2 ...
- ENDDO
37Execution Model
- Solution Use a doit flag for each statement
S(i).doit - Statement instances that are not control
dependent on any other statement doit True - For all other statements doit False
- How does doit get set to True?
- All those statements that are control dependent
on the conditional and whose execution is forced
by the sense of the condition doit true - Execute statement instance S(i) if its doit flag
is set to True and every statement instance it
depends on either has a false doit flag or has
been executed
38Execution Model
- Note that if doit is true for S, then there is a
sequence of control statements S0, S1, ... , Sm
S such that S0 is executed unconditionally and
the decision taken at Sk forces the execution of
Sk1, 0 ? k lt m - Sequence of control dependences defines a unique
execution path
39Execution Model
- Behavior of loop control nodes under this model
- Case 1 Evaluation of iteration range does not
depend on quantities computed in loop - Set doit for loop node to True
- Range of iteration can be completely evaluated
- Create collection of statement instances for the
loop body, one for each iteration of the loop - Set doit flags of statements control dependent on
loop header to true, all other doit flags to
False
40Execution Model
- Case 2 Evaluation of iteration range depends on
quantities computed in loop - If range is non-empty, create new instance of
loop header, adjusting range to the remainder of
the iterations - DO.doit True if dependence back to DO is a data
dependence and False if it is a control
dependence - Set doit flags of statements control dependent on
loop header to true, all other doit flags to
False
41Execution Model
- Theorem 7.1. Dependence graphs that are executed
according to the execution model are equivalent
in meaning to the programs from which they are
created. - Proof
- Show that doit flag of statement is true iff it
is executed in the original program - Proof by contradiction Consider the shortest
sequence S0, S1, ,Sm-1, Sm s.t. Sm is the
first statement to get the wrong doit flag - Focus on Sm-1
- All statements executed leading to Sm-1 in the
original program must be executed in this model - Statements that are not executed leading to Sm-1
in the original program cannot be executed in
this model
42Control Dependence and Parallelization
- For simplicity, we shall only consider
- Forward branches - they create loop-independent
control dependences - Control Dependences due to loops
- From Chapter 2 Most loop transformations are
unaffected by loop-independent dependences - Loop reversal, loop skewing, strip mining,
index-set splitting, loop interchange do not
affect independent dependences - Might be problematic Loop fusion, loop
distribution - However, since exit branches are excluded, loop
fusion is not a problem
43Loop Distribution
- DO I 1, N
- S1 IF (A(I).LT.B(I)) GOTO 20
- S2 B(I) B(I) C(I) S1 ?-1 S2
- 20 CONTINUE
- ENDDO
- Distributing
- DO I 1, N
- S1 IF (A(I).LT.B(I)) GOTO 20
- ENDDO
- DO I 1, N
- S2 B(I) B(I) C(I)
- ENDDO
- 20 CONTINUE
- Incorrect!
44Loop Distribution
- Problem control dependences crossing between
distributed loops - Solution Keep a history of the evaluated
conditions (similar to if-conversion). - DO I 1, N
- S1 IF (A(I).LT.B(I)) GOTO 20
- S2 B(I) B(I) C(I)
- 20 CONTINUE
- ENDDO
- Convert to
-
- DO I 1, N
- S1 e(I) A(I).LT.B(I)
- ENDDO
- DO I 1, N
- S2 IF (e(I).EQ..FALSE.) B(I) B(I) C(I)
- ENDDO
45Loop Distribution
- More complex example
-
- DO I 1, N
- 1 IF (A(I).NE.0) THEN
- 2 IF (B(I)/A(I).GT.1) GOTO 4
- ENDIF
- 3 A(I) B(I)
- GOTO 8
- 4 IF (A(I).GT.T) THEN
- 5 T (B(I) - A(I)) T
- ELSE
- 6 T (T B(I)) A(I)
- 7 B(I) A(I)
- ENDIF
- 8 C(I) B(I) C(I)
- ENDDO
46Loop Distribution
- Fusion into "like" regions
- Needs two execution variables E2(I) and E4(I) to
hold result of branches at statement 2 and 4
respectively
47Loop Distribution
- Consider branch at node 2
- 3 cases may hold
- Statement 2 is executed and the true branch to
statement 4 is taken - Statement 2 is executed and the false branch to
statement 3 is taken - Statement 2 is never executed because the false
branch is taken at statement 1 - Corresponds to condition for doit variable to be
set - A control dependence exists from S0 to S.
- S0 has its doit flag set
- Value of the conditional expression is the label
on the branch
48Loop Distribution
- Use three corresponding values True, False,
Undefined - procedure DistributeCDG implements these ideas.
It inserts execution variables at appropriate
places in the code and selectively converts
control dependences to data dependences
49Code Generation
- Problem Mapping the arbitrary control flow
represented in the control dependence graph to
real machines - DO I 1, N
- S1 IF (p1) GOTO 3
- S2 ...
- GOTO 4
- 3 IF (p3) GOTO 5
- 4 S4
- 5 S5
- ENDDO
Loop distribution
50Code Generation
- Code generated for first partition
- DO I 1, N
- E1(I) p1
- IF (E1(I).EQ.FALSE) THEN
- S2 ...
- ENDIF
- S5 ...
- ENDDO
- For second partition
- DO I 1, N
- IF((E1(I).EQ..TRUE.).AND..NOT.p3).OR.
- (E1(I).EQ..FALSE.)) THEN
- S4 ...
- ENDIF
- ENDDO
51Code Generation
- Observation generating code for graphs in which
every vertex has at most one control dependence
predecessor is relatively easy - Thus, transform graph into canonical form
consisting of a set of control dependence trees
with the following properties - each statement is control dependent on at most
one other statement, i.e., each statement is a
member of at most one tree - the trees can be ordered so that all data
dependences between trees flow from trees earlier
in the order to trees that are later in the order
52Code Generation
53Code Generation
54Code Generation
- How can the statements be organized into groups
of statements that are part of the same
conditional statement? - Statements can be grouped together if there is no
dependence path between them that passes through
a statement that is not a child of the same
conditional node with the same label - Typed Fusion!
- Each statement typed by (p, l) where
- p its unique control dependence predecessor
- l the truth label of the edge from p to the
statement
55Code Generation
- Simple recursive procedure
- Generate code for each of the subtree in an order
consistent with the data dependences - Roughly linear in size of the original dependence
graph