CMPUT680 Winter 2001 - PowerPoint PPT Presentation

About This Presentation
Title:

CMPUT680 Winter 2001

Description:

http://www.cs.ualberta.ca/~amaral/courses/680. CMPUT 680 - Compiler Design and Optimization ... RC. Registers. L1. L3. L5. L1 = [a, b, f, h, I) L3 = [e, g, h) ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 96
Provided by: josenels
Category:

less

Transcript and Presenter's Notes

Title: CMPUT680 Winter 2001


1
CMPUT680 - Winter 2001
  • Register Minimization X Register Saturation
  • José Nelson Amaral
  • http//www.cs.ualberta.ca/amaral/courses/680

2
Reading List
  • Touati, Sid Ahmed Ali, Register Saturation in
    Superscalar and VLIW Codes, 10th International
    Conference on Compiler Construction, Genova,
    Italy, April 2001, pp. 213-228.
  • Touati, S.-A.-A., Thomasset, F., Register
    Saturation in Data Dependence Graphs, Research
    Report RR-3978, INRIA, July 2000.
  • Touati, S.-A.-A., Optimal Register Saturation in
    Acyclic Superscalar and VLIW Codes, Researchh
    Report, INRIA, Nov. 2000.

3
Minimum Register Instruction Sequence (MRIS)
Problem
Given the Data Dependence Graph G for a basic
block, derive an instruction sequence S for G
that is optimal in the sense that its register
requirement is minimum.
4
Intuition for Our Solution
a
Our intuition is to find sub-sets of nodes that
can definitely share a register to inform
the instruction sequencing algorithm.
h
i
Data Dependence Graph
5
Instruction Lineages
An instruction lineage is a sequence of
instructions in which a single register is passed
from instruction to instruction (except for the
last).
a
b
c
f
g
h
How can we ensure that instructions a, b, f, and
h will be able to share the same register?
i
Data Dependence Graph
6
Sequencing Edges
The lineage formation imposed a scheduling
restriction in the DDG the selected heir of a
node must be the last node listed among
its siblings.
a
b
c
d
e
f
g
h
i
Augmented Data Dependence Graph
7
Node Height
L1 a, b, f, h, i)
If the introduction of sequencing edges was to
produce a cycle in the DDG, it would be
impossible to find a legal instruction sequence.
a
b
c
d
e
f
g
Thus we use the height of the nodes, recomputed
after each lineage formation, to select the
heir. Ties are broken arbitrarily.
h
i
Augmented Data Dependence Graph
8
Lineage Formation
L1 a, b, f, h, i)
For the next lineage, the heighest nodes not in a
lineage are c, d, e, all with a height of 5.
a
b
c
d
e
f
g
h
i
Augmented Data Dependence Graph
9
Lineage Interference
L1 a, b, f, h, i)
L2 c, f)
L3 e, g, h)
L4 d, g)
Two lineages Lu u1, u2, , um) and Lv v1,
v2, , vm) definitely overlap if (i) u1
reaches vn, and (ii) v1 reaches um.
Augmented Data Dependence Graph
10
Lineage Interference Graph
L1 a, b, f, h, i)
L2 c, f)
L3 e, g, h)
a
L4 d, g)
b
c
d
e
Which lineages does lineage L1 definely overlap
with?
f
g
h
L1
L4
How about lineages L2 and L4?
i
Augmented Data Dependence Graph
L3
L2
Lineage Interference Graph
11
Lineage Fusion Condition
L1 a, b, f, h, i)
L1
L4
L2 c, f)
L3 e, g, h)
L4 d, g)
a
L3
L2
Lineages
Lineage Interference Graph
b
c
d
e
Two lineages Lu u1, u2, , um) and Lv v1,
v2, , vn) can be fused into a single lineage
if (i) u1 reaches vn, and (ii) v1 does not
reach um.
f
g
h
i
Augmented Data Dependence Graph
12
Lineage Fusion Condition
L1 a, b, f, h, I)
L1
L4
L2 c, f)
L3 e, g, h)
L4 d, g)
a
L3
L2
Lineages
Lineage Interference Graph
b
c
d
e
f
g
Which lineages can be fused in the example?
h
d reaches f, and c does not reach g
i
Augmented Data Dependence Graph
Thus L4 can be fused with L2 to form L5 d, g)
? c, f)
13
Lineage Fusion
L1 a, b, f, h, i
L1
L4
L2 c, f
L3 e, g, h
L4 d, g
a
L3
L2
Lineages
Lineage Interference Graph
b
c
d
e
When Lu u1, u2, , um) and Lv v1, v2, ,
vn) are fused (1) a scheduling edge from um to
v1 is introduced in the augmented DDG (2)
Lu and Lv are removed from the LIG (3) a new
lineage Lw Lu ? Lv is inserted in LIG
f
g
h
i
Augmented Data Dependence Graph
14
Lineage Fusion Condition
L1 a, b, f, h, I)
L1
L3 e, g, h)
L5 d, g) ? c, f)
a
L3
L5
Lineages
Lineage Interference Graph
b
c
d
e
f
g
Thus the fusion of L4 with L2 form L5 d, g) ?
c, f)
h
How many colors we need to color the LIG?
i
Augmented Data Dependence Graph
15
Lineage Fusion Condition
L1 a, b, f, h, I)
L1
L3 e, g, h)
L5 d, g) ? c, f)
a
L3
L5
Lineages
Lineage Interference Graph
b
c
d
e
f
g
We need three colors.
h
Can we find an instruction sequence?
i
Augmented Data Dependence Graph
16
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
Sequence
17
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
Sequence
18
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
Sequence
19
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
e
Sequence
20
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
e
g
Sequence
21
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
e
g
c
Sequence
22
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
e
g
c
b
Sequence
23
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
e
g
c
b
f
Sequence
24
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
e
g
c
b
f
h
Sequence
25
Sequencing by List Scheduling
a
L1
RA
L1 a, b, f, h, I)
RB
L3 e, g, h)
b
c
d
e
L5 d, g) ? c, f)
RC
L3
L5
Lineages
f
g
Lineage Interference Graph
Registers
h
i
Augmented Data Dependence Graph
a
d
e
g
c
b
f
h
i
Sequence
26
Summary of Our Solution Method
DDG
  • A good construction algorithm for LIG (dynamic)
  • An effective heuristic method to calculate the
    HRB
  • An efficient scheduling method (do not backtrack)

Form Lineage Interference Graph (LIG)
Derive HRB
Extended list-scheduling guided by HRB
A good instruction sequence
27
Register Saturation (Touati)
Given a data depende graph G, the register
saturation (RS) of G is the maximal register
need for any schedule of G.
Touatis strategy is to compute the RS of the G
and, if RS exceeds the number of available
registers, to reduce the RS by introducing new
arcs in G.
The intuition is that by using either (1) all
available registers or (2) the maximal registers
that G can use, instruction level parallelism is
maximized.
28
The HRB and the RS
Govind, Gao, Yang, Amaral, and Zhang had
earlier proposed an alternative method to find
an heuristic register bound (HRB) to be used as a
guidance in a modified list scheduling. Their
goal is to find a schedule that uses a minimum
number of registers.
To compare both methods we will apply
Touatis method to Govind et al.s example, and
Govinds method to Touatis example.
29
Potencial Killers
To find the RS(G), we need to know which
operation must kill each value generated.
Touatis define the set of operations that are
potential killers of the value generated by an
operation u ? G.
Thus a node v is a potential killer of the value
generated by a node u if and only if v consumes
u and no descendent of v consumes u.
30
Potencial Killing Graph
The edges of the Potential Killing Graph of a DDG
G, PK(G)(V, EPK), are defined as follows
EPK (u,v) / u ? VR ? v ? pkillG(u)
VR is the set of operations that define a
value, i.e., operations that need a register.
31
Govinds Example Data Dependency Graph
(a) t1 ld(x) (b) t2 t1 4 (c) t3
t1 8 (d) t4 t1 - 4 (e) t5 t1 / 2 (f)
t6 t2 t3 (g) t7 t4 - t5 (h) t8
t6 t7 (i) st(y,t8)
B3
a
h
i
DDG G
32
Govinds Example Potential Kill Graph
pkillG(a) b, c, d, e pkillG(b)
f pkillG(c) f pkillG(d) g pkillG(e)
g pkillG(f) h pkillG(g) h pkillG(h)
i
a
h
i
DDG G
33
Govinds Example Potential Kill Graph
a
a
h
h
i
i
DDG G
PK(G)
In this example the DDG G and the potential
kill graph PK(G) are identical. In general that
is not the case.
34
Choosing the Killer
If a node u has more than one potential killer,
Touati defines a killing function, k(u), that
specifies which one among the potential killers
of u will actually kill u.
A killing function imposes a scheduling order in
the DDG all other consumers of u , Cons(u), must
be scheduled before k(u) is scheduled.
To represent these scheduling constraints, Touati
defines an extended DAG, G?k, induced by the
killing function k.
35
Govinds Example Killing Function
In this example, node a is the only node with
multiple potential killers.
a
pkillG(a) b, c, d, e pkillG(b)
f pkillG(c) f pkillG(d) g pkillG(e)
g pkillG(f) h pkillG(g) h pkillG(h)
i
h
i
PK(G)
36
Govinds Example Killing Function
a
If we choose k(a) b, we obtain the G?k on the
left.
b
c
d
e
f
g
pkillG(a) b, c, d, e pkillG(b)
f pkillG(c) f pkillG(d) g pkillG(e)
g pkillG(f) h pkillG(g) h pkillG(h)
i
h
i
G?k
37
Selecting a Good Set of Killers...
If the killing function for multiple nodes with
multiple potential killers is choosen
arbitrarily, it might induce cycles in G?k.
A valid killing function is one that does
not induce cycles in G?k.
38
Avoiding Vengeance...
A killer must kill before it has children, thus...
An edge (u,v) in DVk(G) means that the live
interval of u is always before the live interval
of v in any schedule of G?k.
39
Govinds Example Disjoint Value Graph
k(a) b k(b) f k(c) f k(d) g
k(e) g k(f) h k(g) h k(h) i
a
b
c
d
e
a
f
g
b
c
d
e
h
f
g
i
h
G?k
i
simplified by transitive reduction
DVk(G)
40
Register Need and Maximal Antichains
The register need of any schedule of G?k is
always less than or equal to a maximal antichain
in DVk(G).
Where Ec is the transitive closure of G (u,v) ?
Ec (u,v) ? Ec iff ? a path p (u, , v) in G.
41
Govinds Example Maximal Antichain
a
The maximal antichain in this example is
b
c
d
e
f
g
AMk a, c, d, e
h
Thus this graph, with this killing function can
use at most 4 registers.
i
DVk(G)
42
Register Saturating Scheduling
Touati proves that For every valid killing
k(V) function, there is always a
schedule that makes all the values in
the maximal antichain of the disjoint
value DAG DVk(G) simultaneously alive.
43
Saturating Killing Function
To find the register saturation of a DDG, we need
to find a killing function that maximizes the
maximal antichain in DVk(G).
In other words, we need to find a killing
function that maximizes the number of nodes that
are not connected by a path in DVk(G).
Touati calls this the maximizing maximal
antichain (MMA) problem. A solution to the MMA
problem is a saturating killing function. MMA is
NP-complete.
44
Heuristic to Compute Register Saturation
To compute the register saturation, Touati
starts by decomposing the potential kill graph
PK(G) into connected bipartite components.
A bipartite component, cb (Scb, Tcb, Ecb), is a
graph with a set of source nodes Scb, a set of
target nodes Tcb, and a set of edges Ecb. cb must
obey the following conditions.
If e ? EPK ? e ? Ecb ? e, e share an endpoint,
then e ? Ecb
45
Bipartite Decomposition of PK(G)
A bipartite decomposition of the potential
killing graph PK(G) is a set of bipartite
components such that for every edge e ? PK(G),
there is a bipartite component cb in the
decomposition such that e ? Ecb.
Touati proves that given a DDG G, there is only
one bipartite decomposition of G.
46
Govinds Example Bipartite Decomposition
a
f
g
h
i
PK(G)
Bipartite Decomposition
47
Saturating Killing Set
Touati defines the Saturating Killing Set of a
connected bipartite component cb, SKS(cb), as a
subset of the target nodes, Tcb ? Tcb such
that (1) All the source nodes, Scb, are
contained in the union of all predecessors
of the nodes in Tcb. (2) Tcb contains a
minimum number of nodes. Computing the SKS is an
NP-complete problem.
48
Govinds Example Saturating Killing Set
In this example the computation of SKS is
trivial. The only component with a
non-unitary target set is the top one.
The selection of any single node in the set Tcb
b, c, d, e covers the set Scb a. Thus the
selection can be arbitrary.
Bipartite Decomposition
49
Govinds Example
As we seen earlier with k(a) b, the
register saturation in Govinds example is 4. And
a schedule that has four values alive at the same
time can be found.
Using the lineage method, Govind et al. found
a schedule for their example that uses three
registers. What does Touatis method does if only
three registers are available?
50
Reducing RS
Touati proposes an algorithm to reduce the
register saturation while trying not to increase
the length of the critical path.
The algorithm starts by computing the maximal
antichain AMk. Then it starts an interative
process in which the first step is to construct
the set Uk of all admissible serializations
between the saturating values in AMk with their
costs.
51
Admissible Serializations
A serialization u ? v means that the kill of
u must always be carried out before the
definition of v.
If v is one of the potential killers of u, then
to produce the serialization u ? v we must add
arcs from all other potential killers of u to v.
This way we ensure that the live ranges of u and
v will not overlap.
If v is not a potential killer of u, then to
produce the serialization u ? v we must add
arcs from all nodes u ? pkillG(u) to v, as long
as there is no path from v to u.
52
Cost of Serializations
The cost function of a serialization is defined as
?(u ? v) (?1, ?2)
?1 predicts the reduction in the saturation
value produced by the serialization, it is
computed by
?1 ?1 - ?2
?1 is the number of saturating values serialized
after u if this serialization is carried out.
?2 is the number of descendents of u that can
become simultaneously alive with u.
?1 is the increase in the critical path.
53
Govinds Example Reducing RS
With the killling function k(a) b, the
saturating values are
a
AMk a, c, d, e
b
c
d
e
pkillG(a) b, c, d, e
f
g
For a serialization u ? v to be admissible, the
following condition must be true ?v ? pkill(u)
? (v lt v ) i.e., there are no paths from v
to any potential killer of u.
h
i
G?k
54
Govinds Example Reducing RS
With the killling function k(a) b, the
saturating values are
a
AMk a, c, d, e
b
c
d
e
pkillG(a) b, c, d, e
f
g
Thus, there is no admissible serialization from a
to any of the other saturating values, because b
? pkillG(a) and there are paths from c, d, and e
to b in G?k
h
i
G?k
55
Govinds Example Reducing RS
With the killling function k(a) b, the
saturating values are
a
AMk a, c, d, e
b
c
d
e
pkillG(a) b, c, d, e
f
g
h
c ? d and c ? e are not admissible
serializations either because f ? pkillG(c) and
d lt f, e lt f
i
G?k
56
Govinds Example Reducing RS
With the killling function k(a) b, the
saturating values are
a
AMk a, c, d, e
b
c
d
e
pkillG(a) b, c, d, e
f
g
h
d ? e is not admissible because g ? pkillG(d)
and e lt g, e ? d is not admissible because g ?
pkillG(e) and d lt g
i
G?k
57
Govinds Example Reducing RS
With the killling function k(a) b, the
saturating values are
a
AMk a, c, d, e
b
c
d
e
pkillG(a) b, c, d, e
f
g
h
Thus the admissible serializations in this
example are d ? c, e ? c
i
G?k
58
Govinds Example Reducing RS
In this example both serializations will cause
the scheduling edge (g,c) to be added to the
graph. Thus their cost is equivalent.
a
b
c
d
e
f
g
h
Note that, for this example, reducing RS is
equivalent to the lineage fusion technique in
Govind et al. approach.
i
G?k
59
Govinds Algorithm in Touatis Example
Now we will apply the lineage based
method proposed by Govind et al. to the DDG
presented by Touati.
In the next slide we transcribe the code and
the DDG as presented by Touati.
60
A Trivial Example
x
y
z
k
t
PKG
DDG
61
A Trivial Example (cont.)
x
y
z
k
There are no choices to be made as each node has
only one potential killer.
t
PKG
DDG
62
A Trivial Example (cont.)
x
y
The DV graph is identical to the PKG in this
case, and the solution is trivial, the maximal
antichain in the DV graph is x,y,z
z
k
t
DV
DDG
63
A Non-Trivial Example
pkillG(a) f
pkillG(b) d,e
a
pkillG(c) d,e
pkillG(d) g
pkillG(e) f
pkillG(f) g
f
g
DDG
64
A Non-Trivial Example
a
a
a
a
f
f
f
f
g
g
g
g
DVk1 k1(b,d),(c,d)
DVk2 k2(b,d),(c,e)
DVk3 k3(b,e),(c,d)
DVk4 k4(b,e),(c,e)
65
A Non-Trivial Example
a
a
a
a
b
c
b
c
b
c
b
c
d
e
d
e
d
e
d
e
f
f
f
f
g
g
g
g
DVk1 k1(b,d),(c,d)
DVk2 k2(b,d),(c,e)
DVk3 k3(b,e),(c,d)
DVk4 k4(b,e),(c,e)
66
There are eight killing functions (DV Graphs)
a
a
a
a
f
f
f
f
k(a,b),(b,d),(c,d)
k(a,b),(b,d),(c,e)
k(a,b),(b,e),(c,d)
k(a,b),(b,e),(c,e)
a
a
a
a
f
f
f
f
67
Maximal antichains
a
a
a
a
b
c
b
c
b
c
b
c
d
e
d
e
d
e
d
e
f
f
f
f
a
a
a
a
b
c
b
c
b
c
b
c
d
e
d
e
d
e
d
e
f
f
f
f
68
A More Non-Trivial Example
pkillG(a) b,c,g
a
pkillG(b) d,e
pkillG(c) e,j,k
pkillG(d) f
pkillG(e) m
pkillG(f) n
pkillG(g) d,j,k
pkillG(j) f
pkillG(k) m
n
DDG
There are 323318 killing functions
69
Govinds Algorithm in Touatis Example
?
fRc
(a) fload i1, fRa (b) fload i2, fRb (c) fload
i3, fRc (d) fmult fRa, fRb, fRd (e) imultadd
fRa, fRb, fRc, iRe (g) ftoint fRc, iRg (i) iadd
iRg, 4, iRi (f) fmultadd_setz fRb, iRi, fRc, fRf,
gf (h) fdiv fRd, iRe, fRh (j) gf ? fadd_setbnz
fRj, 1 , fRj, gj (k) gf gj ? fsub fRk, 1 , fRk
iRg
iRi
gf
gf
iRe
fRd
j
gj
?
Touati concentrates on the blue edges that
represent flow of floating point values.
70
Govinds Algorithm in Touatis Example
We will also concentrate on the floating point
value flow. Thus the simplified DDG is shown on
the left.
?
Although the modified list scheduling requires a
souce and a sink node, the lineage formation
process does not consider the source and the sink
node.
j
?
71
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
3
2
3
1
0
1
2
1
j
0
1
1
?
72
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
Step 2 First lineage formation
L1 a, e)
3
2
3
1
0
1
2
1
j
0
1
1
?
73
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
Step 2 First lineage formation
L1 a, e)
Step 3 Second lineage formation
3
2
3
L2 b, f)
1
0
1
2
1
j
0
1
1
?
74
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
Step 2 First lineage formation
L1 a, e)
Step 3 Second lineage formation
4
3
4
L2 b, f)
1
Recompute heights
0
2
3
1
j
0
1
1
?
75
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
Step 2 First lineage formation
L1 a, e)
Step 3 Second lineage formation
4
3
4
L2 b, f)
1
Recompute heights
0
Step 4 Third lineage formation
2
3
1
L3 c, f)
j
0
1
1
?
76
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
Step 2 First lineage formation
L1 a, e)
Step 3 Second lineage formation
4
3
4
L2 b, f)
2
Recompute heights
0
Step 4 Third lineage formation
2
3
1
L3 c, f)
j
Recompute heights
0
1
1
?
77
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
Step 2 First lineage formation
L1 a, e)
Step 3 Second lineage formation
4
3
4
L2 b, f)
2
Recompute heights
0
Step 4 Third lineage formation
2
3
1
L3 c, f)
j
Recompute heights
0
1
1
Step 5 Fourth lineage formation
L4 d, h)
?
78
Govinds Algorithm in Touatis Example
Step 1 Compute the heights
?
Step 2 First lineage formation
L1 a, e)
Step 3 Second lineage formation
4
3
4
L2 b, f)
2
Recompute heights
0
Step 4 Third lineage formation
2
3
1
L3 c, f)
j
Recompute heights
0
1
1
Step 5 Fourth lineage formation
L4 d, h)
?
79
Govinds Algorithm in Touatis Example
Lineage Source Nodes S a, b,
c, d
?
Lineage End Nodes S e, f, h
4
3
4
2
0
Reach Relation
2
3
1
j
0
1
1
?
80
Govinds Algorithm in Touatis Example
Reach Relation
?
4
3
4
2
0
2
3
1
Because d can reach f, but c cannot reach h, we
can fuse lineages L4 and L3 to create a new
lineage L5 d, h)?c,f). This fusion requires a
sequencing edge from h to c.
j
0
1
1
?
81
Govinds Algorithm in Touatis Example
L1 a, e)
Reach Relation
L2 b, f)
?
L5 d, h) ?c,f).
4
3
4
2
0
2
3
1
Because there are no more 0s in the Reach
relation matrix, there is no more lineage fusion
possible.
j
0
1
1
?
82
Govinds Algorithm in Touatis Example
L1 a, e)
Reach Relation
L2 b, f)
?
L5 d, h) ?c,f).
4
3
4
2
0
2
3
1
Lineage Interference Graph
j
0
1
1
We need three colors L1 RA L2 RB L3 RC
L1
?
L2
L5
83
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
h
k
Sequence
?
84
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
h
k
Sequence
?
85
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
b
h
k
Sequence
?
86
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
b
d
h
k
Sequence
?
87
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
b
d
h
h
k
Sequence
?
88
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
b
d
h
c
h
k
Sequence
?
89
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
b
d
h
c
e
h
k
Sequence
?
90
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
b
d
h
c
e
g
h
k
Sequence
?
91
Govinds Algorithm in Touatis Example
RA
?
RB
RC
a
b
c
Registers
g
i
d
e
f
j
a
b
d
h
c
e
g
f
h
k
Sequence
?
92
Comparing the Methods
Touatis method allows the creation of
schedules that uses from 7 to 3 registers (in his
CC2001 paper he reduced from 7 to 4) according to
the number of registers available for the basic
block.
Govind et al. method will always create a
schedule using three registers for this basic
block, regardless of the number of registers
available for the basic block.
93
Conjecture
If the scheduler in an out of order instruction
issue processor is optimal and the register
renaming has an infinite number of hidden
registers, both methods should be equivalent, and
the lineage based one is simpler.
With limited number of hidden registers for
renaming, and a sub-optimal runtime scheduler,
Touatis method is likely to produce better
results because it makes better use of the
available registers.
94
Research Questions
How well do the two methods compare in an actual
superscalar processor such as the MIPS R12K?
Touatis claim that his method will work well in
VLIW machines too. How would it compare with the
lineage method in the IA-64?
The allocation of registers to basic block by the
global register scheduler might affect Touatis
method significantly. How can his LRA be
integrated with a GRA?
95
Summary of Our Solution Method
DDG
  • A good construction algorithm for LIG (dynamic)
  • An effective heuristic method to calculate the
    HRB
  • An efficient scheduling method (do not backtrack)

Form Lineage Interference Graph (LIG)
Derive HRB
Extended list-scheduling guided by HRB
A good instruction sequence
Write a Comment
User Comments (0)
About PowerShow.com