A Retargetable Preprocessor for Multimedia Instructions work in progress INRIA F. Bodin, G. Pokam, J - PowerPoint PPT Presentation

About This Presentation
Title:

A Retargetable Preprocessor for Multimedia Instructions work in progress INRIA F. Bodin, G. Pokam, J

Description:

Instruction set extension to achieve high performance. many ... [negated] (same ObjectAddr:14 ObjectAddr:11 0) NODELabel v: NodeType = {scalar} Operator = {obj} ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 21
Provided by: bodinf
Category:

less

Transcript and Presenter's Notes

Title: A Retargetable Preprocessor for Multimedia Instructions work in progress INRIA F. Bodin, G. Pokam, J


1
A Retargetable Preprocessor for Multimedia
Instructions (work in progress) INRIA F.
Bodin, G. Pokam, J. Simonnet
partially supported by ST Microelectronics
2
Multimedia Instructions
  • Instruction set extension to achieve high
    performance
  • many different ones
  • crucial for embedded systems
  • difficult to use
  • Retargetability is an issue

3
Multimedia Instructions
  • Exploits sub word parallelism

4
An Example (Trimedia)
char back, forward, idct, destination for (i
0 ilt64 i 1) destinationi
((backi forwardi 1) gtgt 1)
idcti
int i_back (int ) back int i_forward (int
) forward int i_idct (int ) idct int
i_dest (int ) destination for (i 0 ilt16
i 1) temp QUADAVG(i_backi,
i_forwardi) i_desti
DSPUQUADADDUI(temp, i_idcti)
5
MMI Automatic Exploitation
Vectorization Bik01 Krall00 ...
Src code
Pre- processing
Idioms/MMI Recognition
Code Generation
Alignment
Loop Unrolling Larsen00 Leupers2000
machine independent
machine dependent
6
The MMI Recognition Phase
  • Find the instruction available on the machine
  • after vectorization
  • after unrolling
  • User interaction
  • fast retargetability
  • not only for compiler writer
  • no compiler recompilation needed

7
A MMI Example
tempi (backiforwardi1)gtgt1
rather than
t1i (backi forwardi) t2i t1i
1 tempi t2i gtgt 1
8
SWARecog a Retargetable Engine for MMI
  • Front-end independent
  • CoSy
  • Sage
  • Retargetable
  • configurable intermediate form
  • Uses a rewriting system based on U. Assmanns
    work Assmann96

9
An Overview of SWARecog
CoSy Sage
IR description based
10
The Intermediate Format
  • Identical for code and rules
  • Attributes declaration
  • Node declaration
  • Edge declaration

NODES OperatorENUM cast, mul, add, sright,
assg,... NODES VariableNameSTRING
EDGES distanceINTEGER DEFAULT 0
NODELabel Assign NodeType operator
Operator assg ValueType int
(flowdep ObjectAddr14 Assign8 1) (flowdep
Plus9 Assign8 2) negated (same ObjectAddr14
ObjectAddr11 0)
11
A Rule Description Example
b aa
b altlt1
NODELabel v NodeType scalar Operator
obj Aliased 0 ValueType int
VariableName LoopSector body
v
v
1
v

ltlt


RULE 1 MulToShift (flowdep v1 Plus6
0) (flowdep v2 Plus6 0) (flowdep Plus6 Exp0
) 1 (same v1 v2 0) (same v2 v1 0)
(flowdep v1 Shift7 1) (flowdep IntConst18
Shift7 2) (flowdep Shift7 Exp0 ) 1
12
Example-1
/pragmaVectorLoop("NoAlias")/ for (i xa i
lt xb i i4) sum sum (si omi)
sum sum (si1 omi1) sum sum
(si2 omi2) sum sum (si3
omi3)
for (i xa i lt xb i i 4) sum sum
dualDotProd(packCont(si, si 1),
packCont(omi, omi 1)) sum sum
dualDotProd(packCont(si 2, si 3),
packCont(omi 2, omi 3))
13
Example-2
for (i xa i lt xb i i4) /pragmaVectorLoo
p("NoAlias")/ di j 4 (si
omi) di1 j 4 (si1 omi1)
di2 j 4 (si2 omi2)
di3 j 4 (si3 omi3)
instance number (INSTANCE)
for (i xa i lt xb i i 4)
NEWVAR_temp2_1 dualAdd(packCont(si 2, si
3), packCont(omi 2, omi 3))
NEWVAR_temp2_2 dualAdd(packCont(si, si
1), packCont(omi, omi 1))
NEWVAR_temp1_1 unpackCont(NEWVAR_temp2_1, 0)
di 2 j 4 (NEWVAR_temp1_1)
NEWVAR_temp3_1 unpackCont(NEWVAR_temp2_1, 1)
di 3 j 4 (NEWVAR_temp3_1)
NEWVAR_temp1_2 unpackCont(NEWVAR_temp2_2, 0)
di j 4 (NEWVAR_temp1_2)
NEWVAR_temp3_2 unpackCont(NEWVAR_temp2_2, 1)
di 1 j 4 (NEWVAR_temp3_2)
14
Combining the Rules
  • Strata or alternative based
  • normalization based

Rule Desc.
Rule Desc.
Rewriting Engine
Rewriting Engine
C code
IR Form
Rewriting Engine
IR Form
IR Form
... ....
Rewriting Engine
Rewriting Engine
C code
IR Form
Rule Desc.
Rule Desc.
15
Rule Generation
  • C rules description

RHS Generator
C code
Rule description
LHS Generator
C code
SWARecog
Front-end
Front-end
C code
C code
16
A Rule Description Example
the engine generates same_address_1
defines the properties of the leaf expressions to
match.
for (i 0 i lt LOOP_BOUND1 -1 i i2)
/pragmaLHS()/ ROOT_1(LEAF_3(tab1i)
LEAF_4(tab2i)) ROOT_2(LEAF_5(tab1i1)
LEAF_6(tab2i1)) for (i 0 i lt
LOOP_BOUND1 -1 i i2) /pragmaRHS()/
NEWVAR_temp2 dualAdd(packCont(LEAF_3(tab1i),LE
AF_5(tab1i1)),
packCont(LEAF_4(tab2i),LEAF_6(tab2i1)))
NEWVAR_temp1 unpackCont(NEWVAR_temp2,0)
NEWVAR_temp3 unpackCont(NEWVAR_temp2,1)
ROOT_1(NEWVAR_temp1) ROOT_2(NEWVAR_temp3)

17
A Rule Description Example
for (i 0 i lt LOOP_BOUND1 -1 i i2)
/pragmaLHS()/ ROOT_1(LEAF_7(sum)
LEAF_8(sum) (LEAF_3(tab1i)
LEAF_4(tab2i))) ROOT_2(LEAF_9(sum)
LEAF_10(sum) (LEAF_5(tab1i1)
LEAF_6(tab2i1)))
for (i 0 i lt LOOP_BOUND1 -1 i i2)
/pragmaRHS()/ ROOT_2(LEAF_9(sum)
LEAF_10(sum) dualDotProd(packCont(LEAF_3(
tab1i),LEAF_5(tab1i1)),
packCont(LEAF_4(tab2i),LEAF_6(tab2i1))))

18
Conclusion and Perspectives
  • The prototype is running
  • Vectorization and alignment phases are under
    development
  • Next step study the tradeoff between
    unrolling and vectorization

19
Bibliography
  • Assmann96 Graph Rewrite Systems for Program
    Optimization, U. Assman, Technical Report RR2955,
    INRIA Rocquencourt, 1996
  • bik01 Experiments with Automatic Vectorization
    for the Pentium 4 Processor, A. Bik, M. Girkar,
    P. Grey and X. Tian, CPC, 2001
  • Cheong97 An Optimizer for Multimedia
    Instruction Sets, G. Cheong and M. Lam,
    Proceedings of the Second SUIF Compiler Workshop,
    1997

20
Bibliography (cont.)
  • Krall00 Compilation Technique for Multimedia
    Processors, A. Krall and S. Lelait, IJPP, vol.
    28, No 4, 2000
  • Larsen00 Exploiting Superword Level Parallelism
    with Multimedia Instruction Sets, S. Larsen and
    S. Amarasinghe, PLDI 2000
  • Leupers2000 Code Selection for Media Processors
    with SIMD Instructions, R. Leupers, DATE 2000
  • Sreraman00 A Vectorizing Compiler for
    Multimedia Extensions, N. Sreraman and R.
    Govindarajan, IJPP, vol. 28, No 4, 2000
Write a Comment
User Comments (0)
About PowerShow.com