Loop Optimizations - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Loop Optimizations

Description:

Singh neerajk_at_cse.iitd.ac.in. 2. Outline. Loop Optimizations-why? ... Aho Alfred V., Sethi Ravi, Ullman Jeffrey D., Compilers- Principle, Techniques and Tools. ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 41
Provided by: CSA129
Category:

less

Transcript and Presenter's Notes

Title: Loop Optimizations


1
Loop Optimizations
  • Neeraj Kr. Singh
  • csa02030_at_cse.iitd.ac.in

2
Outline
  • Loop Optimizations-why?
  • Loop Optimizations in Compilers Context
  • Loop Optimizations in HLS Context

3
Loop optimization-why?
  • Execute a number of times
  • Takes most of the time

4
Illustration
  • for(I0Ilt20I)
  • A105
  • / Rest of the code doesnt modify A/

5
Loop Optimizations in compiler context
  • Induction Variables and Cost Reduction
  • Loop Invariant Code Detection and Motion
  • Loop Fusion
  • Loop Fission
  • Loop Unrolling

6
Induction Variables and Cost Reduction
  • For(I0Ilt20I)
  • T4I
  • / Rest of the code doesnt modify T/

7
Induction
  • For(I0Ilt20I)
  • TT4
  • / Rest of the code doesnt modify T/

8
Loop Invariant Code
  • For(I0IltT-2I)
  • / The code doesnt modify T/

9
Loop
  • T1T-2
  • For(I0IltT1I)
  • / The code doesnt modify T/

10
Loop Fusion
  • For(I0Ilt20I)
  • AII
  • For(I0Ilt20I)
  • BI2I

11
Loop
  • For(I0Ilt20I)
  • AII
  • BI2I

12
Implication of Loop Fusion
  • A number of unnecessary tests and increments were
    saved.

13
Loop Fission
  • For(I0Ilt20I)
  • AII
  • BI2I

14
Loop
  • For(I0Ilt20I)
  • AII
  • For(I0Ilt20I)
  • BI2I

15
Motivation for Loop Fission
  • Cache Memory

16
Loop unrolling
  • For(I0Ilt20I)
  • AII

17
Loop unrolling
  • For(I0Ilt20I4)
  • AII
  • AI1I1
  • AI2I2
  • AI3I3

18
Loop Unrolling-Types
  • Partial
  • Full

19
Partial Loop unrolling
  • For(I0Ilt20I)
  • AII

20
Partial Loop Unrolling
  • For(I0Ilt20I4)
  • AII
  • AI1I1
  • AI2I2
  • AI3I3

21
Advantages
  • Number of tests are reduced
  • Parallelism

22
Disadvantages
  • Cache

23
Full Loop unrolling
  • For(I0Ilt8I)
  • AII

24
Full Loop Unrolling
  • A00
  • A11
  • A22
  • A33
  • A44
  • A55
  • A66
  • A77

25
Advantages
  • No tests
  • Parallelism
  • No cache problem
  • For array references, addresses may be calculated
    beforehand.

26
Loop Optimizations in HLS Context
  • Loop Unrolling
  • Induction Variables and Cost Reduction
  • Loop Invariant Code Detection and Motion
  • Loop Fission
  • Loop Fusion

27
Loop Unrolling
  • For(I0Ilt8I)
  • AII

28
Loop
  • Adder takes 3 ns
  • Clock time is 30 ns
  • We have 8 adders

29
Loop
  • A00
  • A11
  • A22
  • A33
  • A44
  • A55
  • A66
  • A77

30
Loop
  • All the stuff can be done in one clock cycle as
    compared to 8 clock cycles in rolled one.

31
Arrays in HLS
  • Can be mapped to memory or a sequence of wires
  • Depends on the designer

32
Arrays mapped to wires
  • A be an Array of 8 elements
  • Array Access
  • tmpAI

A_0
tmp
A_7
I
33
Arrays
  • Non Constant Array Index Assignment
  • AItmp

A_7
A_7
Decoder
A_0
A_0
tmp
I
34
Full Unrolling and Arrays
  • No hardware is further required for array
    indexing as the Array indices are constant.

35
Loop Unrolling in HLS- Epitome
  • Loop
  • For(I0Ilt8I)
  • tmptmpAI
  • Unlimited Resources
  • Clock Period is Long Enough to Schedule 5
    operations(incrementcomparisonadd)

36
Loop
  • Rolled
  • Schedule1 addition,comparison and increment per
    cycle.
  • Resources 1 adder, 1 comparator, 1 incrementer,
    indexing logic.
  • Latency 8 clock cycles

37
Loop
  • Fully Unrolled
  • Schedule4 additions per clock
  • Resources 4 Adders
  • Latency 2 clock cycles

38
Loop
  • Partially Unrolled- 2 times
  • Schedule2 additions, 2 increments,1 comparison
    per clock.
  • Resources 2 adders, 1 comparator, 2 increments
  • Latency 4 clock cycle

39
References
  • Elliott John P.,Understanding Behavioral
    Synthesis- A Practical Guide to High-Level
    Design.
  • Aho Alfred V., Sethi Ravi, Ullman Jeffrey D.,
    Compilers- Principle, Techniques and Tools.

40
Questions?
  • Thanks.
Write a Comment
User Comments (0)
About PowerShow.com