Dynamic Reconfiguration with Binary Translation: Breaking the ILP Barrier with Software Compatibilit - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Dynamic Reconfiguration with Binary Translation: Breaking the ILP Barrier with Software Compatibilit

Description:

... instructions = same color. Limit. Parallel. execution ... LD/ST. First Operand. Second Operand. How it works? Instruction Fetch. Decoding. Operand Fetch ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 75
Provided by: dac97
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Reconfiguration with Binary Translation: Breaking the ILP Barrier with Software Compatibilit


1
Dynamic Reconfiguration with Binary Translation
Breaking the ILP Barrier with Software
Compatibility
  • Antonio Carlos S. Beck Filho caco_at_inf.ufrgs.br
  • Luigi Carro
  • carro_at_inf.ufrgs.br
  • Informatics Institute - LSE
  • Federal University of Rio Grande do Sul
  • Brazil

2
Motivation
  • Complex

3
Motivation
  • How to increase performance with low energy
    consumption?
  • Using a reconfigurable array!

4
Motivation
  • How to increase performance with low energy
    consumption?
  • Using a reconfigurable array!

Binary Translation (BT)
5
Motivation
  • How to increase performance with low energy
    consumption?
  • Using a reconfigurable array!
  • Special tools and/or compilers are needed
  • No software compatibility!
  • What happens to the design cycle?

6
Outline
  • Introduction
  • The Java processors
  • The reconfigurable array
  • How the BT algorithm works
  • Results
  • Conclusions and Future Work

7
Outline
  • Introduction
  • The Java processors
  • The reconfigurable array
  • How the BT algorithm works
  • Results
  • Conclusions and Future Work

8
Introduction
1
  • Advantages of using a reconfigurable array
  • Speeds up sequences of instructions that are not
    necessarily data independent

9
Introduction
1
  • Parallel Architecture

Time (clock cycles)
Data dependent instructions same color
10
Introduction
1
  • Dynamic Analysis (BT)
  • Allows to find sequences of instructions to be
    executed in the array at run-time

11
Introduction
1
  • Dynamic analysis in recent works
  • Fine-Grain arrays and FPGAs
  • Increases the complexity of detection
  • Increases the cache responsible for keeping the
    configurations
  • Just in critical parts of the software
  • In this work
  • Coarse-Grain Array
  • Detection of the instructions becomes simpler
  • Small amount of memory required
  • Optimizes any sequence in the software
  • Technology independent

12
Introduction
1
  • Java processor as case study
  • Object Oriented
  • Modeling
  • Programming
  • Validation
  • Multiplatform
  • Widely spread
  • Moreover, makes the detection algorithm and the
    routing simpler

13
Introduction
1
  • Java Processor

Coarse-Grain Reconfigurable Array
Dynamic Detection (Binary Translation)
Performance
Energy Consumption
14
Outline
  • Introduction
  • The Java processors
  • The reconfigurable array
  • How the BT algorithm works
  • Results
  • Conclusions and Future Work

15
Femtojava Pipelined
2
  • Five Stages

Instruction Fetch
Operand Fetch
Decoding
Write Back
Execution
16
Femtojava VLIW
2
  • The VLIW version basically is an extension of the
    pipelined one
  • 2, 4 or 8 instructions/packet
  • VLIW packet has a variable size
  • Functional Units replicated

17
Outline
  • Introduction
  • The Java processors
  • The reconfigurable array
  • How the BT algorithm works
  • Results
  • Conclusions and Future Work

18
The Reconfigurable Array
3
  • The coarse-grain array is tightly coupled
  • Decreases the overhead in the communication
  • No external access to the array

19
The Reconfigurable Array
3
  • The coarse-grain array is tightly coupled
  • Decreases the overhead in the communication
  • No external access to the array
  • It is formed by one or more basic cells

20
The Reconfigurable Array
3
  • The coarse-grain array is tightly coupled
  • Decreases the overhead in the communication
  • No external access to the array
  • It is formed by one or more basic cells
  • With one multiplier

21
The Reconfigurable Array
3
  • The coarse-grain array is tightly coupled
  • Decreases the overhead in the communication
  • No external access to the array
  • It is formed by one or more basic cells
  • With one multiplier
  • A sequence of seven sets of basic functional units

22
How it works?
3
Instruction flow
5A 32 B7
4C 63
Instruction Fetch
Operand Fetch
Decoding
Write Back
Execution
23
How it works?
3
Sequence found!
Instruction Fetch
Operand Fetch
Decoding
Write Back
Execution
24
How it works?
3
Instruction Fetch
Operand Fetch
Decoding
Write Back
Execution
25
How it works?
3
Instruction Fetch
Operand Fetch
Decoding
Write Back
Execution
26
How it works?
3
Instruction Fetch
Operand Fetch
Decoding
Write Back
Execution
27
Outline
  • Introduction
  • The Java processors
  • The reconfigurable array
  • How the BT algorithm works
  • Results
  • Conclusions and Future Work

28
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
Stack
29
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
10
30
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
5
10
31
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
50
32
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
50
33
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
3
50
34
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
4
3
50
35
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
48
50
36
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
48
50
37
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
98
38
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
98
39
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
40
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
41
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
  • These instructions depend on each other!

42
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
6
43
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
7
6
44
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
42
45
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
42
46
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
47
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
48
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
49
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
50
How BT Works
4
Sequence of instructions
Bipush 10 Bipush 5 Imul Bipush 3 Bipush
4 Ishl Iadd Istore Bipush 6 Bipush 7 Imul Istore
Operand Block 1 First Sequence
Operand Block 2 Second Sequence
51
Outline
  • Introduction
  • The Java processor
  • The reconfigurable array
  • How the BT algorithm works
  • Results
  • Conclusions and Future Work

52
Results - Benchmarks
5
  • A set of algorithms were executed in the
    architectures
  • Sin Calculation
  • Sort Bubble
  • Sort Select
  • Sort Quick (10 and 100 elements)
  • Search Binary
  • Search Sequential
  • IMDCT (plus three unrolled versions)
  • Floating Point Sums emulation
  • Full MP3 Player

53
Results - Performance
5
54
Results - Performance
5
55
Results - Performance
5
56
Results - Performance
5
57
Results - Performance
5
58
Results - Performance
5
59
Results - Performance
5
60
Results - Performance
5
61
Results - Performance
5
62
Results - Performance
5
63
Results - Performance
5
64
Results - Performance
5
Bubble 10 3.4x faster Bubble 100 5.5x faster
65
Energy consumption - ROM
5
66
Energy consumption - RAM
5
67
Energy Consumption - Core
5
68
Energy Consumption - Total
5
69
Results - Area
5
Pipelined
VLIW 2
FEMTOJAVA REC. ARRAY 4 cells 5 dif.
reconfigurations
VLIW 4
VLIW 8
70
Results - Area
5
Low Power
Reconfiguration Cache
VLIW 2
Femtojava Pipelined
BT Logic
Data Cache
VLIW 4
VLIW 8
Reconfigurable Array
71
Outline
  • Introduction
  • The Java processor
  • The reconfigurable array
  • How the BT algorithm works
  • Results
  • Conclusions and Future Work

72
Conclusions
6
  • With BT, a reconfigurable array and Java we
    achieve at the same time
  • Software portability
  • Performance
  • Low Energy Consumption

73
Future Work
6
  • Use dynamic analysis with CMP
  • At run-time detects which is the best core to
    execute the software at certain time
  • Implement the BT and reconfigurable array in
    traditional RISC machines
  • What are the differences of implementation?
  • Tradeoffs analysis

74
The end
  • Thank you!!!
  • caco_at_inf.ufrgs.br
  • carro_at_inf.ufrgs.br
Write a Comment
User Comments (0)
About PowerShow.com