Synthesis of synchronous elastic architectures - PowerPoint PPT Presentation

1 / 101
About This Presentation
Title:

Synthesis of synchronous elastic architectures

Description:

pearl. receiver. V. S. V. S. V. S. V. S. Carloni's relay ... pearl. receiver. shell. pearl. sender. Handshakes with short wires. Double storage required ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 102
Provided by: jordicor
Category:

less

Transcript and Presenter's Notes

Title: Synthesis of synchronous elastic architectures


1
Synthesis of synchronouselastic architectures
  • Jordi Cortadella (Universitat Politècnica
    Catalunya)
  • Mike Kishinevsky (Intel Corp.)
  • Bill Grundmann (Intel Corp.)

2
Network of Computing Units
Out
In
B3
B1
B2
3
Network of Computing Units
Out
In
B3
B1
B2
4
Network of Computing Units
Out
In
B3
B1
B2
5
Latency-insensitive (elastic) system
Out
In
B3
B1
B2
Every block onlymakes one step when all inputs
are valid
6
Why
  • Scalable
  • Modular (Plug Play)
  • Tolerance to variable latency
  • Communication
  • Computation
  • Not asynchronous
  • Use existing design paradigms
  • CAD tools

7
Outline
  • The cost of latency insensitivity
  • SELF an elastic protocol
  • Basic implementation (linear pipelines)
  • General netlists (forks and joins)
  • Formal models and verification
  • Synthesis of elastic architectures
  • Related work

8
Latency-insensitive block
Whats the cost oflatency-insensitivity?
Core
Data
Data
9
Communication channel
receiver
sender
Data
Data
Long wires slow transmission
10
Pipelined communication
sender
receiver
Data
11
Pipelined communication
sender
receiver
Data
12
Pipelined communication
sender
receiver
Data
How about if the sender does not always send
valid data?
13
The Valid bit
sender
receiver
Data
Data
Valid
Valid
14
The Valid bit
sender
receiver
Data
Valid
15
The Valid bit
sender
receiver
Data
Valid
16
The Valid bit
sender
receiver
Data
Valid
17
The Valid bit
sender
receiver
Data
Valid
How about if the receiver is not always ready ?
18
The Stop bit
19
The Stop bit
20
The Stop bit
21
The Stop bit
Back-pressure
22
The Stop bit
Long combinational path
23
Carlonis relay stations (double storage)
24
Carlonis relay stations (double storage)
25
Carlonis relay stations (double storage)
26
Carlonis relay stations (double storage)
27
Carlonis relay stations (double storage)
28
Carlonis relay stations (double storage)
29
Carlonis relay stations (double storage)
30
Carlonis relay stations (double storage)
31
Carlonis relay stations (double storage)
  • Handshakes with short wires
  • Double storage required

32
Proposal an elastic protocol
  • SELF (Synchronous ELastic Flow)
  • Simple and provably correct
  • Data-path with
  • No area overhead
  • No latency overhead
  • Minimum energy
  • Negligible control overhead
  • Fine-grain elasticity

33
Flip-flops vs. latches
sender
receiver
FF
FF
1 cycle
34
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
35
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
36
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
37
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
38
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
39
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
40
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
Flip-flops already have a double storage
capability, but
41
Flip-flops vs. latches
sender
receiver
H
L
H
L
1 cycle
Not allowed in conventional FF-based design !
42
Flip-flops vs. latches
sender
receiver
1 cycle
Lets make the master/slave latches independent
43
Flip-flops vs. latches
sender
receiver
½ cycle
½ cycle
Lets make the master/slave latches independent
Only half of the latches (H or L) can move tokens
44
Elastic buffer keeps datawhile stop is in flight
Cannot be done withSingle Edge Flops without
double pumping Use latches inside MS
W1R1
W2R1
W1R2
Carlonis relay station belongs to this class
W2R2
45
Shorthand notation (clock lines not shown)

D
Q
En
En
clk
46
SELF (linear communication)
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
Valid
Valid
1
1
1
1
Stop
Stop
S
S
S
S
47
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
48
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
49
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
50
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
51
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
52
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
0
Valid
Valid
0
Stop
Stop
S
S
S
S
53
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
0
Valid
Valid
0
Stop
Stop
S
S
S
S
54
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
0
Valid
Valid
0
Stop
Stop
S
S
S
S
55
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
0
Valid
Valid
0
Stop
Stop
S
S
S
S
56
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
0
Valid
Valid
0
Stop
Stop
S
S
S
S
57
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
58
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
59
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
60
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
61
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
62
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
63
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
64
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
65
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
1
Stop
Stop
S
S
S
S
66
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
67
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
68
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
69
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
70
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
71
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
72
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
73
SELF
sender
receiver
Data
Data
En
En
En
En
V
V
V
V
1
Valid
Valid
0
Stop
Stop
S
S
S
S
74
The protocol
Data
Valid
Sender
Receiver
Stop
75
The protocol
D
Data
1
Valid
Sender
Receiver
0
Stop
Transfer cycle Valid 1 ? Stop 0
76
The protocol
D
Data
1
Valid
Sender
Receiver
1
Stop
Retry cycle Valid 1 ? Stop 1
Persistency G V ? S ? (DataD) ? Next (V
? DataD)
77
The protocol
Sender
Receiver
D D C C C B A
Data
Data
0 1 1 0 1 1 1 1 0 1
Valid
Valid
0 0 1 0 0 1 1 0 0 0
Stop
Stop
78
Elastic Half Buffer
Latch
Data
Eni
EHB
Vi
Vi-1
Si
Si-1
79
Join

V1
V
EHB
S1
EHB
S
V2
S2
EHB
80
(Lazy) Fork
V
V1
S1
V2
S
S2
81
Eager Fork
S1

V1
V
V2

S
S2
82
Elastic combinational paths
Fork
Join
Join / Fork
83
Elastic combinational paths
Enable signal to data latches
Fork
Join
Join / Fork
84
Elastic combinational paths
Datapath
Fork
Join
Control layer
Join / Fork
85
Elastic buffer formal model
i i1
ik

Din
Dout
Vin
Vout
rd
wr
Sout
Sin
Buffer 0..? Initial state rd wr
0 Invariant wr ? rd
86
Elastic buffer formal model
i i1
ik

Din
Dout
Vin
Vout
rd
wr
Sout
Sin
  • Liveness properties (finite unbounded latencies)
  • Finite forward latency G (rd ? wr ? F
    Vout)
  • Finite backward latency G( ?Sout ? F ?Sin)

87
Formal verification
i i1
ik

Din
Dout
Vin
Vout
rd
wr
Sout
Sin
?
Din
Dout
Implementation
Vin
Vout
Sin
Sout
88
Formal verification
  • The abstract FSM model is appropriate for
    compositional verification
  • Verification of implementations with model
    checking (1-bit abstractions of the datapath)
  • Buffer is a refinement of the spec
  • In-order data-transmission
  • Correct synchronization of fork/join structures
  • Absence of deadlocks
  • LTL specs SMV

89
Formal verification
Dout
Din
Abstract model (NFSM)
Abstract model (NFSM)
Vin
Vout
Sin
Sout
?
Din
Dout
Abstract model (NFSM)
Vin
Vout
Sin
Sout
90
Formal verification
Dout
Din
Abstract model (NFSM)
Abstract model (NFSM)
Vin
Vout
Sin
Sout
?
Din
Dout
Abstract model (NFSM)
Vin
Vout
Sin
Sout
91
Formal verification
Dout
Din
Abstract model (NFSM)
Abstract model (NFSM)
Vin
Vout
Sin
Sout
?
Assuming the sameinitial contents (e.g. empty)
Din
Dout
Abstract model (NFSM)
Vin
Vout
Sin
Sout
92
Flow equivalence
Synchronous
D a b c d e f g h i j k
Elastic
D a a b b b c d e e f g g h i i i j k En 1 0
1 0 0 1 1 1 0 1 1 0 1 1 0 0 1 1
93
Elasticization
Synchronous
Elastic
94
CLK
95
FORK
IF/ID
ID/EX
EX/MEM
MEM/WB
F O R K
J O I N
PC
J O I N
CLK
96
FORK
J O I N
F O R K
J O I N
CLK
97
FORK
J O I N
F O R K
J O I N
CLK
98
FORK
J O I N
F O R K
J O I N
CLK
99
Elastic control layer Generation of gated
clocks
CLK
100
Bubble (empty latch) insertion
  • Does not change functionality(flow equivalence)
  • May affect performance (recycling)
  • Throughput may decrease (tokens / cycle)
  • Cycle time may be shortened
  • Performance (tokens / time unit) ???
  • New dimension for architectural exploration(fine
    granularity !!!)

101
Variable-latency Units
0 - k cycles
done
go
102
Variable-latency units
  • Telescopic units
  • 1 cycle for fast operations
  • 2 cycles for slow operations
  • Examples
  • Short / long additions (carry propagation)
  • A 0, A / 1
  • Dynamic changes in latency(fast if cold, slow if
    hot)

103
Some of Related work
  • Latency insensitive design
  • Carloni and a few follow-ups (large overhead)
  • Wire pipelining Svensson, Nookala, Casu,
  • Interlock pipelines (H. Jacobson et al.)
  • Asynchronous design
  • Micropipelines (Sutherland)
  • Rings (Williams, Sparso)
  • CHP and slack-elasticity (Martin, Burns, Manohar
    et al.)
  • De-synchronization
  • J. Cortadella et al.
  • V. Varshavsky
  • Synchronous implementation of CSP
  • J. OLeary et al.
  • A. Peeters et al.

104
Summary
  • SELF adds discrete time handshaking to the clock
    with a very small overhead buffering
  • Gives a last minute extra tick to anybody who
    wants it
  • Compositional theory proving correctness (Krstic
    et al., FMCAD06)
  • Libraries of controllers have been designed and
    their correctness verified
  • Elasticization CAD in progress
  • New micro-architectural opportunities based on
    variable latency units
Write a Comment
User Comments (0)
About PowerShow.com