Title: Design and Analysis of an NoC Architecture from Performance, Reliability and Energy Perspective
1Design and Analysis of an NoC Architecture from
Performance, Reliability and Energy Perspective
- J.Kim, D. Park, C. Nicopoulos, N. Vijaykrishnan,
C. R. Das - Dept. of Computer Science and Engineering
- Pennsylvania State University
2Outline
- Motivation
- A New Router Architecture
- An Analytical Model for NoC interconnections
- Fault-Tolerance and Energy Analysis
- Conclusions
3Motivation
The SoC Era
? Network-on-Chip
ALU CORE
VGA CORE
DSP
ADC / DAC
ANALOG
4Motivation (cont)
- A chip-wide network Processing Elements
(PEs) interconnected via a packet-based network
in NoC Architecture
Packetized Message
MSG
MSG
Decoded Message
5Motivation (cont)
NoC Design Issues
- NoCs are critical for supporting hundreds of
functional units. - Needs high performance, low energy consumption
and reliable data transfer. - Design of NoCs with performance, energy and fault
tolerance is challenging with limited silicon
budget in deep sub-micron technology. - Prior work mostly considered performance and
energy issues.
6Motivation (cont)
- We focus on
- Design of a new router architecture
- Develop a queuing model for performance and
energy analysis - Investigate possible fault scenarios and propose
suitable fault-tolerant techniques.
7A Generic Router Architecture
On-chip Virtual Channel Router
8A Typical Router Pipeline
FLIT OUT
FLIT IN
PV1
ROUTING BUFFERS
VC ALLOCATION
SWITCH TRAVERSAL
ARBITRATION
9The Proposed Router Pipeline
- Routing performed in two steps
- Partition output paths into 2 choices
- (next-hop quadrants)
- In previous node
- Look-Ahead routing (route for current node)
- determines output quadrant
- In current node select output channel
NEXT- NODE ROUTING
FLIT_IN
BUFFERING
Credit From Next Router
PRE-SELECTION
X-BAR TRAVERSAL
ARBITRATION
FLIT_OUT
10Pre-Selection Mechanism
- Pre-Selection unit updates every cycle with local
congestion information.
- Example
- Packet comes from West
- Previous node sets which of the two quadrants
(NE, SE) or PE it will go to - Pre-Selection Mechanism in current node
determines final output channel based on - network status.
NE
SE
11Path-Sensitive Router Architecture
12Initial Performance Comparison with a 2-stage
router
Uniform Traffic
Self-Similar Traffic
13An Analytical Model for NoCs
- The average network latency consists of two
parts, actual message transfer time and blocking
time. Thus, the network latency (Tj) of M-flit
packet for a router j is
Tj (M Bj )Wj P - 1, where M ( of flits
per packet), P ( of pipeline stages), Bj
(average blocking length per flit), Wj (average
waiting time per flit)
- The waiting time depends on channel contention
and finite buffer blocking. Contention occurs at
two Modules, virtual channel allocation (VA) and
switch allocation (SA).
Pcon 1 (1 Pcon_va)(1 Pcon_sa)
14Modeling the Finite size Buffer in NoCs
The average queuing length (B) and the buffer
unavailability probability (Pblock) can be
iteratively estimated from the contention
probability and finite buffered state diagram .
(1-(1-Pcon)(1-Pblock))Pc
(1-(1-Pcon)(1-Pblock))Pc
(1-(1-Pcon)(1-Pblock))Pc
0
1
2
D
(1-Pcon)(1-Pblock)(1-Pc)
(1-Pcon)(1-Pblock)(1-Pc)
(1-Pcon)(1-Pblock)(1-Pc)
The average waiting time can be estimated from
the steady-sate traffic intensity, ?. ?
(1-(1-Pcon)(1-Pblock))Pc (1-Pcon)(1-Pblock)(1-
Pc)-1
Pc is flit arriving probability.
15Path-Sensitive Router Model
The Generic Router Queuing Systems
The Path-Sensitive Queuing Model
- Early Ejection
- Less number of competing channels for an output
port - Lower contention probability
- Blocking time is decreased in the Path-Sensitive
queuing model.
16Comparison of Analytical Model and Simulation
Virtual Channel Router
Path-Sensitive Router
8x8 2-D Mesh Network
17Model Utility
Input
Output
Analytical Model
System Workload Parameters
- Link Error
- Model Analysis
- Utilization of
- Different
- Components
18Fault-Tolerance Energy Analysis
- Possible soft faults that could afflict a network
architecture can be grouped in TWO main
categories - Link errors that occur during the traversal of
flits from router to router (channel disturbances
such as cross-talk, coupling noise and transient
faults), - Router errors that occur within the router
hardware components.
19Link Errors
- Link errors have so far been considered the
dominant source of network infrastructure errors
and have been given a great deal of attention in
current reliability schemes. - 5 different retransmission/error correction
schemes are analyzed. -
- End-to-End (E2E)
- Hop-by-Hop (HBH)
- Forward Error Correction (FEC)
- Header E2E (HE2E)
- Header FEC (HFEC)
20Router Errors
- Until very recently, not much attention had been
given to the effects of transient errors (e.g.
soft errors) occurring within a router. The
susceptibility of circuits to such errors
increases exponentially with technology scaling
in the deep sub-micron regime. - Soft errors within the router would escape the
error detecting/correcting blanket because they
do not actually corrupt the data, but, instead,
cause erroneous behavior in the functionality of
the routing process.
21Router Errors (Cont)
Considered faults in the following five units.
- Routing Unit
- Virtual Channel Allocation
- Switch Arbiter
- Crossbar
- Valid/Ready Handshaking Signal Errors
22Routing Unit Errors
23Routing Unit Errors
24Simulation Results
Latency with Data Logic Errors
50
LINK-HBH
LINK-HE2E
LINK-HFEC
45
ROUTE
SW-ARB
40
Latency (cycles)
35
30
25
0.00001
0.0001
0.001
0.01
Error Probability
25Simulation Results
Number of errors Corrected
60
LINK-HBH
LINK-HE2E
50
LINK-HFEC
ROUTE
SW-ARB
40
30
Number of Errors Corrected (thousand)
20
10
0
0.00001
0.0001
0.001
0.01
Error Probability
26Simulation Results
Energy Consumed per Packet
6
LINK-HBH
LINK-HE2E
5
LINK-HFEC
ROUTE
SW-ARB
4
Energy (nJ)
3
2
1
0
0.00001
0.0001
0.001
0.01
Error Probability
27Conclusions
- Proposed a Path-Sensitive Router to enhance the
overall performance and adaptivity of on-chip
networks. - Average latency can be minimized up to 30.
- Proposed a queuing-theorybased analytical model
for performance, power and fault-tolerance
analysis. - Investigated fault-tolerance aspects of on-chip
Link errors and intra-router errors. - Separate error coding technique for the header
flits reduces packet misrouting probability. - Fault protection techniques to tackle soft errors
in router components.
28Thank You!