Dynamic Interconnection Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Dynamic Interconnection Networks

Description:

Dynamic Interconnection Networks CEG 4131 Computer Architecture III Miodrag Bolic Quiz 1 NIOS II processor basics FPGA basics Caches Performance Size, number ... – PowerPoint PPT presentation

Number of Views:26

Avg rating:3.0/5.0

Slides: 25

Provided by: siteUott6

Category:

more less

Transcript and Presenter's Notes

Title: Dynamic Interconnection Networks

1
Dynamic Interconnection Networks

CEG 4131 Computer Architecture III
Miodrag Bolic

2
Quiz 1

NIOS II processor basics
FPGA basics
Caches
Performance
Size, number of bits
Block placement
Block identification
Block replacement
Write strategy

3
Quiz 1 (Cont.)

Key terms
Flynns taxonomy
Shared memory architectures
Cache coherence
NUMA, UMA, COMA
Symmetric Multiprocessors
Distributed memory systems
Classification based on communication
Classification based on type of parallelism
Chapter 1 from the textbook

4
Quiz 1 (Cont.)

Amdahl law
Speedup, Efficiency
Parallelism profile, average parallelism, MIPS
Scalability
Understanding of performance of the program for
parallel addition
Chapters 3.1, 3.2.2, 3.4, 3.5

5
Overview

Network properties
Switches
Single and multistage Interconnection networks
Crossbar

6
Network properties

Node degree d - the number of edges incident on a
node.
In degree
Out degree
Diameter D of a network is the maximum shortest
path between any two nodes.
The network is symmetric if it looks the same
from any node.
The network is scalable if it expandable with
scalable performance when the machine resources
are increased.

7
Bisection width

Bisection width is the minimum number of wires
that must be cut to divide the network into two
equal halves.
Small bisection width -gt low bandwidth
A large bisection width -gt a lot of extra wires
A cut of a network C(N1,N2) is a set of channels
that partition the set of all nodes into two
disjoint sets N1 and N2. Each element of C(N1,N2)
is a channel with a source in N1 and destination
in N2 or vice versa.
A bisection of a network is a cut that partitions
the entire network nearly in half, such that
N2N1N21. Here N2 means the number of
nodes that belong to the partition N2.
The channel bisection of a network is the minimum
channel count over all bisections of the network

8
Factors Affecting Performance

Functionality how the network supports data
routing, interrupt handling, synchronization,
request/message combining, and coherence
Network latency worst-case time for a unit
message to be transferred
Bandwidth maximum data rate
Hardware complexity implementation costs for
wire, logic, switches, connectors, etc.

9
2 2 Switches
From Advanced Computer Architectures, K. Hwang,
1993.
10
Switches

Permutation function each input can only be
connected a single output.
Legitimate state Each input can be connected to
multiple outputs, but each output can only be
connected to a single input

11
Single-stage networks

Single stage Shuffle-Exchange IN (left)
Perfect shuffle mapping function (right)
Perfect shuffle operation cyclic shift 1 place
left, eg 101 --gt 011
Exchange operation invert least significant bit,
e.g. 101 --gt 100

From Ben Macey at http//www.ee.uwa.edu.au/macey
b/aca319-2003
12
Multistage Interconnection Networks

The capability of single stage networks are
limited but if we cascade enough of them
together, they form a completely connected MIN
(Multistage Interconnection Network).
Switches can perform their own routing or can be
controlled by a central router
This type of networks can be classified into the
following four categories
Nonblocking
A network is called strictly nonblocking if it
can connect any idle input to any idle output
regardless of what other connections are
currently in process
Rearrangeable nonblocking
In this case a network should be able to
establish all possible connections between inputs
and outputs by rearranging its existing
connections.
Blocking interconnection
A network is said to be blocking if it can
perform many, but not all, possible connections
between terminals.
Example the Omega network

13
Omega networks

A multi-stage IN using 2 2 switch boxes and a
perfect shuffle interconnect pattern between the
stages
In the Omega MIN there is one unique path from
each input to each output.
No redundant paths ? no fault tolerance and the
possibility of blocking

Example
Connect input 101 to output 001
Use the bits of the destination address, 001,
for dynamically selecting a path
Routing
- 0 means use upper output
- 1 means use lower output

From Ben Macey at http//www.ee.uwa.edu.au/macey
b/aca319-2003
14
Omega networks

log2N stages of 2 2 switches
N/2 switches per stage
S(N/2) log2(N) switches
Number of permutations in a omega network 2S

15
Baseline networks

The network can be generated recursively
The first stage N N, the second (N/2) (N/2)
Networks are topologically equivalent if one
network can be easily reproduced from the other
networks by simply rearranging nodes at each
stage.

From Advanced Computer Architectures, K. Hwang,
1993.
16
Crossbar Network

Each junction is a switching component
connecting the row to the column.
Can only have one connection in each column

From Advanced Computer Architectures, K. Hwang,
1993.
17
Crossbar Network

The major advantage of the cross-bar switch is
its potential for speed.
In one clock, a connection can be made between
source and destination.
The diameter of the cross-bar is one.
Blocking if the destination is in use
Because of its complexity, the cost of the
cross-bar switch can become the dominant factor
for a large multiprocessor system.
Crossbars can be used to implement the ab
switches used in MINs. In this case each
crossbar is small so costs are kept down.

18
Problem

Use two-input AND and OR gates to construct NxN
crossbar switch network between N processors and
N memory modules. Use cij signal as the enable
signal for the switch in ith row and jth column.
Let the width of each crosspoint be w bits.
Estimate the total number of AND and OR gates
needed as a function of N and w.

19
Problem (cont.)
20
Problem (cont.)
21
Problem (cont.)
22
Performance Comparison
23
Some Commercial Solutions 3

System-on-chip crossbar networks
Nexus from Fulcrum Microsystems
The core is used in PMC-Sierra dual MIPS
processor RM9000

24
References

Advanced Computer Architecture and Parallel
Processing, by Hesham El-Rewini and Mostafa
Abd-El-Barr, John Wiley and Sons, 2005.
Advanced Computer Architecture Parallelism,
Scalability, Programmability, by K. Hwang,
McGraw-Hill 1993.
A. Lines, Nexus an asynchronous crossbar
interconnect for synchronous system-on-chip
designs, Proc. of High Performance
Interconnects, pp 2-7, 2003.