Chapter 10: Scalable Interconnection Networks - PowerPoint PPT Presentation

Loading...

PPT – Chapter 10: Scalable Interconnection Networks PowerPoint presentation | free to download - id: 90d00-OTYxM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Chapter 10: Scalable Interconnection Networks

Description:

3D bidirectional torus, dimension order (NIC selected), virtual cut-through, packet sw ... packet sw, cut-through, no virtual channel, source-based routing ... – PowerPoint PPT presentation

Number of Views:181
Avg rating:3.0/5.0
Slides: 57
Provided by: trevor60
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 10: Scalable Interconnection Networks


1
Chapter 10 Scalable Interconnection Networks
2
Goals
  • Low latency
  • to neighbors
  • to average/furthest node
  • High bandwidth
  • per-node and aggregate
  • bisection bandwidth ..
  • Low cost
  • switch complexity, pin count
  • wiring cost (connectors!)
  • Scalability

3
Requirements from Above
  • Communication-to-computation ratio implies
    bandwidth needs
  • local or global?
  • regular or irregular (arbitrary)?
  • bursty or uniform?
  • broadcasts? multicasts?
  • Programming Model
  • protocol structure
  • transfer size
  • importance of latency vs. bandwidth

4
Basic Definitions
  • switch is a device capable of transferring data
    from input ports to output ports in an arbitrary
    pattern
  • network is a graph V switches/nodes connected
    by communication channels (aka links) C ? V x V
  • direct node at each switch
  • indirect node only on edge (like internet)
  • route sequence of links message follows from
    node A to node B

5
What characterizes a network?
  • Topology
  • interconnection structure of the graph
  • Switching Strategy
  • circuit vs. packet switching
  • store-and-forward vs. cut-through
  • virtual cut-through vs. wormhole, etc.
  • Routing Algorithm
  • how is route determined
  • Control Strategy
  • centralized vs. distributed
  • Flow Control Mechanism
  • when does a packet (or a portion of it) move
    along its route
  • can't send two packets down same link at same
    time

6
Topologies
  • Topology determines many critical parameters
  • degree number of input (output) ports per switch
  • distance number of links in route from A to B
  • average distance average over all (A,B) pairs
  • diameter max distance between two nodes (using
    shortest paths)
  • bisection min number of links to separate into
    two halves

7
Bus
  • Fully connected network topology
  • bus plus interface logic is a form of switch
  • Parameters
  • diameter average distance 1
  • degree N
  • switch cost O(N)
  • wire cost constant
  • bisection bandwidth constant (or worse)
  • Broadcast is free

8
Crossbar
  • Another fully connected network topology
  • one big switch of degree N connects all nodes
  • Parameters
  • diameter average distance 0(1)
  • degreeN
  • switch cost O(N2)
  • wire cost 2N
  • bisection bandwidth O(N)
  • Most switches in other topologies are crossbars
    inside

9
How to build a crossbar
10
Switches
11
Linear Arrays and Rings
  • Linear Array
  • Diameter? N -1
  • Avg. Distance? N/3
  • Bisection? 1
  • Torus (ring) links may be unidirectional or
    bidirectional
  • Examples FDDI, SCI, FiberChannel, KSR1

12
Multidimensional Meshes and Tori
  • d-dimensional k-ary mesh N kd k dvN
  • nodes in each of d dimensions
  • torus has wraparound, array doesn't mesh
    ambiguous
  • general d-dimensional mesh
  • n kd-1 x ...x k0 nodes

13
Properties
  • (Nkd,kdvN)
  • Diameter?
  • Average Distance?
  • d x k/3 for mesh
  • Degree
  • Switch Cost?
  • Wire Cost?
  • Bisection?
  • kd1

14
Hypercubes
15
Trees
  • Usually indirect, occasionally direct
  • Diameter and avg. distance are logarithmic
  • k-ary tree, height d logk N
  • Fixed degree
  • Route up to common ancestor and down
  • benefit for local traffic
  • Bisection?

16
Fat-Trees
  • Fatter links (really more of them) as you go up,
    so bisection BW scales with N

17
Butterflies
  • Tree with lots of roots !
  • N log N (actually N/2 x logN)
  • Exactly one route from any source to any dest
  • R A xor B, at level i use straight edge if
    rio, otherwise cross edge
  • Bisection N/2 vs N(d-1)/d

18
Benes Networks and Fat Trees
  • Back-to-back butterfly can route all permutations
  • off line
  • What if you just pick a random mid point?

19
Relationship of Butterflies to Hypercubes
  • Wiring is isomorphic
  • Except that Butterfly always lakes log n steps

20
Summary of Topologies
  • Topology Degree Diameter Ave Dist Bisection
    Diam/Ave Dist N1024
  • 1DArray 2 N-1 N/3 1
    1023/341
  • 1D Ring 2 N/2 N/4 2
    512/2
  • 2DArray 4 2(N½ -1) ?N½ N½
    63/21
  • 3D Array 6 3(N? -1) N? N?
    -30/-10
  • 2DTorus 4 N½ ½N 2N½
    32/16
  • k-ary n-cube 2n nk/2 nk/4 nk/4
    15/7.5 _at_ n3
  • Hypercube nlogN n n/2 N/2
    10/5
  • 2DTree 3 2log2N -2Iog2N 1
    20/-20
  • 4DTree 5 2log4N -2Iog4N-2/3 1
    10/-9
  • 2D fat tree 4 log2N -2Iog2N N
    20/-20
  • 2D butterfly 4 log2N log2N N/2
    20/20

21
Choosing a Topology
  • Cost vs. performance
  • For fixed cost which topology provides best
    performance?
  • best performance on what workload?
  • message size
  • traffic pattern
  • define cost
  • target machine size
  • Simplify tradeoff to dimension vs. radix
  • restrict to k-ary d-cubes
  • what is best dimension?

22
How Many Dimensions in a Network?
  • d2 or d3
  • Short wires, easy to build
  • Many hops, low bisection bandwidth
  • Benefits from traffic locality
  • dgt4
  • Harder to build, more wires, longer average
    length
  • Higher switch degree
  • Fewer hops, better bisection bandwidth
  • Handles non-local traffic better
  • Effect of of hops on latency depends on
    switching strategy...

23
StoreForward vs. Cut-Through Routing
  • messages typ. fragmented into packets pipelined
  • cut-through pipelines on flits

24
Handling Output Contention
  • What if output is blocked?
  • virtual cut-through
  • switch w/blocked output buffers entire packet
  • degenerates to
  • requires lots of buffering in switches
  • wormhole
  • leave flits strung out over network (In buffers)
  • minimal switch buffering
  • one blocked packet can tie up lots of channels

25
Traditional Scaling Latency(P)
  • Assumes equal channel width
  • independent of node count or dimension
  • dominated by average distance

26
Average Distance
average distance d(k-1)/2
  • but equal channel width is not equal cost!
  • Higher dimension gt more channels

27
In the 3-D world
For n nodes, bisection area is O(n2/3 )
  • For large n, bisection bandwidth is limited to
    O(n2/3 )
  • Dally, IEEE TPDS, Dal90a
  • For fixed bisection bandwidth, low-dimensional
    k-ary n-cubes are better (otherwise higher is
    better)
  • i.e., a few short fat wires are better than many
    long thin wires
  • What about many long fat wires?

28
Equal cost in k-ary n-cubes
  • Equal number of nodes?
  • Equal number of pins/wires?
  • Equal bisection bandwidth?
  • Equal area? Equal wire length?
  • What do we know?
  • switch degree d diameter d(k-1)
  • total links Nd
  • pins per node 2wd
  • bisection kd-1 N/k links in each direction
  • 2Nw/k wires cross the middle

29
Latency(d) for P with Equal Width
  • total links(N)Nd

30
Latency with Equal Pin Count
  • Baseline d2, has w 32 (128 wires per node)
  • fix 2dw pins gt w(d) 64/d
  • distance down with increasing d, but channel time
    up

31
Latency with Equal Bisection Width
  • N-node hypercube has N bisection links
  • 2d torus has 2N½
  • Fixed bisection gt w(d) N1/d/2 k/2
  • 1 M nodes, d2 has w512!

32
Larger Routing Delay (w/equal pin)
  • ratio of routing to channel time is key

33
Topology Summary
  • Rich set of topological alternatives with deep
    relationships
  • Design point depends heavily on cost model
  • nodes, pins, area, …
  • Need for downward scalability lends to fix
    dimension
  • high-degree switches wasted in small
    configuration
  • grow machine by increasing nodes per dimension
  • Need a consistent framework and analysis to
    separate opinion from design
  • Optimal point changes with technology
  • store-and-forward vs. cut-through
  • non-pipelined vs. pipelined signaling

34
Real Machines
  • Wide links, smaller routing delay
  • Tremendous variation

35
What characterizes a network?
  • Topology
  • interconnection structure of the graph
  • Switching Strategy
  • circuit vs. packet switching
  • store-and-forward vs. cut-through
  • virtual cut-through vs. wormhole, etc,
  • Routing Algorithm
  • how is route determined
  • Control Strategy
  • centralized vs. distributed
  • Flow Control Mechanism
  • when does a packet (or a portion of it) move
    along its route
  • can't send two packets down same link at same
    time

36
Typical Packet Format
  • Two basic mechanisms for abstraction
  • encapsulation
  • fragmentation

37
Routing
  • Recall routing algorithm determines
  • which of the possible paths are used as routes
  • how the route is determined
  • R N x N ? C, which at each switch maps the
    destination node nd to the next channel on the
    route
  • Issues
  • Routing mechanism
  • arithmetic
  • source-based port select
  • table driven
  • general computation
  • Properties of the routes
  • Deadlock free

38
Routing Mechanism
  • need to select output port for each input packet
  • in a few cycles
  • Simple arithmetic in regular topologies
  • ex ?x, ?y routing in a grid
  • west(-x) ?xlt0
  • east (x) ?xgt0
  • south(-y) ?x0, ?ylt0
  • north(y) ?x0, ?ygt0
  • processor ?x0, ?y0
  • Reduce relative address of each dimension in
    order
  • Dimension-order routing in k-ary d-cubes
  • e-cube routing in n-cube

39
Routing Mechanism (cont)
P3
P2
P1
P0
  • Source-based
  • message header carries series of port selects
  • used and stripped en route
  • CRC? header length …
  • CS-2, Myrinet, MIT Artic
  • Table-driven
  • message header carried index for next port at
    next switch
  • o Ri
  • table also gives index for following hop
  • , I Ri
  • ATM, HPPI

40
Properties of Routing Algorithms
  • Deterministic
  • route determined by (source, dest), not
    intermediate state (i.e. traffic)
  • Adaptive
  • route influenced by traffic along the way
  • Minimal
  • only selects shortest paths
  • Deadlock free
  • no traffic pattern can lead to a situation where
    no packets move forward

41
Deadlock Freedom
  • How can it arise?
  • necessary conditions
  • shared resource
  • incrementally allocated
  • non-pre-emptible
  • think of a channel as a shared that is acquired
    incrementally
  • source buffer then dest. buffer
  • channels along a route
  • How do you avoid it?
  • constrain how channel resources are allocated
  • ex dimension order
  • How do you prove that a routing algorithm is
    deadlock free?

42
Proof Technique
  • Resources are logically associated with channels
  • Messages introduce dependences between resources
    as they move forward
  • Need to articulate possible dependences between
    channels
  • Show that there are no cycles in Channel
    Dependence Graph
  • find a numbering of channel resources such that
    every legal route follows a monotonic sequence
  • gt no traffic pattern can lead to deadlock
  • Network need not be acyclic, only channel
    dependence graph

43
Example k-ary 2D array
  • Theorem dimension-order x,y routing is deadlock
    free
  • Numbering
  • x channel (i,y) ? (i1,y) gets i
  • similarly for -x with 0 as most positive edge
  • y channel (x,j) -gt (x,j I) gets Nj
  • similarly for -y channels
  • Any routing sequence' x direction, turn, y
    direction is increasing

44
Channel Dependence Graph
45
More examples
  • Why is the obvious routing on X deadlock free? .
  • butterfly?
  • tree?
  • fat tree?
  • Any assumptions about routing mechanism? amount
    of buffering?
  • What about wormhole routing on a ring?

46
Deadlock free wormhole networks?
  • Basic dimension-order routing doesn't work for
    k-ary d-cubes
  • only for k-ary d-arrays (bi-directional, no
    wrap-around)
  • Idea add channels!
  • provide multiple virtual channels to break
    dependence cycle
  • good for BW too!
  • Don't need to add links, or x bar, only buffer
    resources
  • This adds nodes to the CDG, remove edges?

47
Breaking deadlock with virtual channels
48
Turn Restrictions in X, Y
  • XY routing forbids 4 of 8 turns and leaves no
    room for adaptive routing
  • Can you allow more turns and still be deadlock
    free

49
Minimal turn restrictions in 2D
y
x
-x
-y
north-last
negative first
50
Example legal west-first routes
  • Can route around failures or congestion
  • Can combine turn restrictions with virtual
    channels

51
Adaptive Routing
  • R C x N x ? ? C
  • Essential for fault tolerance
  • at least multipath
  • Can improve utilization of the network
  • Simple deterministic algorithms easily run into
    bad permutations
  • Fully/partially adaptive, minimal/non-minimal
  • Can introduce complexity or anomalies
  • Little adaptation goes a long way!

52
Contention
  • Two packets trying to use the same link at same
    time
  • limited buffering
  • drop?
  • Most parallel machine networks block in place
  • link-level flow control
  • tree saturation
  • Closed system - offered load depends on delivered

53
Flow Control
  • What do you do when push comes to shove?
  • ethernet collision detection and retry after
    delay
  • TCPIW AN buffer, drop, adjust rate
  • any solution must adjust to output rate
  • Link-level flow control

54
Example T3D
  • 3D bidirectional torus, dimension order (NIC
    selected), virtual cut-through, packet sw
  • 16 bit x 150 MHz, short, wide, synch
  • rotating priority per output
  • logically separate request/response
  • 3 independent, stacked switches
  • 8 16-bit flits on each of 4 VC in each directions

55
Routing and Switch Design Summary
  • Routing Algorithms restrict the set of routes
    within the topology
  • simple mechanism selects turn at each hop
  • arithmetic, selection, lookup
  • Deadlock-free if channel dependence graph is
    acyclic
  • limit turns to eliminate dependences
  • add separate channel resources to break
    dependences
  • combination of topology, algorithm, and switch
    design
  • Deterministic vs adaptive routing
  • Switch design issues
  • input/output/pooled buffering, routing logic,
    selection logic
  • Flow control
  • Real networks are a package of design choices

56
Example SP
  • 8-port switch, 40 MB/s per link, 8-bit phit,
    16-bit flit, single 40 MHz clock
  • packet sw, cut-through, no virtual channel,
    source-based routing
  • variable packet lt 255 bytes, 31 byte FIFO per
    input, 7 bytes per output, 16 phit links
  • 128 8-byte chunks in central queue, LRU per
    output
About PowerShow.com