The fattree topology and its performance issues - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

The fattree topology and its performance issues

Description:

... with m-port switches: all internal nodes are of degree m ... Generalized fat-tree GFT(h, m, w) ... FT(m, n) is a constant bisection bandwidth (CBB) network: ... – PowerPoint PPT presentation

Number of Views:255
Avg rating:3.0/5.0
Slides: 31
Provided by: rache82
Category:

less

Transcript and Presenter's Notes

Title: The fattree topology and its performance issues


1
The fat-tree topology and its performance issues
  • Different names for fat-tree
  • folded Clos networks
  • constant bisection bandwidth (CBB) networks
  • Trees with multiple roots
  • multi-stage networks
  • Fat-tree is the de-facto topology in high speed
    system area networks.
  • Almost all medium and large clusters (gt 100
    ports) are connected with some kinds of fat-tree
    topologies.

2
Why is fat-tree so popular?
  • Clusters with nodes connected by one centralized
    switch have many desirable properties for
    building large (scalable) systems.
  • Constant latency
  • Bisection bandwidth scales linearly with the
    number of nodes
  • Fat-tree approximate a centralized switch with a
    large number of ports. (the Clos network)

3
Fat-tree construction
  • Fat-tree as it was original defined by C.E.
    Leiserson is very flexible regarding bisection
    bandwidth.
  • C.E. Leiserson, Fat-trees Universal Networks
    for Hardware-Efficient Supercomputing, IEEE
    Transactions on Computers, 34(10)892-901, Oct.
    1985.
  • The ones used in the current system area networks
    are mostly constant bisectional bandwidth (CBB)
    networks.
  • We will introduce two sub-class of fat-trees.

4
Some example fat-trees
5
General idea in fat-tree construction
  • A perfect fat-tree has the same functionality as
    a crossbar.
  • Use smaller switches to approximate large
    switches.
  • Connectivity is reduced, but the topology is
    implementable
  • Constant bisection bandwidth is maintained by
    having the same number of links in each level.

6
FT(m, n) m-port n-tree
  • Reference X. Lin, Y. Chung and T. Huang, A
    Multiple LID Routing Scheme for Fat-tree Based
    InfiniBand Networks, IEEE IPDPS 2004.
  • Fat-trees built with m-port switches all
    internal nodes are of degree m
  • FT(m, n) is built over sub fat-trees (SUBFT) fat
    trees with open up-links.

7
SUB-fat-trees
  • SUBFT(m, h) has (m/2)h open up links and
    connects (m/2)h machines (leaves).
  • (m/2)(h-1) top level switches
  • m/2 SUBFT(m, h-1)

8
FT(m, h)
  • (m/2)(h-1) top level switches
  • m SUBFT(m, h-1)

9
FT(m, h)
  • Number of machines m(m/2)(h-1)
  • Number of switches (2h-1) (m/2)(h-1)
  • Typical value for m 24
  • Typical value for h 2 or 3.
  • FT(24, 3) 3456 ports, 720 switches

10
FT(4, 3)
11
Generalized fat-treeGFT(h, m, w)
  • Reference S. R. Ohring, M. Ibel, S. K. Das, M.
    J. Kumar, On Generalized Fat-tree, IEEE IPPS
    1995.
  • FT(m, n) is a constant bisection bandwidth (CBB)
    network
  • each node has m/2 children and m/2 parents.
  • GFT(h, m, w)
  • Each node has m children and w parents
  • mw bisection bandwidth ratio
  • mw 11 is sometimes called full bisectional
    bandwidth (FBB).

12
GFT(h, m, w)
  • GFT(0, m, w) a single node
  • GFT(h1, m, w)
  • w(h1) top level switches (eaching having m
    child)
  • m GFT(h, m, w)s
  • Similar to how FT(m, n) is constructed.

13
GFT(x, 2, 1)
GFT(2, 2, 1)
GFT(1, 2, 1)
GFT(0, 2, 1)
14
GFT(x, 2, 2)
GFT(0, 2, 2)
GFT(1, 2, 2)
GFT(2, 2, 2)
15
GFT(3, 2, 2)
GFT(2, 2, 2)
GFT(1,2,2)
How is this different from FT(4, 3)?
16
GFT(2, 4, 4)
17
GFT(2, 3, 3)
18
GFT(2, 2, 3)
19
GFT(2, 4, 2)
20
Performance issues in fat-trees
  • Clos network is non-blocking when ngt2m-1.
  • 2-level fat-trees (e.g. GFT(1, 2, 4)) are
    equivalent to Clos networks, thus the name folded
    Clos.
  • Can 2-level fat-trees achieve non-blocking
    communication?

21
Can 2-level fat-trees achieve non-block
communications?
  • Clos networks are non-blocking when
  • The system knows all the current on-going
    traffics
  • Needs a centralized controller.
  • The source must be able to use any path to the
    destination.
  • Needs to support a large number of paths.
  • Are these conditions practical in large computer
    clusters?

22
Practical fat trees
  • 2-level CBB networks or folded Clos (nm) are the
    minimum required to achieve non-blocking
    (rearrangeable non-blocking).
  • Network contention is possible due to the lack of
    centralized controller.
  • Needs techniques to minimize the possibility of
    network contention.
  • What kind of techniques can do this?

23
Practical fat trees
  • What kind of techniques can reduce contention?
  • Routing spread traffics among all links
  • Adaptive routing (Quadrics)
  • Require multiple paths, avoid links currently
    under use.
  • Limited applicability used in up links, but not
    down links.
  • Source routing similar idea as adaptive routing,
    but less flexibility. (Myrinet)
  • Deterministic routing worst performer, but
    simple implementation. (InfiniBand)
  • Congestion control slow down when the network is
    in trouble.
  • Reactive approach is this good for high speed
    networks?

24
Routing issue in fat trees
Can we compute routes that achieve non-blocking
Communication for any permutation?
25
A case study for the current fat-tree
interconnection networks
  • Reference T. Hoefler, T. Schneider, and A.
    Lumsdaine Multistage Interconnection Networks
    are not Crossbars Effects of static routing in
    high performance networks, IEEE Cluster, 2008.
  • Many large scale fat-tree based networks have
    been built. How are they doing?

26
Performance metrics
  • User perceived bisection bandwidth
  • 4X DDR InfiniBand ? 20 Gbps between each pair.
  • What happens when half of the machines send to
    the other half simultaneously?
  • In a crossbar, all pairs should get 20Gbps!!
  • How about fat-tree?
  • Due to the routing constraints, the user
    perceived bisection bandwidth should depend on
    the permutation.

27
User perceived bisection bandwidth on some systems
  • Results obtained using simulation average of
    many random permutations
  • Ranger (3908 nodes) 57.5
  • Atlas (1142 nodes) 55.6
  • Thunderbird (4390 nodes) 40.6
  • 40 to 60 of a crossbars seem not too bad.
  • But the results are the average case, not the
    worst case.

28
Other effects of network contenion
  • Bandwidth varies with communication pattern.
  • Performance prediction and modeling is not easy.
  • Message latency is also affected.

29
Conclusion
  • Fat-trees can only approximate cross-bar.
  • Are there better topologies than fat-trees under
    practical constraints?
  • In the current fat-tree topology, what are the
    best routing schemes with adaptive, source route,
    and single path routing?
  • It is commonly believed that adaptive routing is
    good for fat-trees, but is adaptive routing good
    enough?

30
References
  • Fat-tree origins
  • C.E. Leiserson, Fat-trees Universal Networks
    for Hardware-Efficient Supercomputing, IEEE
    Transactions on Computers, 34(10)892-901, Oct.
    1985.
  • Fat-tree construction
  • S. R. Ohring, M. Ibel, S. K. Das, M. J. Kumar,
    On Generalized Fat-tree, IEEE IPPS 1995.
  • X. Lin, Y. Chung and T. Huang, A Multiple LID
    Routing Scheme for Fat-tree Based InfiniBand
    Networks, IEEE IPDPS 2004.
  • Fat-tree routing and performance issues
  • T. Hoefler, T. Schneider, and A. Lumsdaine
    Multistage Interconnection Networks are not
    Crossbars Effects of static routing in high
    performance networks, IEEE Cluster, 2008.
  • P. Geoffray and T. Hoefler. Adaptive Routing
    Strategies for Modern High Performance Networks.
    In 16th Annual IEEE Symposium on High Performance
    Interconnects (HOTI 2008), pages 165-172, Aug.
    2008.
Write a Comment
User Comments (0)
About PowerShow.com