Dolphin Wulfkit and Scali software The Supercomputer interconnect - PowerPoint PPT Presentation

About This Presentation
Title:

Dolphin Wulfkit and Scali software The Supercomputer interconnect

Description:

Dolphin Wulfkit and Scali software. The Supercomputer interconnect. Summation Enterprises Pvt. ... Dolphin Wulfkit advantages ... – PowerPoint PPT presentation

Number of Views:158
Avg rating:3.0/5.0
Slides: 46
Provided by: imsc9
Category:

less

Transcript and Presenter's Notes

Title: Dolphin Wulfkit and Scali software The Supercomputer interconnect


1
Dolphin Wulfkit and Scali softwareThe
Supercomputer interconnect
Amal DSilva amal_at_summnet.com
  • Summation Enterprises Pvt. Ltd.
  • Preferred System Integrators since 1991.

2
Agenda
  • Dolphin Wulfkit hardware
  • Scali Software / some commercial benchmarks
  • Summation profile

3
Interconnect Technologies
Application areas
WAN
LAN
I/O
Memory
Processor
Cache
FibreChannel
SCSI
Myrinet, cLan
Design space for different technologies
Proprietary Busses
Ethernet
Dolphin SCI Technology
ATM
Bus
Network
Cluster Interconnect Requirements
Application requirements
Distance
100 000
10 000
1 000
100
10
1
Bandwidth
50 000
100 000
100 000
100
10 000
1
20
1
1
100 000
1 000
8
Latency
4
Interconnect impact on cluster performance
  • Some Real-world examples from Top500 May 2004
    List
  • Intel, Bangalore cluster
  • 574 Xeon 2.4 GHz CPUs/ GigE interconnect
  • Rpeak 2755 GFLOPs Rmax 1196 GFLOPs
  • Efficiency 43
  • Kabru, IMSc, Chennai
  • 288 Xeon 2.4 GHz CPUs/ Wulfkit 3D interconnect
  • Rpeak 1382 GFLOPs Rmax 1002 GFLOPs
  • Efficiency 72
  • Simply put, Kabru gives 84 of the performance
    with HALF the number of CPUs !

5
Commodity interconnect limitations
  • Cluster performance depends primarily on two
    factors Bandwidth and Latency
  • Gigabit Speed limited to 1000mbps (approx 80
    Megabytes/s in real world). This is fixed
    irrespective of processor power
  • With Increasing processor speeds, latency time
    taken to move data from one node to another is
    playing an increasing role in cluster performance
  • Gigabit typically gives an internode latency of
    120 150 microsecs. As a result, CPUs in a node
    are often idling waiting to get data from another
    node
  • In any switch based architecture, the switch
    becomes the single point of failure. If the
    switch goes down, so does the cluster.

6
Dolphin Wulfkit advantages
  • Internode bandwidth 260 Megabytes/s on Xeon/
    (over three times faster that Gigabit).
  • Latency under 5 microsecs ( over TWENTY FIVE
    times quicker than Gigabit)
  • Matrix type internode connections No switch,
    hence no single point of failure
  • Cards can be moved across processor generations.
    This leads to investment protection

7
Dolphin Wulfkit advantages (contd.)
  • Linear scalability e.g. adding 8 nodes to a 16
    node cluster involves known fixed costs eight
    nodes and eight Dolphin SCI cards. With any
    switch based architecture, there are additional
    issues like unused ports on the switch to be
    considered. E.g. For Gigabit, one has to throw
    away the 16 port switch and buy a 32 port switch
  • Realworld performance on par /better than
    proprietary interconnects like Memory Channel
    (HP) and NUMAlink (SGI), at cost effective price
    points

8
Wulfkit The Supercomputer Interconect
  • Wulfkit is based on the Scalable Coherent
    Interface (SCI), the ANSI/IEEE 1596-1992 standard
    defines a point-to-point interface and a set of
    packet protocols.
  • Wulfkit is not a networking technology, but is a
    purpose-designed cluster interconnect.
  • The SCI interface has two unidirectional links
    that operate concurrently.
  • Bus imitating protocol with packet-based
    handshake protocols and guaranteed data delivery.
  • Upto 667 MegaBytes/s internode bandwidth.

9
PCI-SCI Adapter Card 1 slot 2 dimensions
  • SCI ADAPTERS (64 bit - 66 MHz)
  • PCI / SCI ADAPTER (D335)
  • D330 card with LC3 daughter card
  • Supports 2 SCI ring connections
  • Switching over B-Link
  • Used for WulfKit 2D clusters
  • PCI 64/66
  • D339 2-slot version

LC
LC
PCI
PSB
2D Adapter Card
10
System Interconnect
11
System Architecture
12
3D Torus topology (for greater than 64 72 nodes)
13
Linköping University - NSC - SCI Clusters
Also in Sweden, Umeå University 120 Athlon nodes
  • Monolith 200 node, 2xXeon, 2,2 GHz, 3D SCI
  • INGVAR 32 node, 2xAMD 900 MHz, 2D SCI
  • Otto 48 node, 2xP4 2.26 GHz, 2D SCI
  • Commercial under installation 40, 2xXeon, 2D SCI
  • Total 320 SCI nodes

14
MPI connect middleware and MPIManage Cluster
setup/ mgmt tools
http//www.scali.com
15
Scali Software Platform
  • Scali MPI Manage
  • Cluster Installation /Management
  • Scali MPI Connect
  • High Performance MPI Libraries

16
Scali MPI Connect
  • Fault Tolerant
  • High Bandwidth
  • Low Latency
  • Multi-Thread safe
  • Simultaneous Inter/-Intra-node operation
  • UNIX command line replicated
  • Exact message size option
  • Manual/debugger mode for selected processes
  • Explicit host specification
  • Job queuing
  • PBS, DQS, LSF, CCS, NQS, Maui
  • Conformance to MPI-1.2 verified through 1665 MPI
    tests

17
Scali MPI Manage features
  • System Installation and Configuration
  • System Administration
  • System Monitoring Alarms and Event Automation
  • Work Load Management
  • Hardware Management
  • Heterogeneous Cluster Support

18
Fault Tolerance
2D Torus topology more routing options XY routing
algorithm Node 33 fails (3) Nodes on 33s
ringlets becomes unavailable Cluster fractured
with current routing setting
19
Fault Tolerance
33
Scali advanced routing algorithm From the Turn
Model family of routing algorithms All nodes but
the failed one can be utilised as one big
partition
20
Scali MPI Manage GUI
21
Monitoring ctd.
22
System Monitoring
  • Resource Monitoring
  • CPU
  • Memory
  • Disk
  • Hardware Monitoring
  • Temperature
  • Fan Speed
  • Operator Alarms on selected Parameters at
    Specified Tresholds

23
Events/Alarms
24
SCI vs. Myrinet 2000Ping-Pong comparison
25
Itanium vs Cray T3E Bandwidth
26
Itanium vs T3E Latency
27
Some Reference Customers
  • Max Planck Institute für Plasmaphysik, Germany
  • University of Alberta, Canada
  • University of Manitoba, Canada
  • Etnus Software, USA
  • Oracle Inc., USA
  • University of Florida, USA
  • deCODE Genetics, Iceland
  • Uni-Heidelberg, Germany
  • GMD, Germany
  • Uni-Giessen, Germany
  • Uni-Hannover, Germany
  • Uni-Düsseldorf, Germany
  • Linux NetworX, USA
  • Magmasoft AG, Germany
  • University of Umeå, Sweden
  • University of Linkøping, Sweden
  • PGS Inc., USA
  • US Naval Air, USA
  • Spacetec/Tromsø Satellite Station, Norway
  • Norwegian Defense Research Establishment
  • Parallab, Norway
  • Paderborn Parallel Computing Center, Germany
  • Fujitsu Siemens computers, Germany
  • Spacebel, Belgium
  • Aerospatiale, France
  • Fraunhofer Gesellschaft, Germany
  • Lockheed Martin TDS, USA
  • University of Geneva, Switzerland
  • University of Oslo, Norway
  • Uni-C, Denmark
  • Paderborn Parallel Computing Center
  • University of Lund, Sweden
  • University of Aachen, Germany
  • DNV, Norway
  • DaimlerChrysler, Germany
  • AEA Technology, Germany
  • BMW AG, Germany

28
Some more Reference Customers
  • Rolls Royce Ltd., UK
  • Norsk Hydro, Norway
  • NGU, Norway
  • University of Santa Cruz, USA
  • Jodrell Bank Observatory, UK
  • NTT, Japan
  • CEA, France
  • Ford/Visteon, Germany
  • ABB AG, Germany
  • National Technical University of Athens, Greece
  • Medasys Digital Systems, France
  • PDG Linagora S.A., France
  • Workstations UK, Ltd., England
  • Bull S.A., France
  • The Norwegian Meteorological Institute, Norway
  • Nanco Data AB, Sweden
  • Aspen Systems Inc., USA
  • Atipa Linux Solution Inc., USA
  • Intel Corporation Inc., USA
  • IOWA State University, USA
  • Los Alamos National Laboratory, USA
  • Penguin Computing Inc., USA
  • Times N Systems Inc., USA
  • University of Alberta, Canada
  • Manash University, Australia
  • University of Southern Mississippi, Australia
  • Jacusiel Acuna Ltda., Chile
  • University of Copenhagen, Denmark
  • Caton Sistemas Alternativos, Spain
  • Mapcon Geografical Inform, Sweden
  • Fujitsu Software Corporation, USA
  • City Team OY, Finland
  • Falcon Computers, Finland
  • Link Masters Ltd., Holland
  • MIT, USA
  • Paralogic Inc., USA

29
Application Benchmarks
  • With Dolphin SCI and Scali MPI

30
NAS parallel benchmarks (16cpu/8nodes)
31
Magma (16cpus/8nodes)
32
Eclipse (16cpus/8nodes)
33
FEKO Parallel Speedup
34
Acusolve (16cpus/8nodes)
35
Visage (16cpus/8nodes)
36
CFD scaling mm5 linear to 400 CPUs
37
Scaling - Fluent Linköping cluster
38
Dolphin Software
  • All Dolphin SW is free open source (GPL or LGPL)
  • SISCI
  • SCI-SOCKET
  • Low Latency Socket Library
  • TCP and UDP Replacement
  • User and Kernel level support
  • Release 2.0 available
  • SCI-MPICH (RWTH Aachen)
  • MPICH 1.2 and some MPICH 2 features
  • New release is being prepared, beta available
  • SCI Interconnect Manager
  • Automatic failover recovery.
  • No single point of failure in 2D and 3D networks.
  • Other
  • SCI Reflective Memory, Scali MPI, Linux Labs SCI
    Cluster Cray-compatible shmem and Clugres
    PostgreSQL, MandrakeSoft Clustering HPC solution,
    Xprime X1 Database Performance Cluster for
    Microsoft SQL Servers, ClusterFrame from Qlusters
    and SunCluster 3.1 (Oracle 9i), MySQL Cluster

39
Summation Enterprises Pvt. Ltd.
  • Brief Company Profile

40
  • Our expertise Clustering for High Performance
    Technical Computing, Clustering for High
    Availability, Terabyte Storage solutions, SANs
  • O.S. skills Linux (Alpha 64bit, x8632 and
    64bit), Solaris (SPARC and x86), Tru64unix,
    Windows NT/ 2K/ 2003 and the QNX Realtime O.S.

41
Summation milestones
  • Working with Linux since 1996
  • First in India to deploy/ support 64bit Alpha
    Linux workstations (1999)
  • First in India to spec, deploy and support a 26
    Processor Alpha Linux cluster (2001)
  • Only company in India to have worked with
    Gigabit, SCI and Myrinet interconnects
  • Involved with the design, setup, support of many
    of the largest HPTC clusters in India.

42
Exclusive Distributors /System Integrators in
India
  • Dolphin Interconnect AS, Norway
  • SCI interconnect for Supercomputer performance
  • Scali AS, Norway
  • Cluster management tools
  • Absoft, Inc., USA
  • FORTRAN Development tools
  • Steeleye Inc., USA
  • High Availability Clustering and Disaster
    Recovery Solutions for Windows Linux
  • Summation is the sole Distributor, Consulting
    services Technical support partner for Steeleye
    in India

43
Partnering with Industry leaders
  • Sun Microsystems, Inc.
  • Focus on Education Research segments
  • High Performance Technical Computing, Grid
    Computing Initiative with Sun Grid Engine (SGE/
    SGEE)
  • HPTC Competency Centre

44
Wulfkit / HPTC users
  • Institute of Mathematical Sciences, Chennai
  • 144 node Dual Xeon Wulfkit 3D cluster,
  • 9 node Dual Xeon Wulfkit 2D cluster
  • 9 node Dual Xeon Ethernet cluster
  • 1.4 TB RAID storage
  • Bhaba Atomic Research Centre, Mumbai
  • 64 node Dual Xeon Wulfkit 2D cluster
  • 40 node P4 Wulfkit 3D cluster
  • Alpha servers / Linux OpenGL workstations /
    Rackmount servers
  • Harish Chandra Research Institute, Allahabad
  • Forty Two node Dual Xeon Wulfkit Cluster,
  • 1.1 TB RAID Storage

45
Wulfkit / HPTC users (contd.)
  • Intel Technology India Pvt. Ltd., Bangalore
  • Eight node Dual Xeon Wulfkit Clusters (ten nos.)
  • NCRA (TIFR), Pune
  • 4 node Wulfkit 2D cluster
  • Bharat Forge Ltd., Pune
  • Nine node Dual Xeon Wulfkit 2D cluster
  • Indian Rare Earths Ltd., Mumbai
  • 26 Processor Alpha Linux cluster with RAID
    storage
  • Tata Institute of Fundamental Research, Mumbai
  • RISC/Unix servers, Four node Xeon cluster
  • Centre for Advanced Technology, Indore
  • Alpha/ Sun Workstations

46
Questions ?
  • Amal DSilvaemailamal_at_summnet.com
  • GSM 98202 83309
Write a Comment
User Comments (0)
About PowerShow.com