SOS7: - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

SOS7:

Description:

1 NIC/node/rail. Federated switch (/rail) 'Fat-tree' (bbw ~0.2 TB/s) ... Multi-rail fat-tree network. Redundant monitor/ctrl. WAN/LAN accessible. File servers: ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 24
Provided by: csSa2
Learn more at: http://www.cs.sandia.gov
Category:
Tags: rail | sos7

less

Transcript and Presenter's Notes

Title: SOS7:


1
SOS7 Machines Already OperationalNSFs
Terascale Computing System
  • SOS-7 March 4-6, 2003
  • Mike Levine, PSC

2
Outline
  • Overview of TCS, the US-NSFs Terascale Computing
    System.
  • Answering 3 questions
  • Is your machine living up to performance
    expectations?
  • What is the MTBI?
  • What is the primary complaint, if any, from
    users?
  • See also PSC web pages Rolfs info.

3
Q1 Performance
  • Computational and communications performance is
    very good!
  • Alpha processors ES45 servers very good
  • Quadrics bw latency very good.
  • 74 of peak on Linpack gt76 on LSMS
  • More work on disk IO.
  • This has been a very ease port for most users.
  • Easier than some Cray ? Cray upgrades.

4
Q2 MTBI (Monthly Average)
  • Compare with theoretical prediction of 12 hrs.
  • Expect further improvement (fixing systematic
    problems).

5
Time Lost to Unscheduled Events
  • Purple nodes requiring cleanup
  • Worst case is 3

6
Q3 Complaints
  • 1 I need more time (not a complaint about
    performance)
  • Actual usage gt80 of wall clock
  • Some structural improvements still in progress.
  • Not a whole lot more is possible!
  • Work needed on
  • Rogue OS activity. recall Prof. Kales comment
  • MPI global reduction libraries. ditto
  • System debugging and fragility.
  • IO performance.
  • We have delayed full disk deployment to avoid
    data corruption instabilities.
  • Node cleanup
  • We detect hold out problem nodes until staff
    clean.
  • All in all, the users have been VERY pleased.
    ditto

7
Full Machine Job
  • This system is capable of doing big science

8
TCS (Terascale Computing System) ETF
  • Sponsored by the U.S. National Science Foundation
  • Serving the very high end for US academic
    computational science and engineering
  • Designed to be used, as a whole, on single
    problems. (recall full machine job)
  • Full range of scientific and engineering
    applications.
  • Compaq AlphaServer SC hardware and software
    technology
  • In general production since April, 2002
  • 6 in Top 500 (largest open facility in the
    world Nov 2001)
  • TCS-1 in general production since April, 2002
  • Integrated into the PACI program (Partnerships
    for Academic Computing Infrastructure)
  • DTF project to build and integrate multiple
    systems
  • NCSA, SDSC, Caltech, Argonne. Multi-lamba,
    transcontinental interconnect
  • ETF aka Teratrid (Extensible Terascale Facility)
    integrating TCS with DTF forming
  • A heterogeneous, extensible scientific/engineering
    cyberinfrastructure Grid

9
Infrastructure PSC - TCS machine room ( _at_
Westinghouse)(Not require a new building just a
pipe wire upgrade not maxed out)
  • 8k ft2
  • Use 2.5k
  • Existingroom.
  • (16 yrs old.)

10
Floor Layout
Full System Physical Structure
  • Geometrical constraints invariant twixt US Japan

11
Terascale Computing System
Compute Nodes
  • 750 ES45 4-CPU servers
  • 13 inline spares
  • (2 login nodes)
  • 4 - EV68s /node
  • 1 GHz 2.Gf 6 Tf
  • 4 GB memory 3.0 TB
  • 318.2 GB disk 41 TB
  • System
  • User temporary
  • Fast snapshots
  • 90 GB/s
  • Tru64 Unix

Compute Nodes
12
  • ES45 nodes
  • 5 nodes per cabinet
  • 3 local disks /node

13
Terascale Computing System
Quadrics
Quadrics Network
  • 2 rails
  • Higher bandwidth
  • (250 MB/s/rail)
  • Lower latency
  • 2.5 ?s put latency
  • 1 NIC/node/rail
  • Federated switch (/rail)
  • Fat-tree (bbw 0.2 TB/s)

Compute Nodes
  • User virtual memory mapped
  • Hardware retry
  • Heterogeneous
  • (Alpha Tru64 Linux, Intel Linux)

14
Central Switch Assembly
  • 20 cabinetsin center
  • Minimize max internode distance
  • 3 out of 4 rows shown
  • 21st LL switch, outside (not shown)

15
Quadrics wiring overhead (view towards ceiling)
16
Terascale Computing System
Quadrics
Management Control
Control
  • Quadrics switch control
  • Internal SBC Ethernet
  • Insight Manager on PCs
  • Dedicated systems
  • Cluster/node monitoring control
  • RMS database
  • Ethernet
  • Serial Link

LAN
Compute Nodes
17
Terascale Computing System
Quadrics
Interactive Nodes
Control
  • Dedicated 2ES45
  • 8 on compute nodes
  • Shared function nodes
  • User access
  • Gigabit Ethernet to WAN
  • Quadrics connected
  • /usr indexed store (ISMS)

LAN
Compute Nodes
/usr
WAN/LAN
18
Terascale Computing System
Quadrics
File Servers
Control
  • 64, on compute nodes
  • 0.47 TB/server 30 TB
  • 500 MB/s 32 GB/s
  • Temporary user storage
  • Direct IO
  • /tmp
  • Each server has
  • 24 disks on
  • 8 SCSI chains on
  • 4 controllers
  • sustain full drive bw.

LAN
Compute Nodes
File Servers
/tmp
/usr
WAN/LAN
19
Terascale Computing System
Summary
  • 750 ES45 Compute Nodes
  • 3000 EV68 CPUs _at_ 1 GHz
  • 6 Tf
  • 3. TB memory
  • 41 TB node disk, 90GB/s
  • Multi-rail fat-tree network
  • Redundant monitor/ctrl
  • WAN/LAN accessible
  • File servers 30TB, 32 GB/s
  • Buffer disk store, 150 TB
  • Parallel visualization
  • Mass store, 1 TB/hr, gt 1 PB
  • ETF coupled (hetero)

Quadrics
Control
LAN
Compute Nodes
File Servers
/tmp
/usr
WAN/LAN
20
Terascale Computing System
Visualization
  • Intel/Linux
  • Newest software
  • 16 nodes
  • Parallel rendering
  • HW/SW compositing
  • Quadrics connected
  • Image output
  • ? Web pages

TCS
340 GB/s (1520Q)
Quadrics
4.5 GB/s (20Q)
3.6 GB/s (16Q)
3.6 GB/s (16Q)
ApplicationGateways
Viz
Buffer Disk
WAN coupled
21
Buffer Disk HSM
Terascale Computing System
  • Quadrics coupled (225 MB/s/link)
  • Intermediate between TCS HSM
  • Independently managed.
  • Private transport from TCS.

TCS
340 GB/s (1520Q)
Quadrics
4.5 GB/s (20Q)
3.6 GB/s (16Q)
3.6 GB/s (16Q)
gt360 MB/s to tape
HSM - LSCi
ApplicationGateways
Viz
Buffer Disk
WAN/LAN SDSC
22
Application Gateways
Terascale Computing System
  • Quadrics coupled (225 MB/s/link)
  • Coupled to ETF backbone by GigE
  • 30 Gb/s

TCS
340 GB/s (1520Q)
Quadrics
4.5 GB/s (20Q)
3.6 GB/s (16Q)
3.6 GB/s (16Q)
ApplicationGateways
Viz
Buffer Disk
Multi GigE to ETF Backbone _at_
30 Gb/s
23
The Front Row
  • Yes, those are Pittsburgh sports colors.
Write a Comment
User Comments (0)
About PowerShow.com