Supercomputing: Directions in Technology, Architecture and Applications - PowerPoint PPT Presentation

About This Presentation

Title:

Supercomputing: Directions in Technology, Architecture and Applications

Description:

Supercomputing: Directions in Technology, Architecture and Applications Keynote Talk to Supercomputer 98 in Mannheim, Germany June 18, 1998 – PowerPoint PPT presentation

Number of Views:118

Avg rating:3.0/5.0

Slides: 32

Provided by: LarryS166

Category:

more less

Transcript and Presenter's Notes

Title: Supercomputing: Directions in Technology, Architecture and Applications

1
Supercomputing Directions in Technology,
Architecture and Applications

Keynote Talk to Supercomputer98 in Mannheim,
Germany
June 18, 1998

2
Supercomputing Directions in Technology,
Architecture and Applications
Abstract "By using the results of the Top 500
over the last five years, one can easily trace
out the complete transformation of the
supercomputer industry. In 1993, none of the
Top500 was made by a broadly based market driven
company, while today over 3/4 of the Top500 are
made by SGI, IBM, HP, or Sun. Similarly, vector
architectures have been replaced in market share
by microprocessor based SMPs. We now see a
strong move to replace many MPPs and SMPs by the
new architecture of Distributed Shared Memory
(DSM) such as the SGI Origin or HP SPP series. A
key trend is the move toward clusters of DSMs
instead of monolithic MPPs. The next major change
will be the emergence of Intel processors
replacing RISC processors, particularly the Intel
Merced processor which should become dominant
shortly after 2000. A major battle will shape up
between UNIX and Microsoft's NT operating
systems, particularly at the lower end of the
Top500. Finally, with each new architecture
comes a new set of applications we can now
attack. I will discuss how DSM will enable
dynamic load balancing needed to support the
multi-scale problems that teraflop machines will
enable us to tackle."
3
NCSA is the Leading Edge Site for the National
Computational Science Alliance
www.ncsa.uiuc.edu
4
Scientific Applications Continue to Require
Exponential Growth in Capacity
From Bob Voigt, NSF
5
The Promise of the Teraflop - From Thunderstorm
to National-Scale Simulation
Simulation by Wilhelmson, et al. Figure from
Supercomputing and the Transformation of Science,
Kaufmann and Smarr, Freeman, 1993
6
Accelerated Strategic Computing Initiative is
Coupling DOE Defense Labs to Universities

Access to ASCI Leading Edge Supercomputers
Academic Strategic Alliances Program
Data and Visualization Corridors

http//www.llnl.gov/asci-alliances/centers.html
7
Comparison of the DoE ASCI and the NSF PACI
Origin Array Scale Through FY99
Los Alamos Origin System FY99 5-6000 processors
NCSA Proposed System FY99 6x128 and 4x641024
processors
www.lanl.gov/projects/asci/bluemtn /Hardware/sched
ule.html
8
NCSA Combines Shared Memory Programming with
Massive Parallelism
Future Upgrade Under Negotiation with NSF
9
The Exponential Growth of NCSAs SGI Shared
Memory Supercomputers
Doubling Every Nine Months!
SN1
Origin
Power Challenge
Challenge
10
TOP500 Systems by Vendor
500
Other
Japanese
Other
DEC
400
Intel
Japanese
TMC
Sun
DEC
Intel
HP
300
TMC
IBM
Number of Systems
Sun
Convex
HP
200
Convex
SGI
IBM
SGI
100
CRI
CRI
0
Jun-93
Jun-94
Jun-95
Jun-96
Jun-97
Jun-98
Nov-93
Nov-94
Nov-95
Nov-96
Nov-97
TOP500 Reports http//www.netlib.org/benchmark/t
op500.html
11
Why NCSA Switched From Vector to RISC Processors
NCSA 1992 Supercomputing Community
150
Average Speed 70 MFLOPS
Cray Y-MP4 / 64
March, 1992 - February, 1993
100
Average Performance, Users gt 0.5 CPU Hour
Number of Users
Peak

Speed

Y-MP1
50
Peak Speed

MIPS R8000
0
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
Average User MFLOPS
12
Replacement of Shared Memory Vector
Supercomputers by Microprocessor SMPs
TOP500 Reports http//www.netlib.org/benchmark/t
op500.html
13
Top500 Shared Memory Systems
Vector Processors
Microprocessors
TOP500 Reports http//www.netlib.org/benchmark/t
op500.html
14
Simulation of the Evolution of the Universe on a
Massively Parallel Supercomputer
4 Billion Light Years
12 Billion Light Years
Virgo Project - Evolving a Billion Pieces of Cold
Dark Matter in a Hubble Volume - 688-processor
CRAY T3E at Garching Computing Centre of the
Max-Planck-Society
http//www.mpg.de/universe.htm
15
Limitations of Uniform Grids for Complex
Scientific and Engineering Problems
Gravitation Causes Continuous Increase in Density
Until There is a Large Mass in a Single Grid Zone
512x512x512 Run on 512-node CM-5
Source Greg Bryan, Mike Norman, NCSA
16
Use of Shared Memory Adaptive Grids To Achieve
Dynamic Load Balancing
64x64x64 Run with Seven Levels of Adaption on SGI
Power Challenge, Locally Equivalent to
8192x8192x8192 Resolution
Source Greg Bryan, Mike Norman, John Shalf, NCSA
17
Extreme and Large PIs Dominant Usage of NCSA
Origin
January thru April, 1998
18
Disciplines Using the NCSA Origin 2000CPU-Hours
in March 1995
19
A Variety of Discipline Codes -Single Processor
Performance Origin vs. T3E
20
Solving 2D Navier-Stokes Kernel - Performance
of Scalable Systems
Preconditioned Conjugate Gradient Method With
Multi-level Additive Schwarz Richardson
Pre-conditioner (2D 1024x1024)
Source Danesh Tafti, NCSA
21
Alliance PACS Origin2000 Repository
Kadin Tseng, BU, Gary Jensen, NCSA, Chuck
Swanson, SGI John Connolly, U Kentucky Developing
Repository for HP Exemplar
http//scv.bu.edu/SCV/Origin2000/
22
High-End Architecture 2000-Scalable Clusters of
Shared Memory Modules
Each is 4 Teraflops Peak

NEC SX-5
32 x 16 vector processor SMP
512 Processors
8 Gigaflop Peak Vector Processor
IBM SP
256 x 16 RISC Processor SMP
4096 Processors
1 Gigaflop Peak RISC Processor
SGI Origin Follow-on
32 x 128 RISC Processor DSM
4096 Processors
1 Gigaflop Peak EPIC Processor

23
Emerging Portable Computing Standards

HPF
MPI
OpenMP
Hybrids of MPI and OpenMP

24
Basket of Applications Average Performance as
Percentage of Linpack Performance
22
Applications Codes CFD Biomolecular Chemistry Ma
terials QCD
25
14
19
33
26
25
Harnessing Distributed UNIX Workstations -
University of Wisconsin Condor Pool
Condor Cycles
CondorView, Courtesy of Miron Livny, Todd
Tannenbaum(UWisc)
26
NT Workstation Shipments Rapidly Surpassing UNIX
Source IDC, Wall Street Journal, 3/6/98
27
First Scaling Testing of ZEUS-MP on CRAY T3E and
Origin vs. NT Supercluster
Supercomputer performance at mail-order
prices-- Jim Gray, Microsoft
access.ncsa.uiuc.edu/CoverStories/SuperCluster/sup
er.html

Alliance Cosmology Team
Andrew Chien, UIUC
Rob Pennington, NCSA

Zeus-MP Hydro Code Running Under MPI
28
NCSA NT Supercluster Solving Navier-Stokes
Kernel
Single Processor Performance MIPS R10k
117 MFLOPS Intel Pentium II 80 MFLOPS
Preconditioned Conjugate Gradient Method With
Multi-level Additive Schwarz Richardson
Pre-conditioner (2D 1024x1024)
Danesh Tafti, Rob Pennington, Andrew Chien NCSA
29
Near Perfect Scaling of Cactus - 3D Dynamic
Solver for the Einstein GR Equations
Cactus was Developed by Paul Walker,
MPI-Potsdam UIUC, NCSA
Ratio of GFLOPs Origin 2.5x NT SC
Danesh Tafti, Rob Pennington, Andrew Chien NCSA
30
NCSA Symbio - A Distributed Object Framework
Bringing Scalable Computing to NT Desktops