Title: Using PC clusters for scientific computing: do they really work?
1Using PC clusters for scientific computing do
they really work?
- Roldan Pozo
- Mathematics and Computational Sciences Division
- NIST
2Alternate Title
- Supercomputing the view from below
3NIST Activities
- Fire Dynamics
- Applied Economics
- Polymer Combustion Research
- DNA Chemistry
- Applied Computational Chemistry
- Reacting Flow Simulation
- Microbeam analysis
- Atmospheric and Chemometric Research
- Analytical Mass Spectrometry for Biomolecules
- Trace-Gas analysis
- Neutron Activation Analysis
- Plasma Chemistry
- Thin-Film Process Metrology
- Nanoscale Cryoelectronics
- Computer Security
- Computer-aided Manufacturing
- Polymer Characterization
- ...
4NIST Computing
- Cray C-90
- IBM SP2
- SGI Origin 2000
- Convex C3820
- small workstation (Alpha, RS6K, etc.) clusters
- small PC clusters
5Parallel / Scalable / Distributed / Computing?
- Thanks, but parallel computing is (still) hard...
- dont have the
- time
- resources
- development cycle
- economic justfication
- dont really need it
6Flops is not the issue...
- Development time
- Turn-around time
7The Big Supercomputing Maxim
- The bigger the machine, the more you share it
8From Big Iron to clusters...
- Migration of conventional supercomputer users
(Cray, etc.) to less expensive platforms - are small clusters the answer?
- care to parallelize your apps?
9User Responses
- Go away.
- DM Been there, done that.
- Parallel Computing failed.
- Cant the compiler do that?
10Alternate approach...
- MASSIVELY SEQUENTIAL COMPUTING
- personal compute server
- mainly sequential jobs
- some occasional small (2-8 processor) parallel
jobs
11Sequential rules
- Big applications are hardly ever run once.
- Most simulations consist of many runs of the same
code with different input data. - Memory constraints? Buy more memory!
12Benefits of a personal supercomputer
- Dont have to share it with anyone!
- Often reduced turn-around time
- No batch queues, CPU limits, disk quotas, etc.
- direct control over the resource
- You get to decide how to best use it
13- JazzNet I
- 9 processors
- Intel BaseTX Express Hub
- JazzNet II
- 18 processors
- Myrinet Gigabit network
- 8-port 3Com SuperStackII 3000TX fast ethernet
switch - 16-port Bay Networks BayStack 350T fast ethernet
switch
14Parallel adaptive multigrid (PHAML)(William F.
Mitchell, MCSD)
- Adaptive multigrid for finite element modeling
- 2D elliptic partial differential equations
- uses Fortran 90 and PVM/MPI
- originally developed on the IBM SP2
15PHAML performance
163D Helmholtz equation solver(Karin A. Remington,
MCSD)
- Fast, direct method for solving elliptic PDEs via
matrix decomposition - Handles Dirichlet, Neumann, or periodic boundary
conditions
17Helmoltz solver implementation
- 1D decomposition, f77/C, PVM MPI
- FFT across processors
- personalized all-to-all communication
183D Helmholtz performance
19Optimal wing shape in viscous flows(Anthony J.
Kearsley, MCSD)
- Optimization problem to minimize vorticity
- CFD around trial shapes with constrained shape
methods - uses domain decomposition and domain embedding
- hybrid constrained direct search method
20Optimal wing shape performance
21Phase-field algorithm for solidification
modeling(Bruce Murray, NIST/SUNY)
- set of two time-dependent, nonlinear parabolic
PDEs - Fortran 77 Cray application
- finite difference / ADI method
22 solidification modeling performance(1200x600
grid, 50 steps)
23 solidification modeling performance(1200x600
grid, 50 steps)
24 JazzNet Pentium II nodes
- ASUS KN97X motherboard (440FX PCI chipset)
- 266MHz Pentium IIs (512KB cache)
- 128 MB RAM (60ns SIMMs)
- integrated EIDE controller
- 2GB EIDE disk
- Kingston Tech. EtherRx 10/100 NIC
25JazzNet Networking Hardware
- Myrinet
- 3Com 905 10/100 NIC
- 8-port 3Com SuperStack II 3000 switch
- 16-port Bay Networks BayStack 350T switch
- Intel EtherExpress 100 Hub
26Myrinet bandwidth (TCP/IP)MPI (LAM 6.1) Myrinet
M2F-SW8 switch
27Tools and Libraries
- Matlab
- LAPACK, LAPACK
- BLAS
- Posix threads
- Open GL
- Java (JDK 1.1)
28Example Configuration(8 nodes, fast ethernet
switch)
- 8 nodes, 2GB RAM, 64 GB disk (25,000)
- 400 MHz Pentium IIs, rack-mount case
- 256 MB RAM each
- 8 GB Ultra-ATA disks
- 16 port Fast Ethernet switch
- 4 UPS
- DDS-3 SCSI DAT backup
- monitor, cables, etc.
29A few things to keep in mind...
- Parallel computing is not a general solution
- support and maintenance varies from site to site
- use a reliable vendor
- find a good sys admin
- turn-key systems just now appearing...
30PC clusters will work if...
- You have many independent jobs to run (compute
server) - supercomputing resources are busy
- you have ready-to-run parallel applications
- have portable Unix f77/C codes
- apps not highly vectorizable
- willing to use Linux/PC
31PC clusters will not work if...
- Proprietary library/app not available
- expect parallel computing to be easy and solve
all your problems - have extreme memory bandwidth requirements
- need more RAM/disk space than physically
available on PC architectures
32Recommendations
- Dont invest in 2 or 4-proc boards ---memory
bandwidth too limited - go with fast ethernet (cheap, easy)
- use brand-name, quality components
- buy pre-configured systems --- dont bother
building these yourself - have a Linux-friendly sys admin
33Who is supporting Linux clusters?
- Linux User Community
- Extreme Linux consortium
- Cluster workshops
- 1,600 listing at SAL
- Linux Journal
- Hardware Vendors
- SWT, Atlas, VAResearch, PromoX,
- Software Vendors
- Red Hat
- Caldera
- PGI
34Related Projects
- NIST Scalable Computing Testbed Project
- Beowulf
- Berkeley NOW
- Illinois HPVM
- DAISy (Sandia)
- Grendel
- TORC (ORNL/Tenn.)
- FermiLab
- Brahma
- Aenes
- PACET
- MadDog
- and many more
35From Big Iron to clusters...
- Migration of conventional supercomputer users
(Cray, etc.) to less expensive platforms - are small clusters the answer?
- care to parallelize your apps?
36What could we do?
- Give each user their personal server
- help them port their apps
- provide some consultation
- for jobs too big, contract out.
37Departing thoughts
- The Ultra-high-end is sexy, but
- the end-user audience shrinks to zero
- The real opportunities for the greatest
influence is at the low/middle level. - This is where the other 99.9 of the needs are,
and users there feel ignored.
38