Building Beowulfs for High Performance Computing - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

Building Beowulfs for High Performance Computing

Description:

Building Beowulfs for High Performance Computing. Duncan Grove ... Condor, GNU Queue. Gaussian94, Gaussian98. Expected parallel performance. Loki, 1996 ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 30

Provided by: duncan59

Category:

more less

Transcript and Presenter's Notes

Title: Building Beowulfs for High Performance Computing

1
Building Beowulfs for High Performance Computing

Duncan Grove
Department of Computer Science
University of Adelaide
http//dhpc.adelaide.edu.au/projects/beowulf

2
Anatomy of a Beowulf

Cluster of networked PCs
Intel PentiumII or Compaq Alpha
Switched 100Mbit/s Ethernet or Myrinet
Linux
Parallel and batch software support

Switching Infrastructure
Front-end Node
Outside World
n1
nN
n2
Compute Nodes
3
Why build Beowulfs?

Science/
Some problems take lots of processing
Many supercomputers are used as batch processing
engines
Traditional supercomputers wasteful high
throughput computing
Beowulfs
useful computational cycles at the lowest
possible price.
Suited to high throughput computing
Effective at an increasingly large set of
parallel problems

4
Three Computational Paradigms

Data Parallel
Regular grid based problems
Parallelising compilers, eg HPF
Eg physicists running lattice gauge calculations
Message Passing
Unstructured parallel problems.
MPI, PVM
Eg chemists running molecular dynamics
simulations.
Task Farming
High throughput computing - batch jobs
Queuing systems
Eg chemists running Gaussian.

5
A Brief Cluster History

Caltech Prehistory
Berkeley NOW
NASA Beowulf
Stone SouperComputer
USQ Topcat
UIUC NT Supercluster
LANL Avalon
SNL Cplant
AU Perseus?

6
Beowulf Wishlist

Single System Image (SSI)
Unified process space
Distributed shared memory
Distributed file system
Performance easily extensible
Just add more bits
Is fault tolerant
Is simple to administer and use

7
Current Sophistication?

Shrinkwrapped solutions or do-it-yourself
Not much more than a nicely installed network of
PCs
A few kernel hacks to improve performance
No magical software for making the cluster
transparent to the user
Queuing software and parallel programming
software can create the appearance of a more
unified machine

8
Stone SouperComputer
9
Iofor

Learning platform
Program development
Simple benchmarking
Simple performance evaluation of real applcaions
Teaching machine
Money lever

10
iMacwulf

Student lab by day, Beowulf by night?
MacOS with Appleseed
LinuxPPC 4.0, soon LinuxPPC 5.0
MacOS/X

11
Gigaflop harlotry

Machine Cost Processors Peak Speed
Cray T3E 10s million 1084 1300Gflop/s
SGI Origin 2000 10s million 128 128Gflop/s
IBM SP2 10s million 512 400Gflop/s
Sun HPC 1s million 64 50Gflop/s
TMC CM5 5 Million (1992) 128 20Gflop/s
SGI PowerChallenge 1 Million (1995) 20 20Gflop/s
Beowulf cluster myrinet 1 Million 256 120Gflop
/s
Beowulf cluster 300K 256 120Gflop/s

12
The obvious, but important

In the past
Commomdity processors way behind supercomputer
processors
Commodity networks way, way, way behind
supercomputer networks
In the now
Commomdity processors only just behind
supercomputer processors
Commmodity networks still way, way behind
supercomputer networks
More exotic networks still way behind
supercomputer networks
In the future
Commodity processors will be supercomputer
processors
Will the commodity networks catch up?

13
Hardware possibilities
14
OS possibilities
15
Open Source

The good...
Lots of users, active development
Easy access to make your own tweaks
Aspects of Linux are still immature, but recently
SGI has release xfs as open source
Sun has released its HPC software as open source
And the bad...
Theres a lot of bad code out there!

16
Network technologies

So many choices!
Interfaces, cables, switches, hubs ATM,
Ethernet, Fast Ethernet, gigabit Ethernet,
firewire, HiPPI, serial HiPPI, Myrinet, SCI
The important issues
latency
bandwidth
availability
price
price/performance
application type!

17
Disk subsystems

I/O a problem in parallel systems
Data not local on compute nodes is a performance
hit
Distributed file systems
CacheFS
CODA
Parallel file systems
PVFS
On-line bulk data is interesting in itself
Beowulf Bulk Data Server
cf with slow, expensive tape silos...

18
Perseus

Machine for chemistry simulations
Mainly high throughput computing
RIEF grant in excess of 300K
128 nodes. For lt 2K per node
Dual processor PII450
At least 256MB RAM
Some nodes up to 1GB
6GB local disk each
5x24 (2x4) port Intel 100Mbit/s switches

19
Perseus Phase 1

Prototype
16 dual processor PII
100Mbit/s switched Ethernet

20
Perseus installing a node
Switching Infrastructure
Front-end Node
Outside World
n1
nN
n2
User node, administration, compilers, queues,
nfs, dns, NIS, /etc/, bootp/dhcp, kickstart, ...
Floppy disk or bootrom
21
Software on perseus

Software to support the three computational
paradigms
Data Parallel
Portland Group HPF
Message Passing
MPICH, LAM/MPI, PVM
High throughput computing
Condor, GNU Queue
Gaussian94, Gaussian98

22
Expected parallel performance

Loki, 1996
16 Pentium Pro processors, 10Mbit/s Ethernet
3.2 Gflop/s peak, achieved 1.2 real Gflop/s on
Linpack benchmark
Perseus, 1999
256 PentiumII processors, 100Mbit/s Ethernet
115 Gflop/s peak
40 Gflop/s on Linpack benchmark?
Compare with top 500!
Would get us to about 200 currently
Other Australian machines?
NEC SX/4 _at_ BOM at 102
Sun HPC at 181, 182, 255
Fujitsi VPP _at_ ANU at 400

23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Reliability in a large system

Build it right!
Is the operating system and software running ok?
Is heat dissipation going to be a problem?
Monitoring daemon
Normal features
CPU, network, memory, disk
More exotic features
Power supply and CPU fan speeds
Motherboard and CPU temperatures
Do we have any heisen-cabling?
Racks and lots of cable ties!

28
The limitations...

Scalability
Load balancing
Effects of machines capabilities
Desktop machines vs. dedicated machines
Resource allocation
Task Migration
Distributed I/O
System monitoring and control tools
Maintenance requirements
Installation, upgrading, versioning
Complicated scripts
Parallel interactive shell?

29
and the opportuntities

A large proportion of the current limitations
compared with traditional HPC solutions are
merely systems integration problems
Some contributions to be made in
HOWTOs
Monitoring and maintenance
Performance modelling and real benchmarking

Write a Comment

User Comments (0)