Building Beowulfs for High Performance Computing - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Building Beowulfs for High Performance Computing

Description:

Building Beowulfs for High Performance Computing. Duncan Grove ... Condor, GNU Queue. Gaussian94, Gaussian98. Expected parallel performance. Loki, 1996 ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 30
Provided by: duncan59
Category:

less

Transcript and Presenter's Notes

Title: Building Beowulfs for High Performance Computing


1
Building Beowulfs for High Performance Computing
  • Duncan Grove
  • Department of Computer Science
  • University of Adelaide
  • http//dhpc.adelaide.edu.au/projects/beowulf

2
Anatomy of a Beowulf
  • Cluster of networked PCs
  • Intel PentiumII or Compaq Alpha
  • Switched 100Mbit/s Ethernet or Myrinet
  • Linux
  • Parallel and batch software support

Switching Infrastructure
Front-end Node
Outside World
n1
nN
n2
Compute Nodes
3
Why build Beowulfs?
  • Science/
  • Some problems take lots of processing
  • Many supercomputers are used as batch processing
    engines
  • Traditional supercomputers wasteful high
    throughput computing
  • Beowulfs
  • useful computational cycles at the lowest
    possible price.
  • Suited to high throughput computing
  • Effective at an increasingly large set of
    parallel problems

4
Three Computational Paradigms
  • Data Parallel
  • Regular grid based problems
  • Parallelising compilers, eg HPF
  • Eg physicists running lattice gauge calculations
  • Message Passing
  • Unstructured parallel problems.
  • MPI, PVM
  • Eg chemists running molecular dynamics
    simulations.
  • Task Farming
  • High throughput computing - batch jobs
  • Queuing systems
  • Eg chemists running Gaussian.

5
A Brief Cluster History
  • Caltech Prehistory
  • Berkeley NOW
  • NASA Beowulf
  • Stone SouperComputer
  • USQ Topcat
  • UIUC NT Supercluster
  • LANL Avalon
  • SNL Cplant
  • AU Perseus?

6
Beowulf Wishlist
  • Single System Image (SSI)
  • Unified process space
  • Distributed shared memory
  • Distributed file system
  • Performance easily extensible
  • Just add more bits
  • Is fault tolerant
  • Is simple to administer and use

7
Current Sophistication?
  • Shrinkwrapped solutions or do-it-yourself
  • Not much more than a nicely installed network of
    PCs
  • A few kernel hacks to improve performance
  • No magical software for making the cluster
    transparent to the user
  • Queuing software and parallel programming
    software can create the appearance of a more
    unified machine

8
Stone SouperComputer
9
Iofor
  • Learning platform
  • Program development
  • Simple benchmarking
  • Simple performance evaluation of real applcaions
  • Teaching machine
  • Money lever

10
iMacwulf
  • Student lab by day, Beowulf by night?
  • MacOS with Appleseed
  • LinuxPPC 4.0, soon LinuxPPC 5.0
  • MacOS/X

11
Gigaflop harlotry
  • Machine Cost Processors Peak Speed
  • Cray T3E 10s million 1084 1300Gflop/s
  • SGI Origin 2000 10s million 128 128Gflop/s
  • IBM SP2 10s million 512 400Gflop/s
  • Sun HPC 1s million 64 50Gflop/s
  • TMC CM5 5 Million (1992) 128 20Gflop/s
  • SGI PowerChallenge 1 Million (1995) 20 20Gflop/s
  • Beowulf cluster myrinet 1 Million 256 120Gflop
    /s
  • Beowulf cluster 300K 256 120Gflop/s

12
The obvious, but important
  • In the past
  • Commomdity processors way behind supercomputer
    processors
  • Commodity networks way, way, way behind
    supercomputer networks
  • In the now
  • Commomdity processors only just behind
    supercomputer processors
  • Commmodity networks still way, way behind
    supercomputer networks
  • More exotic networks still way behind
    supercomputer networks
  • In the future
  • Commodity processors will be supercomputer
    processors
  • Will the commodity networks catch up?

13
Hardware possibilities
14
OS possibilities
15
Open Source
  • The good...
  • Lots of users, active development
  • Easy access to make your own tweaks
  • Aspects of Linux are still immature, but recently
  • SGI has release xfs as open source
  • Sun has released its HPC software as open source
  • And the bad...
  • Theres a lot of bad code out there!

16
Network technologies
  • So many choices!
  • Interfaces, cables, switches, hubs ATM,
    Ethernet, Fast Ethernet, gigabit Ethernet,
    firewire, HiPPI, serial HiPPI, Myrinet, SCI
  • The important issues
  • latency
  • bandwidth
  • availability
  • price
  • price/performance
  • application type!

17
Disk subsystems
  • I/O a problem in parallel systems
  • Data not local on compute nodes is a performance
    hit
  • Distributed file systems
  • CacheFS
  • CODA
  • Parallel file systems
  • PVFS
  • On-line bulk data is interesting in itself
  • Beowulf Bulk Data Server
  • cf with slow, expensive tape silos...

18
Perseus
  • Machine for chemistry simulations
  • Mainly high throughput computing
  • RIEF grant in excess of 300K
  • 128 nodes. For lt 2K per node
  • Dual processor PII450
  • At least 256MB RAM
  • Some nodes up to 1GB
  • 6GB local disk each
  • 5x24 (2x4) port Intel 100Mbit/s switches

19
Perseus Phase 1
  • Prototype
  • 16 dual processor PII
  • 100Mbit/s switched Ethernet

20
Perseus installing a node
Switching Infrastructure
Front-end Node
Outside World
n1
nN
n2
User node, administration, compilers, queues,
nfs, dns, NIS, /etc/, bootp/dhcp, kickstart, ...
Floppy disk or bootrom
21
Software on perseus
  • Software to support the three computational
    paradigms
  • Data Parallel
  • Portland Group HPF
  • Message Passing
  • MPICH, LAM/MPI, PVM
  • High throughput computing
  • Condor, GNU Queue
  • Gaussian94, Gaussian98

22
Expected parallel performance
  • Loki, 1996
  • 16 Pentium Pro processors, 10Mbit/s Ethernet
  • 3.2 Gflop/s peak, achieved 1.2 real Gflop/s on
    Linpack benchmark
  • Perseus, 1999
  • 256 PentiumII processors, 100Mbit/s Ethernet
  • 115 Gflop/s peak
  • 40 Gflop/s on Linpack benchmark?
  • Compare with top 500!
  • Would get us to about 200 currently
  • Other Australian machines?
  • NEC SX/4 _at_ BOM at 102
  • Sun HPC at 181, 182, 255
  • Fujitsi VPP _at_ ANU at 400

23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Reliability in a large system
  • Build it right!
  • Is the operating system and software running ok?
  • Is heat dissipation going to be a problem?
  • Monitoring daemon
  • Normal features
  • CPU, network, memory, disk
  • More exotic features
  • Power supply and CPU fan speeds
  • Motherboard and CPU temperatures
  • Do we have any heisen-cabling?
  • Racks and lots of cable ties!

28
The limitations...
  • Scalability
  • Load balancing
  • Effects of machines capabilities
  • Desktop machines vs. dedicated machines
  • Resource allocation
  • Task Migration
  • Distributed I/O
  • System monitoring and control tools
  • Maintenance requirements
  • Installation, upgrading, versioning
  • Complicated scripts
  • Parallel interactive shell?

29
and the opportuntities
  • A large proportion of the current limitations
    compared with traditional HPC solutions are
    merely systems integration problems
  • Some contributions to be made in
  • HOWTOs
  • Monitoring and maintenance
  • Performance modelling and real benchmarking
Write a Comment
User Comments (0)
About PowerShow.com