Technologies for the Future: CLUSTERS - PowerPoint PPT Presentation

About This Presentation
Title:

Technologies for the Future: CLUSTERS

Description:

Espen S. Johnsen, Otto J. Anshus, John Markus Bj rndalen, Lars Ailo Bongo (September 29, 2003) ... Lars Ailo Bongo, Otto J. Anshus, John Markus Bj rndalen. 8 ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 25
Provided by: els3
Category:

less

Transcript and Presenter's Notes

Title: Technologies for the Future: CLUSTERS


1
Technologies for the Future CLUSTERS
  • Anne C. Elster
  • Dept. of Computer Information Science (IDI)
  • Norwegian Univ. of Science Tech. (NTNU)
  • Trondheim, Norway

NOTUR 2003
2
Clusters (Networks of PCs/Workstation)
  • Are they suitable for HPC?
  • Advantage
  • Cost-effective hardware since uses COTS
    (Commercial Of-The-Shelf) parts
  • BUT
  • Typically much slower processor interconnectes
    than traditional HPC systems
  • What about usability?

NTNU IDIs 40-node AMD 1.46GHz cluster 2GB RAM,
40GB disk, Fast Ethernet
3
Cluster TechnologiesNOTUR Emerging Technology
projectCollaboration between NTNU Univ. of
Tromsø
  • Goal
  • Analyze Cluster technologies suitability for HPC
    by looking at some of the most interesting NOTUR
    applications
  • The results will provide a foundation for
    decisions regarding future HPC programs

4
Main Collaborators include
  • Anne C. Elster (IDI, NTNU) Project leader
  • Otto Anshus Tore Larsen (CS, U of Tromsø)
  • Tor Johansen staff (CC, U of Tromsø)
  • Torbjørn Hallgren (IDI, NTNU)
  • Einar Rønquist (IMF, NTNU)
  • Master Ph.D. Students and Post Docs at NTNU and
    Univ. of Tromsø

5
General Issues to Consider
  • Why cluster vs. Powerful desktop vs. Large SMPs?
  • What are the total costs associated with clusters
    (hardware, software, support, usability)
  • 32-bit vs. 64-bit architectures

6
Cluster Project ACTIVITIES
  • A.1 Profiling Tuning Selected Applications
  • A.1.a/b Physics and Chemistry Codes
  • (Elster students, Dept. of Computer Science
    Dept., NTNU)
  • A.1.2a Profiling User-Analysis of Amber, Dalton
    Gaussian
  • (Tor Johansen staff, Comp. Center, U of
    Tromsø)
  • A.1.2b Optimization tool analysis of Dalton
  • (Anshus PostDoc/student, Dept. of Comp. Sci.,
    U of Tromsø)

7
Cluster Project ACTIVITIES continuted
  • A.2 Execution Monitoring
  • (Anshus, Tore Larsen students, CS, U of T)
  • A.3 Visualization servers, etc.
  • (Hallgren, Elster students, CS, NTNU)
  • A.4 Impact of future numerical algorithms
  • (Rønquist student, Dept. of Mathematics, NTNU
  • A.5 Interface with NOTUR ET Grid Project
  • (Elster, Harald Simonsen and colleagues, staff
    students associated with the NOTUR ET Cluster
    Grid projects)

8
A.1.a/b Physics Chemistry Codes (Elster
students, Dept. of CS Dept., NTNU)
Lessons Learned so far -- Paul Sacks work on a
Physics application (report available on the
Web)
  • FORTRAN problems
  • Different FORTRAN implementations have
    non-stardard add-ons (e.g. FORTRAN 90)
  • Leads to great difficulty in porting code to a
    different platform with a different Fortran
    compiler (e.g. by a different vendor)

9
A.1.a/b Physics Chemistry Codes contin.
  • Performance of programs can individually vary on
    different machines
  • Åsmund Østvold wrote a proj. report on
  • porting PROTOMOL from an SMP w/ MPI one-siden
    communication primitives (MPI put/get) to a
    cluster. (available on WWW)
  • He also did a MS study with SCALI on various
  • MPI broadcast algorithms and bechmarking

10
A.1.a/b Physics Chemistry Codes contin.2
  • Ongoing work with Snorre Boasson Jan Christian
    Meyer on porting of PIC code using Pthread (SMP
    primitives) to MPI .
  • Preliminary report will be available later this
    week.
  • Recent Trends in Cluster Computing presented at
    ParCo 2003 by Elster et. al. includes harware
    trends and survey of libraries and performance
    tools.

11
A.1.2a Profiling User-Analysis of Amber, Dalton
Gaussian (Tor Johansen staff, Comp.
Center, U of Tromsø)
  • Koordineringsarbeide
  • Reise NOTUR 2003
  • Porting og testing av Amber og Scali SW


12
A.1.2b Optimization tool analysis of
Dalton(Anshus PostDoc/students, CS, U of
Tromsø)
  • Ytelsesmålinger gjort på DALTON
  • A Report for the NOTUR Project Emerging
    Technologies Cluster
  • Daniel Stødle, Otto J. Anshus, John Markus
    Bjørndalen
  • Survey of optimizing techniques for parallel
    programs running on computer clusters
  • Espen S. Johnsen, Otto J. Anshus, John Markus
    Bjørndalen, Lars Ailo Bongo (September 29, 2003)

13
A.1.2b Optimization tool analysis of Dalton
(Anshus PostDoc/student, IFI, U i Tromsø)
CONTINUED
  • RESULTS
  • Dalton scales pretty well 25x speedup on 32
    nodes
  • NOTE Only with-out caching temp. If use cache
    only 3-5x speedup on 32!
  • Even through the 8-way cluster had no local disk
    (only a netork file system), the sequential
    Dalton code was significantly faster.
  • This indicates that network bandwith may not
    be a problem if caching is used in the parallel
  • Communication pattern master-slave
    "bag-of-tasks" oriented programs with little
    communicaiton sychronization and generally good
    utilization of the slave nodes.
  • Master does relatively little work and is blocked
    most of the time
  • Finally checked if the master node could be a
    bottle neck, but could not detect differences in
    execution time when Master put on a slow node vs.
    a fast node.. NOTE Only tested up to 32 nodes
    using larger no. of nodes may limit performance
    by overloading the master node.

14
A.1.2b Optimization tool analysis of Dalton
(Anshus PostDoc/student, IFI, U i Tromsø)
CONTINUED 2
  • Thanks to
  • Kenneth Ruud, Chemistry, UiT
  • Roy Dragseth, CC UiT for support on the Itanium
    at U og Tromsø.

15
A.2 Execution Monitoring (Anshus, Tore Larsen
students, CS, U of T)
  • Survey of execution monitoring tools for
    computer clusters
  • Espen S. Johnsen, Otto J. Anshus, John Markus
    Bjørndalen, Lars Ailo Bongo, Sept 03
  • Performance Monitoring
  • Lars Ailo Bongo, Otto J. Anshus, John Markus
    Bjørndalen

16
A.3 Visualization servers, etc. (Hallgren,
Elster students, CS, NTNU)
  • On going work with Torbjørn Vik
  • Preliminary report on survey of how clusters are
    currently used in visualization
  • To types of Cluster usages
  • off-line (non-real-time rendering). Often called
    "renderingfarms" with lots of nodes which all
    work on a frame each of a larger animation.
  • Typically used in the film industry and other
    areas where interactivity and/or real-time
    rendering not needed.
  • All larger 3D modelling programs such as
    Lightwave, 3DStudio, Maya has functionality for
    this.
  • on-line ( realtime). Most interesting from a
    technical viewpoint...

17
A.3 Visualization servers, etc. - Contin.
  • Cluster brukes innenfor interaktiv
    visualiseringsprogramvare for å
  • øke ytelsen,
  • muliggjøre større datasett,
  • unngå begrensninger i lokal hardware.
  • De fleste visualiseringscluster fungerer
    prinsipielt ved at en bruker sitter på en
    klientmaskin som i seg selv ikke har noe særlig
    kapasitet. Clusteret tar seg av all beregning og
    sender bare de ferdige bildene til klienten.
    Klientmaskinen sørger også for å ta imot input
    fra bruker og sende disse til cluster. Datasett
    for slik visualisering er ofte svært store, og,
    avhengig av situasjonen, brukes både
    polygonbasert og voxelbasert rendering.
  • Hovedproblemet med å få clusters brukbare
    innenfor interaktive visualiseringsprogram er
    forsinkelser pga nettverk. Dette løses ved å
    redusere tiden som brukes for å overføre bilder
    mellom cluster og klient. Det kan enten løses ved
    å
  • redusere datamengden (komprimeringsmetoder) eller
  • øke nettverksytelsen. Eller begge.
  • Parallelitet i selve clusteret baseres på
    uavhengighetsforhold mellom forskjellige data.
    Det kan være uavhengigheter mellom forskjellige
    deler i samme datasett, eller det kan være
    uavhengigheter mellom forskjellige frames i et 4D
    datasett.
  • Load-balancing blir ofte et problem i slike
    sammenhenger og er et viktig forskningsområde.
  • Hvilken metode som brukes for load-balancing er
    som oftest svært kontekstavhengig.
  • Clusterprogramvare for visualisering fremdeles
    manglende ??

18
A.4 Impact of future numerical algorithms (Rønqui
st student, Dept. of Mathematics, NTNU
  • Rønquist student Staff (now at Simulasenteret)
    wrote a report based on his summer jobb
  • May add in experiences from Elsters group fall
    2003

19
A.5 Interface with NOTUR ET Grid
Project (Elster, Harald Simonsen and colleagues,
staff students associated with the NOTUR ET
Cluster Grid projects)
  • Test node established at NTNU
  • Andreas Botnen(USIT) and
  • Robin Holtet (IDI, now ITEA)
  • May use IDIs 30-40-node cluster in testgrid
  • Meetings
  • Between Elster and Simonsens groups
  • Robin Holtet and Elsters student Thorvald Natvig
    to Linköping meeting this month.
  • Collaborations re. National GRID and EEGE
  • Student from NTNU and UiO at CERN

20
Main cluster issues
  • Global operations have more severe impact on
    cluster performance than traditional
    supercomputers since communication between
    processors take relatively more of the total
    execution time
  • SCALABILITY!!

21
Lessons leared
  • Clusters generally have cheap hardware, but may
    cause increased hidden costs regarding
  • More incompatible compilers, especially Fortran
    90 (also C)
  • Some applications are non-trivial to port from a
    share-memory paradigm to a distributed memory
    paradigms
  • Some applications require high-bandwidth
    interconnects which drive up costs (e.g. SGI
    Altix)
  • Power and cooling costs (ref. Brian Vinter)
  • Stability, recovery
  • Overall costs and scalability should be further
    studied

22
The Ideal Cluster -- Hardware
  • High-bandwidth network
  • Low-latency network
  • Low Operating System overhead (tcp causes slow
    start)
  • Great floating-point performance
  • (64-bit processors or more?)

23
The Ideal Cluster -- Software
  • Compiler that is
  • Portable
  • Optimizing
  • Do extra work to save communication
  • Self-tuning /Load -balanced
  • Automatic selection of best algorithm
  • One-sided communication support?
  • Optimized middleware

24
For more information
  • A dozen or more reports associated with this
    project will be made available on the web at
  • http//www.idi.ntnu.no/elster
  • Email elster_at_idi.ntnu.no
Write a Comment
User Comments (0)
About PowerShow.com