Getting Started on Topsail - PowerPoint PPT Presentation

Loading...

PPT – Getting Started on Topsail PowerPoint presentation | free to download - id: 5ee987-ZDAxO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Getting Started on Topsail

Description:

Getting Started on Topsail Charles Davis ITS Research Computing October 26, 2009 Outline History of Topsail Structure of Topsail File Systems on Topsail Compiling on ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 41
Provided by: unc
Learn more at: http://its2.unc.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Getting Started on Topsail


1
Getting Started on Topsail
  • Charles Davis
  • ITS Research Computing
  • October 26, 2009

2
Outline
  • History of Topsail
  • Structure of Topsail
  • File Systems on Topsail
  • Compiling on Topsail
  • Topsail and LSF

3
Initial Topsail Cluster
  • Initially 1040 CPU Dell Linux Cluster
  • 520 dual socket, single core nodes
  • Infiniband interconnect
  • Intended for capability research
  • Housed in ITS Franklin machine room
  • Fast and efficient for large computational jobs

4
Topsail Upgrade 1
  • Topsail upgraded to 4,160 CPU
  • replaced blades with dual socket, quad core
  • Intel Xeon 5345 (Clovertown) Processors
  • Quad-Core with 8 CPU/node
  • Increased number of processors, but decreased
    individual processor speed (was 3.6 GHz, now
    2.33)
  • Decreased energy usage and necessary resources
    for cooling system
  • Summary slower clock speed, better memory
    bandwidth, less heat
  • Benchmarks tend to run at the same speed per core
  • Topsail shows a net 4X improvement
  • Of course, this number is VERY application
    dependent

5
Topsail Upgraded blades
  • 52 Chassis Basis of node names
  • Each holds 10 blades -gt 520 blades total
  • Nodes cmp-chassis-blade
  • Old Compute Blades Dell PowerEdge 1855
  • 2 Single core Intel Xeon EMT64T 3.6 GHZ procs
  • 800 Mhz FSB
  • 2MB L2 Cache per socket
  • Intel NetBurst MicroArchitecture
  • New Compute Blades Dell PowerEdge 1955
  • 2 Quad core Intel 2.33 GHz procs
  • 1333 Mhz FSB
  • 4MB L2 Cache per socket
  • Intel Core 2 MicroArchitecture

6
Topsail Upgrade 2
  • Most recent Topsail upgrade (Feb/Mar 09)
  • Refreshed much of the infrastructure
  • Improved IBRIX filesystem
  • Replaced and improved Infiniband cabling
  • Moved cluster to ITS-Manning building
  • Better cooling and UPS

7
Current Topsail Architecture
  • Login node 8 CPU _at_ 2.3 GHz Intel EM64T, 12 GB
    memory
  • Compute nodes 4,160 CPU _at_ 2.3 GHz Intel EM64T,
    12 GB memory
  • Shared disk 39TB IBRIX Parallel File System
  • Interconnect Infiniband 4x SDR
  • 64bit Linux Operating System

8
Multi-Core Computing
  • Processor Structure on Topsail
  • 500 nodes
  • 2 sockets/node
  • 1 processor/socket
  • 4 cores/processor (Quad-core)
  • 8 cores/node
  • http//www.tomshardware.com/2006/12/06/quad-core-x
    eon-clovertown-rolls-into-dp-servers/page3.html

9
Multi-Core Computing
  • The trend in High Performance Computing is
    towards multi-core or many core computing.
  • More cores at slower clock speeds for less heat
  • Now, dual and quad core processors are becoming
    common.
  • Soon 64 core processors will be common
  • And these may be heterogeneous!

10
The Heat Problem
Taken From Jack Dongarra, UT
11
More Parallelism
Taken From Jack Dongarra, UT
12
Infiniband Connections
  • Connection comes in single (SDR), double (DDR),
    and quad data rates (QDR).
  • Topsail is SDR.
  • Single data rate is 2.5 Gbit/s in each direction
    per link.
  • Links can be aggregated - 1x, 4x, 12x.
  • Topsail is 4x.
  • Links use 8B/10B encoding 10 bits carry 8 bits
    of data useful data transmission rate is
    four-fifths the raw rate. Thus single, double,
    and quad data rates carry 2, 4, or 8 Gbit/s
    respectively.
  • Data rate for Topsail is 8 GB/s (4x SDR).

13
Topsail Network Topology
14
Infiniband Benchmarks
  • Point-to-point (PTP) intranode communication on
    Topsail for various MPI send types
  • Peak bandwidth
  • 1288 MB/s
  • Minimum Latency (1-way)
  • 3.6 ms

15
Infiniband Benchmarks
  • Scaled aggregate bandwidth for MPI Broadcast on
    Topsail
  • Note good scaling throughout the tested range
    (from 24-1536 cores)

16
Login to Topsail
  • Use ssh to connect
  • ssh topsail.unc.edu
  • SSH Secure Shell with Windows
  • For using interactive programs with X-Windows
    Display
  • ssh X topsail.unc.edu
  • ssh Y topsail.unc.edu
  • Off-campus users (i.e. domains outside of
    unc.edu) must use VPN connection

17
Topsail File Systems
  • 39TB IBRIX Parallel File System
  • Split into Home and Scratch Space
  • Home /ifs1/home/my_onyen
  • Scratch /ifs1/scr/my_onyen
  • Mass Storage
  • Only Home is backed up
  • /ifs1/home/my_onyen/ms

18
File System Limits
  • 500GB Total Limit per User
  • Home 15GB limit for Backups
  • Scratch
  • No limit except 500GB total
  • Not backed up
  • Periodically cleaned
  • Few installed packages/programs

19
Compiling on Topsail
  • Modules
  • Serial Programming
  • Intel Compiler Suite for Fortran77, Fortran90, C
    and C - Recommended by Research Computing
  • GNU
  • Parallel Programming
  • MPI
  • OpenMP
  • Must use Intel Compiler Suite
  • Compiler tag -openmp
  • Must set OMP_NUM_THREADS in submission script

20
Compiling Modules
  • Module commands
  • module list commands
  • module avail list modules
  • module add add module temporarily
  • module list list modules being used
  • module clear remove module temporarily
  • Add module using startup files

21
Available Compilers
  • Intel ifort, icc, icpc
  • GNU gcc, g, gfortran
  • Libraries - BLAS/LAPACK
  • MPI
  • mpicc/mpiCC
  • mpif77/mpif90
  • mpixx is just a wrapper around the Intel or GNU
    compiler
  • Adds location of MPI libraries and include files
  • Provided as a convenience

22
Test MPI Compile
  • Copy cpi.c to scratch directory
  • cp /ifs1/scr/cdavis/Topsail/cpi.c
    /ifs1/scr/my_onyen/.
  • Add Intel module
  • module load hpc/mvapich-intel-11
  • Confirm Intel module
  • which mpicc
  • Compile code
  • mpicc o cpi cpi.c

23
MPI/OpenMP Training
  • Courses are taught throughout year by Research
    Computing http//learnit.unc.edu/workshops
  • Next course
  • MPI Spring
  • OpenMP Spring

24
Running Programs on Topsail
  • Upon ssh to Topsail, you are on the Login node.
  • Programs SHOULD NOT be run on Login node.
  • Submit programs to one of 4,160 Compute nodes.
  • Submit jobs using Load Sharing Facility (LSF).

25
Job Scheduling Systems
  • Allocates compute nodes to job submissions based
    on user priority, requested resources, execution
    time, etc.
  • Many types of schedulers
  • Load Sharing Facility (LSF) Used by Topsail
  • IBM LoadLeveler
  • Portable Batch System (PBS)
  • Sun Grid Engine (SGE)

26
Load Sharing Facility (LSF)
27
Submitting a Job to LSF
  • For a compiled MPI job
  • bsub -n "lt number CPUs gt" -o out.J -e err.J -a
    mvapich mpirun ./mycode
  • bsub LSF command that submits job to compute
    node
  • bsub o and bsub -e
  • Job output saved to file in submission directory

28
Queue System on Topsail
  • Topsail uses queues to distribute jobs.
  • Specify queue with q in bsub
  • bsub q week
  • No q specified default queue (week)
  • Queues vary depending on size and required time
    of jobs
  • See listing of queues
  • bqueues

29
Topsail Queues
Queue Time Limit Jobs/User CPU Range
int 2 hrs 128 ---
debug 2 hrs 128 ---
day 24 hrs 1024 4 1024
week 1 week 1024 4 256
month 1 month 512 4 128
512cpu 4 days 1024 128 512
128cpu 4 days 512 32 128
32cpu 2 days 1024 4 32
chunk 4 days 512 Batch Jobs
30
Submission Scripts
  • Easier to write submission script that can be
    edited for each job submission.
  • Example script file run.hpl
  • BSUB -n "lt number CPUs gt"
  • BSUB -e err.J
  • BSUB -o out.J
  • BSUB -a mvapich
  • mpirun ./mycode
  • Submit with bsub lt run.hpl

31
More bsub options
  • bsub x NO LONGER USE!!!!
  • Exclusive use of a node
  • Use extensively when first testing code
  • bsub n 4 R spanptile4
  • Forces all 4 processors to be on same node
  • Similar to x
  • bsub J job_name
  • see man pages for a complete description
  • man bsub

32
Performance Test
  • Gromacs MD simulation of bulk water
  • Simulation setups
  • Case 1 -n 8 -R spanptile1
  • Case 2 -n 8 -R spanptile8
  • Simulation times (1ns MD)
  • Case 1 1445 sec
  • Case 2 1255 sec
  • Using 1 node only improved speed by 13

33
Following Job After Submission
  • bjobs
  • bjobs l JobID
  • Shows current status of job
  • bhist
  • bhist l JobID
  • More details information regarding job history
  • bkill
  • bkill r JobID
  • Ends job prematurely

34
Submit Test MPI Job
  • Submit the test MPI program on Topsail
  • bsub q week n 4 o out.J e err.J a mvapich
    mpirun ./cpi
  • Follow submission bjobs
  • Output stored in out.J file

35
Pre-Compiled Programs on Topsail
  • Some applications are precompiled for all users
  • /ifs1/apps
  • Amber, Gaussian, Gromacs, NetCDF, NWChem, R
  • Add module to path using module commands
  • module list shows available applications
  • module add add specific application
  • Once module command is used, executable is added
    to the full path

36
Test Gaussian Job on Topsail
  • Add Gaussian Application to path
  • module add apps/gaussian-03e01
  • module list
  • Copy input com file
  • cp /ifs1/scr/cdavis/Topsail/water.com .
  • Check that executable has been added to path
  • echo PATH
  • Submit job
  • bsub q week n 4 e err.J o out.J g03
    water.com

37
Common Error 1
  • If job immediately dies, check err.J file
  • err.J file has error
  • Can't read MPIRUN_HOST
  • Problem MPI enivronment settings were not
    correctly applied on compute node
  • Solution Include mpirun in bsub command

38
Common Error 2
  • Job immediately dies after submission
  • err.J file is blank
  • Problem ssh passwords and keys were not
    correctly setup at initial login to Topsail
  • Solution
  • cd /.ssh/
  • mv id_rsa id_rsa-orig
  • mv id_rsa.pub id_rsa.pub-orig
  • Logout of Topsail
  • Login to Topsail and accept all defaults

39
Interactive Jobs
  • To run long shell scripts on Topsail, use int
    queue
  • bsub q int Ip /bin/bash
  • This bsub command provides a prompt on compute
    node
  • Can run program or shell script interactively
    from compute node
  • Totalview debugger can also be run interactively
    from Topsail

40
Further Help with Topsail
  • More details about using Topsail can be found on
    the Getting Started on Topsail help document
  • http//help.unc.edu/?id6214
  • http//keel.isis.unc.edu/wordpress/ - ON CAMPUS
  • For assistance with Topsail, please contact the
    ITS Research Computing group
  • Email research_at_unc.edu
  • For immediate assistance, see manual pages on
    Topsail
  • man ltcommandgt
About PowerShow.com