Computing Workshop for Users of NCAR’s SCD machines - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Computing Workshop for Users of NCAR’s SCD machines

Description:

Computing Workshop for Users of NCAR s SCD machines Christiane Jablonowski (cjablono_at_ucar.edu) NCAR ASP/SCD 31 January 2006 ML Mesa Lab, Chapman Room – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 47
Provided by: aspUcarE
Learn more at: https://edec.ucar.edu
Category:

less

Transcript and Presenter's Notes

Title: Computing Workshop for Users of NCAR’s SCD machines


1
Computing Workshop for Users of NCARs SCD
machines
  • Christiane Jablonowski (cjablono_at_ucar.edu) NCAR
    ASP/SCD
  • 31 January 2006

ML Mesa Lab, Chapman Room video conference
facilities FL EOL Atrium and CG1 3150
2
Overview
  • Current machine architectures at NCAR (SCD)
  • Some basics on parallel computing
  • Batch queuing systems at NCAR
  • GAU resources how to obtain a GAU account
  • Insights into GAU charges
  • The Mass Storage System
  • How to monitor the GAUs
  • Some practical tips on benchmarks, debugging
    tools, restarts
  • ???

3
Computer architectures
  • SCDs machines are UNIX-based parallel computing
    architectures
  • Two types
  • Hybrid (shared and distributed memory) machines
    like
  • bluesky (IBM Power4) bluevista (IBM
    Power5)lightning (AMD Opteron system running
    Linux)
  • Shared memory system liketempest (SGI, 128
    CPUs), predominantly used for post-processing jobs

4
Parallel Programming
  • Parallel machines require parallel programming
    techniques in the user application
  • MPI (Message Passing Interface) for distributed
    memory systems, can also be used on shared memory
    systems
  • OpenMP for shared memory systems
  • Hybrid (MPI OpenMP) programming technique
    common on the IBMs at NCAR
  • Pure MPI parallelization often the fastest
    option, computational domain is split into pieces
    that can communicate over the network (via
    messages)
  • OpenMP Parallelization of (mostly) loops via
    compiler directives
  • Parallelization provided in CAM/CCSM/WRF

5
Most common Hybrid hardware architectures
  • Combined shared and distributed memory
    architecture
  • Shared-memory symmetric multiprocessor (SMP)
    nodes,processors on a node have direct access to
    memory
  • Nodes are connected via the network (distributed
    memory)

6
MPI example
Processors communicate via messages
7
MPI Example
  • Initialize finalize MPI in your program via
    function/subroutine calls to the MPI library.
    Examples includeMPI_Init, MPI_Comm_rank,
    MPI_Comm_size, MPI_Finalize

Example fromprevious pagein C
notation(unoptimized)
Important to note such an operation (computing a
global sum) is very common, therefore MPI
provides a highly optimized function, also called
a reduction operation MPI_Reduce () that can
replace the example above
8
Example domain decompositions for MPI
Each color presentsa processor
9
OpenMP Example
  • Parallel loops via compiler directives (here in
    Fortran notation)
  • Before program is called set
  • setenv OMP_NUM_THREADS proc
  • Add compiler directives in your code
  • !OMP PARALLEL DO
  • DO i 1, n
  • a(i) b(i) c(i)
  • END DO
  • !OMP END PARALLEL DO

master thread
team
master thread
Assume n1000 proc4 The loop will be split
into 4 threads that run in parallel with loop
indices 1250, 251500, 501750, 7511000
10
SCDs machines
  • Bluesky (web page)
  • Oldest machine at NCAR (2002)
  • Lots of user experience at NCAR, easy access to
    help
  • CAM/CCSM/WRF are set up for this architecture
    (Makefiles)
  • Batch queuing system LoadLeveler, short
    interactive runs possible
  • Batch queues are listed under http//www.cisl.uca
    r.edu/computers/bluesky/queue.charge.html
  • Lots of additional software available e.g. math
    libraries, graphics packages, Totalview debugger

11
SCDs machines
  • Bluevista (web page)
  • Newest machine on the floor (Jan. 2006)
  • CAM/CCSM/WRF are (probably) set up for this
    architecture
  • Batch queuing system LSF (Load Sharing Facility)
  • Queue names different from bluesky premium,
    regular, economy, standby, debug,
    sharehttp//www.cisl.ucar.edu/computers/bluevista
    /queue.charge.html
  • Some additional software available e.g. math
    libraries, Totalview debugger

12
SCDs machines
  • Lightning (web page)
  • Linux cluster
  • Compilers different from the IBMsPortland Group
    or Pathscale
  • Batch queuing system LSF
  • Same queue names as bluevista
  • Some support software
  • Tempest (web page)
  • for data post-processing with yet another batch
    queuing system NQS
  • Lots of support software
  • Interactive use possible

13
Example of a LoadLeveler job script
Parallel job with 32 MPI processes, com_reg32
queue (32-way node)
  • _at_ class com_rg32
  • _at_ node 1
  • _at_ tasks_per_node 32
  • _at_ output out.(jobid)
  • _at_ error out.(jobid)
  • _at_ job_type parallel
  • _at_ wall_clock_limit 002000
  • _at_ network.MPI csss,not_shared,us
  • _at_ node_usage not_shared
  • _at_ account_no 54042108
  • _at_ ja_report yes
  • _at_ queue
  • setenv OMP_NUM_THREADS 1

regular queue
32 MPI processesper 32-way node
Submit the job via llsubmit job_script
14
Example of a LoadLeveler job script
Hybrid parallel job with 8 MPI processes and 4
OpenMP threads
  • _at_ class com_ec32
  • _at_ node 1
  • _at_ tasks_per_node 8
  • _at_ output out.(jobid)
  • _at_ error out.(jobid)
  • _at_ job_type parallel
  • _at_ wall_clock_limit 002000
  • _at_ network.MPI csss,not_shared,us
  • _at_ node_usage not_shared
  • _at_ account_no 54042108
  • _at_ ja_report yes
  • _at_ queue
  • setenv OMP_NUM_THREADS 4

economy queue
8 MPI processesper 32-way node
Submit the job via llsubmit job_script
15
Example of an LSF job script (lightning)
Parallel job with 8 MPI processes (on 4 2-way
nodes)
  • ! /bin/csh
  • BSUB -a 'mpich_gm'
  • BSUB -P 54042108
  • BSUB -q regular
  • BSUB -W 0030
  • BSUB -x
  • BSUB -n 8
  • BSUB -R "spanptile2"
  • BSUB -o fvcore_amr.out.J
  • BSUB -e fvcore_amr.err.J
  • BSUB -J test0.path
  • mpirun.lsf -v ./dycore

select on lightning
regular queue
wallclock limit 30 min
8 MPI processes (total)
2 MPI processes per node
name of the job (listedin the SCD Portal)
Submit the job via bsub lt job_script
16
Example of an LSF job script (bluevista)
Parallel job with 8 MPI processes (on 1 8-way
node)
  • ! /bin/csh
  • BSUB -a poe
  • BSUB -P 54042108
  • BSUB -q economy
  • BSUB -W 0030
  • BSUB -x
  • BSUB -n 8
  • BSUB -R "spanptile8"
  • BSUB -o fvcore_amr.out.J
  • BSUB -e fvcore_amr.err.J
  • BSUB -J test0.path
  • mpirun.lsf -v ./dycore

select poe on bluevista
economy queue
exclusive use (not shared)
Allows up to 8 MPI processes on a node
Submit the job via bsub lt job_script
17
More information on SCDs machines
  • Web page SCDs Support and Consulting services
  • SCDs costomer support sometimes you even get
    help on the weekends or in the evenings
  • Email consult1_at_ucar.edu
  • Phone 303 497 1278
  • Walk-in support at the Mesa Lab
  • Check out SCDs Daily Bulletin (scheduled machine
    downtimes, etc.)
  • Subscribe to the hpcstatus mailing list (short
    e-mails about machine status, system updates)

18
GAU resources
  • ASP has a monthly allocation of 3850 GAUs
    (General Accounting Units)
  • A GAU is a measure for some compute time on the
    supercomputers maintained by NCARs Scientific
    Computing Division (SCD)http//www.cisl.ucar.edu
    /
  • Access to these machines require
  • an SCD login account (dbs_at_ucar.edu or
    303-497-1225)
  • a GAU account (for ASP contact Maura, otherwise
    contact your division / apply for a university
    account)
  • ssh environment
  • and a crypto card (for secure access)
  • SCD contacts Dick Valent Mike Page (here
    today), Juli Rew, Siddhartha Gosh, Ginger
    Caldwell (GAUs)

19
GAU resources
  • GAUs Use it or lose it - strategy
  • In ASP We share the resource among the ASP
    postdocs graduate fellows
  • Distribution is flexible and will be discussed
    occasionally, e.g. monthly, either via meetings
    or e-mail discussions email asp-gau-users_at_asp.u
    car.edu
  • GAUs are also charged for
  • storing files in the Mass Storage System (MSS)
  • file transfers from MSS to other machines

20
ASP GAU account
  • ASP GAU account number 54042108 (also project
    number)
  • Needs to be specified in the batch job scripts
  • ASP account number is not your default account
    number
  • Therefore everybody needs a second (default) GAU
    account
  • divisional GAU account
  • so-called University account (small request form
    for 1500 GAUs http//www.cisl.ucar.edu/resources/c
    ompServ.html)these GAUs do not expire every
    month, one-time allocation
  • Second GAU account should be used for the
    accumulating MSS charges
  • automatic when using CAM / CCSMs MSS option

21
GAU charges on SCDs supercomputers
  • You are charged GAUs for how much time you use a
    processor (on bluesky, bluevista, lightning,
    tempest)
  • On bluesky, there are actually two formulas
  • Shared-node usageGAUs charged CPU hours used
    ? computer factor ? class charging
    factor
  • Dedicated-node usageGAUs charged wallclock
    hours used ? number of nodes used ?
    number of processors in that node ?
    computer factor ? class charging factor

Slides on GAU charges Modified from an earlier
presentation by George Bryan, NCAR MMM
22
Number of nodes used andNumber of processors
in that node
  • Self explanatory (?)
  • Bluesky
  • 76 8-way (processors) nodes
  • 25 32-way (processors) nodes
  • Bluevista
  • 78 8-way (processors) nodes
  • Lightning
  • 128 2-way (processors) nodes

23
CPU hours used and Wallclock hours used
  • Measure of how long you used a processor
  • NOTE This includes all time you were allocated
    the use of a processor, whether you actually used
    it or not
  • Example you used two 8-processor nodes on
    bluesky. The job started at 100 PM and finished
    at 230 PM. You are charged for 1.5 hrs

24
Computer factor
  • A measure of how powerful a computer is
  • Bluesky 0.24
  • Bluevista 0.5
  • Lightning 0.34
  • This levels the playing field

25
Class charging factor
  • Tied to queuing system How quickly do you want
    your results, and how much are you willing to pay
    for it?
  • Current setting on all SCD supercomputers
  • Premium 1.5 (highest priority, fastest
    turnaround)
  • Regular 1.0
  • Economy 0.5
  • Standby 0.1 (lowest priority, slow turnaround)

26
Example
  • Recall dedicated-node usage on bluesky
  • GAUs charged wallclock hours used ? number of
    nodes used ? number of processors in that node ?
    computer factor ? class charging factor
  • 1.5 hours using two 8-processor nodes
  • Bluesky regular queue
  • GAUs used 1.5 ? 2 ? 8 ? 0.24 ? 1.0 5.76
    GAUs
  • In premium queue, this would be 8.64 GAUs
  • In standby queue, this would be 0.576 GAUs

27
Recommendations Queuing systems
  • Check the queue before you submit any job
  • If the queue is not busy, try using the standby
    or economy queues
  • The queue tends to be emptier evenings,
    weekends, and holidays
  • Job will start sooner when specifying a wallclock
    limit in the job script (scheduler tries to
    squeeze in short jobs)
  • The less processors you request, the sooner you
    start
  • Use the premium queue sparingly
  • Short debug jobs (there is also a special debug
    queue on lightning)
  • When that conference paper is due

28
Recommendations of processors vs. run times
  • If you are using more processors, you might wait
    longer in the queue, but usually the actual
    runtime of your job is reduced
  • Caveat it usually costs more GAUs
  • Example you run the same job, but using
  • Using 8 processors, the job ran in 24 hours
  • Using 64 processors, the job ran in 4 hours
  • 1st example used 46 GAUs
  • 2nd example used 61 GAUs

29
The Mass Storage System
  • MSS Mass storage system (disks and cartridges)
    for your big data sets
  • MSS connected to the SCD machines, sometimes also
    to divisional computers
  • MSS user have directories like mss/LOGIN_NAME/
  • Quick online reference (mss commands)http//www.
    cisl.ucar.edu/docs/mss/mss-commandlist.html
  • You are charged GAUs for using the MSS
  • The GAU equation for MSS is more complicated ....

30
MSS Charges
  • GAUs charged .0837 ? R .0012 ? A N ?
    (.1195 ? W .2050 ? S)
  • where
  • R Gigabytes read
  • W Gigabytes created or written
  • A Number of disk drive or tape cartridge
    accesses
  • S Data stored, in gigabyte-years
  • N Number of copies of file 1 if economy
    reliability selected 2 if standard reliability
    selected

31
Recommendations The MSS
  • MSS charges seem small, but they add up!
  • Examples FY04 MSS usage
  • ACD 24,000 of 60,000 GAUs
  • CGD 94,500 of 181,000 GAUs
  • HAO 22,000 of 122,000 GAUs
  • MMM 34,000 of 139,000 GAUs
  • RAP 32,000 of 35,000 GAUs

32
Recommendations The MSS
  • Recommendation for ASP users
  • use an account in your home division or your
    so-called university account (1500 GAUs for
    postdocs, you need to apply) for MSS charges
  • leave ASP GAUs for supercomputing

33
GAU Usage Strategy 30-day and 90-day averages
  • The allocation actually works through 30-day and
    90-day averages
  • Limits 120 for 30-day use 105 for 90-day use
  • It is helpful to spread usage out evenly
  • How to check GAU usage
  • Type charges on command line of a supercomputer
  • Check the daily summary output (next page)
  • SCD Portal look for the link on SCDs main page
    http//www.cisl.ucar.edu/

34
Web page http//www.cisl.ucar.edu/dbsg/dbs/ASP/ A
SP 30 Day Percent 57.0 ASP 90 Day
Percent 48.3 30 Day Allocation 3850
90 Day Allocation 1155030 Day Use
2193 90 Day Use 5575 90 DAY ST
-- 30 DAY ST -- LAST DAY
01-NOV-05 31-DEC-05 29-JAN-06 ASP
Gaus Used by Day 01-NOV-05 9.3603-NOV-05
.03 04-NOV-05 141.45
22-JAN-06 .04 23-JAN-06
44.29 24-JAN-06 170.83
25-JAN-06 120.30 26-JAN-06
91.67 27-JAN-06 41.97
28-JAN-06 15.59 29-JAN-06
16.95
35
What happens when we use too many GAUs?
  • Your jobs will be thrown into a very low
    priority the dreaded hold queue
  • It will be hard to get work done
  • But, jobs will still run
  • ASP Users You can use more than 3850 GAUs /
    month
  • Experience says, its better to use too many than
    not enough

36
What happens when we use too many/too few GAUs?
  • Too many
  • Recommendation when the 30- and 90-day averages
    are running high, use the economy or standby
    queue ... conserve GAUs
  • But, dont worry about going overToo few
  • ASPs allocation will be cut in the long run if
    the 3850 GAUs per month allocation is not used

37
How to catch up when behind
  • Be wasteful
  • Use the premium queue
  • Use more processors than you need
  • Have fun
  • Try something you always wanted to do, but never
    had the resources

38
How to conserve GAUs
  • Be frugal
  • Use the economy and standby queues
  • Use fewer processors
  • Use divisional GAUs (if possible) or your
    university GAU account

39
How to share monitor GAUs in ASP
  • Communicate!
  • Occasionally, we (ASP postdocs) use the e-mail
    listasp-gau-users_at_asp.ucar.edu
  • to announce a busy GAU period
  • Keep watching the ASP GAU usage on the webpage
    http//www.cisl.ucar.edu/dbsg/dbs/ASP/ or in the
    SCD Portal
  • Look for the SCD Portal link on the SCD
    pagehttp//www.cisl.ucar.edu/

40
SCD Portal
  • Online tool that helps you monitor the GAU
    charges and the current machine status (e.g.
    batch queues), display can be customized
  • Information on the machine status requires a
    setup-command on roy.scd.ucar.edu via the
    crypto-card access, just enter scdportalkey
    hostname (e.g. lightning) after logging on with
    the crypto-card
  • At this time (Jan/31/2006) the GAU charges on
    bluevista are not itemized will be included in
    the next release in Spring 2006

41
Other IBM resources
  • Sources of information on the IBM machines
    bluesky (from the command line), batchview also
    works on bluevista lightning
  • batchview for overview of jobs with their
    rankings
  • llq for list of all submitted jobs, no ranking
  • spinfo queue limits, memory quotas on home file
    system and the temporary file system /ptmp
  • Useful IBM LoadLeveler keywords in the
    script_at_account_no54042108 -gt ASP account
    _at_ja_reportyes -gt job report (see
    example on the next page)
  • Useful LoadLeveler commands llsubmit
    script_file, llcancel job_id

42
Example IBM Job Report
  • If selected, one email per job is sent to you at
    midnight, Output on the IBM machines, here
    blackforest (meanwhile decommisioned)
  • Job Accounting - Summary Report
    Operating System
    blackforest AIX51 User Name (ID) cjablono
    (7568) Group Name (ID) ncar (100) Account
    Name 54042108 Job Name
    bf0913en.26921 Job Sequence Number
    bf0913en.26921 Job Starts 12/20/04 175633
    Job Ends 12/20/04 232634 Elapsed Time
    (Wall-Clock CPU) 633632 s Number of Nodes
    (not_shared) 8 Number of CPUs 32 Number of
    Steps 1                                         
     

43
IBM Job Report (continued)
  • Charge Components Wall-clock Time 53001
    Wall-clock CPU hours 176.00889 hrs Multiplier
    for com_ec Queue 0.50 Charge before Computer
    Factor 88.00444 GAUs Multiplier for computer
    blackforest 0.10 Charged against Allocation
    8.80044 GAUs Project GAUs Allocated 5000.00
    GAUs Project GAUs Used, as of 12/16/041889.20
    GAUs Division GAUs 30-Day Average 103.3
    Division GAUs 90-Day Average 58.6

44
How to increase the efficiency
  • Get a feel for the GAUs for long jobs benchmark
    the application on target machine
  • Run a short but relevant test problem and measure
    the run time (wall clock time) via MPI commands
    (function MPI_WTIME) or UNIX timing commands like
    time or timex (output formats are shell-script
    dependent)
  • Vary number of processors to assess the scaling
  • If application scales poorly, avoid using a large
    number of processors (waste of GAUs), instead use
    smaller number with numerous restarts
  • Make sure your job fits into the queue (finishes
    before the max. time is up)
  • Use compiler options, especially the optimization
    options
  • In case of programming problems the Totalview
    debugger can save you days, weeks or even
    monthson the IBMs compile your program with
    the compiler options-g -qfullpath -d

45
Restarts
  • Restart files are important for long simulations
  • Queue limits are up to 6 wallclock hours (hard
    limit, job fails afterwards), then a restart
    becomes necessary
  • Get information on the queue limits (SCD web
    page) and select the jobs integration time
    accordingly
  • Restarts built into CAM/CCSM/WRF, must only be
    activated
  • Restarts for other user applications must
    probably be programmed

46
Questions ?
Write a Comment
User Comments (0)
About PowerShow.com