Title: LSF for Users
 1LSF for Users
- Mike Page 
- mpage_at_ucar.edu 
- SCD Consulting Services Group 
- SCD/HSS/CSG
2What is LSF?LSF - Load Sharing FacilityBatch 
Management Subsystemfor multi-host, multi-vendor 
complexesSame role as LoadLeveler or NQE with 
capability to manage computing resources across 
multiple platforms LSF runs on the Lightning 
cluster------------------------------------------
------------------------------------Documentation
 /usr/local/docs/LSF/6.0/.pdfHardware 
description http//www.scd.ucar.edu/docs/lightnin
g/overview.html At a lightning command line 
enter man lsfintro Further reading 
http//accl.grc.nasa.gov/lsf/about.html 
 3To be able to access LSFThis has been added to 
your login processing. /usr/local/lsf/conf/prof
ile.lsf (sh users) or source /usr/local/lsf/conf
/cshrc.lsf (csh users)These commands are 
executed before you receive a command 
prompt.There is no need for you to add anything 
to your login files in order to use LSF.These 
commands define the LSF environmentLSF_SERVERDIR
, LSF_BINDIR, LSF_LIBDIR, XLSF_UIDDIR, 
LSF_ENVDIR, PATH, MANPATH------------------------
-------------------------------------------Check
 env  grep -i lsf 
 4Essential Commandsfor Users
- bhosts 
- bqueues 
- bsub 
- bjobs 
- bhist 
- bpeek 
- bmod 
- bbot/btop 
- bswitch 
- bstop/bresume 
- bkill
5Essential CommandsPurpose
- bhosts - information about available hosts 
 (lshosts)
- bqueues - information about available queues 
- bsub - submit jobs to batch subsystem 
- bjobs - list jobs in the batch subsystem 
- bhist - displays historical information about 
 users jobs
- bpeek - displays stdout and stderr of users 
 unfinished job
- bmod - modifies job submission options for users 
 job
6Essential CommandsPurpose (contd)
- bbot/btop - moves a pending job relative to 
 users last/first job in a queue
- bswitch - switches users unfinished jobs from 
 one queue to another
- bstop/bresume - suspends/resumes users 
 unfinished jobs
- bkill - kill, suspend or resume users jobs 
7Essential Commands bhosts
- bhosts -w-l-R res_reqhost_namehost_group
 
- Displays information about hosts/platforms 
- lshosts -w  -l -R "res_req" host_name  
 cluster_name
- lshosts -s shared_resource_name ... 
- Displays hosts and their static resource 
 information
- ln0126en bhosts 
- HOST_NAME STATUS JL/U 
 MAX NJOBS RUN SSUSP USUSP RSV
- ln0126en ok - 2 
 0 0 0 0 0
- ln0127en ok - 2 
 0 0 0 0 0
- ln0128en ok - 2 
 0 0 0 0 0
- . 
- . 
- . 
- ln0440en ok - 2 
 0 0 0 0 0
- ln0441en ok - 2 
 0 0 0 0 0
- ln0442en ok - 2 
 0 0 0 0 0
8Essential Commands bqueues
- bqueues -w-l-r-m host_name-m all 
- -u user_name-u allqueue_name  
- Displays information about queues. 
- By default, returns the following information 
 about all queues queue name, queue priority,
 queue status, job slot statistics, and job state
 statistics.
- ln0126en bqueues 
- QUEUE_NAME PRIO STATUS MAX JL/U 
 JL/P JL/H NJOBS PEND RUN SUSP
- special 500 OpenActive - 
 - - - 0 0 0 0
- premium 300 OpenActive - 
 - - - 0 0 0 0
- regular 200 OpenActive - 
 - - - 0 0 0 0
- economy 160 OpenActive - 
 - - - 0 0 0 0
- hold 104 OpenActive - 
 - - - 0 0 0 0
- standby 100 OpenActive - 
 - - - 0 0 0 0
- share 100 OpenActive - 
 - - - 0 0 0 0
9Essential Commands bsub
- bsub options command cmd_args 
- Submits a job for batch execution 
10Essential Commands bsub (contd)
- bsub options command cmd_args 
11Essential Commands bsub (contd)
- bsub options command cmd_args 
12The Importance of Being lt
LSF usage is different from LL/NQS 
bsub a.out bsub -n 2 a.out bsub myscript bsub -q 
queuename a.out bsub -i infile -o outfile - e 
errfile a.out bsub lt myscript 
 13Sample LSF scriptSerial Job
!/bin/ksh   LSF batch script to run a serial 
code  BSUB -P 93300070  
Project 93300070 BSUB -n 1 
  number of tasks BSUB -J 
seriallsf.test  job 
name BSUB -o seriallsf.out  
output filename BSUB -e seriallsf.err 
  input filename BSUB -q regular 
  queue  Fortran 
example pgf90 -o samp_f -Mextend 
samp.f ./samp_f  C example pgcc -o samp_c 
samp.c ./samp_c  C example pgCC 
--no_auto_instantiation -o samp_cc 
samp.cc ./samp_cc
bsub lt serial.lsf 
 14Sample LSF scriptMPI Job
!/bin/ksh   LSF batch script to run the test 
MPI code  BSUB -P 93300070 
  Project 93300070 BSUB -a mpich_gm 
  select the mpich-gm elim BSUB -x 
  exlusive use 
of node (not_shared) BSUB -n 2 
  number of total tasks BSUB 
-R "spanptile1"  run 1 tasks per 
node BSUB -J mpilsf.test  
job name BSUB -o mpilsf.out 
  output filename BSUB -e mpilsf.err 
  error filename BSUB -q regular 
  queue  Fortran 
example mpif90 -o mpi_samp_f mpisamp.f mpirun.lsf 
./mpi_samp_f  C example mpicc -o mpi_samp_c 
mpisamp.c mpirun.lsf ./mpi_samp_c  C 
example mpicxx -o mpi_samp_cc mpisamp.cc mpirun.ls
f ./mpi_samp_cc
bsub lt mpi.lsf 
 15Sample LSF script OpenMP Job
!/bin/ksh   LSF script to run the test OMP 
codes  BSUB -P 93300070  
Proposal group 2 - Project 93300070 BSUB -a 
mpich_gm  select the mpich-gm elim 
 BSUB -x  
exclusive use of node BSUB -n 2 
  number of tasks BSUB -R 
"spanhosts1"  jobs run on one host BSUB 
-J omplsf.test  job name BSUB -o 
omplsf.out  ouput filename BSUB -e 
omplsf.err  input filename BSUB -q 
regular  queue  Fortran 
example pgf90 -o samp_f -Mextend -mp 
samp.f export OMP_NUM_THREADS1 ./samp_f export 
OMP_NUM_THREADS2 ./samp_f 
 C example pgcc -mp -o samp_c samp.c export 
OMP_NUM_THREADS1 ./samp_c export 
OMP_NUM_THREADS2 ./samp_c  C example pgCC 
--no_auto_instantiation -mp -o sampcc 
samp.cc export OMP_NUM_THREADS1 ./samp_cc export 
OMP_NUM_THREADS2 ./samp_cc
bsub lt omp.lsf 
 16Sample LSF scriptMPMD Job
!/bin/ksh   LSF batch script to run the test 
MPMD codes  BSUB -P 93300070 
  Project 93300070 BSUB -a mpich_gm BSUB -n 
2 BSUB -x BSUB -R "spanptile1" BSUB -o 
mpmdlsf.out  output 
filename BSUB -e mpmdlsf.err 
 error filename BSUB -J mpmdlsf.test 
  job name BSUB -q regular 
  queue  Build pgfile for mpmd 
run rm -f pgfile touch pgfile  EXE../bin/itmpmd 
 j0 for h in echo LSB_HOSTS do echo h" 
"j" "EXEj gtgt pgfile jexpr j  
1 done cat pgfile 
 Fortran example mpif90 -Mextend -o EXE'0' 
../src/mpmd/itmpmd.f mpif90 -Mextend -o EXE'1' 
../src/mpmd/itmpmd.f mpirun -pg pgfile 
/bin/pwd  C example mpicc -o EXE'0' 
../src/mpmd/itmpmd.c mpicc -o EXE'1' 
../src/mpmd/itmpmd.c mpirun -pg pgfile 
/bin/pwd  C example mpicxx --no_auto_instantia
tion -o EXE'0' ../src/mpmd/itmpmd.cc mpicxx 
--no_auto_instantiation -o EXE'1' 
../src/mpmd/itmpmd.cc mpirun -pg pgfile 
/bin/pwd rm EXE'0' EXE'1' pgfile 
bsub lt mpmd.lsf 
 17Sample LSF script Hybrid Job
!/bin/ksh   LSF batch script to run the test 
mixed MPI/OMP codes  BSUB -a mpich_gm 
  select mpich_gm elim BSUB -x 
  exclusive use of 
node BSUB -n 2  
sum of number of tasks BSUB -R "spanptile1" 
  number of processes per node BSUB 
-o mixlsf.out  output 
filename BSUB -e mixlsf.err 
 error filename BSUB -J mixlsf.test 
  job name BSUB -q regular 
  queue  Build pgfile for mix run rm -f 
pgfile touch pgfile  EXEPWD/mix  echo 
LSB_HOSTS j0 for h in echo LSB_HOSTS do 
echo h" "j" "EXE gtgt pgfile jexpr j 
 1 done
 Fortran example mpif90 -Mextend -mp -lmp -o mix 
mix.f export OMP_NUM_THREADS1 mpirun-env.pl -pg 
pgfile EXE export OMP_NUM_THREADS2 mpirun-env.pl
 -pg pgfile EXE  C example mpicc -mp -o mix 
mix.c export OMP_NUM_THREADS1 mpirun-env.pl -pg 
pgfile EXE export OMP_NUM_THREADS2 mpirun-env.pl
 -pg pgfile EXE  C example mpicxx 
--no_auto_instantiation -mp -o mix mix.cc export 
OMP_NUM_THREADS1 mpirun-env.pl -pg pgfile 
EXE export OMP_NUM_THREADS2 mpirun-env.pl -pg 
pgfile EXE rm pgfile
bsub lt mix.lsf 
 18Essential Commands bjobs
- bjobs - Displays information about LSF jobs 
- bjobs -u user_name 
- bjobs -u all 
- bjobs -l 
- bjobs -r 
- bjobs -s 
- bjobs -q queue_name 
19Essential Commands bhist
- bhist - displays historical information about 
 jobs
- bhist -J job_name 
- bhist -C start_time, end_time 
- bhist -D start_time, end_time 
- bhist -S start_time, end_time 
- bhist -T start_time, end_time 
20Essential Commands bpeek
- bpeek - displays stdout and stderr of users 
 selected, unfinished job
- bpeek -f uses tail -f to display output instead 
 of cat
- bpeek -q queue_name  -m host_name  -J job_name 
 
-  job_ID  "job_IDindex_list" 
21Essential Commands bmod
 bmod - modifies job submission options of a 
job bmod bsub options job_ID  
"job_IDindex" bmod -g job_group_name  -gn 
job_ID bmod -sla service_class_name  -slan 
job_ID bmod -h  -V 
 22Essential Commands bbot, btop
- bbot - moves a pending job relative to the last 
 job in the queue
- bbot job_ID  "job_IDindex_list" position 
- bbot -h  -V 
- btop - moves a pending job relative to the first 
 job in the queue
- btop job_ID  "job_IDindex_list" position 
- btop -h  -V
23Essential Commands bswitch
 bswitch - switches unfinished jobs from one 
queue to another bswitch -J job_name -m 
host_name  -m host_group -q queue_name 
-u user_name  -u user_group  -u all 
destination_queue 0 bswitch destination_queue 
job_ID  "job_IDindex_list" ... bswitch -h  
-V 
 24Essential Commands bstop/bresume
- bstop -suspends unfinished jobs 
- bstop -a -d -g job_group_name -sla 
 service_class_name
-  -J job_name -m host_name  -m host_group 
-  -q queue_name -u user_name  -u user_group 
 -u all 0
-  job_ID  "job_IDindex" ... 
- bstop -h  -V 
- bresume -resumes one or more suspended jobs 
- bresume -g job_group_name -J job_name -m 
 host_name
-  -q queue_name -u user_name  -u user_group 
 -u all  0
- bresume job_ID  "job_IDindex_list" ... 
- bresume -h  -V
25Essential Commands bkill
 bkill - sends signals to kill, suspend, or 
resume unfinished jobs bkill -l -g 
job_group_name  -sla service_class_name -J 
job_name -m host_name  -m host_group -q 
queue_name -r  -s (signal_value  
signal_name) -u user_name  -u user_group  
-u all job_ID ...  0  "job_IDindex" 
... bkill -h  -V 
 26Questions?Comments?