STG WW Blue Gene - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

STG WW Blue Gene

Description:

This tutorial provides a brief introduction to the environment for the Blue Gene ... Condor HTC (porting for BG/P) Parallel Debugger ... – PowerPoint PPT presentation

Number of Views:175
Avg rating:3.0/5.0
Slides: 62
Provided by: msi3
Category:
Tags: stg | blue | condor | gene

less

Transcript and Presenter's Notes

Title: STG WW Blue Gene


1
STG WW Blue Gene HPC Benchmark
CentersTutorial Introduction to the Blue
Gene Facility in Rochester, Minnesota
Carlos P SosaChemistry and Life Sciences
GroupAdvanced Systems Software
DevelopmentRochester, MN
2
Rochester Blue Gene Center Team
  • Cindy Mestad, Certified PMP, STG WW Blue Gene
    HPC p Benchmark Centers
  • Steve M Westerbeck, System Administrator, STG WW
    Blue Gene HPC p Benchmark Centers

3
Chemistry Life Sciences Applications Team
  • Carlos P Sosa, Chemistry and Life Sciences
    Applications, Advanced Systems Software
    Development

4
Preface
  • This tutorial provides a brief introduction to
    the environment for the Blue Gene IBM Facilities
    in Rochester, Minnesota
  • Customers should be mindful of their own security
    issues
  • The following points should be considered
  • Sharing of userids is not an accepted practice in
    order to maintain proper authentication controls
  • Additional encryption of data and source code on
    the filesystem is encouraged
  • Housekeeping procedures on your assigned frontend
    node and filesystem is recommended
  • Report any security breaches or concerns to the
    Rochester Blue Gene System Administration
  • Changing permissions on user generated files for
    resource sharing is the responsibility of the
    individual user
  • Filesystem cleanup at the end of the engagement
    is the responsibility of the customer

5
1. Blue Gene Hardware Overview
6
Blue Gene System Modularity
Service Front End (Login) Nodes
1GbE Service Network
How is BG/P Configured?
SLES10 DB2 XLF XLC/C GPFS ESSL TWS LL
File Servers
Storage Subsystem
10GbE Functional Network
7
Hierarchy
Compute nodes dedicated to running user
applications, and almost nothing else simple
compute node kernel (CNK)
I/O nodes run Linux and provide a more complete
range of OS services files, sockets, process
launch, debugging, and termination
Service node performs system management services
(e.g., heart beating, monitoring errors)
largely transparent to application/system software
Looking inside Blue Gene
8
Blue Gene Environment
9
IBM System Blue Gene/P
System-on-Chip (SoC) Quad PowerPC 450 w/ Double
FPU Memory Controller w/ ECC L2/L3 Cache DMA
PMU Torus Network Collective Network Global
Barrier Network 10GbE Control Network JTAG Monitor
10
BG/P Applications Specific Integrated Circuit
(ASIC) Diagram
11
Blue Gene/P Job Modes Allow Flexible Use of Node
Memory
Whats new?
Virtual Node Mode Previously called Virtual Node Mode All four cores run one MPI process each No threading Memory / MPI process ¼ node memory MPI programming model Dual Node Mode Two cores run one MPI process each Each process may spawn one thread on core not used by other process Memory / MPI process ½ node memory Hybrid MPI/OpenMP programming model SMP Node Mode One core runs one MPI process Process may spawn threads on each of the other cores Memory / MPI process full node memory Hybrid MPI/OpenMP programming model
12
Blue Gene Integrated Networks
  • Torus
  • Interconnect to all compute nodes
  • Torus network is used
  • Point-to-point communication
  • Collective
  • Interconnects compute and I/O nodes
  • One-to-all broadcast functionality
  • Reduction operations functionality
  • Barrier
  • Compute and I/O nodes
  • Low latency barrier across system (lt 1usec for 72
    rack)
  • Used to synchronize timebases
  • 10Gb Functional Ethernet
  • I/O nodes only
  • 1Gb Private Control Ethernet
  • Provides JTAG, i2c, etc, access to hardware.
    Accessible only from Service Node system
  • Boot, monitoring, and diagnostics
  • Clock network
  • Single clock source for all racks

13
HPC Software Tools for Blue Gene
  • Other Software Support
  • Parallel File Systems
  • Lustre at LLNL, PVFS2 at ANL
  • Job Schedulers
  • SLURM at LLNL, Cobalt at ANL
  • Altair PBS Pro, Platform LSF (for BG/L only)
  • Condor HTC (porting for BG/P)
  • Parallel Debugger
  • Etnus TotalView (for BG/L as of now, porting for
    BG/P)
  • Allinea DDT and OPT (porting for BG/P)
  • Libraries
  • FFT Library - Tuned functions by TU-Vienna
  • VNI (porting for BG/P)
  • Performance Tools
  • HPC Toolkit MP_Profiler, Xprofiler, HPM,
    PeekPerf, PAPI
  • Tau, Paraver, Kojak
  • IBM Software Stack
  • XL (FORTRAN, C, and C) compilers
  • Externals preserved
  • Optimized for specific BG functions
  • OpenMP support
  • LoadLeveler scheduler
  • Same externals for job submission and system
    query functions
  • Backfill scheduling to achieve maximum system
    utilization
  • GPFS parallel file system
  • Provides high performance file access, as in
    current pSeries and xSeries clusters
  • Runs on I/O nodes and disk servers
  • ESSL/MASSV libraries
  • Optimization library and intrinsics for better
    application performance
  • Serial Static Library supporting 32-bit
    applications
  • Callable from FORTRAN, C, and C
  • MPI library
  • Message passing interface library, based on
    MPICH2, tuned for the Blue Gene architecture

14
High-Throughput Computing (HTC) modes on Blue
Gene/P
  • BG/P with HTC looks like a cluster for serial and
    parallel apps
  • Hybrid environment standard HPC (MPI) apps plus
    now HTC apps
  • Enables a new class of workloads that use many
    single-node jobs
  • Easy administration using web-based Navigator

HTC
15
2. IBM Rochester Center Overview
16
Rochester Blue Gene Infrastructure
17
Shared GPFS Filesystem
18
Understanding Performance on Blue Gene/P
  • Theoretical floating-point performance
  • 1 fpmadd per cycle
  • Total of 4 floating-point operations per cycle
  • 4 floating-point operations/cycle x 850 cycle/s x
    106 3,400 x 106 3.4 GFlop/s per core
  • Peak performance 13.6 GFlop/s per node ( 4
    cores )

19
Two Generations BG/L and BG/P
20
3. How to Access Your Frontend Node
21
How to Login to the Frontend
bcssh.rochester.ibm.com
22
Gateway
gateway
ssh to your assigned front-end
23
Your front-end
gateway
24
Transferring Files
  • Transferring Files into the Rochester IBM Blue
    Gene Center

WinSCP
25
Transferring to the Front-end
  • Use scp
  • bcssh/codhome/myaccount scp conf_gen.cpp
    frontend-1
  • conf_gen.cpp
    100 46KB 45.8KB/s 0000

26
Current Disk Space Limts
  • bcssh gateway
  • /codhome/userid directories on bcssh are limited
    to 300GB (shared, no quota)
  • Used for transferring files in and out of the
    environment
  • Frontend node
  • /home directories have 10GB for all users, no
    quotas
  • The /gpfs file system is 400GB in size, there are
    no quotas as the file space is shared between all
    users on that frontend node

27
4. Compilers for Blue Gene
28
IBM Compilers
  • Compilers for Blue Gene are located in the
    front-end (/opt/ibmcmp)
  • Fortran
  • /opt/ibmcmp/xlf/bg/11.1/bin/bgxlf
  • /opt/ibmcmp/xlf/bg/11.1/bin/bgxlf90
  • /opt/ibmcmp/xlf/bg/11.1/bin/bgxlf95
  • C
  • /opt/ibmcmp/vac/bg/9.0/bin/bgxlc
  • C
  • /opt/ibmcmp/vacpp/bg/9.0/bin/bgxlC

29
GNU Compilers
  • The Standard GNU compilers and libraries which
    are also located on the frontend node will NOT
    produce Blue Gene compatible binary code. The
    standard GNU compilers can only be used for
    utility or frontend code development that your
    application may require.
  • GNU compilers (Fortran, C, C) for Blue Gene are
    located in (/opt/blrts-gnu/ )
  • Fortran
  • /opt/gnu/powerpc-bgp-linux-gfortran
  • C
  • /opt/gnu/powerpc-bgp-linux-gcc
  • C
  • /opt/gnu/powerpc-bgp-linux-g
  • It is recommended not to use GNU compiler for
    Blue Gene as the IBM XL compilers offer
    significantly higher performance. The GNU
    compilers do offer more flexible support for
    things like inline assembler.

30
5. MPI on Blue Gene
31
MPI Library Location
  • MPI implementation on Blue Gene is based on
    MPICH-2 from Argonne National Laboratory
  • Include files mpi.h and mpif.h are at the
    location
  • -I/bgsys/drivers/ppcfloor/comm/include

32
6 7. Compilation and Execution on Blue Gene
33
Copying Executables and Input
  • Step 1 Copy Input files and executables to a
    shared directory
  • Place data and executables in a directory under
    /gpfs
  • Example
  • cd /gpfs/fs2/frontend-1
  • mkdir myaccount
  • cp myaccount/sander /gpfs/fs2/frontend-1/myaccou
    nt
  • cp myaccount/input.tar /gpfs/fs2/frontend-1/myac
    count

34
Compiling on Blue Gene C
  • /gpfs/fs2/frontend-11/myaccount/hello0gtmake -f
    make.hello
  • mpixlc_r -O3 -qarch450 -qtune450 hello.c -o
    hello

gtcat make.hello XL_CC mpixlc_r OBJ
hello SRC hello.c FLAGS
-O3 -qarch450 -qtune450 LIBS (OBJ)
(SRC) XL_CC (FLAGS) (SRC) -o
(OBJ) (LIBS) clean rm .o hello
35
Hello World C
  • gtcat hello.c
  • include ltstdio.hgt / Headers /
  • include "mpi.h"
  • main(int argc, char argv) / Function main /
  • int rank, size, tag, rc, i
  • MPI_Status status
  • char message20
  • rc MPI_Init(argc, argv)
  • rc MPI_Comm_size(MPI_COMM_WORLD, size)
  • rc MPI_Comm_rank(MPI_COMM_WORLD, rank)
  • tag 100
  • if(rank 0)
  • strcpy(message, "Hello, world")
  • for (i1 iltsize i)
  • rc MPI_Send(message, 13, MPI_CHAR, i,
    tag, MPI_COMM_WORLD)

36
Compiling on Blue Gene C
  • gtcat make.hello
  • XL_CC mpixlcxx_r
  • OBJ hello
  • SRC hello.cc
  • FLAGS -O3 -qarch450 -qtune450
  • LIBS
  • (OBJ) (SRC)
  • XL_CC (FLAGS) (SRC) -o (OBJ)
    (LIBS)
  • clean
  • rm .o hello

37
Hello World C
  • cat hello.cc
  • // Include the MPI version 2 C bindings
  • include ltmpi.hgt
  • include ltiostreamgt
  • include ltstring.hgt
  • using namespace std
  • int
  • main(int argc, char argv)
  • MPIInit(argc, argv)
  • int rank MPICOMM_WORLD.Get_rank()
  • int size MPICOMM_WORLD.Get_size()
  • char nameMPI_MAX_PROCESSOR_NAME

https//spaces.umbc.edu/pages/viewpage.action?page
Id5245461C2B2BHelloWorldProgram-parallel
38
Running Programs (applications) on Blue Gene
  • Job running is managed via Loadleveler
  • LoadLeveler is a job scheduler written by IBM, to
    control scheduling of batch jobs
  • mpirun is invoked via loadleveler

39
Script to Emulate Syntax of mpirun
40
llrun
pts/0gt0gtllrun
41
mpirun
  • Step 2 job submission using mpirun
  • User can use mpirun to submit jobs.
  • The Blue Gene mpirun is located in
    /usr/bin/mpirun
  • Typical use of mpirun
  • mpirun -np lt of processesgt partition ltblock idgt
    -cwd pwd -exe ltexecutablegt
  • Where
  • -np Number of processors to be used. Must fit
    in available partition
  • -partition A partition from Blue Gene rack on
    which a given executable will execute, eg., R000.
  • -cwd The current working directory and is
    generally used to specify where any input and
    output files are located.
  • -exe The actual binary program which user wish
    to execute.
  • Example
  • mpirun np 32 partition R000 -cwd
    /gpfs/fs2/frontend-11/myaccount -exe
    /gpfs/fs2/frontend-11/myaccount/hello

42
mpirun Selected Options
  • Selected options
  • -args List of arguments to the executables in
    double quotes
  • -env List of environment variables in double
    quotes. VARIABLEvalue
  • -mode SMP or VN or DUAL
  • For more details perform following operation on
    command prompt
  • mpirun -h

43
mpirun Selected Example in an sh Script
!/bin/sh --------- User options start here
-------------------- MPIRUN"mpirun" MPIOPT"-np
32" PARTITION"-partition R000_J203_128" WDIR"-cw
d /FS1/myaccount/amber/IIsc/b4amber_mod/data1_32"
SANDER"-exe /FS1/myaccount/amber/exe/sander_bob_n
oBTREE" time_ends1600 till many pico seconds
after 150ps ---------- User options end here
--------------------- . . . MPIRUN MPIOPT
PARTITION -args "-O -i trna.md.in -o
trna.FRST_LAST.out -p trna.prm.top -c
trna.PRIV_FRST.res -r trna.FRST_LAST.r
es -x trna.FRST_LAST.crd -v
trna.FRST_LAST.vel -e trna.FRST_LAST.e
n -inf trna.FRST_LAST.info" WDIR
SANDER
44
Invoking llrun
  • pts/0/gpfs/fs2/frontend-11/myaccount/test0gtllrun
    -np 32 -cwd /gpfs/fs2/frontend-11/myaccount/test
    -exe /gpfs/fs2/frontend-11/myaccount/test/hello
  • Output
  • Submitted job frontend-11.rchland.ibm.com.1675
  • Command file llrun.myaccount.090704.1040.cmd
  • Output stdout myaccount.frontend-11.(jobid).out
  • stderr myaccount.frontend-11.(jobid).err
  • path /gpfs/fs2/frontend-11/myaccount/tes
    t/
  • Files created
  • myaccount_at_frontend-11
  • pts/0/gpfs/fs2/frontend-11/myaccount/test1gtls
  • myaccount.frontend-11.1675.err
    myaccount.frontend-11.1675.out
    llrun.myaccount.090704.1040.cmd

45
llrun cmd file
  • _at_ job_type bluegene
  • _at_ requirements (Machine "(host)")
  • _at_ class medium
  • _at_ job_name myaccount.frontend-11
  • _at_ comment "llrun generated jobfile"
  • _at_ error myaccount.frontend-11.(jobid).err
  • _at_ output myaccount.frontend-11.(jobid).out
  • _at_ environment COPY_ALL
  • _at_ wall_clock_limit 003000
  • _at_ notification always
  • _at_ notify_user
  • _at_ bg_connection prefer_torus
  • _at_ bg_size 32
  • _at_ initialdir/gpfs/fs2/frontend-11/myaccount/tes
    t
  • _at_ queue
  • /bgsys/drivers/ppcfloor/bin/mpirun -np 32 -cwd
    /gpfs/fs2/frontend-11/myaccount/test -exe
    /gpfs/fs2/frontend-11/myaccount/test/hello

46
ll Command Script
  • pts/0/gpfs/fs2/frontend-11/myaccount/namd_test0gt
    cat llrun_namd.cmd
  • _at_ job_type bluegene
  • _at_ requirements (Machine "(host)")
  • _at_ class medium
  • _at_ job_name myaccount.frontend-11
  • _at_ comment "LoadLeveler llrun script"
  • _at_ error (job_name).(jobid).err
  • _at_ output (job_name).(jobid).out
  • _at_ environment COPY_ALL
  • _at_ wall_clock_limit 006000
  • _at_ notification never
  • _at_ notify_user
  • _at_ bg_connection prefer_torus
  • _at_ bg_size 256
  • _at_ initialdir/gpfs/fs2/frontend-11/myaccount/nam
    d_test
  • _at_ queue
  • /bgsys/drivers/ppcfloor/bin/mpirun -np 256
    -verbose 1 -mode SMP -env "BG_MAPPINGTXYZ" -cwd
    /gpfs/fs2/frontend-11/myaccount/namd_test -exe
    ./namd2 -args "apoa1.namd"

LL section
mpirun section specific to the application
47
mpirun Standalone Versus mpirun in LL Environment
  • Comparison between mpirun and Loadleveler
    llsubmit command command

job_type and requirements tags must ALWAYS be
specified as listed above If the above command
file listing were contained in a file named
my_job.cmd, then the job would then be submitted
to the LoadLeveler queue using llsubmit
my_job.cmd.
48
Blue Gene Monitoring Jobs bgstatus
  • Monitor Status of job executing on Blue Gene
  • bgstatus

49
Blue Gene Monitoring Jobs lljobq
50
Avoid Firewall inactivity timeout issues
  • Before
  • screen ltentergt
  • After
  • screen -r ltentergt
  • More information
  • http//www.kuro5hin.org/story/2004/3/9/16838/14935

51
Appendix Blue Gene Specific LL Keywords - 1
52
Appendix Blue Gene Specific LL Keywords - 2
53
Appendix Blue Gene Specific LL Keywords - 3
54
Appendix Understanding Job Status - 1
55
Appendix Understanding Job Status - 2
56
Appendix Understanding Job Status - 3
57
Appendix Hardware Naming Convention 1
http//www.redbooks.ibm.com/redbooks/SG247417/wwhe
lp/wwhimpl/js/html/wwhelp.htm
58
Appendix Hardware Naming Convention 2
http//www.redbooks.ibm.com/redbooks/SG247417/wwhe
lp/wwhimpl/js/html/wwhelp.htm
59
Appendix Understanding Job Status - 4
60
Help?
  • Where to submit questions related to the
    Rochester IBm Center?
  • bgcod_at_us.ibm.com

61
References Blue Gene/L
  • Blue Gene/L System Administration, SG24-7178-03
    Redbooks, published 27 October 2006, last updated
    30 October 2006
  • Blue Gene/L Safety Considerations, REDP-3983-01
    Redpapers, published 29 June 2006
  • Blue Gene/L Hardware Overview and Planning,
    SG24-6796-02 Redbooks, published 11 August 2006
  • Blue Gene/L Application Development,
    SG24-7179-03 Redbooks, published 27 October 2006,
    last updated 18 January 2007
  • Unfolding the IBM eServer Blue Gene Solution,
    SG24-6686-00 Redbooks, published 20 September
    2005, last updated 1 February 2006
  • GPFS Multicluster with the IBM System Blue Gene
    Solution and eHPS Clusters, REDP-4168-00
    Redpapers, published 24 October 2006
  • Blue Gene/L Performance Analysis Tools,
    SG24-7278-00 Redbooks, published 18 July 2006
  • IBM System Blue Gene Solution Problem
    Determination Guide, SG24-7211-00 Redbooks,
    published 11 October 2006
  • http//www.redbooks.ibm.com/
Write a Comment
User Comments (0)
About PowerShow.com