Parallel computing on nanco an introductory course - PowerPoint PPT Presentation

1 / 86
About This Presentation
Title:

Parallel computing on nanco an introductory course

Description:

Parallelization Concepts. Nanco Computer Design. Orientation on Nanco. Parallel Programming -MPI ... Parallelization Concepts. Parallel Power for HPC ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 87
Provided by: suz86
Category:

less

Transcript and Presenter's Notes

Title: Parallel computing on nanco an introductory course


1
Parallel computing on nanco- an introductory
course
  • Anne Weill Zrahia
  • Technion,Computer Center
  • July 2007

2
Parallel Programming on the Nanco
  • Parallelization Concepts
  • Nanco Computer Design
  • Orientation on Nanco
  • Parallel Programming -MPI
  • 5) Queuing system - SGE

3
  • Parallelization Concepts

4
Parallel Power for HPC
  • A closely coupled, scalable set of
    interconnected computer system, sharing common
    hardware and software infrastructure, providing a
    parallel set of resources to applications for
    improved performance.

5
Resources needed for applications arising from
Nanotechnology
  • Large memory Tbytes
  • High floating point computing speed Tflops
  • High data throughput state of the art

6
Parallel classification
  • Parallel architectures
  • Shared Memory /
  • Distributed Memory
  • Programming paradigms
  • Data parallel /
  • Message passing

7
Shared Memory
  • Each processor can access any part of the memory
  • Access times are uniform (in principle)
  • Easier to program (no explicit message passing)
  • Bottleneck when several tasks access same
    location

8
SMP architecture
P
P
P
P
Memory
9
Distributed Memory
  • Processor can only access local memory
  • Access times depend on location
  • Processors must communicate via explicit message
    passing

10
Distributed Memory
Processor Memory
Processor Memory
Interconnection network
11
Message Passing Programming
  • Separate program on each processor
  • Local Memory
  • Control over distribution and transfer of data
  • Additional complexity of debugging due to
    communications

12
Why not a cluster
  • Single SMP system easier to purchase/maintain
  • Ease of programming in SMP systems

13
Why a cluster
  • Scalability
  • Total available physical RAM
  • Reduced cost
  • But

14
Performance issues
  • Concurrency ability to perform actions
    simultaneously
  • Scalability performance is not impaired by
    increasing number of processors
  • Locality high ration of local memory
    accesses/remote memory accesses (or low
    communication)

15
SP2 Benchmark
  • Goal Checking performance of real world
    applications on the SP2
  • Execution time (seconds)CPU time for
    applications
  • Speedup
  • Execution time for 1 processor
  • ---------------------------------
    ---
  • Execution time for p processors

16
(No Transcript)
17
2) Nanco design
18
Nanco architecture
19
Configuration
M
M
M
P
P
P
P
P
P
node2
node64
node1
Infiniband Switch
20
Configuration
  • 64 dual-processor, dual core compute nodes, each
    dual-core Opteron Rev. F
  • 8GB RAM memory/node
  • 2 master nodes for H/A , also Opterons
  • Infiniband Interconnect switch HCAs
  • Netapp storage

21
(No Transcript)
22
AMD Opteron processor
23
Memory bottleneck
24
AMD Hypertransport
25
(No Transcript)
26
How does this reflect on performance?

27
Performance
  • Access to local memory 1hop
  • Access to 2nd processor memory 2hops
  • Prefetch can be useful for predictable patterns
  • Multithreading can be used at node level

28
Infiniband interconnect
29
3) Orientation on nanco
30
Getting started
  • Security
  • Logging in
  • Shell environment
  • Transferring files

31
System access-security
  • Secure access
  • X-tunelling (for graphics
  • Can use ssh X for tunnelling

32
Working on nanco
  • Because of high-availability, we have 2 master
    nodes (masternode1 and masternode2) as points of
    entry to the cluster.
  • Login ssh nanco.technion.ac.il and you will be
    redirected to one of the masters

33
Login Environment
  • Paths and environment variables have been setup
    (change things with care)
  • TCSH is the default (can transfer to bash if you
    like)
  • User modifiable environment variables are in
    .cshrc in home directory
  • Home directory is in /u/courseXX

34
Compilers
  • Options are gcc, gcc4, suncc for C
  • g , sunCC for C
  • G77(no F90) , gfortran,sunf90 for
    Fortran77/Fortran90

35
Useful commands
  • ssh-key a script to allow ssh to all nodes
  • top - to see your processes Attention you
    must login to the actual machine to see your
    process
  • ps u ltusernamegt - to see processes

36
Useful commands(cont.)
  • parps a script to allow see running processes
    on a set of nodes . Usage
  • parps n1 n2 - from noden1 to noden2
  • parshow - a script to see where a particular
    executable is running

37
Flags for compilation
  • sunf90 fast -xO5 -xarchamd64a myprog.f o myprog
  • Gcc O3 marchopteron myprog.c o myprog

38
Compilation with MPI
  • Most MPI implementation support C,C,Fortran77
    and Fortran90 bindings.
  • Scripts for compilation of type mpif77,mpif90,
    mpicc etc.
  • You can specify generic compiler options

39
4) Parallel programming with MPI
40
WHAT is MPI?
  • A message- passing library specification
  • Extended message-passing model
  • Not specific to implementation or computer

41
BASICS of MPI PROGRAMMING
  • MPI is a message-passing library
  • Assumes a distributed memory architecture
  • Includes routines for performing communication
    (exchange of data and synchronization) among the
    processors.

42
Message Passing
  • Data transfer synchronization
  • Synchronization the act of bringing one or more
    processes to known points in their execution
  • Distributed memory memory split up into
    segments, each may be accessed by only one
    process.

43
Message Passing
May I send?
yes
Send data
44
MPI STANDARD
  • Standard by consensus, designed in an open forum
  • Introduced by the MPI FORUM in May 1994, updated
    in June 1995.
  • MPI-2 (1998) produces extensions to the MPI
    standard

45
Why use MPI ?
  • Standardization
  • Portability
  • Performance
  • Richness
  • Designed to enable libraries

46
Writing an MPI Program
  • If there is a serial version , make sure it is
    debugged
  • If not, try to write a serial version first
  • When debugging in parallel , start with a few
    nodes first.

47
Format of MPI routines
48
Six useful MPI functions
49
Communication routines
50
End MPI part of program
51
The simplest MPI program
52
Exercise 1 running a simple MPI program
  • 1.

53
Exercise 2 modifying and using send/receive
  • 1.

54
MPI Messages
  • DATA data to be sent
  • ENVELOPE information to route the data.

55
Description of MPI_Send (MPI_Recv)
56
Description of MPI_Send (MPI_Recv)
57
  • program hello
  • include mpif.h status(MPI_STATUS_SIZE)
    character12 message call MPI_INIT(ierror) call
    MPI_COMM_SIZE(MPI_COMM_WORLD, size,ierror) call
    MPI_COMM_RANK(MPI_COMM_WORLD, rank,ierror) tag
    100 if(rank .eq. 0) then message 'Hello,
    world' do i1, size-1 call
    MPI_SEND(message, 12, MPI_CHARACTER , i,
    tag,MPI_COMM_WORLD,ierror)
  • enddo
  • else
  • call MPI_RECV(message, 12, MPI_CHARACTER,
    0,tag,MPI_COMM_WORLD, status, ierror)
  • endif
  • print, 'node', rank, '', message
  • call MPI_FINALIZE(ierror)
  • end

58
int main( int argc, char argv) int tag100
int rank,size,i MPI_Status status char
message12 MPI_Init(argc,argv)
MPI_Comm_size(MPI_COMM_WORLD,size)
MPI_Comm_rank(MPI_COMM_WORLD,rank)
strcpy(message,"Hello,world")
if (rank0) for
(i1iltsizei)
MPI_Send(message,12,MPI_CHAR,i,tag,MPI_COMM_WORLD)
else
MPI_Recv(message,12,MPI_CHAR,0,tag,MPI_C
OMM_WORLD,status) printf("node d s
\n",rank,message) MPI_Finalize() return
0
59
Hellosend
60
Some useful remarks
  • Source MPI_ANY_SOURCE means that any source is
    acceptable
  • Tags specified by sender and receiver must match,
    or MPI_ANY_TAG any tag is acceptable
  • Communicator must be the same for send/receive.
    Usually MPI_COMM_WORLD

61
Computing pi using MPI
62
Computing pi using MPI(2)
63
Computing pi using MPI(3)
64
Computing pi using MPI(4)
65
Broadcast
  • Send data on one node to all other nodes in
    communicator.
  • MPI_Bcast(buffer, count, datatype,root,comm,ierr)

66
Broadcast
DATA
A0
A0
P0
A0
P1
A0
P2
A0
P3
67
Performance evaluation
  • Fortran
  • Real8 t1
  • T1 MPI_Wtime() ! Returns elapsed time
  • C
  • double t1
  • t1 MPI_Wtime ()

68
MPI References
  • The MPI Standard
  • www-unix.mcs.anl.gov/mpi/index.html
  • Parallel Programming with MPI,Peter S.
    Pacheco,Morgan Kaufmann,1997
  • Using MPI, W. Gropp,Ewing Lusk,Anthony Skjellum,
    The MIT Press,1999.

69
5) Queuing system Sun Grid Engine
70
Sun Grid Engine
  • Open-source batch queuing system similar to PBS
    or LSF
  • Automatically runs jobs on less loaded nodes
  • Queue jobs for later execution to avoid
    overloading of system

71
Queues definition
  • System job execution policy
  • Resource allocation
  • Resource limits
  • Accounting

72
SGE properties
  • Can schedule serial or MPI jobs
  • - serial jobs run in individual host queues
  • - parallel jobs must include a parallel
    environment request

73
Working with SGE jobs
  • There are command for querying or modifying the
    status of a job running or queued by SGE
  • - qsub submit a job
  • - qstat - query the status of a job
  • - qdel - deleting a job from SGE

74
Submitting a serial job
  • Create a submit script (basic.sh)
  • !/bin/sh
  • scalar example
  • Echo This code is running on hostname date
  • end of script

75
Submitting a serial job
  • The job is submitted to SGE using the qsub
    command
  • qsub basic.sh

76
2 ways of submitting
  • With arguments
  • qsub o outputfile j y cwd basic.sh
  • In submit script

77
Monitoring a job - QSTAT
  • To list the status and node properties
  • Qstat

78
Monitoring a job - qstat
  • qstat output important fields
  • Job identifier
  • Job status
  • - qw- queued and waiting
  • - t job transferring and about to start
  • - r job running on listed hosts
  • - d job has been marked for deletion

79
Deleting a job - QDEL
  • Single job qdel 151
  • List of jobs
  • qdel 151 152 153
  • All jobs under user
  • qdel u artemis

80
Output produced by jobs
  • By default , we get 2 files
  • ltscriptgt.o.ltjobidgt std output
  • ltscriptgt.e.ltjobidgt error messages
  • For parallel jobs, also
  • ltscriptgt.po.ltjobidgt list of processors the
    job ran on

81
Debugging job failures
82
Script for submitting parallel jobs
  • Mpisub gets as input number of processors and
    script
  • Ex mpisub 8 ltmyscript.shgt

83
Parallel MPI jobs and SGE
  • SGE uses the concept of a parallel environment
    (PE)
  • Several PEs can coexist on the machine
  • Each host has an associated queue and resource
    list (time,memory)
  • A PE is a list of hosts along with a set number
    of job slots

84
List of queues
85
Qstat options
86
Thanks for your attention!!
B
Write a Comment
User Comments (0)
About PowerShow.com