MPI Communication - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

MPI Communication

Description:

Solaris and Linux. Problem. Can't run MPI on between solaris and linux. Origin ... Linux. Sparc-solaris. Linux Cluster(8) Linux Cluster(8) - fork. Private IP 20 nodes ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 21
Provided by: supercom
Category:

less

Transcript and Presenter's Notes

Title: MPI Communication


1
MPI Communication
  • (2002.8.22)
  • Kyung-Lang Park
  • Yonsei Univ. Super Computing Lab.

2
Contents
  • Backup Slides
  • Misc. about MPI and our project
  • Communication Overview
  • Collective Operation Overview
  • Analysis of MPI_Bcast

3
Solaris and Linux
  • Problem
  • Cant run MPI on between solaris and linux
  • Origin
  • MAXHOSTNAMELEN is 64 in Linux but 256 in other
    machines
  • How to patch ?
  • MPI_DIR/mpid/globus2/mpi2.h
  • -define COMMWORLDCHANNELSNAMELEN
    (MAXHOSTNAMELEN20)
  • define G2_MAXHOSTNAMELEN 256
  • define COMMWORLDCHANNELSNAMELEN
    (G2_MAXHOSTNAMELEN20)

4
Job Scheduler
  • Misunderstanding
  • We need not to install job scheduler such as lsf
    and pbs on cluster because of the private IP
  • Job scheduler is not related to the private IP
    problem
  • We should install Job scheduler to examine how
    the mpi work in cluster

5
Resource Management Architecture
RSL specialization
RSL
Application
Information Service
Queries
Info
Ground RSL
Simple ground RSL
Local resource managers
GRAM
GRAM
GRAM
LSF
Condor
Fork/default
6
Job Scheduler Problem
DUROC
DUROC
GRAM
GRAM
GRAM
GRAM
GRAM
LSF
PBS
0
0
0
node101
0
0
node201
node101
node102
node201
node102
1
1
node202
GRAM
GRAM
GRAM
2
node103
0
0
0
node104
3
node103
node104
node202
Subjob_size4
Subjob_size4
SUBJOB_INDEX0
SUBJOB_INDEX1
7
???
???
???
NGrid MPI Team Testbed
KISTI
Yonsei
Private IP 20 nodes
Compaq SMP(82)
Linux Cluster
sdd111
LSF
PBS
sdd112
Linux
Sparc-solaris
Private IP
LSF
sdd113
Fork?
sdd114
cybercs
supercom
imap
parallel
Sogang
Kut
80 node Cluster
Linux Cluster(8) - fork
Linux Cluster(8)
LSF16
mercury
intel
venus
LSF24
LSF24
mars
alpha
jupitor
LSF16
8
MPI Communication (cont.)
  • Preparing Communication - MPI_Init()
  • Get basic information
  • Gather information of each process
  • Create ChannelTable
  • Make passive socket
  • Register listen_callback() function
  • Sending Message MPI_Send()
  • Get protocol information from C.T.
  • Open socket using globus_io
  • Write data to socket using globus_io

9
MPI Communication
  • Receiving Message listen_callback
  • Accept socket connection
  • Reading data from socket
  • copy data into recv-queue
  • Receiving Message MPI_Recv(..buf..)
  • Search recv-queue
  • Copy data from recv-queue to buf

10
Making Channel Table
DUROC
CO-ALLOCATION
Subjob_size4
Subjob_size2
Subjob_size2
Subjob_size4
SUBJOB_INDEX0
SUBJOB_INDEX1
SUBJOB_INDEX2
SUBJOB_INDEX3
0
0
0
0
1
1
1
1
2
2
3
3
Grid2.sogang.ac.kr
Grid1.yonsei.ac.kr
Grid2.yonsei.ac.kr
Grid1.sogang.ac.kr
11
Commworldchannels
Struct channel_t CommworldChannels
Struct miproto_t Proto_list
Struct miproto_t Selected_proto
0
Type(tcp,mpi,unknown)
Void info
next
Type(tcp,mpi,unknown)
Void info
next
1
Struct miproto_t Proto_list
Struct miproto_t Selected_proto
Type(tcp,mpi,unknown)
Void info
next
Hostname
port
handlep
whandle
header
To_self
Connection_lock
Connection_cond
attr
Cancell_head
Cancel_tail
Send_head
Send_tail
nprocs
Struct miproto_t Proto_list
Struct miproto_t Selected_proto
12
MPI Communication
12. Read data from posted Queue
Process A (Rank 3)
Process B (Rank 5)
1. Creating passive socket
5. Connection
6. Accept socket
7. Send data
8. Call listen_callback9. Copy data to unexpected
4. Making socket for writing
2. Getting information of Process B
COMMWORLDCHANNEL
MPID_recvs
Rank 5 selected protocol tcp , link -gt
posted
unexpected
10. Move data to posted11. Delete original buf
3. Getting protocol information
Hostname
port
handlep
whandle
header
To_self
Connection_lock
Connection_cond
attr
Cancell_head
Cancel_tail
Send_head
Send_tail
13
Collective Operation
  • Communicate between multiple processes
    simultaneously
  • Patterns
  • Root sends data to all processes
  • broadcast and scatter
  • Root receives data from all processes
  • gather
  • Each process communicates with each process
  • allgather and alltoall

14
Basic Concept (cont.)
  • Flat tree vs. binomial
  • Tr-Ts is small,
  • binomial is better
  • Tr-Ts is large,
  • flat tree is better
  • M.Bernaschi et al. Collective Communication
    Operation Experimental Result vs. Theory, April
    1998

15
Basic Concept
  • Exploiting Hierarchy
  • WAN_TCP lt LAN_TCP lt intra TCP lt vendor MPI.

m1.utech.edu
m1.utech.edu
vendor MPI
Intra TCP
LAN_TCP
p0
p10
1. WAN_TCP Level P0, P20
2. LAN_TCP Level P0, P10
WAN_TCP
3. Intra TCP Level P10,., P19
vendor MPI
p20
4. Vendor MPI Level P0, P9, P20, P29
c1.nlab.gov
16
MPI_Bcast (cont.)
MPI_Bcast(buf,comm)
comm_ptr MPIR_To_Pointer(comm)
comm_ptr-gtcollops-gtBcast(buf)
type MPI_INTRA
Intra_Bcast(buf)
Inter_Bcast(buf)
Not Supported yet
ifdef MPID_Bcast()
MPID_FN_Bcast(buf)
Intra_Bcast(buf)
Topology-aware bcast
binomial bcast
17
MPI_Bcast (cont.)
MPID_FN_Bcast(buf)
involve(comm,set_info)
allocate request
all sets in set_info
level 0
flat_tree_bcast(buf)
binomial_bcast(buf)
Im root in this set
MPI_Recv(buf) from parent
MPI_Isend(buf)
MPI_Recv(buf)
MPI_Send(buf) to parent
18
struct multiple_set_t
set_info
Rank 0
num
set
size
level
root_index
my_rank_index
set
0,20
size
level
root_index
my_rank_index
set
0,10
size
level
root_index
my_rank_index
set
09
set_info
Rank 10
num
set
size
level
root_index
my_rank_index
set
0,10
size
level
root_index
my_rank_index
set
1019
19
Project ?? ??
  • Testbed ?? ???
  • ???? ??? ?? ?????? ??
  • ???? MPI ????? ??
  • CPI, Matrix Multiplication
  • ???? ???? MPI?? ??

20
Future Work
  • ??? ?? ?? ??
  • ???? ?? ??
  • ???? ??? ??
  • ?? ????
  • CPI, Matrix Multiplication
  • Nas Parallel Test Bench
  • ???? ???? ??
  • Isend, Bsend, Ssend
  • ?? ?? ?? ?? ? ?????
Write a Comment
User Comments (0)
About PowerShow.com