Parallel Programming Models: Shared Memory Programming, Intro to Message Passing, and Shared Objects - PowerPoint PPT Presentation

About This Presentation
Title:

Parallel Programming Models: Shared Memory Programming, Intro to Message Passing, and Shared Objects

Description:

for (k=0; k L; k ) C[i][j] = A[i][k]*B[k][j]; Running Example: ... Each running in its own address space. Processors have direct access to only their memory ... – PowerPoint PPT presentation

Number of Views:210
Avg rating:3.0/5.0
Slides: 41
Provided by: laxmika
Learn more at: http://charm.cs.uiuc.edu
Category:

less

Transcript and Presenter's Notes

Title: Parallel Programming Models: Shared Memory Programming, Intro to Message Passing, and Shared Objects


1
Parallel Programming ModelsShared Memory
Programming, Intro to Message Passing,and
Shared Objects Programming in Charm
  • Laxmikant Kale

2
Writing parallel programs
  • Programming model
  • How should a programmer view the parallel
    machine?
  • Sequential programming von Neumann model
  • Parallel programming models
  • Shared memory (Shared address space) model
  • Message passing model
  • Shared Objects model

3
Shared Address Space Model
  • All memory is accessible to all processes
  • Processes are mapped to processors, typically by
    a symmetric OS
  • Coordination among processes
  • by sharing variables
  • Avoid stepping on toes
  • using locks and barriers

4
Matrix multiplication
for (i0 iltM i) for (j0 jltN j) for
(k0 kltN k) CIj AikBkj
In a shared memory style, this program is trivial
to parallelize Just have each processor deal
with a different range of I (or J?) (or
Both?)
5
Programming Models 2
  • Basics of Shared Address Space Programming and
    Message-passing

6
Shared Address Space Model
  • All memory is accessible to all processes
  • Processes are mapped to processors, typically by
    a symmetric OS
  • Coordination among processes
  • by sharing variables
  • Avoid stepping on toes
  • using locks and barriers

7
Matrix multiplication
for (i0 iltM i) for (j0 jltN j) for
(k0 kltL k) Cij AikBkj
In a shared memory style, this program is trivial
to parallelize Just have each processor deal
with a different range of I (or J?) (or
Both?)
8
SAS version pseudocode
size M/numPEs( ) myStart myPE( ) for
(imyStart iltmyStartsize i) for (j0 jltN
j) for (k0 kltL k) Cij
AikBkj
9
Running Example computing pi
  • Area of circle prr
  • Ratio of the area of a circle, and that of the
    enclosing square
  • p/4
  • Method compute a set of random number pairs (in
    the range 0-1) and count the number of pairs that
    fall inside the circle
  • The ratio gives us an estimate for p/4
  • In parallel Let each processor compute a
    different set of random number pairs (in the
    range 0-1) and count the number of pairs that
    fall inside the circle

10
Pi on shared memory
int count Lock countLock piFunction(int
myProcessor) seed s makeSeed(myProcessor)
for (I0 Ilt100000/P I) x random(s) y
random(s) if (xx yy lt 1.0)
lock(countLock) count unlock(countLock)
barrier() if (myProcessor 0)
printf(pif\n, 4count/100000)
11
main() countLock createLock()
parallel(piFunction)
The system needs to provide the functions for
locks, barriers, and thread (or process) creation.
12
Pi on shared memory efficient version
int count Lock countLock piFunction(int
myProcessor) int c seed s
makeSeed(myProcessor) for (I0 Ilt100000/P
I) x random(s) y random(s)
if (xx yy lt 1.0) c lock(countLock)
count c unlock(countLock) barrier() if
(myProcessor 0) printf(pif\n,
4count/100000)
13
Real SAS systems
  • Posix threads (Pthreads) is a standard for
    threads-based shared memory programming
  • Shared memory calls just a few, normally
    standard calls
  • In addition, lower level calls fetch-and-inc,
    fetch-and-add

14
Message Passing
  • Assume that processors have direct access to only
    their memory
  • Each processor typically executes the same
    executable, but may be running different part of
    the program at a time

15
Message passing basics
  • Basic calls send and recv
  • send(int proc, int tag, int size, char buf)
  • recv(int proc, int tag, int size, char buf)
  • Recv may return the actual number of bytes
    received in some systems
  • tag and proc may be wildcarded in a recv
  • recv(ANY, ANY, 1000, buf)
  • broadcast
  • Other global operations (reductions)

16
Posix Threads on Origin 2000
  • Shared memory programming on Origin 2000
    Important calls
  • Thread creation and joining
  • pthread_create(pthread_t threadID,
    At,functionName, (void ) arg)
  • pthread_join(pthread_t, threadID, void result)
  • Locks
  • pthread_mutex_t lock
  • pthread_mutex_lock(lock)
  • pthread_mutex_unlock(lock)
  • Condition variables
  • pthread_cond_t cv
  • pthread_cond_init(cv, (pthread_condattr_t ) 0)
  • pthread_cond_wait(cv, cv_mutex)
  • pthread_cond_broadcast(cv)
  • Semaphores, and other calls

17
Declarations
/ pgm.c / include ltpthread.hgt include
ltstdlib.hgt include ltstdio.hgt define nThreads
4 define nSamples 1000000 typedef struct
_shared_value pthread_mutex_t lock int
value shared_value shared_value sval
18
Function in each thread
void doWork(void id) size_t tid (size_t)
id int nsucc, ntrials, i ntrials
nSamples/nThreads nsucc 0
srand48((long) tid) for(i0iltntrialsi)
double x drand48() double y
drand48() if((xx yy) lt 1.0)
nsucc pthread_mutex_lock((sval.lock))
sval.value nsucc pthread_mutex_unlock((sval
.lock)) return 0
19
Main function
int main(int argc, char argv) pthread_t
tidsnThreads size_t i double est
pthread_mutex_init((sval.lock), NULL)
sval.value 0 printf("Creating Threads\n")
for(i0iltnThreadsi)
pthread_create(tidsi, NULL, doWork, (void )
i) printf("Created Threads... waiting for them
to complete\n") for(i0iltnThreadsi)
pthread_join(tidsi, NULL) printf("Threads
Completed...\n") est 4.0 ((double)
sval.value / (double) nSamples)
printf("Estimated Value of PI lf\n", est)
exit(0)
20
Compiling Makefile
Makefile for solaris FLAGS -mt for
Origin2000 FLAGS pgm pgm.c cc -o
pgm (FLAGS) pgm.c -lpthread clean rm
-f pgm .o
21
Message Passing
  • Program consists of independent processes,
  • Each running in its own address space
  • Processors have direct access to only their
    memory
  • Each processor typically executes the same
    executable, but may be running different part of
    the program at a time
  • Special primitives exchange data send/receive
  • Early theoretical systems
  • CSP communicating sequential processes
  • send and matching receive from another processor
    both wait.
  • OCCAM on Transputers used this model
  • Performance problems due to unnecessary(?) wait
  • Current systems
  • Send operations dont wait for receipt on remote
    processor

22
Message Passing
send
receive
copy
data
data
PE0
PE1
23
Basic Message Passing
  • We will describe a hypothetical message passing
    system,
  • with just a few calls that define the model
  • Later, we will look at real message passing
    models (e.g. MPI), with a more complex sets of
    calls
  • Basic calls
  • send(int proc, int tag, int size, char buf)
  • recv(int proc, int tag, int size, char buf)
  • Recv may return the actual number of bytes
    received in some systems
  • tag and proc may be wildcarded in a recv
  • recv(ANY, ANY, 1000, buf)
  • broadcast
  • Other global operations (reductions)

24
Pi with message passing
Int count, c1 main() Seed s
makeSeed(myProcessor) for (I0 Ilt100000/P
I) x random(s) y random(s)
if (xx yy lt 1.0) count send(0,1,4,
count)
25
Pi with message passing
if (myProcessorNum() 0) for (I0
IltmaxProcessors() I) recv(I,1,4,
c) count c printf(pif\n,
4count/100000) / end function main /
26
Collective calls
  • Message passing is often, but not always, used
    for SPMD style of programming
  • SPMD Single process multiple data
  • All processors execute essentially the same
    program, and same steps, but not in lockstep
  • All communication is almost in lockstep
  • Collective calls
  • global reductions (such as max or sum)
  • syncBroadcast (often just called broadcast)
  • syncBroadcast(whoAmI, dataSize, dataBuffer)
  • whoAmI sender or receiver

27
Standardization of message passing
  • Historically
  • nxlib (On Intel hypercubes)
  • ncube variants
  • PVM
  • Everyone had their own variants
  • MPI standard
  • Vendors, ISVs, and academics got together
  • with the intent of standardizing current practice
  • Ended up with a large standard
  • Popular, due to vendor support
  • Support for
  • communicators avoiding tag conflicts, ..
  • Data types
  • ..

28
Parallel Programming tasks
  • Decomposition (what to do in parallel)
  • Mapping
  • Scheduling (sequencing)
  • Machine dependent expression

29
Spectrum of parallel Languages
Leve l
MPI
Specialization
30
Charm
  • Data Driven Objects
  • Asynchronous method invocation
  • Prioritized scheduling
  • Object Arrays
  • Object Groups
  • global object with a representative on each PE
  • Information sharing abstractions

31
Data Driven Execution
Objects
Scheduler
Scheduler
Message Q
Message Q
32
CkChareID mainhandle mainmain(CkArgMsg m)
int i, low 0 for (i0 ilt100 i)
new CProxy_piPart() responders 100 count
0 mainhandle thishandle // readonly
initialization void mainresults(DataMsg
msg) count msg-gtcount if (0
--responders) CkPrintf("pi f \n",
4.0count/100000) CkExit()
Execution begins here
argc/argv
Exit scheduler after method returns
33
piPartpiPart() // declarations..
CProxy_main mainproxy(mainhandle)
srand48((long) this) mySamples
100000/100 for (i 0 ilt mySamples i)
x drand48() y drand48() if
((xx yy) lt 1.0) localCount
DataMsg result new DataMsg result-gtcount
localCount mainproxy.results(result) delete
this
34
Chares (Data driven Objects)
  • Regular C classes,
  • with some methods designated as remotely
    invokable (called entry methods )
  • entry methods have only one parameter
  • of type message
  • Creation of an instance of chare class C
  • new CProxy_C(msg)
  • Creates an instance of C on a specified processor
    pe
  • new CProxy_C (msg, pe)
  • Cproxy_C a proxy class generated by Charm for
    chare class C declared by the user

35
Messages
  • A user-defined C class
  • inherits from a system-defined class
  • messages can be communicated to others as
    parameters
  • Has regular data fields
  • Declaration normal C,
  • inherit from a system defined class
  • Creation (just usual C)
  • MsgType m new MsgType

36
Remote method invocation
  • Proxy Classes
  • For each chare class C, the system generates a
    proxy class.
  • (C CProxy_C)
  • Each chare has a global ID (ChareID)
  • Global in the sense of being valid on all
    processors
  • thishandle (analogous to this) gets you the
    ChareID
  • You can send thishandle in messages
  • Given a handle h, you can create a proxy
  • CProxy_C p(h) // or q new CProxy_C(h)
  • p.method(msg) // or q-gtmethod(msg)

37
Object Arrays
  • A collection of chares,
  • with a single global name for the collection, and
  • each member addressed by an index
  • Mapping of element objects to processors handled
    by the system

Users view
A0
A1
A2
A3
A..
System view
A0
A3
38
Object Groups
  • A group of objects (chares)
  • with exactly one representative on each processor
  • A single Id for the group as a whole
  • invoke methods in a branch (asynchronously), all
    branches (broadcast), or in the local branch
  • creation
  • groupId new Cproxy_C(msg)
  • remote invocation
  • CProxy_C p(groupId)
  • p.methodName(msg) // p.methodName(msg, peNum)
  • p.LocalBranch-gtf(.)

39
Information sharing abstractions
  • Observation
  • Information is shared in several specific modes
    in parallel programs
  • Other models support only a limited sets of
    modes
  • Shared memory everything is shared sledgehammer
    approach
  • Message passing messages are the only method
  • Charm identifies and supports several modes
  • Readonly / writeonce
  • Tables (hash tables)
  • accumulators
  • Monotonic variables

40
Compiling Charm programs
  • Need to define an interface specification file
  • mod.ci for each module mod
  • Contains declarations that the system uses to
    produce proxy classes
  • These produced classes must be included in your
    mod.C file
  • See examples provided on the class web site.
Write a Comment
User Comments (0)
About PowerShow.com