SharedMemory Programming with Threads - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

SharedMemory Programming with Threads

Description:

Allocated with virtual address space control of other resources such as I/O, files... pid = fork() create a child process identical to the parent ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 47
Provided by: math50
Category:

less

Transcript and Presenter's Notes

Title: SharedMemory Programming with Threads


1
Shared-Memory Programmingwith Threads
  • Adapted by Aleksey Zimin from
  • http//navet.ics.hawaii.edu/casanova/courses/ics6
    32_fall07/slides/ics632_threads.ppt

2
The concept of a process
  • Processes are the very basic elements in O.S.
  • Unit of resources ownership
  • Allocated with virtual address space control of
    other resources such as I/O, files.
  • Unit of dispatching (allocating computer
    resources)
  • Execution path and state, dispatching priority.
  • Controlled by OS

3
What is a thread?
  • A thread is an execution path in the code segment
  • O.S. provides an individual Program Counter (PC)
    for each execution path

4
Comments
  • Traditional program is one thread per process.
  • The main thread starts with main()
  • Only one thread or program counter (PC) is
    allowed to execute the code segment
  • To add a new PC, you need to fork() to have
    another PC to execute in another process address
    space.

5
Unixs fork() revisited
  • Process management
  • pid fork() create a child process identical
    to the parent
  • pid waitpid(pid,statloc,options) wait for
    child to terminate
  • exit(status) terminate process execution and
    return status

6
fork() example
  • void main()
  • if (fork() 0)
  • printf( in the child process)
  • else
  • printf( in the parent process)

7
Key benefits of multithreading
  • Less time to create a thread than a process
  • Less time to terminate a thread than a process
  • Less time to switch a thread
  • Enhance efficiency in communication no need for
    kernel to intervene
  • Smaller chance of driving you crazy while writing
    code / debugging

8
Shared memory programming
  • The easiest form of parallel programming
  • Can be used to parallelize a sequential code in
    an incremental way
  • take a sequential code
  • parallelize a small section
  • check that it works
  • check that it speeds things up a bit
  • move on to another section

9
Thread
  • A thread is a stream of instructions that can be
    scheduled as an independent unit.
  • A process is created by an operating system
  • contains information about resources
  • process id, file descriptors, ...
  • contains information on the execution state
  • program counter, stack, ...

10
Thread
  • The concept of a thread requires that we make a
    separation between these two kinds of information
    in a process
  • resources available to the entire process
  • program instructions, global data, working
    directory
  • schedulable entities
  • program counters and stacks.
  • A thread is an entity within a process which
    consists of the schedulable part of the process.

11
Process is still there, whats new for thread?
  • With process
  • Virtual address space (holding process image)
  • Protected access to CPU, files, and I/O
    resources
  • With thread (each thread has its own..)
  • Thread execution state
  • Saved thread context (an independent PC within a
    process)
  • An execution stack
  • Per-thread static storage for local variables
  • Access to memory and resources of its process,
    shared with all other threads in that process

12
Possible combination of thread and processes
One process one thread
One process multiple thread
Multiple processes multiple Threads per process
Multiple processes One thread per process
13
Parallelism with Threads
  • Create threads within a process
  • Each thread does something (hopefully) useful
  • Threads may be working truly concurrently
  • Multi-processor
  • Multi-core
  • Or just pseudo-concurrently
  • Single-proc, single-core

14
Example
  • Say I want to compute the sum of two arrays
  • I can just create N threads, each of which sums
    1/Nth of both arrays and then combine their
    results
  • I can also create N threads that each increment
    some sum variable element-by-element, but then
    Ive got to make sure they dont step on each
    others toes
  • The first version is a bit less shared-memory,
    but is probably more efficient

15
Multi-threading issues
  • There are really two main issues when writing
    multi-threaded code
  • Issue 1 Load Balancing
  • Make sure that no processors/cores is left idle
    when it could be doing useful work
  • Issue 2 Correct access to shared variables
  • Implemented via mutual exclusion create sections
    of code that only a single thread can be in at a
    time
  • Called critical sections
  • Classical variable update example
  • Done via locks and unlocks
  • Warning locks are NOT on variables, but on
    sections of code

16
Threads in Practice
  • Pthreads
  • Popular C library
  • Flexible
  • Will discuss these
  • OpenMP
  • Java Threads

17
Pthreads
  • A POSIX standard (IEEE 1003.1c) API for thread
    creation and synchronization
  • The API specifies the standard behavior
  • Implementation choices are up to developers
  • And implementations vary from systems to systems,
    with some better than some others
  • Common in all UNIX operating systems
  • Some people have written it for Win32
  • The most portable threading library out there
  • What do threads look like in UNIX?

18
User-level / Kernel-level
  • User-level threads Many-to-one thread mapping
  • Implemented by user-level runtime libraries
  • Create, schedule, synchronize threads at
    user-level
  • OS is not aware of user-level threads
  • OS thinks each process contains only a single
    thread of control

19
User-level / Kernel-level
  • Advantages
  • Does not require OS support Portable
  • Can tune scheduling policy to meet application
    demands
  • Lower overhead thread operations since no system
    calls
  • Disadvantages
  • Cannot leverage multiprocessors
  • Entire process blocks when one thread blocks

20
User-level / Kernel-level
  • Kernel-level threads One-to-one thread mapping
  • OS provides each user-level thread with a kernel
    thread
  • Each kernel thread scheduled independently
  • Thread operations (creation, scheduling,
    synchronization) performed by OS

21
User-level / Kernel-level
  • Advantages
  • Each kernel-level thread can run in parallel on a
    multiprocessor
  • When one thread blocks, other threads from
    process can be scheduled
  • Disadvantages
  • Higher overhead for thread operations
  • OS must scale well with increasing number of
    threads

22
Using the Pthread Library
  • Pthread library typically uses kernel-threads
  • Programs must include the file pthread.h
  • Programs must be linked with the pthread library
    (-lpthread)
  • The API contains functions to
  • create threads
  • control threads
  • manage threads
  • synchronize threads

23
pthread_self()
  • Returns the thread identifier for the calling
    thread
  • At any point in its instruction stream a thread
    can figure out which thread it is
  • Convenient to be able to write code that says
    If youre thread 1 do this, otherwise to that
  • include ltpthread.hgt
  • pthread_t pthread_self(void)

24
pthread_create()
  • Creates a new thread of control
  • include ltpthread.hgt
  • int pthread_create (
  • pthread_t thread,
  • pthread_attr_t attr,
  • void (start_routine) (void ),
  • void arg)
  • Returns 0 to indicate success, otherwise returns
    error code
  • thread output argument that will contain the
    thread id of the new thread
  • attr input argument that specifies the
    attributes of the thread to be created (NULL
    default attributes)
  • start_routine function to use as the start of
    the new thread must have prototype void
    foo(void)
  • arg argument to pass to the new thread routine
  • If the thread routine requires multiple
    arguments, they must be passed bundled up in an
    array or a structure

25
pthread_create() example
  • Want to create a thread to compute the sum of the
    elements of an array
  • void do_work(void arg)
  • Needs three arguments
  • the array, its size, where to store the sum
  • we need to bundle them in a structure
  • struct arguments
  • long int array
  • long int size
  • long int sum

26
pthread_create() example
  • int main(void)
  • long int arrayARRAY_SIZE, sum, i
  • pthread_t worker_thread
  • struct arguments arg
  • for(i0iltARRAY_SIZEi) arrayi1
  • arg calloc(1,sizeof(struct arguments))
  • arg-gtarray array
  • arg-gtsizeARRAY_SIZE
  • arg-gtsum sum
  • if (pthread_create(worker_thread, NULL,
    do_work, (void )arg))
  • fprintf(stderr,"Error while creating thread")
  • exit(1)
  • ...
  • exit(0)

27
pthread_create() example
  • void do_work(void arg)
  • long int i, size
  • long int array
  • long int sum
  • size ((struct arguments )arg)-gtsize
  • array ((struct arguments )arg)-gtarray
  • sum ((struct arguments )arg)-gtsum
  • sum 0
  • for (i0iltsizei)
  • sum arrayi
  • return NULL

28
Comments about the example
  • The parent thread continues its normal
    execution after creating the child thread
  • Memory is shared by the parent and the child (the
    array, the location of the sum)
  • nothing prevents from the parent doing something
    to it while the child is still executing
  • which may lead to a wrong computation
  • The bundling and unbundling of arguments is a bit
    tedious, but nothing compared to whats needed
    with shared memory segments and processes

29
pthread_exit()
  • Terminates the calling thread
  • include ltpthread.hgt
  • void pthread_exit(
  • void retval)
  • The return value is made available to another
    thread calling a pthread_join() (see later)
  • The previous example had the thread just return
    from function do_work()
  • In this case the call to pthread_exit() is
    implicit
  • The return value of the function serves as the
    argument to the (implicitly called)
    pthread_exit().

30
pthread_join()
  • Causes the calling thread to wait for another
    thread to terminate
  • include ltpthread.hgt
  • int pthread_join(
  • pthread_t thread,
  • void value_ptr)
  • thread input parameter, id of the thread to wait
    on
  • value_ptr output parameter, value given to
    pthread_exit() by the terminating thread (which
    happens to always be a void )
  • returns 0 to indicate success, error code
    otherwise
  • multiple simultaneous calls for the same thread
    are not allowed

31
pthread_kill()
  • Causes the termination of a thread
  • include ltpthread.hgt
  • int pthread_kill(
  • pthread_t thread,
  • int sig)
  • thread input parameter, id of the thread to
    terminate
  • sig signal number
  • returns 0 to indicate success, error code
    otherwise

32
pthread_join() example
  • int main(void)
  • long int array100
  • long int sum
  • pthread_t worker_thread
  • struct arguments arg
  • arg (struct arguments )calloc(1,sizeof(struct
    arguments))
  • arg-gtarray array
  • arg-gtsize100
  • arg-gtsum sum
  • if (pthread_create(worker_thread, NULL,
  • do_work, (void )arg))
  • fprintf(stderr,Error while creating
    thread\n)
  • exit(1)
  • ...
  • if (pthread_join(worker_thread, NULL))
  • fprintf(stderr,Error while waiting for
    thread\n)

33
Synchronizing pthreads
  • As weve seen earlier, we need a system to
    implement locks to create mutual exclusion for
    variable access, via critical sections
  • Lock creation
  • int pthread_mutex_init(
  • pthread_mutex_t mutex, const
    pthread_mutexattr_t attr)
  • returns 0 on success, an error code otherwise
  • mutex output parameter, lock
  • attr input, lock attributes
  • NULL default
  • There are functions to set the attribute (look at
    the man pages if youre interested)

34
Synchronizing pthreads
  • Locking a lock
  • If the lock is already locked, then the calling
    thread is blocked
  • If the lock is not locked, the the calling thread
    acquires it
  • int pthread_mutex_lock(
  • pthread_mutex_t mutex)
  • returns 0 on success, an error code otherwise
  • mutex input parameter, lock
  • Just checking
  • Returns instead of locking
  • int pthread_mutex_trylock(
  • pthread_mutex_t mutex)
  • returns 0 on success, EBUSY is the lock is
    locked, an error code otherwise
  • mutex input parameter, lock

35
Synchronizing pthreads
  • Releasing a lock
  • int pthread_mutex_unlock(
  • pthread_mutex_t mutex)
  • returns 0 on success, an error code otherwise
  • mutex input parameter, lock
  • With locking, trylocking, and unlocking, one can
    avoid all race conditions and protect access to
    shared variables

36
Mutex Example
  • ...
  • pthread_mutex_t mutex
  • pthread_mutex_init(mutex, NULL)
  • ...
  • pthread_mutex_lock(mutex)
  • count
  • pthread_mutex_unlock(mutex)

Critical Section
  • To lock variable count, just put a
    pthread_mutex_lock() and pthread_mutex_unlock()
    around all sections of the code that write to
    variable count
  • Again, youre really locking code, not variables

37
Cleaning up memory
  • Releasing memory for a mutex attribute
  • int pthread_mutex_destroy(
  • pthread_mutex_t mutex)
  • Releasing memory for a mutex
  • int pthread_mutexattr_destroy(
  • pthread_mutexattr_t mutex)

38
Signaling
  • Allows a thread to wait until some process
    signals that some condition is met
  • provides a more sophisticated way to synchronize
    threads than just mutex locks
  • Done with condition variables
  • Example
  • You have to implement a server with a main thread
    and many threads that can be assigned work (e.g.,
    an incoming request)
  • You want to be able to tell a thread there is
    work for you to do
  • Inconvenient to do with mutex locks
  • the main thread must carefully manage a lock for
    each worker thread
  • everybody must constantly be polling locks

39
Condition Variables
  • Condition variables are used in conjunction with
    mutexes
  • Create a condition variable
  • Create an associated mutex
  • We will see why its needed later
  • Waiting on a condition
  • lock the mutex
  • wait on condition variable
  • unlock the mutex
  • Signaling
  • Lock the mutex
  • Signal on the condition variable
  • Unlock mutex

40
pthread_cond_init()
  • Creating a condition variable
  • int pthread_cond_init(
  • pthread_cond_t cond,
  • const pthread_condattr_t attr)
  • returns 0 on success, an error code otherwise
  • cond output parameter, condition
  • attr input parameter, attributes (default
    NULL)

41
pthread_cond_wait()
  • Waiting on a condition variable
  • int pthread_cond_wait(
  • pthread_cond_t cond,
  • pthread_mutex_t mutex)
  • returns 0 on success, an error code otherwise
  • cond input parameter, condition
  • mutex input parameter, associated mutex

42
pthread_cond_signal()
  • Signaling a condition variable
  • int pthread_cond_signal(
  • pthread_cond_t cond
  • returns 0 on success, an error code otherwise
  • cond input parameter, condition
  • Wakes up one thread out of the possibly many
    threads waiting for the condition
  • The thread is chosen non-deterministically

43
pthread_cond_broadcast()
  • Signaling a condition variable
  • int pthread_cond_broadcast(
  • pthread_cond_t cond
  • returns 0 on success, an error code otherwise
  • cond input parameter, condition
  • Wakes up ALL threads waiting for the condition

44
Condition Variable example
  • Say I want to have multiple threads wait until a
    counter reaches a maximum value and be awakened
    when it happens
  • pthread_mutex_lock(lock)
  • while (count lt MAX_COUNT)
  • pthread_cond_wait(cond,lock)
  • pthread_mutex_unlock(lock)
  • Locking the lock so that we can read the value of
    count without the possibility of a race condition
  • Calling pthread_cond_wait() in a loop to avoid
    spurious wakes ups
  • When going to sleep the pthread_cond_wait()
    function implicitly releases the lock
  • When waking up the pthread_cond_wait() function
    implicitly acquires the lock (and may thus sleep)
  • Unlocking the lock after exiting from the loop

45
pthread_cond_timed_wait()
  • Waiting on a condition variable with a timeout
  • int pthread_cond_timedwait(
  • pthread_cond_t cond,
  • pthread_mutex_t mutex,
  • const struct timespec delay)
  • returns 0 on success, an error code otherwise
  • cond input parameter, condition
  • mutex input parameter, associated mutex
  • delay input parameter, timeout (same fields as
    the one used for gettimeofday)

46
PThreads Conclusion
  • A popular way to write multi-threaded code
  • If you know pthreads, youll have no problem
    adapting to other multi-threading techniques
  • Condition variables are a bit odd, but very
    useful
  • For you project you may want to use pthreads
  • More information
  • Man pages
  • PThread Tutorial http//www.llnl.gov/computing/tu
    torials/pthreads/
Write a Comment
User Comments (0)
About PowerShow.com