Parallel Computing Project (OPENMP using LINUX for Parallel application) - PowerPoint PPT Presentation

About This Presentation
Title:

Parallel Computing Project (OPENMP using LINUX for Parallel application)

Description:

The program generated by the compiler is executed by multiple threads ... How to assign code in parallel regions to threads ... into multi-threaded code ... – PowerPoint PPT presentation

Number of Views:474
Avg rating:3.0/5.0
Slides: 31
Provided by: trev50
Category:

less

Transcript and Presenter's Notes

Title: Parallel Computing Project (OPENMP using LINUX for Parallel application)


1
Parallel Computing Project(OPENMP using LINUX
for Parallel application)
  • Summer 2008 Group Project
  • Instructor Prof. Nagi Mekhiel
  • August 12th,, 2008
  • Ravi Illapani
  • Kyunghee Ko
  • Lixiang Zhang

2
OpenMP Parallel Computing Solution Stack

2
3
Recall Basic Idea of OpenMP
  • The program generated by the compiler is executed
    by multiple threads
  • One thread per processor or core
  • Each thread performs part of the work
  • Parallel parts executed by multiple threads
  • Sequential parts executed by single thread
  • Dependences in parallel parts require
    synchronization between threads

3
4
Recall Basic Idea How OpenMP Works
  • User must decide what is parallel in program
  • Makes any changes needed to original source code
  • E.g. to remove any dependences in parts that
    should run in parallel
  • User inserts directives telling compiler how
    statements are to be executed
  • What parts of the program are parallel
  • How to assign code in parallel regions to threads
  • Specifies data sharing attributes shared,
    private, threadprivate

4
5
How The User Interacts with Compiler
  • Compiler generates explicit threaded code
  • Shields user from many details of the
    multithreaded code
  • Compiler figures out details of code each thread
    needs to execute
  • Compiler does not check that programmer
    directives are correct!!!
  • Programmer must be sure the required
    synchronization is inserted
  • The result is a multithreaded object program

5
6
OpenMP Compilers and Platforms
  • Intel C and Fortran Compilers from Intel
  • Intel IA32 Linux/Windows Systems
  • Intel Itanium-based Linux/Windows Systems
  • Fujitsu/Lahey Fortran, C and C
  • Intel Linux Systems, Fujitsu Solaris Systems
  • HP HP-UX PA-RISC/Itanium , HP Tru64 Unix
  • Fortran/C/C
  • IBM XL Fortran and C from IBM
  • IBM AIX Systems
  • Guide Fortran and C/C from Intel's KAI Software
    Lab
  • Intel Linux/Windows Systems
  • PGF77 / PGF90 Compilers from The Portland Group
    (PGI)
  • Intel Linux/Solaris/Windows/NT Systems
  • Freeware Omni, OdinMP, OMPi, OpenUH...
  • Check information at http//www.compunity
    .org

6
7
Structure of a Compiler
  • Front End
  • Read in source program, ensure that it is
    error-free, build the intermediate
    representation(IR)
  • Middle End
  • Analyze and optimize program as much as possible.
    Lower IR to machine-like form
  • Back End
  • Determine layout of program data in memory.
    Generate object code for the target architecture
    and optimize it

7
8
OpenMP Implementation

8
9
OpenMP Implementation (cont)
  • If program is compiled sequentially
  • OpenMP comments and pragmas are ignored
  • If code is compiled for parallel execution
  • Comments and/or pragmas are read, and
  • Drive translation into parallel program
  • Ideally, one source for both sequential and
    parallel program (big maintenance plus)
  • Usually this is accomplished by choosing a
    specific compiler option

9
10
OpenMP Implementation (cont)
  • Transforms OpenMP programs into multi-threaded
    code
  • Figures out the details of the work to be
    performed by each thread
  • Arranges storage for different data and performs
    their initializations shared, private...
  • Manages threads creates, suspends, wakes up,
    terminates threads
  • Implements thread synchronization

10
11
Implementation-Defined Issues
  • OpenMP leaves some issues to the implement
  • Default number of threads
  • Default schedule and default for schedule
    (runtime)
  • Number of threads to execute nested parallel
    regions
  • Behaviour in case of thread exhaustion
  • And many others....
  • Despite many similarities, each implementation is
    a little different from all others

11
12
Butterfly effect
  • The butterfly effect is a phrase that
    encapsulates the more technical notion of
    sensitive dependence on initial conditions in
    chaos theory. Small variations of the initial
    condition of a dynamical system may produce large
    variations in the long term behavior of the
    system
  • As butterfly describes, we gave parameters a
    little change and we got the totally different
    results.

13
System Overview
  • The classical model assumes having a magnetic
    pendulum which is attracted by three magnets with
    each magnet having a distinct color.
  • The magnets are located underneath the pendulum
    on a circle centered at the pendulum mount-point.
    They are strong enough to attract the pendulum in
    a way that it will not come to rest in the center
    position

13
14
System Overview (cont)
15
Beeman Integration Algorithm
  • The formula used to compute the positions at time
    t ?t is
  • and this is the formula used to update the
    velocities

16
(No Transcript)
17
Simulation results
  • Exp 1
  • Single core vs dual core.
  • Performance w.r.t number of threads..
  • Serial vs parallel..
  • 32 tests were conducted

18
(No Transcript)
19
  • Exp 2
  • Simulation when the no.of magnets are changed.
  • Simulation of the behavior of the pendulum.
  • 5 tests were conducted..

20
(No Transcript)
21
(No Transcript)
22
  • Exp 3
  • In this experiment, we simulate the pendulum in a
    field of 2 magnets with varying values of
    friction and gravitation forces.
  • A total number of 63 simulations were run

23
(No Transcript)
24
  • Exp 4
  • In this experiment, we simulate the pendulum in a
    field of 3 magnets with varying values of
    friction and gravitation forces.
  • A total number of 63 simulations were run

25
(No Transcript)
26
  • Exp 5
  • In this experiment, we simulate the pendulum in a
    field of 8 magnets with varying values of
    friction and gravitation forces.
  • A total number of 26 simulations were run

27
(No Transcript)
28
Conclusion
  • Even though the hardware is available, effective
    programming is required to maximize code
    efficiency.
  • Complex simulations can be performed faster using
    parallel architecture.
  • Openmp helps!!
  • Simple everybody can learn it in 11weeks
  • Not so simple Dont stop learning! keep learning
    it for better performance

28
29
References
  • 1 Michael Resch, Edgar Gabriel, Alfred Geiger
    (1999). An Approach for MPI Based Metacomputing,
    High Performance Distributed Computing
    Proceedings of the 8th IEEE International
    Symposium on High Performance Distributed
    Computing, 17, retrieved from ACM website August,
    2008
  • http//portal.acm.org/citation.cfm?id8232
    64collACMdlACMCFID12436242CFTOKEN36621280 
  • 2 William Gropp, Ewing Lusk, Rajeev Thakur
    (1998), A case for using OPENMP's derived
    datatypes to improve I/O performance, Conference
    on High Performance Networking and Computing
    Proceedings of the 1998 ACM/IEEE conference on
    Supercomputing, 1-10, retrieved from ACM website
    August, 2008
  • http//portal.acm.org/citation.cfm?id509
    059collACMdlACMCFID12436242CFTOKEN36621280
  • 3 Michael Kagan (2006), Application
    acceleration through OPENMP overlap, Proceedings
    of the 2006 ACM/IEEE conference on
    Supercomputing, , retrieved from ACM website
    August, 2008
  • http//portal.acm.org/citation.cfm?id118
    8736collACMdlACMCFID12436242CFTOKEN3662128
    0
  • 4 Kai Shen, Hong Tang, Tao Yang (1999),
    Compile/run-time support for threaded OPENMP
    execution on multiprogrammed shared memory
    machines, ACM SIGPLAN Notices Volume 34, Issue 8,
    107 -118, , retrieved from ACM website August,
    2008
  • http//portal.acm.org/citation.cfm?id3011
    14collACMdlACMCFID12436242CFTOKEN36621280
  • 5 Wikipedia Reference, retrieved from
    Wikipedia.org website August, 2008
  • http//en.wikipedia.org/wiki/Beeman's_algo
    rithm
  • http//www.bugman123.com/Fractals/Fractals
    .html
  • http//www.inf.ethz.ch/personal/muellren/p
    endulum/index.htmlsimulation
  • http//en.wikipedia.org/wiki/Chaos_theory
  • http//en.wikipedia.org/wiki/Butterfly_eff
    ect 
  • 6 Software install, compiler, code Reference,
    retrieved website August, 2008
  • http//www.openmp.org/wp/
  • http//www.intel.com/cd/software/products/
    asmo-na/eng/compilers/277618.htm
  • http//www.codeproject.com/KB/recipes/Magn
    eticPendulum.aspx

29
30

30
Write a Comment
User Comments (0)
About PowerShow.com