Programming Multicore Processors - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Programming Multicore Processors

Description:

Title: Computer System Overview Author: Patricia Roy Last modified by: Aamir Created Date: 6/26/1999 9:48:38 PM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:193
Avg rating:3.0/5.0
Slides: 31
Provided by: Patric445
Category:

less

Transcript and Presenter's Notes

Title: Programming Multicore Processors


1
Programming Multicore Processors
  • Aamir Shafi
  • High Performance Computing Lab
  • http//hpc.seecs.nust.edu.pk

2
Serial Computation
  • Traditionally, software has been written for
    serial computation
  • To be run on a single computer having a single
    Central Processing Unit (CPU)
  • A problem is broken into a discrete series of
    instructions

3
Parallel Computation
  • Parallel computing is the simultaneous use of
    multiple compute resources to solve a
    computational problem
  • Also known as High Performance Computing (HPC)
  • The prime focus of HPC is performancethe ability
    to solve biggest possible problems in the least
    possible time

4
Traditional Usage of Parallel Computing--Scientifi
c Computing
  • Traditionally parallel computing is used to solve
    challenging scientific problems by doing
    simulations
  • For this reason, it is also called Scientific
    Computing
  • Computational science

5
Emergence of Multi-core Processors
  • In the last decade, performance of processors is
    not enhanced by increasing clock speed
  • Increasing clock speed directly increases power
    consumption
  • Power is dissipated as heat, not practical to
    cool down processors
  • Intel canceled a project to produce 4 GHz
    processor!
  • This led to the emergence of multi-core
    processors
  • Performance is increased by increasing processing
    cores that run on lower clock speed
  • Implies better power usage

Disruptive Technology!
6
Moores Law is Alive and Well
7
Power Wall
8
Why Multi-core Processors Consume Lesser Power
  • Dynamic power is proportional to V2fC
  • Increasing frequency (f) also increases supply
    voltage (V) more than linear effect
  • Increasing cores increases capacitance (C) but
    has only a linear effect

9
Software in the Multi-core Era
  • The challenge has been thrown to the software
    industry
  • Parallelism is perhaps the answer
  • The Free Lunch Is Over A Fundamental Turn Toward
    Concurrency in Software
  • http//www.gotw.ca/publications/concurrency-ddj.ht
    m
  • Some excerpts
  • The biggest sea change in software development
    since the OO revolution is knocking at the door,
    and its name is Concurrency
  • This essentially means every software programmer
    will be a parallel programmer
  • The main motivation behind conducting this
    Programming Multicore Processors workshop

10
About the Programming Multicore Processors
Workshop
11
Instructors
  • This workshop will be taught by
  • Akbar Mehdi (http//hpc.seecs.nust.edu.pk/akbar/)
  • Masters from Stanford University, USA
  • NVIDIA CUDA API, POSIX Threads, Operating
    Systems, Algorithms
  • Mohsan Jameel (http//hpc.seecs.nust.edu.pk/mohsa
    n/)
  • Masters from KTH, Sweden
  • Scientific Computing, Parallel Computing
  • Languages, OpenMP

12
Course Contents A little background on
Parallel Computing Approaches
13
Parallel Hardware
  • Three main classifications
  • Shared Memory Multi-processors
  • Symmetric Multi-Processors (SMP)
  • Multi-core Processors
  • Distributed Memory Multi-processors
  • Massively Parallel Processors (MPP)
  • Clusters
  • Commodity and custom clusters
  • Hybrid Multi-processors
  • Mixture of shared and distributed memory
    technologies

14
First Type Shared Memory Multi-processors
  • All processors have access to shared memory
  • Notion of Global Address Space

15
Symmetric Multi-Processors (SMP)
  • A SMP is a parallel processing system with a
    shared-everything approach
  • The term signifies that each processor shares the
    main memory and possibly the cache
  • Typically a SMP can have 2 to 256 processors
  • Also called Uniform Memory Access (UMA)
  • Examples include AMD Athlon, AMD Opteron 200 and
    2000 series, Intel XEON etc

16
Multi-core Processors
17
Second Type Distributed Memory
  • Each processor has its own local memory
  • Processors communicate with each other by message
    passing on an interconnect

18
Cluster Computers
  • A group of PCs or workstations or Macs (called
    nodes) connected to each other via a fast (and
    private) interconnect
  • Each node is an independent computer
  • Each cluster has one head-node and multiple
    compute-nodes
  • Users logon to head-node and start parallel jobs
    on compute-nodes
  • Two popular cluster classifications
  • Beowulf Clusters (http//www.beowulf.org)
  • Rocks Clusters (http//www.rocksclusters.org)

19
Cluster Computer
Proc 1
Proc 2
Proc 0
Proc 3
Proc 7
Proc 6
Proc 4
Proc 5
20
Third Type Hybrid
  • Modern clusters have hybrid architecture
  • Distributed memory for inter-node (between nodes)
    communications
  • Shared memory for intra-node (within a node)
    communications

21
SMP and Multi-core clusters
  • Most modern commodity clusters have SMP and/or
    multi-core nodes
  • Processors not only communicate via interconnect,
    but shared memory programming is also required
  • This trend is likely to continue
  • Even a new name constellations has been proposed

22
Classification of Parallel Computers
Parallel Hardware
Shared Memory Hardware
Distributed Memory Hardware
SMPs
Multicore Processors
Clusters
MPPs
In this workshop, we will learn how to program
shared memory parallel hardware Parallel
Hardware ? Shared Memory Hardware ?
23
Writing Parallel Software
  • There are mainly two approaches for writing
    parallel software
  • The first approach is to use libraries (packages)
    written in already existing languages
  • Economical
  • The second and more radical approach is to
    provide new languages
  • Parallel Computing has a history of novel
    parallel languages
  • These languages provide high level parallelism
    constructs

24
Shared Memory Languages and Libraries
  • Designed to support parallel programming on
    shared memory platforms
  • OpenMP
  • Consists of a set of compiler directives, library
    routines, and environment variables
  • The runtime uses fork-join model of parallel
    execution
  • Cilk
  • A design goal was to support asynchronous
    parallelism
  • A set of keywords
  • cilk_for, cilk_spawn, cilk_sync
  • POSIX Threads (PThreads)
  • Threads Building Blocks (TBB)

25
Distributed Memory Languages and Libraries
  • Libraries
  • Message Passing Interface (MPI)defacto standard
  • PVM
  • Languages
  • High Performance Fortran (HPF)
  • Fortran M
  • HPJava

26
Our Focus
  • Shared Memory and Multi-core Processors Machines
  • Using POSIX Threads
  • Using OpenMP
  • Using Cilk (covered briefly)
  • Disruptive Technology
  • Using Graphics Processing Units (GPUs) by NVIDIA
    for general-purpose computing

We are assuming that all of us know the C
programming language
27
Day One
Timings Topic Presenter
1000 to 1030 Introduction to multicore computing Aamir Shafi
1030 to 1130 Background discussionreview of processes, threads, and architecture. Speedup analysis Akbar Mehdi
1130 to 1145 Break
1145 to 1255P Introduction to POSIX Threads Akbar Mehdi
1255P to 125P Prayers break
125P to 230P Practical SessionRun hello world PThreads program, introduce Linux, top, Solaris. Also introduce the first coding assignment Akbar Mehdi
28
Day Two
Timings Topic Presenter
1000 to 1100 POSIX Threads continued Akbar Mehdi
1100 to 1255P Introduction to OpenMP Mohsan Jameel
1255P to 125P Prayer Break
125P to 230P OpenMP continued Lab session Mohsan Jameel
29
Day Three
Timings Topic Presenter
1000 to 1200 Parallelizing the Image Processing Application using PThreads and OpenMPPractical Session Akbar Mehdi and Mohsan Jameel
1200 to 1255P Introduction to Intel Cilk Aamir Shafi
1255 to 125P Prayer Break
125P to 230P Introduction to NVIDIA CUDA Akbar Mehdi
230P to 235P Concluding Remarks Aamir Shafi
30
Learning Objectives
  • To become aware of the multicore revolution and
    its impact on the computer software industry
  • To program multicore processors using POSIX
    Threads
  • To program multicore processors using OpenMP and
    Cilk
  • To program Graphics Processing Units (GPUs) for
    general purpose computation (using NVIDIA CUDA
    API)

You may download the tentative agenda from
http//hpc.seecs.nust.edu.pk/aamir/res/mc_agenda.
pdf
31
Next Session
  • Review of important and relevant Operating
    Systems and Computer Architecture concepts by
    Akbar Mehdi .
Write a Comment
User Comments (0)
About PowerShow.com