Chapter 6 Multiprocessor System - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Chapter 6 Multiprocessor System

Description:

Chapter 6 Multiprocessor System Introduction Each processor in a multiprocessor system can be executing a different instruction at any time. The major advantages of ... – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 25
Provided by: 8332
Category:

less

Transcript and Presenter's Notes

Title: Chapter 6 Multiprocessor System


1
Chapter 6 Multiprocessor System
2
Introduction
  • Each processor in a multiprocessor system can be
    executing a different instruction at any time.
  • The major advantages of MIMD system
  • Reliability
  • High performance
  • The overhead involved with MIMD
  • Communication between processors
  • Synchronization of the work
  • Waste of processor time if any processor runs out
    of work to do
  • Processor scheduling

3
Introduction (continued)
  • task
  • An entity to which a processor is assigned
  • a program, a function or a procedure in execution
  • process
  • another word for a task
  • processor (or processing element)
  • hardware resource on which tasks are executed

4
Introduction (continued)
  • Thread
  • The sequence of tasks performed in succession by
    a given processor
  • The path of execution of a processor through a
    number of tasks.
  • Multiprocessors provide for the simultaneous
    presence of a number of threads of execution in
    an application.
  • Refer to Example 6.1 (degree of parallelism
    3)

5
R-to-C ratio
  • A measure of how much overhead is produced per
    unit of computation.
  • R the length of the run time of the task
    (computation time)
  • C the communication overhead
  • This ratio signifies task granularity
  • A high R-to-C ratio implies that communication
    overhead is insignificant compared to computation
    time.

6
Task granularity
  • Task granularity
  • Coarse grain parallelism
  • High R-to-C ratio
  • Fine grain parallelism
  • Low R-to-C ratio
  • The general tendency to maximum performance is to
    resort to the finest possible granularity. ?
    providing for the highest degree of parallelism.
  • Maximum parallelism does not lead to maximum
    overhead. ? a trade-off is required to reach an
    optimum level.

7
6.1 MIMD Organization(Figure 6.2)
  • Two popular MIMD organizations
  • Shared memory (or tightly coupled ) architecture
  • Message passing (or loosely coupled) architecture
  • Share memory architecture
  • UMA (uniform memory architecture)
  • Rapid memory access
  • Memory contention

8
6.1 MIMD Organization (continued)
  • Message-passing architecture
  • Distributed memory MIMD system
  • NUMA (nonuniform memory access)
  • Heavy communication overhead for remote memory
    access
  • No memory contention problem
  • Other models
  • Mixed of two

9
6.2 Memory Organization
  • Two parameters of interest in MIMD memory system
    design
  • bandwidth
  • latency.
  • Memory latency is reduced by increasing the
    memory bandwidth.
  • By building the memory system with multiple
    independent memory modules (Banked and
    interleaved memory architecture)
  • By reducing the memory access and cycle times

10
Multi-port memories
  • Figure 6.3 (b)
  • Each memory module is a three-port memory device.
  • All three ports can be active simultaneously.
  • The only restriction is that only one location
    can be write data into a memory location.

11
Cache incoherence
  • The problem wherein the value of a data item is
    not consistent throughout the memory system.
  • Write-through
  • A processor updates the cache and also the
    corresponding entry in the main memory.
  • Updating protocol
  • Invalidating protocol
  • Write-back
  • An updated cache-block is written back to the
    main memory just before that block is replaced in
    the cache.

12
6.2 Memory Organization (continued)
  • Cache coherence schemes
  • Not to use private caches (Figure 6.4)
  • With private cache architecture, but to cache
    only non-sharable data items.
  • Cache flushing
  • Shared data are allowed to be cached only when
    it is known that only one processor will be
    accessing the data

13
6.2 Memory Organization (continued)
  • Cache coherence schemes (continued)
  • Bus watching (or bus snooping) (Figure 6.5)
  • Bus watching schemes incorporate hardware that
    monitors the shared bus for data LOAD and STORE
    into each processors cache controller.
  • Write-once
  • The first STORE causes a write-through to the
    main memory.
  • Ownership protocol

14
6.3 Interconnection Network
  • Bus (Figure 6.6)
  • Bus window (Figure 6.7(a))
  • Fat tree (Figure 6.7 (b))
  • Loop or ring
  • token ring standard
  • Mesh

15
6.3 Interconnection Network(continued)
  • Hypercube
  • Routing is straightforward.
  • The number of nodes must be increased by powers
    of two.
  • Crossbar
  • It offers multiple simultaneous communications
    but at a high hardware complexity.
  • Multistage switching networks

16
6.4 Operating System Considerations
  • The major functions of the multiprocessor system
  • Keeping track of the status of all the resources
    at all time
  • Assigning tasks to processors in a justifiable
    manner
  • Spawning and creating new processors such that
    they can be executed in parallel or independently
    of each other.
  • Collecting their individual results when all the
    spawned processed are completed and passing them
    to other processors as required.

17
6.4 Operating System Considerations (continued)
  • Synchronization mechanisms
  • Processes in an MIMD operate in a cooperative
    manner and a sequence control mechanism is needed
    to ensure the ordering of operations.
  • Processes compete with each other to gain access
    to shared data items.
  • An access control mechanism is needed to maintain
    orderly access

18
6.4 Operating System Considerations (continued)
  • Synchronization mechanisms
  • The most primitive synchronization techniques
  • Test set
  • Semaphores
  • Barrier synchronization
  • Fetch add
  • Heavy-weight process and Light-weight process
  • Scheduling
  • Static
  • Dynamic load balancing

19
6.5 Programming (continued)
  • Four main structures of parallel programming
  • Parbegin / parend
  • Fork / join
  • Doall
  • Processes, tasks, procedures, and so on can be
    declared for parallel execution.

20
6.6 Performance Evaluation and Scalability
  • Performance evaluation
  • Speed-up S Ts / Tp
  • To TpP-Ts ? Tp(ToTs)/P
  • S Ts P/(ToTs)
  • Efficiency E S/p
  • Ts/(TsTo)
  • 1/(1To/Ts)

21
Scalability
  • Scalability the ability to increase speedup as
    the number of processors increase.
  • A parallel system is scalable if its efficiency
    can be maintained at a fixed value by increasing
    the number of processors as the problem size
    increases.
  • Time-constrained scaling
  • Memory-constrained scaling

22
Isoefficiency function
  • E 1/(1To/Ts)
  • ? To/Ts(1-E)/E.
  • Hence, TsETo/(1-E)
  • For a given value of E, E/(1-E) is a constant,
    K.
  • Then TsKTo (Isoefficency function)
  • A small isoeffiency function indicates that small
    increments in problem size are sufficient to
    maintain efficiency when p is increased.

23
6.6 Performance Evaluation and Scalability
(continued)
  • Performance models
  • The basic model
  • Each task is equal and takes R time units to be
    executed on a processor.
  • If two tasks on different processors wish to
    communicate with each other, they do so at a cost
    C time units.
  • Model with linear communication overhead
  • Model with overlapped communication
  • Stochastic model

24
Examples
  • Alliant FX series
  • Figure 6.17
  • Parallelism
  • Instruction level
  • Loop level
  • Task level
Write a Comment
User Comments (0)
About PowerShow.com