Title: UMass%20Lowell%20Computer%20Science%2091.503%20Analysis%20of%20Algorithms%20Prof.%20Karen%20Daniels%20Fall,%202001
1 UMass Lowell Computer Science 91.503 Analysis
of Algorithms Prof. Karen Daniels Fall, 2001
- Lecture 9
- Tuesday, 11/20/01
- Parallel Algorithms
- Chapters 28, 30
2Relevant Sections of Chapters 28-30
Youre responsible for material in this chapter
that we discuss in lecture. (Note that this
includes all sections 28.1 - 28.5.)
Ch28 Sorting Networks
Youre not responsible for any of the material in
this chapter. We will not be discussing it in
lecture.
Ch29 Arithmetic Circuits
Youre responsible for material in this chapter
that we discuss in lecture. (Note that this
includes only sections 30.1 - 30.2.)
Ch30 Algorithms for Parallel Computers
3Overview
- Sorting Networks
- Comparison Networks
- 0-1 Principle
- Bitonic Sorting Network
- Merging Network
- Sorting Network
- Algorithms for Parallel Computers
- PRAM Model
- Pointer Jumping
- CRCW Algorithms vs. EREW Algorithms
4Sorting NetworksChapter 28
- Comparison Networks
- 0-1 Principle
- Bitonic Sorting Network
- Merging Network
- Sorting Network
5Comparison NetworksDefinition
Comparison Network only performs comparisons.
Comparisons may occur in parallel.
Comparison Network contains only comparators
wires.
2-input comparator
input wires
output wires
source 91.503 textbook Cormen et al.
6Comparison Networks Definition (continued)
Running Time
Comparator uses Q(1) time.
Define time using wire depth.
Graph of interconnections must be acyclic.
Input wire has depth 0.
Comparator with input wire depths dx, dy has
output wire depths
Depth of comparison network max depth of a
comparator.
source 91.503 textbook Cormen et al.
7Sorting Network Definition
- Sorting Network
- Comparison Network for which output sequence is
monotonically increasing
Example
source 91.503 textbook Cormen et al.
8Sorting Network Structure
Families of Comparison Networks
Recursive Structure
Parallel MergeSort Strategy
Sort n values in O( lg2n ) time
90-1 Principle
If sorting network works correctly for 0,1
inputs, it works correctly on arbitrary input
numbers.
allows us to limit attention to 0,1 inputs
Proof relies on function monotonicity
source 91.503 textbook Cormen et al.
100-1 Principle (continued)
f monotonically increasing comparator
with inputs f(x), f(y) produces
outputs f(min(x,y)), f(max(x,y))
Induction on wire depth
source 91.503 textbook Cormen et al.
110-1 Principle (continued)
- Example applying Lemma 28.1
source 91.503 textbook Cormen et al.
120-1 Principle (continued)
If sorting network works correctly for 0,1
inputs, it works correctly on arbitrary input
numbers.
allows us to limit attention to 0,1 inputs
source 91.503 textbook Cormen et al.
13Sorting Network Structure
Families of Comparison Networks
COMPARATORs
HALF-CLEANERs
Recursive Structure
Bitonic Sorting Networks
BITONIC-SORTERs
Parallel MergeSort Strategy
Merging Networks
Sorting Networks
MERGERs
Sort n values in O( lg2n ) time
SORTERs
14Bitonic Sorting Network
- Bitonic Sequence
- monotonically increases then monotonically
decreases - or can be circularly shifted to conform to this
- Example lt 1, 4, 6, 8, 3, 2 gt
- 0,1 bitonic sequence has structure
- 0i 1j 0k or 1i 0j 1k
- Bitonic Sorter
- comparison network that sorts bitonic 0,1
sequences - will be used to construct Sorting Network
source 91.503 textbook Cormen et al.
15Bitonic Sorting Network
- Bitonic Sorter uses HALF-CLEANERs
HALF-CLEANER - comparison network of depth 1
- input line i compared with line i n/2 for i
1,2,,n/2
Sample inputs outputs
source 91.503 textbook Cormen et al.
16Bitonic Sorting Network
source 91.503 textbook Cormen et al.
17Bitonic Sorting Network
source 91.503 textbook Cormen et al.
18Bitonic Sorting Network
BITONIC-SORTERn/2
HALF-CLEANERn
BITONIC-SORTERn/2
source 91.503 textbook Cormen et al.
Recurrence for depth of BITONIC-SORTERn
19Sorting Network Structure
Families of Comparison Networks
COMPARATORs
HALF-CLEANERs
Recursive Structure
Bitonic Sorting Networks
BITONIC-SORTERs
Parallel MergeSort Strategy
Merging Networks
Sorting Networks
MERGERs
Sort n values in O( lg2n ) time
SORTERs
20Merging Network
Merge 2 sorted input sequences into 1 sorted
output sequence.
use modification of BITONIC-SORTER
KEY IDEA For sorted input sequences X, Y XYR
is bitonic
can merge X, Y, using BITONIC-SORTER(XYR)
challenge perform reversal implicitly
21Merging Network
22Sorting Network Structure
Families of Comparison Networks
COMPARATORs
HALF-CLEANERs
Recursive Structure
Bitonic Sorting Networks
BITONIC-SORTERs
Parallel MergeSort Strategy
Merging Networks
Sorting Networks
MERGERs
Sort n values in O( lg2n ) time
SORTERs
23Sorting Network
Recurrence for depth of SORTERn
24Algorithms for Parallel Computers Chapter 30
- PRAM Model
- Pointer Jumping
- CRCW Algorithms vs. EREW Algorithms
25PRAM Model
- Need a model for parallel computing
- RAM model is serial
- Sorting network (Ch28) too restrictive
- Popular model PRAM
- Parallel Random Access Machine
26PRAM Model
- Memory Access Policies
- Common-CRCW model
- When processors write simultaneously to same
memory location, they write same value - Alternatives
Section 30.1
Section 30.2
source 91.503 textbook Cormen et al.
27Pointer Jumping List Ranking
List Ranking Problem Given singly-linked list of
n objects, compute, for each object, its distance
from end of list
Correctness Invariant At start of each iteration
of while loop, for each object i, sum of d values
for sublist headed by i correct di
Running-Time Invariant Each step of pointer
jumping transforms each list into 2 interleaved
lists (even, odd).
O( lgn ) time
Work time x processors
Q( nlgn ) work
source 91.503 textbook Cormen et al.
28Pointer Jumping Prefix
start with xixk in each object i of the list
Correctness Invariant At end of tth iteration of
while loop, kth processor stores max(1,k-2t
1),k
Differences from LIST-RANK
At each, if we perform prefix computation on each
existing list, each object obtains correct value.
O( lgn ) time
source 91.503 textbook Cormen et al.
29Pointer Jumping Euler Tour
Problem Compute depth of each node in n-node
binary tree.
1) Construct Euler Tour of a graph (cycle
traversing each edge exactly once.)
O(1) time
3 processors per node
2) Initialize values for each of processor
O(lgn) time
3) Parallel Prefix computation using
source 91.503 textbook Cormen et al.
30CRCW vs. EREW Algorithms
- Problem where concurrent reads help
- Find identities of tree roots in a forest
source 91.503 textbook Cormen et al.
31CRCW vs. EREW Algorithms
- Problem where concurrent writes help
- Find maximum element in array of real numbers
source 91.503 textbook Cormen et al.
32CRCW vs. EREW Algorithms
source 91.503 textbook Cormen et al.