Array Maximum Problem

About This Presentation

Title:

Array Maximum Problem

Description:

Algorithm: Going over an array and keeping the maximum. On a computer ... Bi with Cn 1-i the largest of the two is going to A1 and the other to A2. ... Xn-1 ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 63

Provided by: meirb

Category:

more less

Transcript and Presenter's Notes

Title: Array Maximum Problem

1
Concurrent Programming
???? ?? ??? ???? ???? 027382977 ????? ?????
32033946
2
Module

The module we are talking about is
computer with multiple processors but only one
memory unit.
All the processors are synchronized using the
same clock.
The processors are all connected to each other
and to the memory.
If more then one processor writes the same value
to the same address in memory at the same time
then the value will be written correctly. If the
values are not the same then any value can be
written.

3
Module

More then one processor can read the same memory
address at the same time.
Other modules
The processors are on different computers.
There is no sheared memory for all the
processors.
The processors are not using the same clock.

4
Array Maximum Problem

On a computer with one processor
Time O(N).
Algorithm Going over an array and keeping the
maximum.
On a computer with K processors
Time O(N/K).
Algorithm Each processor handles N/K elements
from the array. And all the sum's of the parts of
the array are summed together.

5
Array Maximum Problem

On a computer with O(N) processors.
Time O(log(N)).
Algorithm On the first stage every processor
will add 2 items. So after the first round will
have N/2 numbers. On the next round N/4
processors each will take 2 numbers and sum them
so we will have on ly N/4 result after the 2
round. After log(N) rounds we will have the sum
of the array.

6
Array Maximum Problem

1 2 3 4 5 6
7 8
Example 8 elements time 3 Log(8).
7
Array Maximum Problem

The number of commutations that are performed is
7 (4 in the first round, 2 in the second and 1 in
the last). This is the same number of computation
that is being done in the serial algorithm but
its being done in less time.
This Algorithm will work for a lot of other
functions not just Max like Min, Sum, Avg.
It will work for every Associative function.

8
Finding The Two Greatest Numbers

Simple solution for O(N) processors.
Algorithm Find the first maximum remove it from
the array and find the second.
Time 2 Log(N).
Smart algorithm for O(N) processors.
Algorithm
First round each processor handles 2 items find
the max and puts the other item in a.
Rounds 2..log(n) each processors handles 2 of
the result of the second round compares the 2 Max
values takes the Max as the new Max. and Takes
the candidate group of the new max adds the max
of the second group to it as the new candidate
group.

9
Finding The Two Greatest Numbers

On The last round the Max of the array is the
maximum and the second max is the maximum of the
candidate group.
Sample
Array 7, 10, 1, 3, 100, 8, 55, 6.

10
Finding The Two Greatest Numbers
100
8
55
10
10
100
7
8
3
55

10 3
100 55
7 1
8 6
7 10 1 3 100
8 55 6
Results The maximum is the maximum of the array
(100) and the second maximum is the maximum of
the candidate group (55).
11
Finding The Two Greatest Numbers

Time
Log(N) LogLog(N).
Log(N) to find the first maximum and the
candidate group.
LogLog(N) to find the maximum in the candidate
group.
The candidate group size grows in 1 in each round
(the maximum of the other group) so at the end
its size is Log(N).

12
Merge problem

Description We have 2 sorted N size arrays B, C
and we need to divide them into 2 new N sized
arrays A1, A2 that the N largest items from both
B and C will be in A1 and the N smallest will be
in A2.
Simple solution We can merge B and C into one
sorted array A and copy the firs N elements to A1
and the last N elements to A2. But with this
algorithm we cant use multiple processors the
cost will still be O(N).

13
Merge problem

Smart algorithm for O(N) processors.
Processor I compares Bi with Cn1-i the largest
of the two is going to A1 and the other to A2.
Correction proof.
If Bi gt Cn1-i the Bi gt B1..Bi-1 and Cn1-i gt
C1..Cn-iso Bi is larger then N elements (I - 1
from B and N - i 1 from C) so Bi needs to be in
A1.
If Cn1-i gt Bi then Cn1-i is larger then N
elements ( N - I from C and I from B ) so Cn1-i
needs to be in A1.

14
Merge problem

Example B 1, 8, 10, 17C 9, 12, 67, 100(B1,
Cn), (B2, Cn-1), (B3, Cn-2), (B4, Cn-3).A1
100, 67, 12, 17.A2 1, 8, 10, 9.
Time We can do all the comparisons at the same
time so the cost will be O(1).

15
Prefix Problem

Description Find the sum of the elements
group.S11 X1S12 X1 X2S1n X1 X2
Xn-1Xn
Simple solution Compute the sums with N
processors time O(NLogN) N sums where each one
takes O(LogN).

16
Prefix Problem

Algorithm
for I 0 to n-1 doip
Si Xi
for j 0 to log n do
for I 2j to n-1 doip
Si Si Si-2j
The doip means do in parallel in the different
processor.
At the end the results are in the array s.

17
Prefix Problem

Example With 8 numbers X1..X8 Sij is Xi
Xi1 Xj.

X1 X2 X3 X4 X5
X6 X7 X8
S11 S12 S23 S34 S45
S56 S67 S78
S11 S12 S13 S14 S25
S36 S47 S58
S11 S12 S13 S14 S15
S16 S17 S18
18
Prefix Problem

Timeeach round we get double the result S1i so
after log(n) rounds we will get all the result.
In order to use this algorithm each processor
needs to be connected to log(n) other processors.

19
Prefix Problem

Usage exampleProblem we have an arithmetic
expression and we need to test if the brackets
arrangement is legal. Algorithm we will create
an array x by adding 1 for each ( and -1 for
each ). And run the prefix algorithm. The
results needs to be.S11 1 and S11..S1n-1gt0
and S1n 0.Time with N processors O(logN)
log(N) for the prefix algorithm and O(1) for the
test.

20
Partition Problem

Description We have and array X that some of
its element are signed we need to move all the
signed elements to one array and the none signed
to another array.
Simple solution We take 2 stacks we push the
signed into one stack and the none signed into
the other stack. It will take o(N) time.
Simple solution 2 We take two indexes one for
the start of the array and one to the end. The
first search for signed and the second for none
signed and when they both find they exchange the
items they point to and move on until they meet.
This will take o(N) time too but its more
parallel.

21
Partition Problem

Smart algorithm for O(N) processors
Create a new array B but in be if the element i
is signed Bi 1 else Bi 0.
Create an array C with the prefix sums of B that
is Ci B1 B2 Bi.
If Xi is signed then Y1Ci Xi.
If Xi is not signed then Y2i-Ci Xi.

22
Partition Problem

Example X 2, 4, 7, 8, 1, 3, 10, 12, 15.

X 2, 4, 7, 8, 1, 3, 10, 12, 15
B 0, 1, 0, 0, 0, 1, 1, 0, 1
C 0, 1, 1, 1, 1, 2, 3, 3, 4
Y1 4, 3, 10, 15
Y2 2, 7, 8, 1, 12
23
Partition Problem

Time with O(N) processor.Computing B
O(1).Computing C O(log(n)) using the prefix
algorithm.Computing Y1 and Y2 O(1).Total
O(log(n)).

24
Sorting Algorithm

Description Sorting array A using O(N2)
processors and put the result into array C.
Simple algorithm The serial algorithm for
sorting an array takes a minimum of O(Nlog(N))
time.
Smart algorithm
Create a matrix B size of NN and initialize it
with zeroes at all cells.
We will look at the N2 processor as a matrix of
processors. Processor Pi,j will compute AigtAj if
true then Bi,j 1.

25
Sorting Algorithm

For each i from 1 to N CSum(i) Ai. When
Sum(i) is the sum of Bi,1 to Bi,N.
Example A3, 5, 2, 9, 1Matrix B 1
2 3 4 5 1 1 0 1 0 1
2 1 1 1 0 1 3 0 0
1 0 1 4 1 1 1 1 0 5
0 0 0 0 1

26
Sorting Algorithm

C 1, 2, 3, 5, 9.
Time
Using O(N2) processors finding B matrix will
take O(1) and finding C will cost O(log(N)).
So the total cost of the algorithm will be
O(log(N)).
Using O(N) processors finding B will take O(N)
time and finding C will take O(N) time so the
total will be O(N).

27
Sorting Algorithm

Description Sorting array A using O(N2)
processors and put the result into array C.
Algorithm Merge sort the largest cost in the
merge sort algorithm is the cost of the merge.
Using a serial algorithm the cost of merging 2
sorted arrays is O(N) and the cost of the merge
sort algorithm is O(Nlog(N)). We will use the
regular algorithm but with a smarter merge
algorithm.

28
Sorting Algorithm

Smart merge algorithm
Description We need to merge two sorted arrays
A, B to a sorted array R.
Algorithm We will describe a recursive algorithm
Merge.Cmerge(even(A), odd(B)).Dmerge(odd(A),
even(B)).Where odd(A) is all the items in A with
an Odd index. And Even(A) is all the items in A
with an even index.

29
Sorting Algorithm

When C C0, C1, C2.Cn D D0, D1,
D2.DnEC0, D0, C1, D1Cn, Dn.Compare each
Ci,Di and if CigtDi then replace Ci and Di in
array E.And array E is the merger of C and D.

30
Sorting Algorithm

Example A 3, 5, 8, 10 B 4, 7, 9,
12Even(A) 5 ,10 Odd(A) 3, 8Even(B) 7,
12 Odd(B) 4, 9C 3, 7, 8, 12D 4, 5, 9,
10E 3, 4, 7, 5, 8, 9, 12, 10After replacing
in EE 3, 4, 5, 7, 8, 9, 10, 12
Time Using O(N) processors the merge will take
O(log(N)) time The merge sort runs the merge
algorithm log(N) times so the total cost of the
merge sort is O(log2(N)).

31
Find Algorithm

Description If array X contains the value Val
the Res needs to be True else Res needs to be
False.
Simple Algorithm Using a serial algorithm it
will take O(N) time.
Smart Algorithm Using O(N) processor.
Res False. Each process i tests if XI Val
if true Res True.
Time O(1).

32
Model Description

Many processors.
Processors can send messages to each other
through communication.
We will want that each processor will have a
unique identification.
Since we have O(n) processors we need O(logn) bit
to represent the Id.

33
Model Description

Clean Net when a processor doesnt now anything
about his neighbors, not even their Ids. he only
knows how many neighbors he have.
We will explicitly mention when dealing with
Clean Net, otherwise every processor has a unique
Id.

34
Model Description

Message should include sender and receiver Id and
some information - total O(logn) bits.
If X wants to send message to Y through Z, it
will cost 2 steps to send the message.

X
Z
Y
35
Model Description

Local computation doesnt take time.
we will analyzetime complexity - the number of
steps the algorithm takes in the worst
case.communication complexity - the total number
of messages that we sent in the execution of the
algorithm in the worst case.

36
Distributed vs. Sequential

Communication - we need in the distributed model
but not in the sequential.
Partial knowledge - together all the processor
knows everything, but not all the processors
necessarily knows everything.
There can be processors or communication channels
down.

37
Distributed vs. Sequential

Synchronization - we need to synchronize the
processor.

38
Synchronic Model

there is a global clock.
In any clock cycle each of the processor- send
messages to his neighbors.- receive messages
from his neighbors.- make local computation in 0
time.- change state.

39
Asynchronies Model

There is no global clock.
if a message was sent it will eventually arrive
to its destination (with no fall downs) but we
can't assume anything about the arrival time.
we will start the time from the beginning of the
execution until the last processor stooped.

40
Asynchronies Model

We will force the assumption that any of the
messages arrived in one time unit in the worst
case for time complexity calculations.

41
Model Representation

We can represent the processors net with a graph.
Each node in the graph is a processor.
There is an edge between two nodes if there is a
direct communication channel between the two
processors they represent.

42
Complexity

C(?, G, I) - communication complexitythe total
number of messages that were sent in the
execution in the worst case.
T(?, G, I) - time complexitythe number of clock
cycles that the execution take in the worst case.
Where ? is the protocol, G is the graph and I is
the input.

43
Complexity - examples

The following examples are in a full graph.

2
1
n
44
Complexity - example 1

Protocol A node 1 send the message m to node 2.
C(A, G, I) 1.
T(A, G, I) 1.

1
2
m
45
Complexity - example 2

Protocol B node 1 send the message mi to the
node i.
C(B, G, I) n.
T(B, G, I) 1.

1
i
mi
? i?G
46
Complexity - example 3

Protocol C node i send the message mi to node
i1.
C(C, G, I) n.
T(C, G, I) 1.

i
i1
mi
? i?G
47
Complexity - example 4

Protocol D node i send the message m to node i1
in cycle i.
C(D, G, I) n.
T(D, G, I) n.

m
1
2
m
2
3
. . .
48
Transmission Problem

Input there is a message m in the node V0.
Output the message m is written in all the nodes
in the graph.
dG(x,y) - the shortest path from x to y in graph
G.
D Diameter(G) max x,y?V dG(x,y) .

49
Algorithms for the Transmission Problem

Direct Delivery.
Spanning Tree.
DFS.
Flooding.

50
Direct Delivery

Bases on the assumptions- there is a routing
system, such as that messages are sent in the
shortest path.- V0 knows the addresses of all
other nodes in the graph.
V0 send the message m n-1 times, each time to a
different node.

51
DD Communication Complexity

V0 sends n messages.
It takes O(D) steps for each massages.
C(DD, G, I) O(nD).

52
DD Time Complexity

Under the assumptions1. synchronic model.2. V0
sends one new message in any clock cycle.
There wont be collisions between messages,
because messages goes in the shortest path, and
therefore we cant have more then one message for
a given distance from V0.

53
DD Time Complexity

The last messages will be sent in the n-1 cycle.
It will take O(D) steps for the last message to
arrive.
T(DD, G, I) O( nD ).

54
DD Time Complexity

We can show the same time complexity even without
assumption 2.
If we will have two messages in a node competing
for the same edge. We will send the message that
should arrive to the node with the smaller Id.
the message for node i, in time t, must be in a
distance t-i1 from V0 (or in Vi).

55
Spanning Tree

AssumptionsWe have a spanning tree in the
graph, that all the node aware off (each node
knows which of his edges is part of the spanning
tree).
Each node that receive the message send it on the
spanning tree edges.

56
Spanning Tree Complexity

We send the message once for each spanning tree
edge.
C(ST) n-1.
We need tree depth rounds until the last node
receive the message.
T(ST) O( Depth( tree, V0 ) ).
If we choose a BFS tree T(ST) O(D).

57
Building a Spanning Tree

If we dont have a spanning tree, we can built
one using any algorithm A for Transmission.
Execute algorithm A.
each node V choose as a parent the node W from
which it received the message for the first time.

58
Building a Spanning Tree

V inform W that he is his parent.
The edge E(W,V) is marked as a spanning tree
edge.
Since transmission algorithm deliver the message
to all nodes, we know that all the nodes are in
the spanning tree.
We have no cycles since V choose only one parent.

59
DFS

We traverse the graph in DFS order.
If we reached a new node we leave a copy of the
message, mark the node and continue the
traversal.
If we reached a marked node we go back.

60
DFS Complexity

In the DFS algorithm we move on each edge exactly
twice.
C(DFS) T(DFS) O(E).

61
Flooding

Each node that receive the message for the first
time, sent it to all of his neighbors.
When a node receive a message in the next times,
it just dump the message.
Flooding is affective also in a Clean Net.

62
Flooding Complexity

In each edge the message will pass twice, once in
each direction.
C(Flood) O(?E?).
After t time unit the message will reach all the
nodes that their distance from V0 is smaller or
equal to t.
T(Flood) O(D).

Write a Comment

User Comments (0)