On-line adaptive parallel prefix computation - PowerPoint PPT Presentation

About This Presentation
Title:

On-line adaptive parallel prefix computation

Description:

... (can be Ps or another Pv) and steals part of the work from that process. Pv computes the local prefix operation on the stolen interval ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 15
Provided by: andreass5
Category:

less

Transcript and Presenter's Notes

Title: On-line adaptive parallel prefix computation


1
On-line adaptive parallel prefix computation
  • Jean-Louis Roch, Daouda Traoré and Julien Bernard
  • Presented by Andreas Söderström, ITN

2
The prefix problem
  • Given X x1,x2,,xn compute the n productspkx0
    ? x1 ? ? xk for 1 k nwhere ? is some
    associative operation
  • Exampleo (i.e. addition)X 1,3,5,7p1
    1p2 13 4 p3 135 9 p4 1357 16

3
Parallel prefix sum (first pass)
Step 3
36
10
26
Step 2
3
7
11
15
Step 1
1
2
3
4
5
6
7
8
Step 0
4
Parallel prefix sum (second pass)
  • For every even position use the value of the
    parent node
  • For evey odd position pn compute pn-1 pn

36
36
10
21
3
10
21
36
6
15
28
5
Parallel prefix computation
  • Parallel time 2n/p O(log n) for p lt n/(log n)
  • Lower bound for parallel time 2n/(p1) for n gt
    p(p1)/2
  • Assumes identical processors!

6
Parallel prefix computation
  • Potential practical problems
  • Processor setup may be heterogenous
  • Processor load may vary due to other users
    computing on the same machine
  • Off-line optimal scheduling potentially not
    optimal anymore!
  • Solution
  • Use on-line scheduling!

7
The basic idea
  • Combine a sequentially optimal algorithm with
    fine-grained parallellism using work stealing

P0
P1
Pn

P2
Steal work
Steal work
8
The algorithm
  • Sequential process Ps
  • The sequential process Ps starts working on p1,
    pk, i.e. value indices 1,k where indices
    k1,m has been stolen
  • When Ps reaches the index k it communicates pk to
    the parallel process Pv that has stolen k1,m
    and recoveres the last index n computed by Pv
    together with the local prefix result rn
  • Ps uses associativity to calculate pn1 pk o rn
    and continues with the computation from index n1

9
The algorithm
  • Parallel process Pv
  • Pv scans for active processes (can be Ps or
    another Pv) and steals part of the work from that
    process.
  • Pv computes the local prefix operation on the
    stolen interval
  • The computation of Pv depends on a previous value
    and need to be finalized when that value is known

10
The algorithm
P0
1
2
3
13
14
15
16
P1
P2
11
Performance
  • If a processor is or becomes slow part of its
    work can be stolen by an idle processor
  • Asymptotic optimality (proof provided in the
    paper)

12
Performance
  • P homogenous processeors

13
Performance
  • P heterogenous processors

14
Questions?
Write a Comment
User Comments (0)
About PowerShow.com