Title: Applications of Probabilistic Quorums to Iterative Algorithms
1Applications of Probabilistic Quorums to
Iterative Algorithms
- HyunYoung Lee, University of Denver
- Jennifer L. Welch, Texas AM University
- presented at ICDCS 2001
2Outline
- The Probabilistic Quorum Algorithm (PQA)
- Abstracting PQA into Random Register (RR)
- Using RRs in Iterative Convergent Algorithms
- Monotone RRs and their Performance
- Simulation Results
- Conclusions
3Outline
- The Probabilistic Quorum Algorithm (PQA)
- Abstracting PQA into Random Register (RR)
- Using RRs in Iterative Convergent Algorithms
- Monotone RRs and their Performance
- Simulation Results
- Conclusions
4Distributed Shared Memory
- Provides illusion of shared variables for
inter-process communication on top of a
message-passing distributed system - Benefits of shared memory paradigm
- familiar from uniprocessor case
- supports good software development practice
- Examples Treadmarks Amza, DASH
Gharachorloo, ...
5Distributed Shared Memory
app proc r
app proc 1
read(Y)
return(Y,5)
write(X,3)
ack(X)
client r
client 1
send
recv
send
recv
network
Implements shared variables X, Y, Z, ...
send
send
recv
recv
server 1
server n
6Replicated Data with Quorums
- Keep a copy of shared variable at n replica
servers that communicate by messages. - A quorum is a subset of replica servers.
- To write client updates copies in a quorum with
new value plus timestamp. - To read client receives copies from a quorum and
returns value with latest timestamp.
7Quorum Intersection
- To ensure each read obtains latest value written,
every read quorum must intersect every write
quorum.
4,900
10,800
4,900
a write quorum
4,900
12,700
a read quorum
8Performance Measures for Quorum Systems
- Availability minimum number of servers that
must fail to disable every quorum Peleg Wool. - Optimal (largest) availability is ?(n).
- Achieved when every set of size ?n/2? 1 is a
quorum. - Load probability of accessing the busiest
server, in the best case Naor Wool. - Optimal (smallest) load is ?(1/?n).
- Tradeoff Theorem Naor Wool For any quorum
system, if load is optimal ?(1/?n), then
availability is at most ?(?n).
9Breaking the Tradeoff with Probabilistic
QuorumsMalkhi, Reiter Wright
- Relax requirement that every read quorum overlap
every write quorum. - Instead, choose each quorum uniformly at random
from the set of all k-sized subsets of the n
replica servers, for k lt n/2. - Theorem If k ?(?n), then
- availability is n - k ?(n)
- load is ?(1/?n)
- To handle server failures keep trying until
enough responses to form a quorum are received.
10Probabilistic Quorums MRW
- Drawback A read quorum might not overlap the
most recent write quorum, causing a read to
return an out-of-date value.
a write quorum
Theorem Probability of not overlapping is lt
e-h2, when k h ?n.
11Programming with PQs
- What are the semantics of the shared variable
(register) implemented by the PQA? - What kind of applications can tolerate reads
returning, with low probability, out-of-date
values?
12Outline
- The Probabilistic Quorum Algorithm (PQA)
- Abstracting PQA into Random Register (RR)
- Using RRs in Iterative Convergent Algorithms
- Monotone RRs and their Performance
- Simulation Results
- Conclusions
13Definition of Random Register
- One writer and multiple readers.
- R1 Every read or write invocation has a
response. - R2 Every read R reads from some write W
- (1) W begins before R ends.
- (2) Rs value is same as W s value.
- (3) W is latest such write.
R(c)
14Definition of Random Register
- R3 For every finite execution ending with a
write W, probability that W is read from
infinitely often is 0 (over all extensions with
an infinite number of writes). - Related Work
- Most work on randomized shared objects concerns
termination, not correct responses. - Afek and Jayanti assumed a fixed subset of
shared objects that can return incorrect values.
15PQA implements an RR
- Theorem 1 PQA implements an RR.
- Proof
- R1 Each invocation gets a response since no
lost messages and only crash failures of servers. - R2 Each read reads a value written by a
previous or overlapping write, since no data
corruption.
16PQA Implements an RR
- R3 Show probability that at least one replica
in a write quorum is never overwritten is 0 - Pr( ? 1 replica survives h writes )
- ? k ? Pr( replica j survives h writes )
- k ? Pr( j ? Q1 ? ? j ? Qh )
- k ? ?hi1 Pr( j ? Qi )
- k((n-k)/n)h
- ? 0 as h ? ?.
17Outline
- The Probabilistic Quorum Algorithm (PQA)
- Abstracting PQA into Random Register (RR)
- Using RRs in Iterative Convergent Algorithms
- Monotone RRs and their Performance
- Simulation Results
- Conclusions
18Iterative Convergent AlgorithmsUresin Dubois
- Repeatedly apply a function to a vector to
produce another vector until reaching a fixed
point. - Responsibility for vector components is
distributed across several processes. - Vector component updates are based on possibly
out-of-date views of the vector components.
19Iterative Convergent Algorithms UD
- Requirements
- A1 All views come from the past.
- A2 Every component is updated infinitely
often. - A3 Each view is used only finitely often.
time
vector components
0
Red views are updated ones.
1
2
Arrows indicate views used in last update.
3
20Iterative Convergent Algorithms UD
- A1, A2, A3 are equivalent to the existence
of a partition of the update sequence into
pseudocycles (p.c.s) - at least one update per component, and
- every view used was created in current or
previous p.c.
X
p.c. i -1
p.c. i
21Asynchronously Contracting Operators
- Theorem UD Sufficient condition on F for
convergence to fixed point, if update sequence
satisfies A1-A3 There exists integer M and
sequence of sets D0, D1, such that - each DK is Cartesian product of m sets
(independence) - D0 ? D1 ? ? DM DM1 fixed point
- If x ? DK, then F(x) ? DK1 for all K.
- Why? At end of K-th p.c., computed vector is in
DK.
m-vector
...
DM
DM-1
D1
D0
fixed point
22Example All Pairs Shortest Path
- G is weighted directed graph with n nodes.
- Compute n x n vector x process i updates i-th
row of x, 1 ? i ? n. - Initially x is adjacency matrix for G.
- F(x) computes y, where yij min 1 ? k ? n xik
xkj. - Shown to be an ACO by UD.
- Claim Worst-case number of pseudocycles for F to
converge is ?log2 diameter(G)?.
23ACOs Correct with RRs
- Theorem 2 If F is an ACO, then every iterative
execution using RRs for the vector components
converges with probability 1. - Proof Show the sequence of updates in the
execution satisfies A1, A2 and A3 with
probability 1. - A1 All views are from the past by R2.
- A2 Application ensures every component is
updated i.o. - A3 holds with probability 1 Each view is used
finitely often with probability 1 by R3.
24Implications
- RRs can be used to implement any ACO, which
includes algorithms for - APSP
- transitive closure
- constraint satisfaction
- solving system of linear equations
- If PQA is used for the RRs, improved load and
availability are provided. - Convergence is guaranteed with probability 1.
- But how long does it take to converge?
25Measuring Time with Rounds
- A round finishes when every process has
- read all the vector components
- applied the function
- updated its own vector components
- at least once.
- How many (expected) rounds per p.c.?
- We dont know with current RR definition, so
modify definition...
26Outline
- The Probabilistic Quorum Algorithm (PQA)
- Abstracting PQA into Random Register (RR)
- Using RRs in Iterative Convergent Algorithms
- Monotone RRs and their Performance
- Simulation Results
- Conclusions
27Monotone RR Definition
- R1 - R3 plus
- R4 If read R by process i reads from W and a
later read R' by i reads from W' , then W' does
not precede W.
R(c)
R'(b)
X
W '(b)
W(c)
28Monotone RR Definition (contd)
- R5 There exists q s.t. for all r,
- Prr reads are needed until W or a later write is
read from ? (1 - q)r-1 q. - So q is the probability of a successful read
(w.r.t. W).
29Monotone RR Algorithm
- Same as previous probabilistic quorum algorithm,
except - Read client keeps track of value with latest
timestamp that it has seen so far. - This value is returned if its timestamp is later
than all those obtained from current quorum. - Theorem 3 Attains q 1 -
- W s or later value is read if a subsequent read
quorum overlaps W s quorum.
(
)
n - k
k
30Monotone RR Rounds per P.C.
- Theorem 4 Expected number of rounds per
pseudocycle, when implementing an ACO with
monotone RRs, is at most 1/q. - Proof For p.c. h to end, each process i must
read from a write ? first write in p.c. h-1. - Once this read occurs for i, every later read by
i is at least as recent, since monotone. - Expected rounds for first read is ? 1/q by R5.
31Messages vs. Rounds for ACOs
- Corollary For monotone PQA, expected rounds
per p.c. is ? (1 - ((n-k)/n)k)-1. - Expression is between 1 and 2 when k ?n.
- Strict quorum system has 1 round per p.c.
- Monotone PQA has gt 1 expected round per p.c. but
may have fewer messages per p.c. - Which has better message complexity?
32Message Complexity for ACOs
- Messages per round in synchronous case
- Each of the m vector components is read by each
of the p processes and written by one. - Each operation generates two messages to each of
the k quorum members. - ? 2m(p1)k.
- MPQA When k ?n, expected messages per p.c.
is c2m(p1)?n, 1 lt c lt 2.
33Comparing Message Complexity
- Recall when k ?n, expected messages per p.c.
for MPQA is c2m(p1)?n, 1 lt c lt 2. - High availability ?(n)
- Strict k ?n/2? 1, so messages per p.c. is
2m(p1)(?n/2? 1). Worse. - Low load ?(1/?n)
- Strict k ?n (e.g., rows and columns of grid),
so messages per p.c. is 2m(p1)?n.
Asymptotically same.
34Outline
- The Probabilistic Quorum Algorithm (PQA)
- Abstracting PQA into Random Register (RR)
- Using RRs in Iterative Convergent Algorithms
- Monotone RRs and their Performance
- Simulation Results
- Conclusions
35Simulation Purpose
- Simulated non-monotone and monotone RR
implementations using PQs with APSP application
to study - difference between synchronous and asynchronous
cases - expected convergence time in non-monotone case
(no analysis) - actual expected convergence time in monotone case
compared to computed upper bound
36Simulation Details
- Input graph
- ?log2 33? 6 pseudocycles to converge.
- Measured rounds till convergence (when simulated
results equaled precomputed actual answer). - Each plotted point is average of 7 runs.
...
1
1
1
1
2
34
37Simulation Results
Computed upper bound is not tight. Synch
asynch are very similar. Monotone is better than
non-monotone.
38Outline
- The Probabilistic Quorum Algorithm (PQA)
- Abstracting PQA into Random Register (RR)
- Using RRs in Iterative Convergent Algorithms
- Monotone RRs and their Performance
- Simulation Results
- Conclusions
39Summary
- Proposed two specifications of randomized shared
variables that can return wrong answers, monotone
and non-monotone random read-write registers. - Both specs can be implemented with PQA of MRW.
- Our specs can be used to implement a significant
class of iterative convergent algorithms,
characterized by UD algorithms converge with
probability 1. - Computed bounds on convergence time and message
complexity for ACOs in monotone case. - Simulation results indicate monotone is faster
than non-monotone, asynch and synch are similar,
and computed upper bound is not tight.
40Future Work
- Are our specs of more general interest? Other
good algs that implement them? Different specs
better? - Useful applications for other shared data
structures (e.g., stack) with errors? How to
specify and implement them? - How to tolerate client failures? Approximate
agreement as an application?