Distributed Algorithms presentation

About This Presentation

Transcript and Presenter's Notes

Title: Distributed Algorithms

1
Distributed Algorithms
Luc J. B. Onana Alima Seif Haridi
2
Introduction

What is a distributed system ?
set of autonomous processors interconnected
in some way
What is a distributed algorithm (protocol) ?
Concurrently executing components each on a
separate
processor
Distributed algorithms can be extremely complex
many components run concurrently locality
failure
non-determinism independent inputs no
global clock,
uncertain message delivery uncertain
messages ordering, .
Can we understand everything about their
executions?

3
Ch9 Models of Distributed Computation

Preliminaries
Notations
Assumptions
Causality
Lamport Timestamps
Vector Timestamps
Causal Communication
Distributed Snapshots
Modeling a Distributed Computation
Execution DAG Predicates
Failures in Distributed System

4
Ch9 Models Preliminaries
Assumptions A1 No shared variables among
processors A2 On each processor there are a
number of executing threads A3
Communication by sending and receiving messages
send(dest,action,param) is
non-blocking A4 Event driven algorithms
reaction upon receipt of a declared
event Events sending or receiving a
message etc. An event is buffered until
it is handeled Dedicated thread to
handle some events at any time
5
Ch9 Models Preliminaries
Notations Waiting for events wait for
A1,A2,,An on Ai (sourceparam) do
code to handle Ai , 1lt i ltn end Waiting
for an event from p up to T seconds wait
until p sends (eventparam), timeoutT
on timeout do timeout
action end on event(param) from p do
Successful response actions
6
Ch9 Models Preliminaries
Notations Waiting for events wait for
A1,A2,,An on Ai (sourceparam) do
code to handle Ai , 1lt i ltn end end
Waiting for an event from p up to T seconds wait
for p on timeout do time-out action end on
Ai(param) from p do action end end
7
Ch9 Models Preliminaries
Notations Waiting for responses from a set
of processors up to T seconds
wait up to T seconds for (eventparam) messages
Eventltmessage handling codegt
To be considered if necessary.
8
Ch9 Models Preliminaries
Concurrency control within an instance of a
protocol Definition Let P be a protocol.
If instance of P at processor q consists of
threads T1, T2, T3, , Tn , we say that
T1, T2, , Tn are in the same family.
they access the same set of variables
need for concurrency control
Assumption used A5 Once a thread gains
control of the processor, it
does not release control to a thread of the
same family until it is blocked.
9
Ch9 Models Causality
There is no global time in a distributed system
? processors cannot do simultaneous
observations of global states
Causality serves as a supporting
property Provided traveling backward in time is
excluded, distributed systems are causal
The cause precedes the effect. The
sending of a message precedes the receipt of that
message
10
Ch9 Models Causality
System composition we assume a distributed
system composed of the set processors P
p1, , pM. Each processor reacts upon
receipt of an event Two classes of events
External/Communication events sending
a message receiving a message Internal
events local input/output raising of a
signal decision on a commit point
(database) etc.
11
Ch9 Models Causality
Notations E the set of all possible
events in our system Ep the set of all
events in E that occur at processor p We are
interested in defining orders between events
Why? In many cases, orders are necessary for
coordinating distributed activities (e.g. many
concurrency control algorithms use ordering of
events well see this later)
12
Ch9 Models Causality
Orders between events 1) on the same
processor p Order ltp, e ltp e ? e
occurs before e in p. If e and e
occur on the same processor p then
either e ltp e or e ltp e i.e. in the
same processor events are totally ordered
p
e
e ltp e
e
Time
13
Ch9 Models Causality
Orders between events 2) of sending message m
and receiving message m Order ltm
If e is the sending of message m, and
e the receipt of message m then
e ltm e
14
Ch9 Models Causality
Orders between events 3) in general (i.e. all
events in E are considered) Order ltH
happens-before or can causally
affect Definition ltH is the union of ltp and ltm
(for all p,m), and transitive (i.e. if e1 ltH
e2 and e2 ltH e3 then e1 ltH e3) Definition we
define a causal path from e to e as a
sequence of events e1,e2,,en such that
1) ee1 een
2) for each i in 1,..,n, ei ltH ei1
Thus, e ltH e if only if there is a causal path
from e to e
15
Ch9 Models Causality
Happens-before is a partial order It is
possible to have two events e and e (e ? e)
such that neither e ltH e nor e ltH e If
two events e and e are such that neither e ltH e
nor e ltH e, then e and e are concurrent
and we write e e The possibility
of concurrent events implies that the
happens-before (ltH) relation is a partial order
16
Ch9 Models Causality
Space-Time diagramHappens-before DAG
No causal path neither from e1 to e2 nor
from e2 to e1 e1 and e2 are concurrent No
causal path neither from e1 to e6 nor
from e6 to e1 e1 and e6 are concurrent No causal
path neither from e2 to e6 nor from e6
to e2 e2 and e6 are concurrent
Dependencies must point forward in time
17
Ch9 Models Causality
Space-Time diagramHappens-before DAG
Compare e1 and e7 e1 and e8 e5 and
e2 e4 and e6
18
Ch9 Models Causality
Global Logical Clock (Time stamps) Although
there is no global time in a distributed system,
a Global Logical Clock (GLC) that assigns total
order to the events in a distributed system is
very useful Such a global logical clock can be
used to arbitrate requests for resources in a
fair manner, breaks deadlock, etc. A GLC should
assign a time stamp t(e) to each event e such
that t(e) lt t(e) or t(e) lt t(e) for e ? e,
furthermore the order imposed by the GLC should
be consistent with ltH., that is if e ltH e then
t(e) lt t(e)
19
Ch9 Models Causality
Lamports Algorithm Gives a Global Logical Clock
consistent with ltH Each event e receives
an integer e.TS such that e ltH e ?
e.TS lt e.TS Concurrent events (unrelated by ltH)
are ordered according to the processor
address (assume these are integers) Timestamps
t(e) (e.TS,p) when e occurs at
processor p Ordering of timestamps
(e.TS,p) lt (e.TS,q) iff e.TS lt e.TS or
e.TS e.TS
and p lt q
20
Ch9 Models Causality
Lamports Algorithm (cont.) Each processor
maintains p a local timestamp my_TS Each
processor attaches its timestamp to all messages
that it sends
21
Ch9 Models Causality
Lamports Timestamp algorithm Initially,
my_TS 0 wait for any event e on e do
if e is the receipt of message m then
my_TS max(m.TS,my_TS)1 e.TS
my_TS elseif e is an internal event then
my_TS my_TS1 e.TS my_TS
elseif e is the sending of message m then
my_TS my_TS1 e.TS my_TS
m.TS my_TS end end
22
Ch9 Models Causality
Lamports Algorithm (cont.) Lamports algorithm
ensures that e ltH e ? e.TS lt
e.TS Reason if e1 ltp e2 or e1 ltm e2 then
e2 is assigned a higher timestamp than e1
Note It is easy to see that the algorithm
presented does not assign total order to
the events in the system. ?
Processor address to break the ties
23
Ch9 Models Causality
Lamports timestamps illustrated
e1
(1,1)
e2
(1,2)
(2,2)
e3
(2,1)
e4
(3,2)
e5
Why e7 is labeled (3,1)? e8 is labeled (4,3)?
e6
(1,3)
(3,1)
e7
e8
(4,3)
Time
24
Ch9 Models Causality
Lamports timestamps algorithm Has the
following properties
Completely distributed
Simple Fault tolerant
Minimal overhead
Many applications
25
Ch9 Models Causality
Vector Timestamps Lamport Timestamps guarantee
that if e ltH e then e.TS lt e.TS
but there is no guarantee that
if e.TS lt e.TS then e ltH e Problem
given two arbitrary events e and e in E,
we want to determine if they are causally
related Why this problem is
interesting?
26
Ch9 Models Causality
Knowing when two events are causally related is
useful
To see this, consider the following H-DAG in
which O is a mobile object
p1
p2
p3
Migrate O on p2
When you debug the system after the red line,
you will find that the object is at p2. So, why
p2 dont know where the object is ?
Where is O ?
On p2
m2
m1
Where is O ?
m3
I dont know
Error !
27
Ch9 Models Causality
Causally precedes relation ltc between messages
Let s(m) be the event of sending message m
r(m) the event of receiving message m
Definition m1 ltc m2 if s(m1) ltH s(m2) A
Causality violation occurs when there are
messages m1 and m2, a processor p such that
s(m1) ltH s(m2) and r(m2) ltp r(m1)
p1
p2
The simplest form of causality violation the
sending events are on the same processor p1 the
receiving events are on the same processor p2
s(m1)
s(m2)
r(m2)
Time
r(m1)
28
Ch9 Models Causality
Causality violation (ex distributed object
system)
When p3 receives I dont know message from p2,
p3 has inconsistent information From p1,
p3 knows O is on p2 but from p2, p3 knows O
is not on p2! The source of the problem is m1
ltc m3 but r(m3) ltp2 r(m1) i.e.
there is a causality violation. Thus for two
events e and e, if we know exactly whether e ltH
e then we can detect causality violation
Vector timestamps gives this.
29
Ch9 Models Causality
Vector Timestamps Idea each
event e indicates for each processor p, all
events at p that are causally before e
30
Ch9 Models Causality
The idea illustrated
p1
p4
p3
p2
1
1
1
1
2
2
3
2
2
3
3
3
4
4
4
5
5
e
6
31
Ch9 Models Causality
Vector Timestamps Idea each
event e indicates which events in each processor
p causally precede e Each event
e has a vector timestamp e.VT such that
e.VT ltV e.VT ? e ltH e e.VT is an array
with an entry for any processor p For any
processor p e.VTp is an integer and
e.VTpk means e causally follows the first k
events that occur at p
(one assumes that each event follows itself)
32
Ch9 Models Causality
The meaning of e.VTp illustrated
p1
p4
p3
p2
1
1
1
1
2
2
3
2
2
3
3
3
e.VTp13 e.VTp26 e.VTp34 e.VTp42
4
4
4
5
5
e
6
33
Ch9 Models Causality
Vector Timestamps Ordering ltV on vector
timestamps is defined as e.VT ltV e.VT iff
a) e.VTi ? e.VTi for all i in
1,..,M and b) there is j in 1,..,M
such that e.VTj ? e.VTj Example
(1,0,3) ltV (2,0,5) (1,1,3) ltV (2,1,3)
(1,1,3) ltV (1,0,3) (1,1,3) ltV (1,1,3)
Property e.VT ltV e.VT only if e causally
follows every event that e causally follows
34
Ch9 Models Causality
Comparison of vector timestamps illustrated
p1
p2
1
1
1
1
e1.VT(5,4,1,3) e2.VT(3,6,4,2) e3.VT(0,0,1,3)
e3.VT ltV e1.VT
2
2
3
2
2
3
4
3
3
5
e3
4
No causal path neither from e1 to e2 nor from e2
to e1. e1 and e2 are concurrent
4
e1
5
e2
6
35
Ch9 Models Causality
The property illustrated
p1
p4
p3
p2
1
1
1
1
2
2
3
2
2
3
3
3
We have that e.VT(0,1,4,2) e.VT(3,6,4,2) e.VT
ltV e.VT
4
4
4
e
5
5
6
e
e causally follows every event that e causally
follows
36
Ch9 Models Causality
Vector timestamps algorithm Initially,
my_VT 0,,0 wait for any event e on e do
if e is the receipt of message m then
for i 1 to M do my_VTi
max(m.VTi,my_VTi)1
my_VTself my_VTself 1 e.VT
my_VT end elseif e is an
internal event then my_VTself
my_VTself1 e.VT my_VT
elseif e is the sending of message m then
my_VTself my_VTself1 e.VT
my_VT m.VT my_VT end end
Here we assume that each processor knows the
names of all the processors in the system
How can we achieve this assumption ? Well see
later
37
Ch9 Models Causality
Vector Timestamp algorithm Ensures e ltH e
? e.VT ltV e.VT Reason 1) e ltp e the
case of internal events at processor p
e.VT ltV e.VT 2) e ltm e
the case of receiving of message m
e.VT ltV e.VT
38
Ch9 Models Causality
Vector Timestamp algorithm Ensures e.VT
ltV e.VT ? e ltH e Reason Assume e ltH e
then two cases are to consider 1) if e ltH
e then e.VT ltV e.VT (from previous slide)
p
e
k
l
And e.VTpl gt k which implies that e.VT ltV
e.VT
e
39
Ch9 Models Causality
Vector Timestamp algorithm Ensures (cont.)
e.VT ltV e.VT ? e ltH e Reason Assume e
ltH e then two cases are to consider 2) if
e ltH e then e.VT ltV e.VT and
e.VT ltV e.VT
40
Ch9 Models Causality
Detecting causality violation in the dist. object
system ex.
If we know for every pair of events, whether they
are causally related we can detect causality
violation in the distributed object system
example by installing a causality violation
detector at every processor
p1
p2
p3
Migrate O on p2
If we attach a vector timestamp to each
event (and message)of the distributed object
system example, then each processor can detect
a causality violation e.g. p2 can detect that
a causality violation occurs when it receives
m1 m1 ltc m3 but r(m3) ltp2 r(m1)
Where is O ?
(1,0,0)
(0,0,1)
(2,0,1)
On p2
m2
(3,0,1)
(3,0,2)
m1
Where is O ?
m3
(3,0,3)
(3,1,3)
I dont know
(3,2,3)
(3,2,4)
(3,3,3)
Error !
41
Ch9 Models Causality
Causal communication Causality violation can
lead to undesirable situations A processor
usually cannot choose the order in which messages
arrive. But a processor can decide the order in
which application executing on it have messages
delivered to them This leads to the need for
communication subsystems with specified
properties e.g. one may require a communication
subsystem that deliver messages in a causal
order Advantage the design of many distributed
algorithms would be easy (e.g. simple object
migration protocol)
42
Ch9 Models Causality
Causal communication Can we build a
communication subsystem that guarantees delivery
of messages in causal order? No for unicast
message sending, Yes for multicast
43
Ch9 Models Causality
Causal communication (an attempt of
solution) Idea Hold back messages that arrive
too soon Deliver a held-back message m only
when you are assured that you will not receive
m such that m causally precedes m The
implementation of this idea is similar to the
implementation of FIFO communication
44
Ch9 Models Causality
FIFO communication (TCP) the problem Assume
1) p and q are connected by an oriented
communication line from p to q that
satisfies messages sent are
eventually received messages sent
by p can arrive at q in any order 2) q
delivers messages received from p to an
application A running at q The problem
is to devise a distributed algorithm that
enables processor q to deliver to A messages
received from p in the order p sent them.
45
Ch9 Models Causality
FIFO communication implementation(idea) The
solution consists of one algorithm for p and one
for q. Algorithm for p p sequentially
numbers each message it sends to q. q knows
that messages should be sequentially numbered.
Algorithm for q (idea) upon receipt of a
message m with a sequence number x, if q has
never received a message with sequence number
x-1, q delays the delivery of m until m can be
delivered in sequence
46
Ch9 Models Causality
FIFO communication implementation(idea) Algorith
m for q (idea cont.)
Message number x
Message number x
There is a hole, buffer
No hole, deliver
47
Ch9 Models Causality
Causal communication implementation(idea) Assumpt
ion (PTP)all point-to-point messages are
delivered in
order sent Instead of using sequence numbers (as
for the FIFO implementation) we use timestamps
Lamport timestamps or vector timestamps can be
used Idea whenever processor q receives a
message m from processor p, q holds back m
until it is assured that no message m ltc m
will be delivered from any other processor.
48
Ch9 Models Causality
Causal communication implementation(idea,
variables used)
blockedi queue of blocked messages
received from pi earliesti
(head(blockedi)).timestamp OR
1i if blockedi is empty messages in
delivery_list are causally ordered
self
49
Ch9 Models Causality
Causal communication implementation(idea,
variables update) When processor self receives
a message m from p, it performs the
following steps in order Step1 If
blockedp is empty then earliestp is set to
m.timestamp / because assumption
(PTP) guarantees that no earlier message can be
received from p / Step 2
Enqueue message m to blockedp Step 3
Unblock one after another, all blocked messages
that can be unblocked add each unblocked
message to deliver_list update earliest if
necessary How to determine when a
message can be unblocked? Step 4 Deliver
messages in deliver_list
50
Ch9 Models Causality
Causal communication implementation(idea,
variables update) Step 3 detailed
Assume we use vector timestamps Step 3
refined Unblock one after another, all
blocked messages that can be unblocked
the message m at the head of the holding queue
for processor k can be unblocked only if
the time of processor k according to
message m is smaller than the
time of processor k according to any other
message m if any, at
the head of a holding queue
More precisely, blockedk can be
unblocked only if (? i ?
1,..,M ? i ? k ? i ? self earliestki lt
earliestii) Thus, the details of
Step 3 are
51
Ch9 Models Causality
Causal communication implementation(idea,
variables update) Step 3 detailed
(cont.) blockedk can be unblocked only if
(? i ? 1,..,M ? i ? k ? i ? self
earliestki lt earliestii) combining the
above condition with the fact that messages are
unblocked one after another, we obtain a while
loop. While ( (? k ? 1,..,M blockedk ?
empty) ? (? i ? 1,..,M ? i ? k
? i ? self earliestki lt earliestii))
do remove the first message of blockedk
and add this message to delivery_list if
blockedk ? empty then earliestk
(head(blockedk)).timestamp / vector
timestamp / else earliestk
earliestk 1k end Deliver the messages in
delivery_list
52
Ch9 Models Causality
Causal communication implementation(the complete
scheme) Initially for each k in 1,..,M,
earliestk 1k blockedk
empty Wait for a message from any processor
on the receipt of message m from processor
p do deliver_list empty
Step 1 Step 2
Step 3 Step 4 end end

53
Ch9 Models Causality
Detecting causality violation in the dist. object
system ex.
If we know for every pair of events, whether they
are causally related we can detect causality
violation in the distributed object system
example by installing a causality violation
detector at every processor
p1
p2
p3
Migrate O on p2
Where is O ?
(1,0,0)
(0,0,1)
(2,0,1)
On p2
m2
(3,0,1)
(3,0,2)
m1
Where is O ?
m3
(3,0,3)
(3,1,3)
I dont know
(3,2,3)
(3,2,4)
(3,3,3)
Error !
54
Ch9 Models Causality
Problem of the causal communication
implementation previously given. One problem
that the algorithm presented for causal
communication has is that the communication
subsystem at processor self might never deliver
some messages
55
Ch9 Models Causality
Causal communication problems illustrated
Message M is never delivered by the communication
subsystem running at processor p2 blockedp3? ?
Mhead(blockedp3) earliestp3p13
and blockedp1 ? earliestp1p11 blockedp4
? earliestp4p11
self is processor p2
56
Ch9 ModelsDistributed Snapshots
Assumptions/definitions The system is connected,
that is there is a path from every pair of
processors Ci,j channel from pi to pj
Communication channels reliable and FIFO
messages sent are eventually received in
order State of Ci,j is the ordered list of
messages sent by pi but not yet received
at pj (we will soon make this definition
precise) State of a processor (at an instant)
is the assignment of a value to each variable of
that processor
57
Ch9 ModelsDistributed Snapshots
Assumptions (cont.) Global state of the
system (S,L) S (s1,.., sM) processor
states L channel states A global state
cannot be taken instantaneously it must be
computed in a distributed manner The problem
Devise a distributed algorithm that computes a
consistent global state. What
do we mean by consistent global state?
58
Ch9 ModelsDistributed Snapshots
Meaning of consistent global state Example 1
Two possible states for each processor s0, s1
q
In s0 the processor hasnt the
token In s1 the processor has the
token
Cq,p
Cq,p
The system contains exactly one token which moves
back and forth between p and q. Initially, p
has the token. Events sending/receiving the
token.
59
Ch9 ModelsDistributed Snapshots
Meaning of consistent global state Global
states of the system of Example 1
Cp,q
q
p
Cq,p
Cp,q
Cp,q
q
p
q
p
Cq,p
Cq,p
60
Ch9 ModelsDistributed Snapshots
Meaning of consistent global state (informal)
A global state G is consistent if it is one that
could have occurred
Consider a system with two possible runs
(non-determinism)
G
Actual transitions
The output of the snapshot algorithm can be G !
61
Ch9 ModelsDistributed Snapshots
Consistent global state (formal) Ss1,..,sM
oi event of observing si at pi O(S)o1,..,oM D
efinition S is a consistent cut iff o1,..,oM
is consistent with causality Definition
o1,..,oM is consistent with causality iff
(? e, oi e in Ei ? e ltH oi (?
e e in Ej ? e ltH e e ltH oj) )
Notations(m) event of sending m
r(m) event of receiving m
Intuition
pi
pj
e
oj
e
oi
62
Ch9 ModelsDistributed Snapshots
Precision about message sent but not yet
received Definition Given O(S)o1,..,oM m
a message. If s(m) ltpi oi ? oj ltpj r(m)
then m is sent but not yet received
(relatively to O).
o1
p1
m3
m1
m2
p2
o2
p2 observes its state, then asks p1 to do the
same The global state resulting from o1 and o2
must contain m1,m2,m3
63
Ch9 ModelsDistributed Snapshots
Meaning of consistent global state
(cont.) Definition A global state (S,L)
is consistent if S is a consistent
cut L contains all messages sent
but not yet received (relatively to
O(S))
64
Ch9 ModelsDistributed Snapshots
Examples of global states (questions)
p1
p1
p2
p3
p2
p3
o3
o1
o2
Is Oo1,o2,o3 consistent with causality?
65
Ch9 ModelsDistributed Snapshots
Why a consistent global state is useful (an
example)?
Processors p1 and p2 make use of resources r1 and
r2 A deadlock global state of a distributed
system is one in which there is cycle in the
wait-for graph Deadlock property Once a
distributed system enters a deadlock state, all
subsequent global state are deadlock states.
p1
r1
r2
p2
Req
Ok
Req
Assume we have a tough guy called deadlock
detector whose goal is to observe the processors
and the resources at some points of their
processing then checks if there is a cycle in the
wait-for graph if so, he claim that there is a
deadlock
1
2
Ok
Rel
Req
Ok
Our guy observes the processors and the resources
at the points marked 1 through 4
3
4
Req
66
Ch9 ModelsDistributed Snapshots
Why a consistent global state is useful (ex.,
cont.)?
The deadlock detector observes the processors and
the resources at the points marked 1 through 4
and finds
p1
r1
r2
p2
Req
Ok
Req
1
2
Ok
Rel
Req
Req
Ok
To see why, assume a correct transaction for
using a resource consists of three steps
3
Ok
4
Req
Rel
67
Ch9 ModelsDistributed Snapshots
Why a consistent global state is useful (ex.,
cont.)?
p1
r1
r2
p2
The deadlock detector observes the processors and
the resources at the points marked 1 through 4
a finds
Req
Ok
Req
1
2
Ok
Rel
Req
Ok
Is there actually a deadlock in the system?
3
4
Req
The answer is NO. There is only a phantom
deadlock. The claim of our guy is due to the fact
that he made an inconsistent observation that led
to a wrong result!
68
Ch9 ModelsDistributed Snapshots
The snapshot algorithm(Informal) Uses special
messages snapshot tokens (stok) There
are two types of participating processors
initiating, others The algorithm for
the initiating processor Records its
state Sends a stok to each outgoing
channel Starts to record state of
incoming channels. Recording of the
state of an incoming channel c is finished
when a stok is received along it.
69
Ch9 ModelsDistributed Snapshots
The snapshot algorithm(Informal cont.) Uses
special messages snapshot tokens (stok) Types
of participating processors initiating, others
The algorithm for any other processor
Records its state on receipt of a stok for the
first time (assume the first stok
is received along c). Records the state
of c as empty Sends one stok to each
outgoing channel Starts to record the
state of all other incoming channels
Recording of the state of an incoming channel c?
c is finished when a stok is received
along it.
70
Ch9 ModelsDistributed Snapshots
The snapshot algorithm(Idea, cont.) Notation
T(p,state) time at p when p records
its state T(p,stok,c) time at p
when p receives a stok along c The state of an
incoming channel c of p is the sequence of
messages that p receives in the interval
T(p,state), T(p,stok,c) Recall
that the state of c is recorded by p.
71
Ch9 ModelsDistributed Snapshots
The snapshot algorithm illustrated
Taking a snapshot of a token passing system
p records its state s0 and send stok
Cp,q
s0
s0
stok
q
p
Cq,p
Cp,q
Cp,q
s0
s1
q
p
q
p
stok
s0 Lpq
s0
Cq,p
Cq,p
s0 Lqp
q receives stok q records its state
and the state of Cp,q then sends stok
Recorded global state Ss0, s0 LLpq, Lqp
p received the token and stok arrives then p
records the state of Cq,p
72
Ch9 ModelsDistributed Snapshots
Applications of snapshots Detecting
stable state predicates (or properties) A
state predicate P is said to be stable if
P(G) ? P(G) for every G that is
reachable from G Examples Deadlock
Termination lost of token etc.
73
Ch9 ModelsDistributed Snapshots
The snapshot algorithm(in the book) Accounts
for the possibility of different concurrent
snapshots To achieve this,
Each snapshot is identified by the name of the
initiating processor A
processor might initiate a new snapshot while the
first is still being collected version
number To achieve this,
Version numbers are used
(for simplicity, when a processor r requests a
new version of the
snapshot, the old snapshot is cancelled) diffusin
g computation one useful technique for
designing
distributed algorithms
74
Ch9 ModelsDiffusing computation
Diffusing computation Assume a connected network
(i.e. for each pair of processors in the
system, there is a path connecting them) and
that messages sent are eventually received The
problem A processor p has an information
Info that it wants to send to all other
processors.
Processors that are directly connected are called
neighbors Each processor knows its neighbors
75
Ch9 ModelsDiffusing computation
Diffusing computation (a solution) The algorithm
for the initiator i for each neighbor k
send(k,Info) The algorithm for any other
processor wait for message from any neighbor
on receipt of Info from some neighbor p do
for each neighbor k ? p
send(k,Info) end end
There are two problems with this
algorithm Problem 1 there might be
unprocessed messages left in some
channels Problem 2 processor p does not
know if and when all other processors
have received Info

76
Ch9 ModelsDiffusing computation
Diffusing computation (a solution,cont.) Solution
to problem 1 and 2 we want the initiator to be
informed of the fact that all the processors
have received Info

The algorithm for the initiator i Step1 for
each k in my_neighbors send(k,Info)
Step 2 my_wlistmy_neighbors while
my_wlist is not empty do wait for
message from any k in my_wlist on
receipt of Info from k in my_wlist
do my_wlist
my_wlist\ k end
end end
Variables used my_neighbors the set of
identities of all my neighbors my_wlist the
list of neighbors from which I am waiting for
a message containing Info
77
Ch9 ModelsDiffusing computation
(a solution,cont. The algorithm for a
non-initiating processor consists of three steps
Step 1, Step 2 and Step 3 in that order)

Why this distributed algorithm is correct (i.e.
each processor receives Info and the initiator
eventually learns that each processor has
received Info, no deadlock)?
Step 3 send(my_parent, Info)
78
Ch9 ModelsDiffusing computation
Spanning tree construction
A spanning tree of a graph is a tree whose nodes
are all those in the graph and whose edges are a
subset of those in the graph

p
Channels along which processors received Info for
the first time
79
Ch9 ModelsDistributed computation
Formal models (non-deterministic interleaving)
Understand how distributed computations
actually occur Intuition A distributed system
has Global states (S,L) see Snapshots
Initially, each processor is in an initial
local state each
communication channel is empty Events
occurrence of an event causes a transition of
the system from the current global state
to a new global state Computations sequences
of events from intial global states
80
Ch9 ModelsDistributed computation
More precisely An event e(p,s,s,m,c)
p in P s, s local states of p
m in M ? NULL (M set of all possible messages)
c in C? NULL (Cset of all channels)
Interpretation of e(p,s,s,m,c) e takes p
from s to s and possibly sends or receives m on
c. If m (and c) is NULL then e is an
internal event No channel is affected
by the occurrence of e Otherwise
If c is an incoming channel then m is removed
from c. If c is an outgoing channel
then m is added to c.
81
Ch9 ModelsDistributed computation
Occurrence of event (execution of an event)
An event e(p,s,s,m,c) can occur in a global
state G only if some condition, termed
enabling condition of e is satisfied in
G. The enabling condition of e(p,s,s,m,c)
is a condition on the state of p and the
channels attached to p example
the program counter has a specific value
Transition of the system If
e(p,s,s,m,c) can occur in G, then the execution
of e by p changes the global state by
changing only the state of p and
possibly the state of one channel attached to p.
82
Ch9 ModelsDistributed computation
More precisely (cont.) Two functions
Let G be a global state e and event.
Ready(G) the set of all events that can occur
in G Next(G,e) the global state just
after the occurrence of e. Assume
G0 initial global state Gi the
global state when event ei occurs seq
lte0,e1,,engt a sequence of events. Definition
seq is a computation of the system if
1) (? i in 0,,n ei in Ready(Gi))
2) (? i in 0,,n
Gi1Next(Gi, ei)) Note non-deterministic
selection in Ready(Gi).
83
Ch9 ModelsDistributed computation
Correctness State predicate assertion on
global states Correctness property
assertion on computations. Definition
A distributed algorithm is correct if each of
its computations satisfies the
correctness property. Proving correctness
Show that each global state reachable from
the initial global state satisfies some
well-defined state predicate. In
general, one uses invariant assertions.
84
Ch9 ModelsDistributed computation
Eventually and Always properties
Let G0 be an initial global state
R(G0) all computations that start in G0
A a state predicate Q an
assertion on computation. eventually
eventually(A,G0,Q) means starting from G0, for
any computation for which Q holds, there is a
global state that satisfies A (from now
on, something good will happen) always
always(A,G0,Q) means A is always true starting
from G0 for any computation for which Q holds,
85
Ch9 ModelsDistributed computation
Failures in a distributed system
In a distributed system, failures occur An
additional complication in designing distributed
algorithm for a distributed system to be
dependable, fault tolerance must be incorporated
a fault tolerant algorithm is one which
minimizes the impact of certain faults
on the service provided by the system Fault
Classification fail-stop timing fault,
byzantine transient faults, etc.

Write a Comment

User Comments (0)

About PowerShow.com

Distributed Algorithms PowerPoint PPT Presentation