Title: What is a Distributed State Observations from a topdown view
1What is a Distributed State?Observations from a
top-down view
- Yoram Moses
- EE Dept., Technion Israel
- Joint with Ron van der Meyden Kai Engelhardt
- U. of New South Wales, Sydney
2Long-term Goal
- Design a refinement calculus for distributed
programs, incorporating knowledge, time, etc. - Refinement is well-established in sequential
case Back, Morgan
3Stepwise Refinement
- A process by which a program is repeatedly
replaced by a more detailed one, that satisfies
the same requirements.
- Intuitively,
- P refines Q if
- every execution of P is an execution of Q
4Example Two-Phase Commit
- n processes, complete reliable asynchronous
network - Initially each proc j has binary vote vj
- Goal all votes 1 gt every process decides 1
- otherwise, every process decides 0
-
- Two-Phase Commit (2PC) is
refined by -
- Collect votes at coordinator
- Compute result
- Distribute result to all.
5Why Refinement?
- Resulting programs are
- correct by design
- well-documented
- easier to modify and maintain
But Rules, specification language, semantics
can be complex
6Bottom-up Treatments
- Basic distributed problems are difficult Mutual
exclusion, snapshot, leader election, consensus - Many techniques devoted to formal treatment of
individual problems - algorithms, verification,
complexity analysis. - Less attention to top-down techniques.
7Refinement is top-down
- Can we extend to distributed case?
- What changes may be needed?
- Many new issues
- state and composition
- specification language
- termination
- modularity
8Todays Focus
- Can we extend to distributed case?
- What changes may be needed?
- Many new issues
- state and composition
- specification language
- termination
- modularity
9Sequential Specifications
- A specification is a pair ?,? of state
formulas. - This specifies a program that, if started in a
state satisfying ? will terminate in a final
state satisfying ?. - If ? doesnt hold initially, all bets are off.
10Programs
- We extend the space of programs to include
intermediate-level programs, containing both
program code and specification statements - x3 x lt 0 , y gt 1 yy1 x gt
1 , y 2 - Additional technical syntactic features
11Refinement Rules
- Splitting Rule
- ?,? ? ,? ?,?
- Action Rules
- x 1 true,x1
12Sequential Computation
Execution alternating sequence of states and
actions.
A program executes between initial state and
final state
13Sequential2 Composition
- Sequential programs compose at a state
- In PQ final(P) initial(Q)
- Distributed case?
P
Q
14Distributed Computation
alternating global states and actions.
- Natural generalization
- Powerful tool (state machine model)
- Well developed
- Problem with sequential composition
15Two-Phase Commit Again
- Programs dont end at a global state!
16Composition
- Assuming initial states incompatible with no
final states - gt solutions to distributed problems do not
automatically compose
Q
R
P
17Composition
- Assuming initial states incompatible with no
final states - gt PODC, FOCS, STOC, ICALP, DISC, SODA solutions
to distributed problems do not automatically
compose - Solutions are not directly comparable
Q
R
P
18Termination Detection (Francez, Dijkstra)
- Guarantees P ends before Q begins
- Useful for synchronizing tasks
- Added costs
Q
P
19Firing Squad
- The firing squad problem (McCarthy,) provides
synchronization at a global state gt enables
starting at an initial state. - at a cost
Q
P
FS
20Other Remedies
- Consistent cuts (Lamport) are internally
indistinguishable from global states in
asynchronous computations and are used as
substitutes - Communication-closed layers (FrancezElrad,
Zwiers, Janssen) relax the global state
assumption much more. (Main property all
messages sent in a layer arrive in that layer.)
21Programs Execute bet. Cuts
Q
R
P
22Cuts
- A cut is a tuple (c1,c2,,cn) of local time
points, - with ci ? ? for all i.
- A non-terminating sequential program diverges
- A distributed program can have processes that
never terminate! - Therefore ci ? ?
-
c1
c2
c3
c4
23Sequential Composition Regained
- We can now denote by P Q the layering of Q
after P Ps final cut is Qs initial cut. - Comments
- Initial and final cuts need not be consistent
cuts! - The layer defined by an initial cut and a final
cut of a program is typically not communication
closed! . - Specified carefully, programs can now be designed
separately and then be sequentially composed.
24What is a Distributed State?
- The state plays two roles in the sequential case
- Transformation Actions modify the state
- Control Programs start and end at a state
- Claim In the distributed case
- Transformation Actions modify global state
- Control Programs start and end on cuts
25Why Worry?
- Our standard assumptions are too strong.
Typically - Initial state provides perfect synchronization,
timing is free (firing squad) - Empty channels guarantee no collisions with
out-of- phase messages. E.g. - 2PC can be implemented with 1-bit messages. What
if we want to finish Collect after receiving the
first 0? Undelivered messages may arrive in a
later phase. - Both Collect and Distribute can be implemented
with 1-bit messages even when channels are
lossy but fair. The composed solution is
incorrect.
26Reminder Two-Phase Commit
- n processes, complete reliable asynchronous
network - Initially each proc j has binary vote vj
- Goal all votes 1 gt every process decides 1
- otherwise, every process decides 0
-
- Two-Phase Commit is refined
by -
- Collect votes at coordinator
- Compute result
- Distribute result to all.
27Problem Challenge
- Some protocols make strong use of unstated
implicit assumptions resulting from the initial
state assumption, others dont. - Some protocols require strong disambiguation
properties from earlier/later phases (padding
messages), others dont. - Our formalisms are often insensitive to these
distinctions. They shouldnt be!
28Problem Challenge
- Provide a theory of programs that will account
for composition costs. - Classify problems and solutions accordingly.
- Consider solutions that are aware of initial cut
rather than initial global states. - Example
- Suppose computing the vote in 2PC is
moderately hard. - gt naturally led to checking if coordinator
has Distributed answer before computing vote.
29 Example Minimum Spanning Tree
- An MST protocol
- Starts with every site being aware of its
outgoing links (edges) and their weights. - Ends when every process knows
- which of its links are in the MST, and
- which point to its parent.
30 Composing MST
Suppose we want to perform MST Leader
Election 2PC MST ends on a cut. Observation
(Lynch 96) Leader Election doesnt need to
wait until MST completes
31Distributed Specifications
- A cut formula corresponds to a set of pairs (r,c)
, where r is a run and c is a cut in r. - Can include local state information
- synchronization information global state,
consistent cut, empty channels etc. - temporal and knowledge information
- A specification is ?,?, where ? and ? represent
properties of cuts.
32Specification Language
- Cut formulae can represent non-trivial facts
- properties of local states x0 y2
- temporal properties (local and cut-related)
?? sendikm gt ? receivek m - knowledge formulas K1(vote1) K2(vote2)
- Temporal operators local LTL reasoning
cut-based ? , ?.
33High-level programs
- We can define big-IF and big-WHILE
- If ? then P
- While ? do P
- where P is a distributed program!
- E.g., While MST not complete do
- Add an MST edge to forest
34Termination Revisited
- Sequential programs terminate or diverge.
- Good distributed programs may run forever.
-
- Koo Toueg In a system with fair lossy channels
and asynchronous processes, nontrivial problems
do not have terminating solutions! - Example Transmitting a bit from Sender to
Receiver - Problem Can we compose programs in this model?
35Forking
- We find it useful to distinguish
- Foreground activities from
- Background activities
- by using a fork operator (a la Havelund)
- fork(P) means that P starts now, but
- on a separate control line (thread).
36fork(Pi)
Pi
Process i
E.g., Pi could be the program Acking Repeat
forever If received(m) then send(ack(m))
fork(Pi) true, ?( receive(m) gt
send(ack(m)))
37Bit Transmission
- Sender
- Repeat send(bit) until receive(ack)
- Receiver
- fork(Acking) await(receive(bit))
- Thus is a terminating program!
- (although Acking never terminates)
38Forking contd
- It is often important to determine what point in
an execution of a program defines its final cut
for sequential composition. - Can distinguish terminating threads of an
activity from nonterminating ones. - By using fork, we can distinguish foreground from
background activities. Separate the concerns of a
process from its service to the community.
(Consensus)
39Forking Parallel Composition
- We can represent the parallel composition
operator ?? in a number of ways - (P ??Q) ? fork(Q) P
- ? fork(P) Q
- ? fork(P) fork(Q)
- These differ in the way they sequentially
compose with later programs. X R
40Conclusion
- The view from the top-down differs from the
bottom-up. - Distributed programs operate between cuts.
- Composition deserves closer attention.
- There is a whole range of related questions to be
explored. - (The road to refinement is long, but
- promises to offer insights along the way)
41A Simple Language of Programs
- null program ?
- basic actions a,b,
- sequential composition PQ
- nondeterministic choice P Q
- iteration (finite or infinite) P?
- assertions ?
- constraints
- specifications ?,?
- coercions ?
42Constructs and Termination
- We can represent while and if-then-else
- if ? then P else Q ? (? P) (?? Q)
- while ? do P ? (? P)? ??
- true,true is an arbitrary terminating
program -
-
43Distributed Specifications
- A cut formula is evaluated at a cut.
- It corresponds to a set of pairs (r,c)
- where r is a run and c is a cut in r.
- Examples
- x0 y2,
- (x2) (eventually x2),
-