Principles of Reliable Distributed Systems Recitation 11: State Machine Replication with Paxos - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

Principles of Reliable Distributed Systems Recitation 11: State Machine Replication with Paxos

Description:

x.write1(0), x.write2(1), x.ack1, x.read1, x.ack2, x.ret1(0), x.read2, x.ret2(1) ... x.write1(1), y.write2(1), y.ack2, x.ack1, y.read1, x.read2, x.ret1(), y.ret2 ... – PowerPoint PPT presentation

Number of Views:115

Avg rating:3.0/5.0

Slides: 20

Provided by: idi3

Category:

more less

Transcript and Presenter's Notes

Title: Principles of Reliable Distributed Systems Recitation 11: State Machine Replication with Paxos

1
Principles of Reliable Distributed Systems
Recitation 11State Machine Replication with
Paxos Sequential Consistency

Spring 2009
Alex Shraer

2
Replicated State Machines

Data is replicated at n servers.
Operations are initiated by clients.
Operations need to be performed at all correct
servers in the same order.
Goal ensure that all the copies are the same
after the ith operation.

3
Client-Server Interaction

Leader-based each process (client/server) has an
estimate of who is the current leader.
A client sends a request to its current leader.
The leader sends the response to the client.

4
Sequence of Paxos Instances

A sequence of separate instances of Paxos.
The value chosen by instance i is the ith
operation.
Clients send operations to the current leader.
The leader decides where in the sequence each
operation should appear.
If the leader decides that a certain operation
should appear as the 135th operation, it tries to
have that operation as the value of the 135th
instance of Paxos.

5
Safety and Liveness

Reasons for leader proposals failures
A leader fails
A different node believes it is a leader
Safety is always preserved (worst case)
Performance can be optimized during non-faulty
periods

6
Replication with Fast Paxos

Non-optimized version
New leader learns the entire history
Observations
No value is chosen until phase 2 of Paxos.
At the end of phase 1, either the value to be
proposed is determined, or else the proposer is
free to choose any value.

7
Normal Operation

Normal operation the previous leader has just
failed and a new one has been selected.
The new leader knows most of the operations that
have already been chosen
Since it participated in the protocol before it
became a leader
Suppose it knows operations 1-134, 138 and 139.

8
Normal Operation (contd)

The leader executes phase 1 of instances 135-137
and of all instances gt 139.
Suppose the outcome of this phase determines the
value to be proposed in instances 135 and 140,
and is unconstrained in the other instances.
The new leader now executes phase 2 for instance
135 and 140 (Why does it have to?)

9
Normal Operation (contd)

Every server knows commands 1-135 ? the leader
can execute them.
Cannot execute commands 138-140 before 136 and
137
Two options
Use the next two client requests as commands 136
and 137.
Fill the gap using no-op operations
Which one is better?

10
Normal Operation (contd)

Operations 1-140 have now been chosen, and all
servers can execute them.
The leader also completed phase 1 for instances gt
140.
Can start working in express mode
Can propose any value in phase 2 of these
instances immediately

11
How can gaps occur?

The leader can propose operation 142 before it
knows its proposed 141 operation is chosen.
Bad scenario
All messages it sent proposing operation 141 are
lost and operation 142 is chosen before any
server learns about operation 141.
The leader fails before 141 is chosen.

12
Phase 1 for infinity?

A new leader executes phase 1 for infinitely many
instances of Paxos. (135-137 and all instances gt
139).
Uses the same BallotNum for all of the instances.
A response to a prepare message needs to include
a value only for the instances for which it
already accepted a value (in phase 2). In the
example 135 and 140.
the servers can respond with a "reasonably short"
message

13
Abnormal Operation

We assumed that there is a single leader.
Only phase 2 can be executed for each instance.
What happens if that is not the case?
Safety is preserved (why?).
A single leader is needed for liveness.

14
Sequential-Linearization

A Sequential-linearization ? of a concurrent
execution ? is
A sequential execution
Each invocation is immediately followed by its
response
Satisfies the objects sequential specification
Looks like ?
Responses to all invocations are the same as in ?
Responses to pending invocations in ? may be
added
Preserves local real-time order
If the completion for operation o1 at process pi
occurs in ? before the invocation for operation
o2 at node pi, then o1 appears before o2 in ?
Can be written as ?i ?i for all i

14
15
Sequential Consistency

A concurrent execution that has a
sequential-linearization is sequentially
consistent
Whats different from linearizability?

15
16
Sequential Consistency

A concurrent execution that has a
sequential-linearization is sequentially
consistent
What is the difference from linearizability?
Both linearizability and sequentially consistency
are strong consistency conditions all
processes must agree on the order in which all
operations occur

16
17
Some notations

x.writei(v) invocation by process pi of a write
operation with value v to register x
x.acki completion of write operation to
register x by process pi
x.readi invocation by process pi of a read
operation from register x
x.reti(v) completion of read operation from
register x by process pi, with v being the
returned value

17
18
Sequentially consistent local-writes algorithm

the algorithm emulates sequentially-consistent
shared register using message-passing
abcast and adeliver reliable atomic broadcast
xi is the local copy of the shared register x at
pi
upon x.readi
if num0 then
invoke x.reti(xi)
upon x.writei(v)
num ? num1
abcast(?"write", x, v?)
invoke x.acki
upon adeliveri(j, ?"write", x, v?)
xi ? v
if (i j) then
num ? num 1
if num 0 and a read on x is pending then
invoke x.reti(xi)

The algorithm is taken from Attiya Book (second
edition), page 197
18
19
Question 2

For each of the following executions, determine
whether it is linearizable, sequentially-consisten
t, or neither, and explain (assume that the
initial value in all register is ??)
x.write1(0), x.write2(1), x.ack1, x.read1,
x.ack2, x.ret1(0), x.read2, x.ret2(1).
x.write1(1), x.ack1, x.read2, x.ret2(-), x.read2,
x.ret2(1)
x.write1(0), x.write2(1), x.ack1, x.ack2,
x.read1, x.read2, x.ret1(0), x.ret2(1).
x.write1(1), x.ack1, x.write3(2), x.ack3,
x.read4, x.read2, x.ret2(1), x.ret4(2), x.read4,
x.ret4(1), x.read2, x.ret2(2).
x.write1(1), y.write2(1), y.ack2, x.ack1,
y.read1, x.read2, x.ret1(-), y.ret2(-)
Hint it always helps to draw the execution as in
the lectures, and your explanation should use the
requirements made by the definition