# A Sqrt(N) Algorithm for Mutual Exclusion in Decentralized Systems - PowerPoint PPT Presentation

PPT – A Sqrt(N) Algorithm for Mutual Exclusion in Decentralized Systems PowerPoint presentation | free to download - id: c014a-ZDc1Z

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## A Sqrt(N) Algorithm for Mutual Exclusion in Decentralized Systems

Description:

### Sigma protocol is fully implemented and deployed in a distributed testbed ... Critique of the Sigma Algorithm ... the performance of Sigma protocol by removing ... – PowerPoint PPT presentation

Number of Views:306
Avg rating:3.0/5.0
Slides: 64
Provided by: moosamu
Category:
Tags:
Transcript and Presenter's Notes

Title: A Sqrt(N) Algorithm for Mutual Exclusion in Decentralized Systems

1
A Sqrt(N) Algorithm for Mutual Exclusion in
Decentralized Systems
M. Maekawa University of Tokyo
2
The Problem of Mutual Exclusion
sometimes essential for consistency.
• The problem of mutual exclusion involves granting
is desired by a requestor.

3
Previous Work
• Ricart and Agrawala
• O(N) messages.
• Each node communicates with every other node.
• Thomas
• Majority voting, still O(N) messages
• Gifford and Skeen
• Majority voting with a non-uniform distribution

4
Introduction
• Maekawas algorithm uses cSqrt(N) messages to
obtain Mutual Exclusion.
• c is a constant between 3 and 5
• Symmetric
• Fully parallel operation

5
Network assumptions
• Error-free
• FIFO channels
• Messages between two nodes are delivered in the
order sent

6
Decentralization
• Two criteria for decentralized Mutual Exclusion
• Equal Responsibility Each node in the network
bears an equal amount of responsibility to
control Mutual Exclusion.
• Equal Work Each node must perform an equal
amount of work to obtain Mutual Exclusion.

7
Intuition behind Voting Sets
• Any pair of mutual exclusion requests must be
arbitrated and one of the requesting nodes may be
given access.
• The system comprises entirely of identical nodes
which must share the responsibility of mutual
exclusion Decentralization
• Thus, any pair of two requests must reach a
certain common node.
• This implies that the voting sets of any two
nodes i and j, given by Si and Sj, must have a
non-empty intersection.

8
Voting Set Rules
9
Voting Set Rules
Every pair of Voting Sets should have at least
one shared node.
Each nodes Voting Set should contain the node
itself.
10
Voting Set Rule Details
All Voting Sets should be of the same size. This
ensures that each node does an equal amount of
work for obtaining mutual exclusion.
Each node should appear a constant number of
times in all the Voting Sets. This ensures that
each node is equally responsible for mutual
exclusion.
11
Optimal Voting Set Size
• The general idea is to represent the maximum
number of Voting Sets The size of the set of
Voting Sets, in terms of D, K guided by the
established set of rules. This evaluates to
• (D-1)K 1
• This should be equal to the number of nodes, N,
since we do not want fewer or more voting sets.
• D is the degree of duplication of nodes and KN
is the number of members such that N KN/D. Thus
DK.

12
More Math
The problem of finding a set of Sis which
satisfy all rules exactly is equivalent to
finding a finite projective plane of N
points. The finite projective plane result
implies that the set of Sis will comply with all
the rules IFF (K-1) is a power of a prime. For
other cases some of the rules can be relaxed. In
general
13
Algorithm Intuition
• If node i can lock all members of its Voting Set
Si, then no other node can capture all its
members since the intersection of its Voting Set
with that of is will have at least one node.
• If a node fails to capture all its members, it
waits till all of them are freed to lock them.
• To prevent deadlocks, nodes get a priority based
on the timestamp of their request.

14
Messages
• REQUEST
• The message sent by a node to request mutual
exclusion
• REQUEST messages are time-stamped and earlier
ones get higher priority.
• INQURE
• The INQUIRE message is sent to a node i that has
requested a node j if j receives another request
that predates that of i.
• The purpose of the INQUIRE message is to query
node j if it can indeed lock all its members. It
is only sent once.
• RELINQUISH
• A reply to INQUIRE if the originating node cannot
get all it members.
• RELEASE
• The message sent by a node after it has completed
it critical section
• LOCKED
• The message sent from a member node to a
requesting node if it is not currently locked by
another request.
• FAILED
• The message sent from a member node to a
requestor when it is currently locked by a higher
priority request.

15
Example (1)
16
Example (2)
17
Example (3)
18
Example (4)
19
Example (5)
20
Proof of Mutual Exclusion
• Paper contains a proof.
• Essentially as long as the network assumptions
hold and the algorithm observes it specification,
mutual exclusion is guaranteed.

21
• Deadlocks are eliminated in Maekawas algorithm
by attacking the circular wait condition
• For any cycle, there must be one node in the
cycle whose REQUEST timestamp is preceded by both
of its adjacent nodes in the circular wait. The
removal of such a node breaks the circular wait

22
Starvation
• Starvation is a state of no progress.
• For a node i in Maekawas algorithm, starvation
would occur if is REQUEST are continuously
blocked by preceding REQUEST messages at various
members of Si.
• This is however impossible because there can be
at most (K-1) preceding outstanding requests for
any request by a node and therefore, in finite
time, is request will be accomodated.

23
Message Traffic Light Demand
• Contention is rare
• For an instance of mutual exclusion
• (K-1) REQUEST messages
• (K-1) LOCKED messages
• (K-1) RELEASE messages

24
Message Traffic Heavy Demand
• At most (K-1) messages for each of REQUEST,
INQUIRE, FAILED, RELEASED, RELINQUISH
• Thus, a maximum of 5(K-1) messages.

25
Node Failure
• Algorithm assumes that failures can be detected
by other nodes and failed nodes are removed from
the system.
• A simple approach to deal with failure is to
allow another node to take over the
responsibilities of the failed node.

26
Comparison
27
Variations
• A simplified version of the algorithm can achieve
mutual exclusion in 2 Sqrt(N) messages if greater
delays are tolerated.
• This entails sending REQUEST messages one-by-one
to Voting Set members only if all REQUESTs so far
have been LOCKED, in cyclic order.
• An additional Sqrt(N) messages are required to
release the mutual exclusion.

28
Critique
• Message complexity is O(Sqrt(N)) which is
better that Ricart and Agrawala for a
decentralized protocol.
• Symmetry The responsibility on nodes to
control mutual exclusion and the work needed to
attain mutual exclusion are balanced.
• -- Does not address the problems associated with
to Voting Sets with large churn will lead to
rapid degeneration from the ideal Maekawa setup
depending on policies for removal and addition
of nodes.

29
A Practical Distributed Mutual Exclusion Protocol
in Dynamic Peer-to-Peer Systems
Authors Shi-Ding Lin (Microsoft Research Asia)
Zheng Zhang (Microsoft Research Asia)
Qiao Lian (Tsinghua University)
Ming Chen (Tsinghua University)
30
Motivation
• Emerging p2p applications (such as the Grid)
built on top of DHTs introduce several new
challenges, when it comes to resource sharing
• An important challenge that needs to be tackled
in this context is providing mutual exclusion
• Techniques used to provide mutual exclusion in
completely applicable to p2p systems
• Enforcing concurrency using stable transaction
servers is out of question

31
Quorum Consensus Background
• A quorum is a subgroup of replica managers whose
size gives it the right to carry out operations
• Quorum consensus is a replication scheme that
provides all the benefits of replication, along
with being able to handle network partitions

32
Introduction
• This paper proposes the Sigma protocol which is
implemented inside a dynamic p2p DHT
• It uses queuing and cooperation between clients
and replicas, to enforce a quorum consensus
scheme
• Demonstrated the scalability of this protocol,
resilience to network latency variance, and
fault-tolerance

33
Challenges
• The open and dynamic nature of a wide-area p2p
environment
• Random resets, meaning the logical replica
crashes and is replaced by one of its neighbors
in the DHT

34
System Model
• The replicas are always available, but their
internal states may be randomly reset
• The number of clients is unpredictable and can be
very large
• Clients and replicas communicate via messages
across unreliable channels. Messages can be
replicated and lost

35
System Model Majority consensus in p2p DHT
36
System Model Assumptions
• Clients are not malicious and fail-stop
• Messages can not be forged
• The typical lifetime of a DHT node is long enough
so that a client can talk directly to the current
logical replica

37
Strawman Protocol
• A simple, ALOHA-like protocol

Replica
2
1
1
Critical Section
Replica
3
Client
. . .
2
1
Replica
2
38
Strawman Protocol Discussion
• Purpose behind this is to show that high variance
of network latency between clients and replicas
is responsible for the large performance
• Observed that robust mutual exclusion was
entirely feasible, with m24, n32 (i.e. m/n
0.75)
• Chance of breaking exclusivity is 10-40
• The performance, on the other hand, is very poor,
as shown by the following graph

39
Strawman Protocol Performance
40
Strawman Protocol Poor Performance Reasoning
• Due to the variance of network latency between
one client and each replica, requests will reach
different replicas at different times
• Problem 1 Difficult to build a consistent view
of competing clients
• Problem 2 Greedy behavior of clients

41
SIGMA Protocol
• Addresses both problems faced by Strawmans
protocol
• Problem 1 Solved by installing a queue at the
replica, which is shuffled to reach a consistent
view, in case of high contention
• Problem 2 Solved by placing clients into an
active waiting state

42
SIGMA Protocol Architecture
43
SIGMA Protocol YIELD Operation
• This is an important performance optimizing
operation and shows the collaborative side of
this protocol
• YIELD RELEASE REQUEST
• By issuing the YIELD request, clients are
collectively offering the replicas a chance to
build a more consistent view and choose the right
winner
• This continues until a winner is chosen

44
SIGMA Protocol With failures (1 of 2)
• After a crash, a replica may vote a second time.
Solved by raising m/n ratio
• Replicas grants permission to client using a
renewable lease, to avoid the case when clients
crash, while in CS
• Message lost due to unreliable communication is
treated the same as a client/replica crash

45
SIGMA Protocol With failures (2 of 2)
• If queue is destroyed, the replicas state can be
rebuilt using the following informed backup
mechanism
• Replica predicts expected waiting time and
notifies client to retry after that time, using
the empirical formula Tw TCS (P ½)
• P clients position in the queue
• TCS average CS duration
• Tw is always updated upon the reception of a retry

46
SIGMA Protocol Analysis
• Service Policy since client requests can take
arbitrarily long to reach the replica, therefore
FCFS is not guaranteed
• Safety guaranteed with high probability, since
safety violation is almost negligible by choosing
appropriate m/n ratio
• Liveness ensured through lease mechanism

47
Experimental Results (1 of 3)
• Sigma protocol is fully implemented and deployed
in a distributed testbed
• Assume a pool of infinity clients, each firing
requests to enter CS based on Poisson
distribution, with ? as the incoming request rate
• After 5 minutes warm-up, tested 10 minutes during
which throughput in terms of number of serviced
requests per second is measured. Repeated for
different incoming request rates

48
Experimental Results (2 of 3)
49
Experimental Results (3 of 3)
50
Novel Contributions
• Showed that high variation of network latency
between clients and replicas causes performance
degradation in a ALOHA-like strawman protocol
• Demonstrated that a cooperative strategy between
clients and replicas is necessary for overcoming
the above problem and also to achieve scalability
and robustness
• Proposed an informed backoff mechanism to rebuild
replicas state (after a crash)

51
Critique of the Sigma Algorithm
• The YIELD operation (which is a big performance
optimization for their protocol) might not be
suitable for a file-sharing p2p application
• How to come up with the best lease expiration
period?
• Effect on performance by changing m or n
parameters?

52
Future Ideas
• Analyze the performance of Sigma protocol by
removing the YIELD operation
• Extending the concept of logical replicas to
devise mechanisms to handle compromised and
malicious nodes

53
Conclusion
• Proposed a practical, efficient and
fault-tolerant protocol for distributed mutual
exclusion inside p2p DHT
• Utilized logical replicas and quorum consensus to
deal with system dynamisms
• Dealt with failures using informed backoff and
lease mechanisms
• Presented some basic performance analysis in this
paper. More can be found in their Technical
Report

54
Scalable and Dynamic Quorum Systems
Authors Moni Naor Udi Wieder Department of
Computer Science The Weizmann Institute of
Science (Some slides borrowed from the authors
website)
55
Introduction
• Definition Quorum systems are a collection of
quorums, every two of which intersect
• Investigated two aspects of quorum systems
• Algorithmic complexity of finding a quorum in
case of random failures
• Dynamic Paths

56
Motivation
• For Dynamic quorums Popularity of p2p
applications
• Churn
• Large number of nodes entering and leaving
arbitrarily
• Mutual exclusion and data replication, using
quorum systems
• Validate assumptions

57
system and its probe complexity for non-adaptive
algorithms
• Proved that the non-adaptive probe complexity is
at least log n divided by the load
• Proved in another paper that load of a quorum
system

58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
Dynamic Path Quorum Systems -Voronoi Diagram
• Dynamic Processors are joining and leaving the
system arbitrarily.
• The grid is substituted with a continuous unit
square.
• This square can be decomposed into Voronoi cells
where each processor is associated with a cell.
• Adding a processor involves computing a new
boundary.

62
The Dynamic Paths Quorum System
A quorum is the union of the vertices that form
right-left and top-bottom paths in the Delaunay
graph.
63
Thank you