Recap

About This Presentation

Title:

Recap

Description:

Remote Procedure Calls ... Remote Procedure Calls. Five main classes of failure can occur in RPC systems: ... Remote Procedure Calls. Client Cannot Locate the Server ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 43

Provided by: danielzi

Category:

more less

Transcript and Presenter's Notes

Title: Recap

1
Recap

Fault Tolerance
Process Resilience

2
Today

Reliable Client-Server Communication
Reliable Group Communication

3
Reliable Communication

There are multiple types of communication
failure
Crashes - the communication channel breaks in
some way
Omission - messages are dropped
Timing - messages arrive too slowly (or too
quickly)
Arbitrary - messages are duplicated, corrupted,
etc.
Two primary types of communication
Point-to-Point
RPC

4
Point-to-Point Communication

Reliable point-to-point communication is
typically in the form of TCP sockets
Omission failures are masked using a system of
acknowledgements and retransmissions
Arbitrary failures are masked using packet
numbering and the reliability of the underlying
Internet Protocol
Crash failures and timing failures cannot always
be masked - the way to get reliability in the
face of crash failures is to have the system
automatically re-establish broken connections

5
Remote Procedure Calls

The goal of RPC is to hide communication, by
making remote calls look local
As long as the client and server are functioning
perfectly, and the network is reasonably speedy,
it does a good job
When errors occur in communication, the
differences between local and remote calls arent
always easy to mask

6
Remote Procedure Calls

Five main classes of failure can occur in RPC
systems
The client is unable to locate the server
The request message from the client to the server
is lost
The server crashes after receiving a request
The reply message from the server to the client
is lost
The client crashes after sending a request
Each of these has its own set of problems

7
Remote Procedure CallsClient Cannot Locate the
Server

This can happen if the server is down, or if the
server has been changed since the client was
built (so the interface isnt compatible anymore)
One solution is to raise an exception on the
client side that must be dealt with by an
exception handler
Drawbacks not every language has exceptions, and
this destroys the transparency
We pretty much cant maintain transparency in
this case

8
Remote Procedure CallsLost Request Messages

This can happen for many reasons, and is the
easiest failure to deal with
We have the client (or OS) start a timer when
sending the request, and if theres no reply
before the timer runs out, we send the request
again
If the message was really lost, everythings OK
because the server never saw the first one
If the message wasnt lost, as long as the server
can detect that its a duplicate everything is
still OK
Its possible for the client to incorrectly
conclude that the server is down, which isnt
good, but cant be avoided

9
Remote Procedure CallsServer Crashes

There are multiple places where the server can
crash, all of which look the same to the client
(it doesnt get a reply)

10
Remote Procedure CallsServer Crashes

There are three schools of thought on what the
RPC system should do in these scenarios
Keep trying until a reply has been received (on
the assumption that the server will restart
eventually), then return that reply to the client
- at least once semantics, guarantees the call
was executed one or more times
Give up immediately and report the failure - at
most once semantics, guarantees the call was
executed one time or not at all
Dont guarantee anything (very easy to implement)

11
Remote Procedure CallsServer Crashes

None of those options are what we really want -
we really want exactly once semantics, but
unfortunately, we cant have it
No matter what strategy is used by the client to
reissue unanswered requests, or by the server to
send completion messages, duplicate executions
(or no execution) can result

12
Remote Procedure CallsLost Reply Messages

One solution to lost reply messages is to just
rely on a timer again, just like lost request
messages
The problem is, this may cause problems if the
request is not idempotent (executing it more than
once has the same effect as executing it once)
We can structure many calls idempotently, but
with some it simply isnt possible

13
Remote Procedure CallsLost Reply Messages

One solution is to use sequence numbering, or
some other scheme, to let the server detect
duplicates
However, it still has to respond to the requests
and track the sequence numbers, and this might
substantially increase the processing overhead on
the server
Another solution is to have a bit in the message
header that distinguishes originals from
duplicates - originals can always be processed
safely (this doesnt help too much though)

14
Remote Procedure CallsClient Crashes

A client can send a request to a server, but then
crash before it receives the response - this
leaves an orphan computation running on the
server
Orphans can cause problems such as wasting CPU
cycles, locking files, or otherwise using
resources - also, if the client resends the
request and receives a response from the orphan,
chaos can ensue
Nelson (1981) proposed four solutions to the
problem of orphans

15
Remote Procedure CallsClient Crashes

Extermination
The client stub logs all RPC transmissions to
disk, and explicitly cancels any that were in
progress before a crash when the machine comes
back up
Disadvantages of this approach
Its expensive to keep a log
The orphans themselves may make RPC calls that
are difficult to cancel
Its possible that the network will be
partitioned in a way such that the cancellation
doesnt make it to the server

16
Remote Procedure CallsClient Crashes

Reincarnation
Time is divided into sequentially-numbered
epochs, and the epoch number is incremented on
every reboot
When a client boots, it broadcasts its epoch
number to all machines, and they cancel any RPCs
that have an old epoch number
Disadvantages of this approach
It requires a broadcast to the entire network
If the network is partitioned, some orphans may
survive (though they can be detected once they
communicate)

17
Remote Procedure CallsClient Crashes

Gentle Reincarnation
Like reincarnation, but less draconian
When an epoch broadcast comes in, each machine
kills only those computations for which it cannot
locate the owner on the network
This mainly addresses the possible situation
where a false epoch message is received from some
faulty (or malicious) process on the network

18
Remote Procedure CallsClient Crashes

Expiration
Each RPC is given a quantum of time T to run to
completion, and must explicitly ask for another
quantum if it cant finish
After a crash a client has to wait for at least
one quantum to pass before going back online, and
all its orphans will have disappeared
The main problem with this method is deciding
what a reasonable value for T is, balancing the
need to clean up orphans quickly with the
communication overhead that can result

19
Remote Procedure CallsClient Crashes

In practice, none of these solutions are
particularly desirable
Killing an orphan may also have unforeseen
consequences, such as database corruption or
files staying locked forever
An orphan may have taken various actions, such as
setting timers to start other processes at future
times, which make removing all traces of it from
the system impossible

20
Reliable Group Communication

Because process resilience by replication is so
important, reliable multicast services are
important as well
It turns out to be rather difficult to multicast
reliably - some of the difficulty lies in
defining exactly what reliably means in terms
of multicast communication
We distinguish between reliable multicast in the
presence of faulty processes and reliable
multicast when processes are assumed to operate
correctly

21
Reliable Multicast

If there are faulty processes, multicasting is
considered reliable if it is guaranteed that all
non-faulty group members receive the messages
However, agreement needs to be reached on what
the group looks like before messages can be
delivered
If there are no faulty processes, and the group
membership doesnt change during communication,
multicasting is considered reliable if every
message is delivered to every group member - we
get agreement for free

22
Reliable Multicast Implementations

Its (relatively) easy to implement reliable
multicast with non-faulty processes, if we dont
require messages to be delivered in the same
order to all group members
Unfortunately, the easy solution isnt scalable
to large groups
There are harder solutions that are scalable to
large groups, of which we will discuss two
categories

23
The Easy Reliable MulticastImplementation
24
Scalability of the Easy Reliable Multicast
Implementation

If there are N receivers, the sender has to
accept at least N acknowledgements - feedback
implosion
One solution is to only have receivers send
negative acknowledgements - when they receive a
message and detect theyve missed one, they ask
for the one they missed
In theory, sender has to keep all messages
forever
This still isnt guaranteed to prevent feedback
implosions
We need more sophisticated solutions

25
Nonhierarchical Feedback Control

The goal is to reduce the number of feedback
messages - we use a technique called feedback
suppression
This technique underlies the Scalable Reliable
Multicasting (SRM) protocol, Floyd et al. 1997
Receivers never acknowledge the successful
delivery of a message
Negative acknowledgements are multicast, not sent
just to the message sender

26
Nonhierarchical Feedback Control

This allows other receivers that missed the same
message to suppress their feedback, because the
replacement message will be multicast when the
original sender gets one negative acknowledgement
The negative acknowledgements are scheduled with
random delays, to prevent feedback implosions

27
Nonhierarchical Feedback Control

Drawbacks
Feedback messages must be scheduled accurately to
prevent feedback implosion
Receivers that received a message are forced to
receive it again if other receivers missed it

28
Nonhierarchical Feedback Control

One workaround is to let receivers that didnt
get a particular message m join a separate
multicast group for m - but this requires very
efficient group management
Receivers can assist in recovery to increase
scalability - if a receiver has successfully
received m and then gets a negative
acknowledgement for m, it can multicast m before
the negative acknowledgement gets to the original
sender

29
Hierarchical Feedback Control

To scale to very large groups, we need some sort
of hierarchical organization
Assume we have one sender that needs to multicast
to a very large group of receivers
We can partition the receivers into subgroups,
within which any multicast method that scales to
small groups can be used, and elect a local
coordinator for each subgroup

30
Hierarchical Feedback Control

Within each subgroup, the coordinator handles the
negative acknowledgements of subgroup members by
retransmitting to the subgroup
If the coordinator misses a message, it can
request it from the coordinator of its parent
group
If we base the implementation on acknowledgements
rather than negative acknowledgements, the
coordinator doesnt need to keep too large a
buffer

31
Hierarchical Feedback Control
32
Hierarchical Feedback Control

The main problem with this scheme is the
construction of the tree
This often has to be done dynamically
One way is to make use of the multicast tree in
the underlying network, if such exists, by adding
extra software to multicast routers - but its
not easy to make that kind of change to the
routers that are already deployed on existing
networks

33
Atomic Multicast

Often, we need to guarantee that, in the presence
of process failures, a message is delivered
either to all processes in a group or to none at
all - this is the atomic multicast problem
We can define reliable multicast in the presence
of process failures in terms of process groups
and changes to group membership

34
Atomic MulticastCommunication Model

We distinguish between message receipt and
message delivery, for the purpose of modeling
communication in such a system

35
Atomic Multicast

Each multicast message m is associated with a
list of processes to which it should be delivered
- this list corresponds to the group view that
the sender had at the time m was sent
This group view is shared by the rest of the
processes on the list - so each process on the
list believes that m should be delivered to all
processes on the list, and to no other processes

36
Atomic Multicast

Suppose m is multicast when its sender has group
view G
Another process joins or leaves the group while
the multicast of m is taking place - this causes
a view change (which is communicated with a
multicast message vc)
There are now two messages in transit (m, vc) -
we need to guarantee either that m is delivered
before vc to all processes, or that m is not
delivered at all

37
Virtual Synchrony

In principle, the only case where m should not be
delivered at all is when the group membership
change is caused by the sender of m crashing -
either all members of G should hear that the
sender crashed before m was sent, or none should
A reliable multicast that satisfies the
requirement that a message multicast to group
view G is delivered to each nonfaulty process in
G is called virtually synchronous

38
Virtual Synchrony
39
Virtual Synchrony andMessage Ordering

Virtual synchrony is much like using a
synchronization variable in a data store - view
changes are barriers that messages cannot cross
There are four different possible message
orderings for virtually synchronous multicast
Unordered
FIFO-ordered
Causally-ordered
Totally-ordered

40
Virtual Synchrony and Message Ordering

In reliable, unordered multicast, no guarantees
are given about the order in which received
messages are delivered by different processes
In reliable, FIFO-ordered multicast, all messages
from each individual process are delivered to all
other processes in the same order, but no
guarantees are made about the relative delivery
orders of messages from different processes

41
Virtual Synchrony and Message Ordering

In reliable, causally-ordered multicast, messages
are delivered so that potential causality among
messages is preserved (this can be done with
vector timestamps)
In reliable, totally-ordered multicast, all
messages are delivered in the same order to all
group members (this is generally combined with a
requirement of causal or FIFO ordering)

42
Next Class