Recap - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Recap

Description:

Remote Procedure Calls ... Remote Procedure Calls. Five main classes of failure can occur in RPC systems: ... Remote Procedure Calls. Client Cannot Locate the Server ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 43
Provided by: danielzi
Category:
Tags: procedure | recap

less

Transcript and Presenter's Notes

Title: Recap


1
Recap
  • Fault Tolerance
  • Process Resilience

2
Today
  • Reliable Client-Server Communication
  • Reliable Group Communication

3
Reliable Communication
  • There are multiple types of communication
    failure
  • Crashes - the communication channel breaks in
    some way
  • Omission - messages are dropped
  • Timing - messages arrive too slowly (or too
    quickly)
  • Arbitrary - messages are duplicated, corrupted,
    etc.
  • Two primary types of communication
  • Point-to-Point
  • RPC

4
Point-to-Point Communication
  • Reliable point-to-point communication is
    typically in the form of TCP sockets
  • Omission failures are masked using a system of
    acknowledgements and retransmissions
  • Arbitrary failures are masked using packet
    numbering and the reliability of the underlying
    Internet Protocol
  • Crash failures and timing failures cannot always
    be masked - the way to get reliability in the
    face of crash failures is to have the system
    automatically re-establish broken connections

5
Remote Procedure Calls
  • The goal of RPC is to hide communication, by
    making remote calls look local
  • As long as the client and server are functioning
    perfectly, and the network is reasonably speedy,
    it does a good job
  • When errors occur in communication, the
    differences between local and remote calls arent
    always easy to mask

6
Remote Procedure Calls
  • Five main classes of failure can occur in RPC
    systems
  • The client is unable to locate the server
  • The request message from the client to the server
    is lost
  • The server crashes after receiving a request
  • The reply message from the server to the client
    is lost
  • The client crashes after sending a request
  • Each of these has its own set of problems

7
Remote Procedure CallsClient Cannot Locate the
Server
  • This can happen if the server is down, or if the
    server has been changed since the client was
    built (so the interface isnt compatible anymore)
  • One solution is to raise an exception on the
    client side that must be dealt with by an
    exception handler
  • Drawbacks not every language has exceptions, and
    this destroys the transparency
  • We pretty much cant maintain transparency in
    this case

8
Remote Procedure CallsLost Request Messages
  • This can happen for many reasons, and is the
    easiest failure to deal with
  • We have the client (or OS) start a timer when
    sending the request, and if theres no reply
    before the timer runs out, we send the request
    again
  • If the message was really lost, everythings OK
    because the server never saw the first one
  • If the message wasnt lost, as long as the server
    can detect that its a duplicate everything is
    still OK
  • Its possible for the client to incorrectly
    conclude that the server is down, which isnt
    good, but cant be avoided

9
Remote Procedure CallsServer Crashes
  • There are multiple places where the server can
    crash, all of which look the same to the client
    (it doesnt get a reply)

10
Remote Procedure CallsServer Crashes
  • There are three schools of thought on what the
    RPC system should do in these scenarios
  • Keep trying until a reply has been received (on
    the assumption that the server will restart
    eventually), then return that reply to the client
    - at least once semantics, guarantees the call
    was executed one or more times
  • Give up immediately and report the failure - at
    most once semantics, guarantees the call was
    executed one time or not at all
  • Dont guarantee anything (very easy to implement)

11
Remote Procedure CallsServer Crashes
  • None of those options are what we really want -
    we really want exactly once semantics, but
    unfortunately, we cant have it
  • No matter what strategy is used by the client to
    reissue unanswered requests, or by the server to
    send completion messages, duplicate executions
    (or no execution) can result

12
Remote Procedure CallsLost Reply Messages
  • One solution to lost reply messages is to just
    rely on a timer again, just like lost request
    messages
  • The problem is, this may cause problems if the
    request is not idempotent (executing it more than
    once has the same effect as executing it once)
  • We can structure many calls idempotently, but
    with some it simply isnt possible

13
Remote Procedure CallsLost Reply Messages
  • One solution is to use sequence numbering, or
    some other scheme, to let the server detect
    duplicates
  • However, it still has to respond to the requests
    and track the sequence numbers, and this might
    substantially increase the processing overhead on
    the server
  • Another solution is to have a bit in the message
    header that distinguishes originals from
    duplicates - originals can always be processed
    safely (this doesnt help too much though)

14
Remote Procedure CallsClient Crashes
  • A client can send a request to a server, but then
    crash before it receives the response - this
    leaves an orphan computation running on the
    server
  • Orphans can cause problems such as wasting CPU
    cycles, locking files, or otherwise using
    resources - also, if the client resends the
    request and receives a response from the orphan,
    chaos can ensue
  • Nelson (1981) proposed four solutions to the
    problem of orphans

15
Remote Procedure CallsClient Crashes
  • Extermination
  • The client stub logs all RPC transmissions to
    disk, and explicitly cancels any that were in
    progress before a crash when the machine comes
    back up
  • Disadvantages of this approach
  • Its expensive to keep a log
  • The orphans themselves may make RPC calls that
    are difficult to cancel
  • Its possible that the network will be
    partitioned in a way such that the cancellation
    doesnt make it to the server

16
Remote Procedure CallsClient Crashes
  • Reincarnation
  • Time is divided into sequentially-numbered
    epochs, and the epoch number is incremented on
    every reboot
  • When a client boots, it broadcasts its epoch
    number to all machines, and they cancel any RPCs
    that have an old epoch number
  • Disadvantages of this approach
  • It requires a broadcast to the entire network
  • If the network is partitioned, some orphans may
    survive (though they can be detected once they
    communicate)

17
Remote Procedure CallsClient Crashes
  • Gentle Reincarnation
  • Like reincarnation, but less draconian
  • When an epoch broadcast comes in, each machine
    kills only those computations for which it cannot
    locate the owner on the network
  • This mainly addresses the possible situation
    where a false epoch message is received from some
    faulty (or malicious) process on the network

18
Remote Procedure CallsClient Crashes
  • Expiration
  • Each RPC is given a quantum of time T to run to
    completion, and must explicitly ask for another
    quantum if it cant finish
  • After a crash a client has to wait for at least
    one quantum to pass before going back online, and
    all its orphans will have disappeared
  • The main problem with this method is deciding
    what a reasonable value for T is, balancing the
    need to clean up orphans quickly with the
    communication overhead that can result

19
Remote Procedure CallsClient Crashes
  • In practice, none of these solutions are
    particularly desirable
  • Killing an orphan may also have unforeseen
    consequences, such as database corruption or
    files staying locked forever
  • An orphan may have taken various actions, such as
    setting timers to start other processes at future
    times, which make removing all traces of it from
    the system impossible

20
Reliable Group Communication
  • Because process resilience by replication is so
    important, reliable multicast services are
    important as well
  • It turns out to be rather difficult to multicast
    reliably - some of the difficulty lies in
    defining exactly what reliably means in terms
    of multicast communication
  • We distinguish between reliable multicast in the
    presence of faulty processes and reliable
    multicast when processes are assumed to operate
    correctly

21
Reliable Multicast
  • If there are faulty processes, multicasting is
    considered reliable if it is guaranteed that all
    non-faulty group members receive the messages
  • However, agreement needs to be reached on what
    the group looks like before messages can be
    delivered
  • If there are no faulty processes, and the group
    membership doesnt change during communication,
    multicasting is considered reliable if every
    message is delivered to every group member - we
    get agreement for free

22
Reliable Multicast Implementations
  • Its (relatively) easy to implement reliable
    multicast with non-faulty processes, if we dont
    require messages to be delivered in the same
    order to all group members
  • Unfortunately, the easy solution isnt scalable
    to large groups
  • There are harder solutions that are scalable to
    large groups, of which we will discuss two
    categories

23
The Easy Reliable MulticastImplementation
24
Scalability of the Easy Reliable Multicast
Implementation
  • If there are N receivers, the sender has to
    accept at least N acknowledgements - feedback
    implosion
  • One solution is to only have receivers send
    negative acknowledgements - when they receive a
    message and detect theyve missed one, they ask
    for the one they missed
  • In theory, sender has to keep all messages
    forever
  • This still isnt guaranteed to prevent feedback
    implosions
  • We need more sophisticated solutions

25
Nonhierarchical Feedback Control
  • The goal is to reduce the number of feedback
    messages - we use a technique called feedback
    suppression
  • This technique underlies the Scalable Reliable
    Multicasting (SRM) protocol, Floyd et al. 1997
  • Receivers never acknowledge the successful
    delivery of a message
  • Negative acknowledgements are multicast, not sent
    just to the message sender

26
Nonhierarchical Feedback Control
  • This allows other receivers that missed the same
    message to suppress their feedback, because the
    replacement message will be multicast when the
    original sender gets one negative acknowledgement
  • The negative acknowledgements are scheduled with
    random delays, to prevent feedback implosions

27
Nonhierarchical Feedback Control
  • Drawbacks
  • Feedback messages must be scheduled accurately to
    prevent feedback implosion
  • Receivers that received a message are forced to
    receive it again if other receivers missed it

28
Nonhierarchical Feedback Control
  • One workaround is to let receivers that didnt
    get a particular message m join a separate
    multicast group for m - but this requires very
    efficient group management
  • Receivers can assist in recovery to increase
    scalability - if a receiver has successfully
    received m and then gets a negative
    acknowledgement for m, it can multicast m before
    the negative acknowledgement gets to the original
    sender

29
Hierarchical Feedback Control
  • To scale to very large groups, we need some sort
    of hierarchical organization
  • Assume we have one sender that needs to multicast
    to a very large group of receivers
  • We can partition the receivers into subgroups,
    within which any multicast method that scales to
    small groups can be used, and elect a local
    coordinator for each subgroup

30
Hierarchical Feedback Control
  • Within each subgroup, the coordinator handles the
    negative acknowledgements of subgroup members by
    retransmitting to the subgroup
  • If the coordinator misses a message, it can
    request it from the coordinator of its parent
    group
  • If we base the implementation on acknowledgements
    rather than negative acknowledgements, the
    coordinator doesnt need to keep too large a
    buffer

31
Hierarchical Feedback Control
32
Hierarchical Feedback Control
  • The main problem with this scheme is the
    construction of the tree
  • This often has to be done dynamically
  • One way is to make use of the multicast tree in
    the underlying network, if such exists, by adding
    extra software to multicast routers - but its
    not easy to make that kind of change to the
    routers that are already deployed on existing
    networks

33
Atomic Multicast
  • Often, we need to guarantee that, in the presence
    of process failures, a message is delivered
    either to all processes in a group or to none at
    all - this is the atomic multicast problem
  • We can define reliable multicast in the presence
    of process failures in terms of process groups
    and changes to group membership

34
Atomic MulticastCommunication Model
  • We distinguish between message receipt and
    message delivery, for the purpose of modeling
    communication in such a system

35
Atomic Multicast
  • Each multicast message m is associated with a
    list of processes to which it should be delivered
    - this list corresponds to the group view that
    the sender had at the time m was sent
  • This group view is shared by the rest of the
    processes on the list - so each process on the
    list believes that m should be delivered to all
    processes on the list, and to no other processes

36
Atomic Multicast
  • Suppose m is multicast when its sender has group
    view G
  • Another process joins or leaves the group while
    the multicast of m is taking place - this causes
    a view change (which is communicated with a
    multicast message vc)
  • There are now two messages in transit (m, vc) -
    we need to guarantee either that m is delivered
    before vc to all processes, or that m is not
    delivered at all

37
Virtual Synchrony
  • In principle, the only case where m should not be
    delivered at all is when the group membership
    change is caused by the sender of m crashing -
    either all members of G should hear that the
    sender crashed before m was sent, or none should
  • A reliable multicast that satisfies the
    requirement that a message multicast to group
    view G is delivered to each nonfaulty process in
    G is called virtually synchronous

38
Virtual Synchrony
39
Virtual Synchrony andMessage Ordering
  • Virtual synchrony is much like using a
    synchronization variable in a data store - view
    changes are barriers that messages cannot cross
  • There are four different possible message
    orderings for virtually synchronous multicast
  • Unordered
  • FIFO-ordered
  • Causally-ordered
  • Totally-ordered

40
Virtual Synchrony and Message Ordering
  • In reliable, unordered multicast, no guarantees
    are given about the order in which received
    messages are delivered by different processes
  • In reliable, FIFO-ordered multicast, all messages
    from each individual process are delivered to all
    other processes in the same order, but no
    guarantees are made about the relative delivery
    orders of messages from different processes

41
Virtual Synchrony and Message Ordering
  • In reliable, causally-ordered multicast, messages
    are delivered so that potential causality among
    messages is preserved (this can be done with
    vector timestamps)
  • In reliable, totally-ordered multicast, all
    messages are delivered in the same order to all
    group members (this is generally combined with a
    requirement of causal or FIFO ordering)

42
Next Class
  • Distributed commit, a general distributed systems
    problem of which atomic multicast is an example
  • Recovery
Write a Comment
User Comments (0)
About PowerShow.com