EEC 688/788 Secure and Dependable Computing - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

EEC 688/788 Secure and Dependable Computing

Description:

EEC 688/788 Secure and Dependable Computing Lecture 14 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University wenbing_at_ieee.org – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 39
Provided by: Wenb72
Category:

less

Transcript and Presenter's Notes

Title: EEC 688/788 Secure and Dependable Computing


1
EEC 688/788Secure and Dependable Computing
  • Lecture 14
  • Wenbing Zhao
  • Department of Electrical and Computer Engineering
  • Cleveland State University
  • wenbing_at_ieee.org

2
Outline
  • Group communication systems
  • Ordered multicast
  • Techniques to implement ordered multicast
  • Group membership service
  • Agreed and safe delivery
  • Checkpointing and recovery
  • Reference
  • Reliable distributed systems, by K. P. Birman,
    Springer Chapter 14-16

3
Group Communication System
  • Services provided by the GCS
  • Membership service who is up and who is down
  • Deals with failure detection and more
  • Reliable, ordered, multicast service
  • FIFO, causal, total
  • Virtual synchrony service
  • Virtual synchrony synchronizes membership change
    with multicasts
  • GCS is often used to build fault tolerant systems

4
Reliable Multicast
  • Reliable multicast the message is targeted to
    multiple receivers, and all receivers receive the
    message reliably
  • Positive or negative acknowledgement
  • Need to avoid ack/nack implosion
  • Distinguish receiving from delivery!

Application
Delivering
Middleware
Receiving
5
Ordered Reliable Multicast
  • Ordered reliable multicast if many messages are
    multicast by many senders, in what order the
    messages are delivered at the receivers?
  • First in first out (FIFO)
  • Causal the causal relationship among msgs
    preserved
  • Total all msgs are delivered at all receivers
    in the same order

6
FIFO Ordered Multicast
  • FIFO or sender ordered multicast
  • Messages are delivered in the order they were
    sent (by any single sender)

a
e
p q r s
b
c
d
delivery of c to p is delayed until after b is
delivered
7
Causally Ordered Multicast
  • Causal or happens-before ordering
  • If send(a) ? send(b) then deliver(a) occurs
    before deliver(b) at common destinations

a
p q r s
b
8
Causally Ordered Multicast
  • Causal or happens-before ordering
  • If send(a) ? send(b) then deliver(a) occurs
    before deliver(b) at common destinations

a
p q r s
b
c
delivery of c to p is delayed until after b is
delivered
9
Causally Ordered Multicast
  • Causal or happens-before ordering
  • If send(a) ? send(b) then deliver(a) occurs
    before deliver(b) at common destinations

a
e
p q r s
b
c
delivery of c to p is delayed until after b is
delivered
e is sent (causally) after b
10
Causally Ordered Multicast
  • Causal or happens-before ordering
  • If send(a) ? send(b) then deliver(a) occurs
    before deliver(b) at common destinations

a
e
p q r s
b
c
d
delivery of c to p is delayed until after b is
delivered
delivery of e to r is delayed until after bc are
delivered
11
Totally Ordered Multicast
  • Total ordering
  • Messages are delivered in same order to all
    recipients (including the sender)

a
e
p q r s
b
c
d
all deliver a, b, c, d, then e
12
Implementing Total Ordering
  • Use a token that moves around
  • Token has a sequence number
  • When you hold the token you can send the next
    burst of multicasts
  • Use a sequencer to order all multicast
  • Message is first multicast to all, including the
    sequencer then the sequencer determines the
    order for the message and informs all
  • Or send to the sequencer and the sequencer
    multicast with total order information
  • Each sender can take turn to serve as the
    sequencer

13
Group membership service
  • Input
  • Process join events
  • Process leave events
  • Apparent failures
  • Output
  • Membership views for group(s) to which those
    processes belong

14
Issues?
  • The service itself needs to be fault-tolerant
  • Otherwise our entire system could be crippled by
    a single failure!
  • Hence Group Membership Service (GMS) must run
    some form of protocol (GMP)

15
Approach
  • Assume that GMS has members p,q,r at time t
  • Designate the oldest of these as the protocol
    leader
  • To initiate a change in GMS membership, leader
    will run the GMP
  • Others cant run the GMP they report events to
    the leader

16
GMP Example
p
q
r
  • Example
  • Initially, GMS consists of p,q,r
  • Then q is believed to have crashed

17
Unreliable Failure Detection
  • Recall that failures are hard to distinguish from
    network delay
  • So we accept risk of mistake
  • If p is running a protocol to exclude q because
    q has failed, all processes that hear from p
    will cut channels to q
  • Avoids messages from the dead
  • q must rejoin to participate in GMS again

18
Basic GMP
  • Someone reports that q has failed
  • Leader (process p) runs a 2-phase commit protocol
  • Announces a proposed new GMS view
  • Excludes q, or might add some members who are
    joining, or could do both at once
  • Waits until a majority of members of current view
    have voted ok
  • Then commits the change

19
GMP Example
Proposed V1 p,r
Commit V1
p
q
r
OK
V0 p,q,r
V1 p,r
  • Proposes new view p,r -q
  • Needs majority consent p itself, plus one more
    (current view had 3 members)
  • Can add members at the same time

20
Special Concerns?
  • What if someone doesnt respond?
  • P can tolerate failures of a minority of members
    of the current view
  • New first-round overlaps its commit
  • Commit that q has left. Propose add s and drop
    r
  • P must wait if it cant contact a majority
  • Avoids risk of partitioning

21
What If Leader Fails?
  • Here we do a 3-phase protocol
  • New leader identifies itself based on age ranking
    (oldest surviving process)
  • It runs an inquiry phase
  • The adored leader has died. Did he say anything
    to you before passing away?
  • Note that this causes participants to cut
    connections to the adored previous leader
  • Then run normal 2-phase protocol but terminate
    any interrupted view changes leader had initiated

22
GMP Example
p
Proposed V1 r,s
Commit V1
Inquire -p
q
r
OK
OK nothing was pending
V0 p,q,r
V1 r,s
  • New leader first sends an inquiry
  • Then proposes new view r,s -p
  • Needs majority consent q itself, plus one more
    (current view had 3 members)
  • Again, can add members at the same time

23
Safe and Agreed Delivery
  • For totally ordered reliable multicast, there are
    two delivery policies
  • Safe delivery a message is delivered only when
    all correct processes have received it
  • Agreed delivery a message is delivered as long
    as it is the next message in total order

24
Safe and Agreed Delivery
  • Safe delivery guarantees the uniformity of
    multicast
  • If a message is delivered to any process, it is
    delivered by all correct processes
  • Agreed delivery does not
  • It is possible that a message is delivered in one
    (or more) process, but is not delivered by some
    correct process

25
Checkpointing and Recovery
  • Faults occur over time. How to ensure a fault
    tolerant system remain operational for extensive
    period of time?
  • Recover failed replicas, or replace failed
    replicas with new one gt Recovery is needed
  • How to recover a failed replica or install a new
    replica?
  • Checkpointing a correct replica and transfer the
    state to the recovering replica

26
Checkpointing
  • Checkpointing the act of taking a snapshot of an
    entity so that we can restore it later
  • A replica is a process running in an operating
    system. The state of a process
  • Processes' memory, stack and registers
  • Threads
  • Open or mmap'ed files
  • Current working directory
  • Interprocess communication
  • Semaphores, shared memory, pipes, sockets
  • Dynamic Load Libraries

27
Checkpointing
  • Many tools are available to perform checkpointing
    transparently or semi-transparently
  • http//www.checkpointing.org/
  • Condor, libckpt, etc.
  • Checkpoints taken in general are not portable
  • Checkpoint size might be big

28
Checkpointing of Application State
  • Sometimes it is more efficient to save and store
    the application state only
  • Checkpoints can be very portable and compact in
    size
  • class Counter int counter Counter(int
    initVal) counter initVal void
    increment() counter void decrement()
    counter-- void setState(int c) counter
    c int getState() return counter

29
Logging
  • Logging of messages
  • Checkpointing in general is expensive
  • Logging of messages is cheaper
  • gt we can periodically do checkpointing, or do
    checkpointing on demand and log all messages in
    between
  • Logging of other non-deterministic activities
  • Access order to shared data

30
Roll-Forward Recovery
  • With replication in space, it is possible to
    recover a fault while the system is progressing
    ahead
  • Roll-forward recovery is made possible by
  • Checkpointing of replica state
  • Logging of incoming messages
  • Reliable, totally ordered group communication
    system

31
Roll-Forward Recovery
  • We want to ensure the newly admitted replica to
    have a consistent state with others when it
    starts
  • Steps of adding a new replica into a group (with
    on-demand checkpointing)
  • A recovered (or a new) replica joins a group
  • A join message is multicast in total order
  • On receiving the join message, it is put into
    incoming message queue and wait for processing
  • When the join message is at the head of the
    queue, a checkpoint is taken and it is
    transferred to the new replica

32
Roll-Forward Recovery
  • At the new replica, it starts queueing messages
    after it receives the join messages (sent by
    itself)
  • When the checkpoint is received by the new
    replica, its state is restored using the received
    checkpoint (the checkpoint is delivered out of
    order!)
  • The queued messages are delivered in order, at
    the new replica
  • Other replicas do not stop and wait for the new
    replica
  • Steps of adding a new replica into a group with
    periodic checkpointing is similar

33
Steps of Roll-Forward Recovery
34
Steps of Roll-Forward Recovery
35
Steps of Roll-Forward Recovery
36
Steps of Roll-Forward Recovery
37
Roll-backward Recovery
  • Roll-backward recovery is used for systems
    relying on replication in time for fault
    tolerance
  • When a failure occurs, roll back using the most
    recent checkpoint (and retry)

38
Roll-backward Recovery in a Distributed System
  • Performing roll-backward recovery in a
    distributed system is non-trivial
  • Need to solve the distributed snapshot problem
  • It is easy to perform a local checkpoint of a
    process, but in a distributed system, when one
    process rolls back, other processes must also
    roll back to a consistent state

39
Distributed Snapshot Problem
  • Goal Determine the global system state
  • e.g. the total amount of money
  • Assumptions
  • Each process records its own state
  • No shared clock/memory
  • Imagine that a group of photographers taking
    snapshots of different portions and trying to
    combine to get the overall picture

40
Distributed Snapshot
  • A distributed snapshot reflects a state in which
    the distributed system might have been
  • What constitute a consistent global state?
  • If we have recorded that process P has received a
    message from another process Q, then we should
    also have recorded that process Q had actually
    sent the message
  • The reverse condition (Q has sent a message that
    P has not yet received) is allowed

41
Distributed Snapshot
  • A pair of mutually consistent checkpoints

42
Distributed Snapshot
  • A missing message
  • gt need to log messages (i.e.,consider channel
    state in addition to process state)

43
Distributed Snapshot
  • An orphan message
  • The two checkpoints are definitely not consistent

44
Chandy and Lamport's Algorithm
  • Assumptions
  • FIFO, unidirectional, reliable channels (A
    bidirectional channel is modelled as two
    unidirectional channels)
  • No process fails during the snapshot
  • System state consists of process state and
    channel state (messages sent but not received)
  • Any process P can initiate taking a distributed
    snapshot

45
Chandy and Lamport's Algorithm
  • P starts by recording its own local state and
    sends a marker along each of its outgoing
    channels
  • When Q receives a marker through channel C, its
    action depends on whether it had already recorded
    its local state
  • Not yet recorded
  • It records its local state, and sends the marker
    along each of its outgoing channels
  • It starts recording incoming messages on OTHER
    channels
  • Already recorded the marker on C indicates that
    the channels state should be recorded
  • All messages received before this marker and
    after Q recorded its own state

46
Chandy and Lamport's Algorithm
  • Q is finished when it has received a marker along
    each of its incoming channels
  • The recorded local state as well as the state it
    recorded for each incoming channel, can be
    collected and sent to the process that initiated
    the snapshot
  • The global state can be subsequently constructed

47
Chandy and Lamport's Algorithm
C1
M
C2
Process Q receives a marker for the first time
(from C1) and records its local state
Q records all incoming message on C2 (and other
incoming channels except C1, if any)
Q receives a marker for its incoming channel C2
and finishes recording the state of the incoming
channel C2
Write a Comment
User Comments (0)
About PowerShow.com