CTIS 490 DISTRIBUTED SYSTEMS - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

CTIS 490 DISTRIBUTED SYSTEMS

Description:

... idea of error recovery is to replace an erroneous state with an error-free state. ... Used mainly by the Web hosting services. ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 17
Provided by: cneyt
Category:

less

Transcript and Presenter's Notes

Title: CTIS 490 DISTRIBUTED SYSTEMS


1
CTIS 490DISTRIBUTED SYSTEMS
  • WEEK 10
  • RECOVERY
  • REPLICA MANAGEMENT

2
RECOVERY
  • Another issue of fault tolerance is the recovery
    from an error.
  • The idea of error recovery is to replace an
    erroneous state with an error-free state.
  • The most widely used recovery method is the
    backward recovery.
  • Backward recovery brings the system from its
    present erroneous state back into a previously
    correct state. To do so, it will be necessary to
    record the systems state from time to time and
    to restore such a recorded state when things go
    wrong.
  • Each time the systems present state is recorded,
    a checkpoint is said to be made.

3
CHECKPOINTING
  • In distributed systems, a consistent global state
    is also called a distributed snapshot.
  • In a distributed snapshot, if a process P has
    recorded the receipt of a message, then there
    should be a process Q that has recorded the
    sending of that message.

4
INDEPENDENT CHECKPOINTING
  • Each process saves its state from time to time to
    a locally available stable storage (which is
    designed to survive anything except major
    disasters), and we have to construct a consistent
    global state from these local states.
  • A recovery line corresponds to the most recent
    collection of checkpoints.
  • The distributed nature of checkpointing may make
    it difficult to find a recovery line. To discover
    a recovery line requires that each process is
    rolled back to its most recently saved state.
  • If these local states jointly do not form a
    distributed snapshot, further rolling back is
    necessary.
  • This process of a cascaded rollback leads to
    domino effect.

5
INDEPENDENT CHECKPOINTING
  • The state saved by P2 indicates the receipt of a
    message m, but no other process can be identified
    as its sender. So, P2 needs to be rolled back to
    an earlier state.
  • P1 has recorded the receipt of message m, but
    there is no recorded event of this message being
    sent.
  • In this example, the recovery line is the initial
    state of the system.

6
COORDINATED CHECKPOINTING
  • In coordinating checkpointing, all processes
    synchronize to jointly write their state to local
    stable storage. The main advantage is that the
    saved state is globally consistent.
  • A coordinator first multicasts a
    CHECKPOINT_REQUEST message to all processes. When
    a process receives such a message, it takes a
    local checkpoint, queues any subsequent messages,
    and acknowledges that it has taken a checkpoint.
  • When the coordinator has received an
    acknowledgment from all processes, it multicasts
    a CHECKPOINT_DONE message to allow the blocked
    processes to continue.

7
REPLICA MANAGEMENT
  • Replica management involves two issues where to
    place replicas and which mechanisms to use for
    keeping them consistent.
  • Placing replica servers concerned with finding
    the best locations to place a server that can
    host part of a data store.
  • Placing content finding the best servers for
    placing content.

8
REPLICA-SERVER PLACEMENT
  • Replica management involves two issues where to
    place replicas and which mechanisms to use for
    keeping them consistent.
  • Analysis of client and network properties are
    useful to come to informed decisions.
  • One approach is to consider the topology of the
    Internet as formed by the Autonomous Systems
    (AS).
  • An AS can best be viewed as a network in which
    the nodes all run the same routing protocol and
    which is managed by single organization,
    typically Internet Service Provider (ISP).

9
CONTENT REPLICATION PLACEMENT
  • There are three types of replicas.

10
PERMANENT REPLICAS
  • Permanent replicas can be considered as the
    initial set of replicas that constitute a data
    store.
  • For example, distribution of a Web site generally
    comes in two forms
  • First, files that constitute a site are
    replicated across number of servers at a single
    location. Whenever a request comes in, it is
    forwarded to one of the servers, for instance
    using a round-robin method.
  • Second, mirroring can bed used. In this case, a
    Web site is copied to a limited number of
    servers, called mirror sites which are
    geographically spread across the Internet.
    Clients choose one of the sites offered to them.

11
SERVER-INITIATED REPLICAS
  • Server-initiated replicas are used to enhance
    performance by placing temporary replicas
    (dynamically placing) to handle sudden burst of
    requests.
  • Used mainly by the Web hosting services.
  • Each server keeps track of access counts per file
    and where access requests come from.
  • Given a client C, each server can determine which
    of the servers in the Web hosting service is
    closest to C (Such information can be obtained
    from routing database).
  • If client C1 and C2 share the same closest server
    P, all access requests for file F jointly
    registered.
  • When the number of requests for a specific file F
    drops below a certain threshold, that file can be
    removed from the server.

12
SERVER-INITIATED REPLICAS
  • Server-initiated replicas are generally used for
    placing read-only copies.

13
CLIENT-INITIATED REPLICAS
  • Client-initiated replicas are more commonly known
    as client caches. A cache is a local storage
    facility that is used by a client to temporarily
    store a copy of data.
  • Managing cache is left to the client. However,
    client can rely on server to inform when cache
    has become stale.
  • When most operations involve only reading data,
    performance can be improved by letting the client
    store requested data in nearby cache. Such a
    cache can be located on the clients machine or
    on a separate machine in the same LAN.
  • Whenever requested data can be fetched from the
    local cached, a cache hit is said to have occured.

14
CONTENT DISTRIBUTION
  • There are three ways to propagate the updated
    content to the replica servers
  • Propagate only a notification of an update
    Other copies are informed that an update has
    taken place, and the data they contain is no
    longer valid. The main advantage here is that use
    of little network bandwidth, and works best when
    there are many update operations compared to read
    operations, that is read-to-write ratio is
    relatively small.
  • Transfer data from one copy to another It is
    useful when read-to-write ratio is relatively
    high. In that case, the probability that an
    update will be effective in the sense that the
    modified data will be read before the next update
    takes place is high.

15
CONTENT DISTRIBUTION
  • Propagate the update operation from one copy to
    another Tell each replica which update
    operation it should perform (sending only
    parameter values that those operations need).
    This approach, also referred as active
    replication assumes that each replica is
    represented by a process capable of actively
    keeping its associated data up to date.

16
PULL versus PUSH PROTOCOLS
  • Yet another design issue is whether updates are
    pulled or pushed.
  • In a push-based approach, also referred as
    server-based protocols, updates are propagated to
    other replicas without those replicas asking for
    the updates. Push-based approach is used when
    replicas need to maintain a high degree of
    consistency i.e. replicas need to be kept
    identical. The server needs to keep track of all
    client caches.
  • In pull-based approach, a server or client
    requests another server to send it any updates it
    has at that moment. This approach, also called
    client-based protocols are often used by client
    caches, for example by Web caches.
Write a Comment
User Comments (0)
About PowerShow.com