CTIS 490 DISTRIBUTED SYSTEMS - PowerPoint PPT Presentation

1 / 16

About This Presentation

Title:

CTIS 490 DISTRIBUTED SYSTEMS

Description:

Number of Views:20

Avg rating:3.0/5.0

Slides: 17

Provided by: cneyt

Category:

more less

Transcript and Presenter's Notes

Title: CTIS 490 DISTRIBUTED SYSTEMS

1
CTIS 490DISTRIBUTED SYSTEMS

2
RECOVERY

Another issue of fault tolerance is the recovery
from an error.
The idea of error recovery is to replace an
erroneous state with an error-free state.
The most widely used recovery method is the
backward recovery.
Backward recovery brings the system from its
present erroneous state back into a previously
correct state. To do so, it will be necessary to
record the systems state from time to time and
to restore such a recorded state when things go
wrong.
Each time the systems present state is recorded,
a checkpoint is said to be made.

3
CHECKPOINTING

In distributed systems, a consistent global state
is also called a distributed snapshot.
In a distributed snapshot, if a process P has
recorded the receipt of a message, then there
should be a process Q that has recorded the
sending of that message.

4
INDEPENDENT CHECKPOINTING

Each process saves its state from time to time to
a locally available stable storage (which is
designed to survive anything except major
disasters), and we have to construct a consistent
global state from these local states.
A recovery line corresponds to the most recent
collection of checkpoints.
The distributed nature of checkpointing may make
it difficult to find a recovery line. To discover
a recovery line requires that each process is
rolled back to its most recently saved state.
If these local states jointly do not form a
distributed snapshot, further rolling back is
necessary.
This process of a cascaded rollback leads to
domino effect.

5
INDEPENDENT CHECKPOINTING

The state saved by P2 indicates the receipt of a
message m, but no other process can be identified
as its sender. So, P2 needs to be rolled back to
an earlier state.
P1 has recorded the receipt of message m, but
there is no recorded event of this message being
sent.
In this example, the recovery line is the initial
state of the system.

6
COORDINATED CHECKPOINTING

In coordinating checkpointing, all processes
synchronize to jointly write their state to local
stable storage. The main advantage is that the
saved state is globally consistent.
A coordinator first multicasts a
CHECKPOINT_REQUEST message to all processes. When
a process receives such a message, it takes a
local checkpoint, queues any subsequent messages,
and acknowledges that it has taken a checkpoint.
When the coordinator has received an
acknowledgment from all processes, it multicasts
a CHECKPOINT_DONE message to allow the blocked
processes to continue.

7
REPLICA MANAGEMENT

Replica management involves two issues where to
place replicas and which mechanisms to use for
keeping them consistent.
Placing replica servers concerned with finding
the best locations to place a server that can
host part of a data store.
Placing content finding the best servers for
placing content.

8
REPLICA-SERVER PLACEMENT

Replica management involves two issues where to
place replicas and which mechanisms to use for
keeping them consistent.
Analysis of client and network properties are
useful to come to informed decisions.
One approach is to consider the topology of the
Internet as formed by the Autonomous Systems
(AS).
An AS can best be viewed as a network in which
the nodes all run the same routing protocol and
which is managed by single organization,
typically Internet Service Provider (ISP).

9
CONTENT REPLICATION PLACEMENT

10
PERMANENT REPLICAS

Permanent replicas can be considered as the
initial set of replicas that constitute a data
store.
For example, distribution of a Web site generally
comes in two forms
First, files that constitute a site are
replicated across number of servers at a single
location. Whenever a request comes in, it is
forwarded to one of the servers, for instance
using a round-robin method.
Second, mirroring can bed used. In this case, a
Web site is copied to a limited number of
servers, called mirror sites which are
geographically spread across the Internet.
Clients choose one of the sites offered to them.

11
SERVER-INITIATED REPLICAS

Server-initiated replicas are used to enhance
performance by placing temporary replicas
(dynamically placing) to handle sudden burst of
requests.
Used mainly by the Web hosting services.
Each server keeps track of access counts per file
and where access requests come from.
Given a client C, each server can determine which
of the servers in the Web hosting service is
closest to C (Such information can be obtained
from routing database).
If client C1 and C2 share the same closest server
P, all access requests for file F jointly
registered.
When the number of requests for a specific file F
drops below a certain threshold, that file can be
removed from the server.

12
SERVER-INITIATED REPLICAS

13
CLIENT-INITIATED REPLICAS

Client-initiated replicas are more commonly known
as client caches. A cache is a local storage
facility that is used by a client to temporarily
store a copy of data.
Managing cache is left to the client. However,
client can rely on server to inform when cache
has become stale.
When most operations involve only reading data,
performance can be improved by letting the client
store requested data in nearby cache. Such a
cache can be located on the clients machine or
on a separate machine in the same LAN.
Whenever requested data can be fetched from the
local cached, a cache hit is said to have occured.

14
CONTENT DISTRIBUTION

15
CONTENT DISTRIBUTION

16
PULL versus PUSH PROTOCOLS

Yet another design issue is whether updates are
pulled or pushed.
In a push-based approach, also referred as
server-based protocols, updates are propagated to
other replicas without those replicas asking for
the updates. Push-based approach is used when
replicas need to maintain a high degree of
consistency i.e. replicas need to be kept
identical. The server needs to keep track of all
client caches.
In pull-based approach, a server or client
requests another server to send it any updates it
has at that moment. This approach, also called
client-based protocols are often used by client
caches, for example by Web caches.