Distributed Systems - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Distributed Systems

Description:

Distributed Systems Distributed System models Physical Networks Logical Models Different Failure Models Communication constructs ( semantics of distributed programs) – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 42

Provided by: VinayTr6

Learn more at: http://vega.cs.kent.edu

Category:

more less

Transcript and Presenter's Notes

Title: Distributed Systems

1
Distributed Systems

Distributed System models
Physical Networks
Logical Models
Different Failure Models
Communication constructs
( semantics of distributed programs)
Ordering of events and
Execution Semantics

2
System Model

Two ways of viewing a DS
As defined by the physical components of the
system physical model
As defined from the point of view of processing
or computation logical model

3
The goal of fault tolerance in DS

Is to ensure that some property/service, in
the logical model is preserved despite the
failure of some component(s) in the physical
system.

4
Physical Network of a DS

Consists of many computers called nodes that
are typically
Autonomous
Geographically Separated
Communicate through Communication Networks

5
Distributed vs. Parallel Systems

Nodes loosely coupled
Essentially no shared memory
Private clocks for nodes

Nodes closely coupled
May have shared memory b/w nodes
May have a single global clock for many/all nodes

6
Point to Point Physical Network
Fully Connected
Star
Tree
Communication Protocols used TCP/IP, OSI etc.
7
Bus Topology
Nodes Common bus Nodes
Communication Protocol used CSMA/CD
8
Logical Model

A Distributed Application consists of
a set of concurrently executing processes that
cooperate with each other to perform some task.
A process is the execution of a sequential
program, which is a list of instructions.

9
Concurrent Processes

Can be classified in three categories
Independent processes the sets of objects
accesses are disjoint.
Competing Processes share resources but there is
no information exchange between them.
Cooperating Processesexchange information either
by using shared data or by message passing.

10
A few logical level assumptions

Finite progress assumption since no assumptions
about the relative speeds of processes can be
made, it is assumed that they all have positive
rates of execution.
Underlying network is treated as fully
connected/topology is not considered.
At logical level the system is made of processes
and channels between them.
Channels are assumed to have infinite buffer and
to be error-free.
Channels deliver messages in the order in which
they are sent (NB on a particular channel).

Assumptions about Time Bounds on the performance
of the system are also made.
A system is said to be synchronous if, whenever
the system is working correctly, it performs its
intended function within a finite and known time
bound, otherwise it is said to be asynchronous.
A synchronous channel is one in which the max.
message delay is known and bounded.
A synchronous processor is one in which the time
to execute a sequence of instructions is finite
and bounded.
Advantage of synchronous systems failure of
components can be deduced by the lack of response.

12
Failures and Fault Classification

Crash Fault causes a component to halt or to
lose its internal state, component never
undergoes any incorrect state transition when it
fails.
Omission Fault causes a component to not respond
to some inputs.
Timing/performance Faultcauses a component to
respond either too early or too late.
Byzantine Fault causes the component to behave
in totally arbitrary manner during failure.
Incorrect Computation Fault produces incorrect
outputs.

13
Fault Hierarchy

byzantine
timing
omission
crash
Incorrect computation fault is a subset of
byzantine but different from the other faults
14
Assumptions about fault types

For a processor crash fault or byzantine fault
For a communication network all the different
types of faults
For a clock timing fault, byzantine fault and
sometimes omission fault
For a storage media crash, timing, omission
incorrect computation faults.
For software components most of the above
defined faults but most important is incorrect
computation fault.

15
Interprocess Communication

Synchronization and communication are both
achieved by message passing primitives.
In shared memory systems, primitives like
semaphores, conditional critical regions and
monitors are used.

16
Process Creation

Processes are created in a system by the use of
some operating system-provided system call.
At the language level, this is done by using some
language primitives eg. fork and join
and cobegin-coend statement.

17
Fork and Join Primitive

Program P1
..
fork P2
.
join P2
.

Program P2
.
.
.
end

18
Cobegin-Coend Primitive

cobegin S1 S2 S3..Sn coend
The above statement causes n different processes
to
be created, each executing a different statement
Si,
it ends with the termination of all the Sis.

19
Asynchronous Message Passing

In DS without Shared Memory message passing is
used both for communication and synchronization.
A message is sent by a process by executing a
send command
send (data, destination)
Receiving of data is done with a receive command
receive(data, source) or receive
(message)
in clientserver interaction.

20
Assumptions

Message passing requires some buffer between
sender and receiver
In asynchronous message passing infinite buffer
to store message is assumed, so sender can go
on sending messages however receiver is not
non-blocking.
In reality though buffers are finite size, so
sender may also have to block called buffered
message passing.

NB
Asynchronous and synchronous message passing is
different from asynchronous and synchronous DS.
The former refers to communication primitives and
the size of the buffer between sender and
receiver, while the latter deals with bounds on
message delays.
In synchronous DS , both asynchronous and
synchronous message passing can be supported.

22
Synchronous msg passing CSP

Has no buffering.
Has the advantage that at each communication
command it is easier to make assertions about
processes.
Has been employed in Communicating Sequential
Processes (CSP), a notation proposed for
specifying distributed programs.
CSP uses Guarded Command Language.

23
Guarded Commands
A GC is a statement list that is prefixed by a
Boolean expression called a guard
guard ? statement list. The statement
list is eligible for execution only if the guard
evaluates to true i.e. it succeeds. Evaluation of
guard is assumed to have no side-effects i.e. it
does not alter the state of the program in any
manner. The alternative construct is formed by
using a set of guarded commands as follows
24
G1 ? S1 G2 ? S2
Gn ? Sn , The execution of this
alternative construct aborts if all the guarded
commands evaluate to false. If any GC is true,
the corresponding statement is eligible for
execution. In case multiple GCs evaluate to
true, the statement to be executed is selected
non-deterministically. Repetitive structure is
similar but with a prefix. GC notation allows
non-determinism within a program.
25
Communicating Sequential Processes

Is a programming notation for expressing
concurrent programs.
Employs synchronous message passing.
Uses guarded commands to allow selective
communication.
A CSP program may consist of many concurrent
processes e.g. A process Pi sends a message,
msg, to a process Pj by an output command of the
form Pj!msg.
Pj receives a message from Pi by input command
Pi?m.
For a process Pj the overall code is of the form
Pj Initialize G1 ? C1 G2? C2 ..Gn
? Cn.

26
Remote Procedure Call

A higher level primitive to support clientserver
interaction.
An extension of the procedure call mechanism
available in most programming languages.
The service to be provided by the server is
treated as a procedure that resides on the
machine on which the server is.The client process
that needs that service makes calls to this
procedure and RPC takes care of the underlying
communication.
A call statement is of the form
call service (value_args, result_args )

The states of the server and the client both may
change as a result of a call .
However Idempotent remote procedures on the
server do not change the state of the server
after each call from the client.
Idempotent servers simplify the task of fault
tolerance.
Two basic approaches to specifying the server
side in RPC
Remote procedure is just like a sequential
procedure i.e. single process executes the
procedure as calls are made.
A new process is created every time a call is
made. These processes can be concurrent.

28
Semantics of the RPC in failure conditions

The classification for semantics of remote calls
At Least Once remote proc. has been executed one
or more times if the invocation terminates
normally. If it terminates abnormally nothing can
be said about the number of times remote proc.
executed.
Exactly Once remote proc. has executed exactly
once if invocation terminates normally if not ,
then it can be asserted that remote proc. Did not
execute more than once.
At Most once same as exactly once if invocation
terminates normally, otherwise it is guaranteed
that remote proc. Has been executed completely
once or has not been executed at all.

29
Orphans Unwanted executions of remote procedures
caused due to communication or processor failure.
e.g. A client that crashes after issuing a call
may restart on recovery and reissue the call even
though the previous call is still being executed
by the server. Presence of orphans can violate
the semantics of RPC and lead to
inconsistency. Call Ordering property requires
that a sequence of invocations generated by a
given client result in computations performed by
the server in the same order. It is automatically
satisfied if there are no failures.Not a strict
requirement in case of Idempotent servers.
30
Object-Action Model

Another high-level communication paradigm.
In this paradigm a system consists of many
objects that consist some data and well defined
methods (operations) on that data.
The encapsulated data can only be accesses
through the methods defined for them.
The objects may reside on different nodes.
A process, sends a message to the object
concerned, which performs an action by executing
a method and returns the result to the process.

Nested remote procedure calls may be created.
Methods on objects may execute in parallel.
Concurrent calls may be made to the same method
or to the same object.
Becoming popular since it supports
Fault tolerance by possible replication of
objects.

32
Ordering of Events

No single global clock for defining
happenedbefore relationship
between events of different processors.
Partial Ordering the relation ? on a set of
events in a distributed system is the smallest
relation satisfying the following three
conditions
If a and b are events performed by the same
process and a is performed before b, then a ? b.
If a is the sending of a message by one process
and b is the receiving of the same message by
another process.
If a?b and b?c, then a?c . Two events are said to
be concurrent if neither a?b , nor b ?a.

33
Logical Clocks

The logical clock Ci, for a process Pi, is a
function which assigns a value Ci(a) to an event
a of the process Pi.
The system of logical clocks is considered to be
correct if it is consistent with the relation ?
or for any events a, b if a?b then C(a) lt C(b)
Whe a msg is sent from process Pi, the timestamp
of the sending event is included in the msg m and
can be retrieved by the receiver.
Let Tm be the timestamp of the message m. There
are two conditions that a system of logical
clocks should satisfy in order to be correct
Each Pi increments Ci between any two successive
events.
Upon receiving a msg m, Pj sets Cj greater than
or equal to its present value and greater than
Tm.

34
Total Ordering of Events

Order the events by the timestamps assigned to
them by the logical clock system.Processes can be
ordered in their lexicographic order of names.
A relationship gt on the set of events has been
defined as follows
for events a and b of processes Pi and Pj
respectively
agt b iff either Ci(a) lt Cj(b) or Ci(a)
Cj(b) and Pi comes before Pj in the ordering.

35
Execution Model and System State

At a logical level, a distributed system can be
modeled as a directed graph with nodes
representing channels between processes.
The state of a channel in this model is the
sequence of msgs that are still in the channel.
A process can be considered as consisting of a
set of states, an initial state, and a sequence
of events (or actions).
The state of a process is an assignment of a
value to each of its variables, along with the
specification of its control point which
specifies the event executed last.
Each event or action of a process assumed to be
atomic.
An event e of a process p can change the state of
p and at most one channel c that is incident on p.

Each event has an enabling condition, which is a
condition on the state of the process and the
channel attached to it.
An event e can occur only if this enabling
condition is true.e.g. when the program counter
has a specific value.
The global state or the system state of a DS
consists of states of each of the processes in
the system and the states of the channels in the
system.
The initial global state is one in which each
process is in its initial state and all channels
are empty.
An event e can change the system state S by
changing the state of process p, iff the enabling
condition for e is true in S.

A function ready(S) is defined on a global state
S as a set of events for which enabling condition
is satisfied in S.
The events in ready (S) can belong to different
processes , however only one of these events will
take place.
Which of the events in ready (S) will occur can
not be predicted deterministically.
We define another function next, where next(S, e)
is the global state immediately following the
occurrence of the event e in the global state S.
The computation of a DS can be defined as a
sequence of events.
Let the initial state of the system be S0 and let
seq (ei, 0lt I lt n) be a sequence of events.
Suppose that the system state when ei occurs is
Si, the sequence of events seq is a computation
of the system if the following conditions are
satisfied

The event ei belongs to ready(Si), 0lt ilt n.
Si1 next(Si, ei), 0ltiltn.
Example A concurrent shared memory program.
a x 0
b cobegin
c y 0
d cobegin
e y 2y
f y
y 3
coend
g while y 0 do
h x x1
coend
j x 2y

39
An execution sequence for the program S0
(2,7) a (a) ? S1 (0,7)c,g (c )? S2
(0,0) e,f,g (g)? S3 (0,0)e,f,h
(h)? S4 (1,0) e,f,g (f)? S5 (1,3)e,g
(e)? S6 (1,6) g (g)? S7 (1,6)j
(j)? S8 (12,6) The possible states of the
system can alos be represented as a tree with its
root as the initial state and each event in the
ready state producing a child of a node. Such a
tree is called a reachability tree, in which each
node represents a state, and the number of
children of a node equals the cardinality of the
ready set at that state. Each path from the
initial node to a leaf node shows one possible
execution sequence of the system. The states in
the path are called valid or consistent states.
40
Reachability Tree
S0
a
S1
g
f
b
S2
e
f
g
S3
e
h
S4
This model is also called the interleaving model.
41
For details, please refer to Fault Tolerance
in Distributed Systems , -by Pankaj Jalote
THANK YOU

Write a Comment

User Comments (0)