Title: Distributed Systems
1Distributed Systems
- Distributed System models
- Physical Networks
- Logical Models
- Different Failure Models
- Communication constructs
- ( semantics of distributed programs)
- Ordering of events and
- Execution Semantics
2System Model
- Two ways of viewing a DS
- As defined by the physical components of the
system physical model - As defined from the point of view of processing
or computation logical model
3The goal of fault tolerance in DS
-
- Is to ensure that some property/service, in
the logical model is preserved despite the
failure of some component(s) in the physical
system.
4Physical Network of a DS
- Consists of many computers called nodes that
- are typically
- Autonomous
- Geographically Separated
- Communicate through Communication Networks
5Distributed vs. Parallel Systems
- Nodes loosely coupled
- Essentially no shared memory
- Private clocks for nodes
- Nodes closely coupled
- May have shared memory b/w nodes
- May have a single global clock for many/all nodes
6Point to Point Physical Network
Fully Connected
Star
Tree
Communication Protocols used TCP/IP, OSI etc.
7Bus Topology
Nodes Common bus Nodes
Communication Protocol used CSMA/CD
8Logical Model
- A Distributed Application consists of
- a set of concurrently executing processes that
cooperate with each other to perform some task. - A process is the execution of a sequential
program, which is a list of instructions.
9Concurrent Processes
- Can be classified in three categories
- Independent processes the sets of objects
accesses are disjoint. - Competing Processes share resources but there is
no information exchange between them. - Cooperating Processesexchange information either
by using shared data or by message passing.
10A few logical level assumptions
- Finite progress assumption since no assumptions
about the relative speeds of processes can be
made, it is assumed that they all have positive
rates of execution. - Underlying network is treated as fully
connected/topology is not considered. - At logical level the system is made of processes
and channels between them. - Channels are assumed to have infinite buffer and
to be error-free. - Channels deliver messages in the order in which
they are sent (NB on a particular channel).
11- Assumptions about Time Bounds on the performance
of the system are also made. - A system is said to be synchronous if, whenever
the system is working correctly, it performs its
intended function within a finite and known time
bound, otherwise it is said to be asynchronous. - A synchronous channel is one in which the max.
message delay is known and bounded. - A synchronous processor is one in which the time
to execute a sequence of instructions is finite
and bounded. - Advantage of synchronous systems failure of
components can be deduced by the lack of response.
12Failures and Fault Classification
- Crash Fault causes a component to halt or to
lose its internal state, component never
undergoes any incorrect state transition when it
fails. - Omission Fault causes a component to not respond
to some inputs. - Timing/performance Faultcauses a component to
respond either too early or too late. - Byzantine Fault causes the component to behave
in totally arbitrary manner during failure. - Incorrect Computation Fault produces incorrect
outputs.
13Fault Hierarchy
byzantine
timing
omission
crash
Incorrect computation fault is a subset of
byzantine but different from the other faults
14Assumptions about fault types
- For a processor crash fault or byzantine fault
- For a communication network all the different
types of faults - For a clock timing fault, byzantine fault and
sometimes omission fault - For a storage media crash, timing, omission
incorrect computation faults. - For software components most of the above
defined faults but most important is incorrect
computation fault.
15Interprocess Communication
- Synchronization and communication are both
achieved by message passing primitives. - In shared memory systems, primitives like
- semaphores, conditional critical regions and
- monitors are used.
16 Process Creation
- Processes are created in a system by the use of
some operating system-provided system call. - At the language level, this is done by using some
language primitives eg. fork and join - and cobegin-coend statement.
17Fork and Join Primitive
- Program P1
- ..
- fork P2
- .
- join P2
- .
18Cobegin-Coend Primitive
- cobegin S1 S2 S3..Sn coend
- The above statement causes n different processes
to - be created, each executing a different statement
Si, - it ends with the termination of all the Sis.
19Asynchronous Message Passing
- In DS without Shared Memory message passing is
used both for communication and synchronization. - A message is sent by a process by executing a
send command - send (data, destination)
- Receiving of data is done with a receive command
- receive(data, source) or receive
(message) - in clientserver interaction.
20Assumptions
- Message passing requires some buffer between
sender and receiver - In asynchronous message passing infinite buffer
to store message is assumed, so sender can go
on sending messages however receiver is not
non-blocking. - In reality though buffers are finite size, so
sender may also have to block called buffered
message passing.
21- NB
- Asynchronous and synchronous message passing is
different from asynchronous and synchronous DS.
The former refers to communication primitives and
the size of the buffer between sender and
receiver, while the latter deals with bounds on
message delays. - In synchronous DS , both asynchronous and
synchronous message passing can be supported.
22Synchronous msg passing CSP
- Has no buffering.
- Has the advantage that at each communication
command it is easier to make assertions about
processes. - Has been employed in Communicating Sequential
Processes (CSP), a notation proposed for
specifying distributed programs. - CSP uses Guarded Command Language.
23Guarded Commands
A GC is a statement list that is prefixed by a
Boolean expression called a guard
guard ? statement list. The statement
list is eligible for execution only if the guard
evaluates to true i.e. it succeeds. Evaluation of
guard is assumed to have no side-effects i.e. it
does not alter the state of the program in any
manner. The alternative construct is formed by
using a set of guarded commands as follows
24 G1 ? S1 G2 ? S2
Gn ? Sn , The execution of this
alternative construct aborts if all the guarded
commands evaluate to false. If any GC is true,
the corresponding statement is eligible for
execution. In case multiple GCs evaluate to
true, the statement to be executed is selected
non-deterministically. Repetitive structure is
similar but with a prefix. GC notation allows
non-determinism within a program.
25Communicating Sequential Processes
- Is a programming notation for expressing
concurrent programs. - Employs synchronous message passing.
- Uses guarded commands to allow selective
communication. - A CSP program may consist of many concurrent
processes e.g. A process Pi sends a message,
msg, to a process Pj by an output command of the
form Pj!msg. - Pj receives a message from Pi by input command
Pi?m. - For a process Pj the overall code is of the form
- Pj Initialize G1 ? C1 G2? C2 ..Gn
? Cn.
26Remote Procedure Call
- A higher level primitive to support clientserver
interaction. - An extension of the procedure call mechanism
available in most programming languages. - The service to be provided by the server is
treated as a procedure that resides on the
machine on which the server is.The client process
that needs that service makes calls to this
procedure and RPC takes care of the underlying
communication. - A call statement is of the form
- call service (value_args, result_args )
27- The states of the server and the client both may
change as a result of a call . - However Idempotent remote procedures on the
server do not change the state of the server
after each call from the client. - Idempotent servers simplify the task of fault
tolerance. - Two basic approaches to specifying the server
side in RPC - Remote procedure is just like a sequential
procedure i.e. single process executes the
procedure as calls are made. - A new process is created every time a call is
made. These processes can be concurrent.
28Semantics of the RPC in failure conditions
- The classification for semantics of remote calls
- At Least Once remote proc. has been executed one
or more times if the invocation terminates
normally. If it terminates abnormally nothing can
be said about the number of times remote proc.
executed. - Exactly Once remote proc. has executed exactly
once if invocation terminates normally if not ,
then it can be asserted that remote proc. Did not
execute more than once. - At Most once same as exactly once if invocation
terminates normally, otherwise it is guaranteed
that remote proc. Has been executed completely
once or has not been executed at all.
29Orphans Unwanted executions of remote procedures
caused due to communication or processor failure.
e.g. A client that crashes after issuing a call
may restart on recovery and reissue the call even
though the previous call is still being executed
by the server. Presence of orphans can violate
the semantics of RPC and lead to
inconsistency. Call Ordering property requires
that a sequence of invocations generated by a
given client result in computations performed by
the server in the same order. It is automatically
satisfied if there are no failures.Not a strict
requirement in case of Idempotent servers.
30Object-Action Model
- Another high-level communication paradigm.
- In this paradigm a system consists of many
objects that consist some data and well defined
methods (operations) on that data. - The encapsulated data can only be accesses
through the methods defined for them. - The objects may reside on different nodes.
- A process, sends a message to the object
concerned, which performs an action by executing
a method and returns the result to the process.
31- Nested remote procedure calls may be created.
- Methods on objects may execute in parallel.
- Concurrent calls may be made to the same method
or to the same object. - Becoming popular since it supports
- Fault tolerance by possible replication of
objects.
32Ordering of Events
- No single global clock for defining
happenedbefore relationship - between events of different processors.
- Partial Ordering the relation ? on a set of
events in a distributed system is the smallest
relation satisfying the following three
conditions - If a and b are events performed by the same
process and a is performed before b, then a ? b. - If a is the sending of a message by one process
and b is the receiving of the same message by
another process. - If a?b and b?c, then a?c . Two events are said to
be concurrent if neither a?b , nor b ?a.
33Logical Clocks
- The logical clock Ci, for a process Pi, is a
function which assigns a value Ci(a) to an event
a of the process Pi. - The system of logical clocks is considered to be
correct if it is consistent with the relation ?
or for any events a, b if a?b then C(a) lt C(b) - Whe a msg is sent from process Pi, the timestamp
of the sending event is included in the msg m and
can be retrieved by the receiver. - Let Tm be the timestamp of the message m. There
are two conditions that a system of logical
clocks should satisfy in order to be correct - Each Pi increments Ci between any two successive
events. - Upon receiving a msg m, Pj sets Cj greater than
or equal to its present value and greater than
Tm.
34Total Ordering of Events
- Order the events by the timestamps assigned to
them by the logical clock system.Processes can be
ordered in their lexicographic order of names. - A relationship gt on the set of events has been
defined as follows - for events a and b of processes Pi and Pj
respectively - agt b iff either Ci(a) lt Cj(b) or Ci(a)
Cj(b) and Pi comes before Pj in the ordering.
35Execution Model and System State
- At a logical level, a distributed system can be
modeled as a directed graph with nodes
representing channels between processes. - The state of a channel in this model is the
sequence of msgs that are still in the channel. - A process can be considered as consisting of a
set of states, an initial state, and a sequence
of events (or actions). - The state of a process is an assignment of a
value to each of its variables, along with the
specification of its control point which
specifies the event executed last. - Each event or action of a process assumed to be
atomic. - An event e of a process p can change the state of
p and at most one channel c that is incident on p.
36- Each event has an enabling condition, which is a
condition on the state of the process and the
channel attached to it. - An event e can occur only if this enabling
condition is true.e.g. when the program counter
has a specific value. - The global state or the system state of a DS
consists of states of each of the processes in
the system and the states of the channels in the
system. - The initial global state is one in which each
process is in its initial state and all channels
are empty. - An event e can change the system state S by
changing the state of process p, iff the enabling
condition for e is true in S.
37- A function ready(S) is defined on a global state
S as a set of events for which enabling condition
is satisfied in S. - The events in ready (S) can belong to different
processes , however only one of these events will
take place. - Which of the events in ready (S) will occur can
not be predicted deterministically. - We define another function next, where next(S, e)
is the global state immediately following the
occurrence of the event e in the global state S. - The computation of a DS can be defined as a
sequence of events. - Let the initial state of the system be S0 and let
seq (ei, 0lt I lt n) be a sequence of events. - Suppose that the system state when ei occurs is
Si, the sequence of events seq is a computation
of the system if the following conditions are
satisfied
38- The event ei belongs to ready(Si), 0lt ilt n.
- Si1 next(Si, ei), 0ltiltn.
- Example A concurrent shared memory program.
-
a x 0 - b cobegin
- c y 0
- d cobegin
- e y 2y
- f y
y 3 - coend
- g while y 0 do
- h x x1
- coend
- j x 2y
39An execution sequence for the program S0
(2,7) a (a) ? S1 (0,7)c,g (c )? S2
(0,0) e,f,g (g)? S3 (0,0)e,f,h
(h)? S4 (1,0) e,f,g (f)? S5 (1,3)e,g
(e)? S6 (1,6) g (g)? S7 (1,6)j
(j)? S8 (12,6) The possible states of the
system can alos be represented as a tree with its
root as the initial state and each event in the
ready state producing a child of a node. Such a
tree is called a reachability tree, in which each
node represents a state, and the number of
children of a node equals the cardinality of the
ready set at that state. Each path from the
initial node to a leaf node shows one possible
execution sequence of the system. The states in
the path are called valid or consistent states.
40Reachability Tree
S0
a
S1
g
f
b
S2
e
f
g
S3
e
h
S4
This model is also called the interleaving model.
41 For details, please refer to Fault Tolerance
in Distributed Systems , -by Pankaj Jalote
THANK YOU