Leslie Lamport

About This Presentation

Title:

Leslie Lamport

Description:

Issue is dependency on critical ... The Arianne rocket is designed in a modular fashion. Guidance system. Flight telemetry ... Basic issues with the approach ... – PowerPoint PPT presentation

Number of Views:130

Avg rating:3.0/5.0

Slides: 56

Provided by: bina1

Category:

more less

Transcript and Presenter's Notes

Title: Leslie Lamport

1
Leslie Lamport

A distributed system is one in which the failure
of a machine you have never heard of can cause
your own machine to become unusable
Issue is dependency on critical components
Notion is that state and health of system at
site A is linked to state and health at site B

2
Component Architectures Make it Worse

Modern systems are structured using
object-oriented component interfaces
CORBA, COM (or DCOM), Jini
XML
In these systems, we create a web of dependencies
between components
Any faulty component could cripple the system!

3
Reminder Networks versus Distributed Systems

Network focus is on connectivity but components
are logically independent program fetches a file
and operates on it, but server is stateless and
forgets the interaction
Less sophisticated but more robust?
Distributed systems focus is on joint behavior of
a set of logically related components. Can talk
about the system as an entity.
But needs fancier failure handling!

4
Component Systems?

Includes CORBA and Web Services
These are distributed in the sense of our
definition
Often, they share state between components
If a component fails, replacing it with a new
version may be hard
Replicating the state of a component an
appealing option
Deceptively appealing, as well see

5
Example

The Web components are individually reliable
But the Web can fail by returning inconsistent or
stale data, can freeze up or claim that a server
is not responding (even if both browser and
server are operational), and it can be so slow
that we consider it faulty even if it is working
For stateful systems (the Web is stateless) this
issue extends to joint behavior of sets of
programs

6
Example

The Arianne rocket is designed in a modular
fashion
Guidance system
Flight telemetry
Rocket engine control
. Etc
When they upgraded some rocket components in a
new model, working modules failed because hidden
assumptions were invalided.

7
Arianne Rocket
Telemetry
Attitude Control
Guidance
Altitude
Accelerometer
Thrust Control
8
Arianne Rocket
Telemetry
Attitude Control
Guidance
Altitude
Overflow!
Accelerometer
Thrust Control
9
Arianne Rocket
Telemetry
Attitude Control
Guidance
Altitude
Accelerometer
Thrust Control
10
Insights?

Correctness depends very much on the environment
A component that is correct in setting A may be
incorrect in setting B
Components make hidden assumptions
Perceived reliability is in part a matter of
experience and comfort with a technology base and
its limitations!

11
Detecting failure

Not always necessary there are ways to overcome
failures that dont explicitly detect them
But situation is much easier with detectable
faults
Usual approach process does something to say I
am still alive
Absence of proof of liveness taken as evidence of
a failure

12
Example pinging with timeouts

Programs P and B are the primary, backup of a
service
Programs X, Y, Z are clients of the service
All ping each other for liveness
If a process doesnt respond to a few pings,
consider it faulty.

13
Component failure detection

An even harder problem!
Now we need to worry
About programs that fail
But also about modules that fail
Unclear how to do this or even how to tell
Recall that RPC makes component use rather
transparent

14
Vogels the Failure Investigator

Argues that we would not consider someone to have
died because they dont answer the phone
Approach is to consult other data sources
Operating system where process runs
Information about status of network routing nodes
Can augment with application-specific solutions
Wont detect program that looks healthy but is
actually not operating correctly

15
Further options Hot button

Usually implemented using shared memory
Monitored program must periodically update a
counter in a shared memory region. Designed to
do this at some frequency, e.g. 10 times per
second.
Monitoring program polls the counter, perhaps 5
times per second. If counter stops changing,
kills the faulty process and notifies others.

16
Friedmans approach

Used in a telecommunications co-processor mockup
Cant wait for failures to be sensed, so his
protocol reissues requests as soon as soon as the
reply seems late
Issue of detecting failure becomes a background
task need to do it soon enough so that overhead
wont be excessive or realtime response impacted

17
Broad picture?

Distributed systems have many components, linked
by chains of dependencies
Failures are inevitable, hardware failures are
less and less central to availability
Inconsistency of failure detection will introduce
inconsistency of behavior and could freeze the
application

18
Suggested solution?

Replace critical components with group of
components that can each act on behalf of the
original one
Develop a technology by which states can be kept
consistent and processes in system can agree on
status (operational/failured) of components
Separate handling of partitioning from handling
of isolated component failures if possible

19
Suggested Solution
Program
Module it uses
20
Suggested Solution
Program
Module it uses
Module it uses
Transparent replication
multicast
21
Replication the key technology

Replicate critical components for availability
Replicate critical data like coherent caching
Replicate critical system state control
information such as Ill do X while you do Y
In limit, replication and coordination are really
the same problem

22
Basic issues with the approach

We need to understand client-side software
architectures better to appreciate the practical
limitations on replacing a server with a group
Sometimes, this simply isnt practical

23
Client-Server issues

Suppose that a client observes a failure during a
request
What should it do?

24
Client-server issues
Timeout
25
Client-server issues

What should the client do?
No way to know if request was finished
We dont even know if server really crashed
But suppose it genuinely crashed

26
Client-server issues
backup
Timeout
27
Client-server issues

What should client say to backup?
Please check on the status of my last request?
But perhaps backup has not yet finished the
fault-handling protocol
Reissue request?
Not all requests are idempotent
And what about any cached server state? Will
it need to be refreshed?
Worse still what if RPC throws an exception?
Eg. demarshalling error
A risk if failure breaks a stream connection

28
Client-server issues

Client is doing a request that might be disrupted
by failure
Must catch this request
Client needs to reconnect
Figure out who will take over
Wait until it knows about the crash
Cached data may no longer be valid
Track down outcome of pending requests
Meanwhile must synchronize wrt any new requests
that application issues

29
Client-server issues

This argues that we need to make server failure
transparent to client
But in practice, doing so is hard
Normally, this requires deterministic servers
But not many servers are deterministic
Techniques are also very slow

30
Client-server issues

Transparency
On client side, nothing happens
On server side
There may be a connection that backup needs to
take over
What if server was in the middle of sending a
request?
How can backup exactly mimic actions of the
primary?

31
Other approaches to consider

N-version programming use more than one
implementation to overcome software bugs
Explicitly uses some form of group architecture
We run multiple copies of the component
Compare their outputs and pick majority
Could be identical copies, or separate versions
In limit, each is coded by a different team!

32
Other approaches to consider

Even with n-version programming, we get limited
defense against bugs
... studies show that Bohrbugs will occur in all
versions! For Heisenbugs we wont need multiple
versions running one version multiple times
suffices if versions see different inputs or
different order of inputs

33
Logging and checkpoints

Processes make periodic checkpoints, log messages
sent in between
Rollback to consistent set of checkpoints after a
failure. Technique is simple and costs are low.
But method must be used throughout system and is
limited to deterministic programs (everything in
the system must satisfy this assumption)
Consequence useful in limited settings.

34
Byzantine approach

Assumes that failures are arbitrary and may be
malicious
Uses groups of components that take actions by
majority consensus only
Protocols prove to be costly
3t1 components needed to overcome t failures
Takes a long time to agree on each action
Currently employed mostly in security settings

35
Tougher failure models

Weve focused on crash failures
In the synchronous model these look like a
farewell cruel world message
Some call it the failstop model. A faulty
process is viewed as first saying goodbye, then
crashing
What about tougher kinds of failures?
Corrupted messages
Processes that dont follow the algorithm
Malicious processes out to cause havoc?

36
Here the situation is much harder

Generally we need at least 3f1 processes in a
system to tolerate f Byzantine failures
For example, to tolerate 1 failure we need 4 or
more processes
We also need f1 rounds
Lets see why this happens

37
Byzantine scenario

Generals (N of them) surround a city
They communicate by courier
Each has an opinion attack or wait
In fact, an attack would succeed the city will
fall.
Waiting will succeed too the city will
surrender.
But if some attack and some wait, disaster ensues
Some Generals (f of them) are traitors it
doesnt matter if they attack or wait, but we
must prevent them from disrupting the battle
Traitor cant forge messages from other Generals

38
Byzantine scenario
Attack! No, wait! Surrender!
Wait
Attack!
Attack!
Wait
39
A timeline perspective
p

Suppose that p and q favor attack, r is a traitor
and s and t favor waiting assume that in a tie
vote, we attack

q
r
s
t
40
A timeline perspective

After first round collected votes are
attack, attack, wait, wait, traitors-vote

p
q
r
s
t
41
What can the traitor do?

Add a legitimate vote of attack
Anyone with 3 votes to attack knows the outcome
Add a legitimate vote of wait
Vote now favors wait
Or send different votes to different folks
Or dont send a vote, at all, to some

42
Outcomes?

Traitor simply votes
Either all see a,a,a,w,w
Or all see a,a,w,w,w
Traitor double-votes
Some see a,a,a,w,w and some a,a,w,w,w
Traitor withholds some vote(s)
Some see a,a,w,w, perhaps others see
a,a,a,w,w, and still others see a,a,w,w,w
Notice that traitor cant manipulate votes of
loyal Generals!

43
What can we do?

Clearly we cant decide yet some loyal Generals
might have contradictory data
In fact if anyone has 3 votes to attack, they can
already decide.
Similarly, anyone with just 4 votes can decide
But with 3 votes to wait a General isnt sure
(one could be a traitor)
So in round 2, each sends out witness
messages heres what I saw in round 1
General Smith send me attack(signed) Smith

44
Digital signatures

These require a cryptographic system
For example, RSA
Each player has a secret (private) key K-1 and a
public key K.
She can publish her public key
RSA gives us a single encrypt function
Encrypt(Encrypt(M,K),K-1) Encrypt(Encrypt(M,K-1)
,K) M
Encrypt a hash of the message to sign it

45
With such a system

A can send a message to B that only A could have
sent
A just encrypts the body with her private key
or one that only B can read
A encrypts it with Bs public key
Or can sign it as proof she sent it
B can recompute the signature and decrypt As
hashed signature to see if they match
These capabilities limit what our traitor can do
he cant forge or modify a message

46
A timeline perspective

In second round if the traitor didnt behave
identically for all Generals, we can weed out his
faulty votes

p
q
r
s
t
47
A timeline perspective

We attack!

Attack!!
p
Attack!!
q
Damn! Theyre on to me
r
Attack!!
s
Attack!!
t
48
Traitor is stymied

Our loyal generals can deduce that the decision
was to attack
Traitor cant disrupt this
Either forced to vote legitimately, or is caught
But costs were steep!
(f1)n2 ,messages!
Rounds can also be slow.
Early stopping protocols min(t2, f1) rounds
t is true number of faults

49
Recent work with Byzantine model

Focus is typically on using it to secure
particularly sensitive, ultra-critical services
For example the certification authority that
hands out keys in a domain
Or a database maintaining top-secret data
Researchers have suggested that for such
purposes, a Byzantine Quorum approach can work
well
They are implementing this in real systems by
simulating rounds using various tricks

50
Byzantine Quorums

Arrange servers into a ? n x ?n array
Idea is that any row or column is a quorum
Then use Byzantine Agreement to access that
quorum, doing a read or a write
Separately, Castro and Liskov have tackled a
related problem, using BA to secure a file server
By keeping BA out of the critical path, can avoid
most of the delay BA normally imposes

51
Split secrets

In fact BA algorithms are just the tip of a
broader coding theory iceberg
One exciting idea is called a split secret
Idea is to spread a secret among n servers so
that any k can reconstruct the secret, but no
individual actually has all the bits
Protocol lets the client obtain the shares
without the servers seeing one-anothers messages
The servers keep but cant read the secret!
Question In what ways is this better than just
encrypting a secret?

52
How split secrets work

They build on a famous result
With k1 distinct points you can uniquely
identify an order-k polynomial
i.e 2 points determine a line
3 points determine a unique quadratic
The polynomial is the secret
And the servers themselves have the points the
shares
With coding theory the shares are made just
redundant enough to overcome n-k faults

53
Byzantine Broadcast (BB)

Many classical research results use Byzantine
Agreement to implement a form of fault-tolerant
multicast
To send a message I initiate agreement on that
message
We end up agreeing on content and ordering w.r.t.
other messages
Used as a primitive in many published papers

54
Pros and cons to BB

On the positive side, the primitive is very
powerful
For example this is the core of the Castro and
Liskov technique
But on the negative side, BB is slow
Well see ways of doing fault-tolerant multicast
that run at 150,000 small messages per second
BB more like 5 or 10 per second
The right choice for infrequent, very sensitive
actions but wrong if performance matters

55
Take-aways?

Fault-tolerance matters in many systems
But we need to agree on what a fault is
Extreme models lead to high costs!
Common to reduce fault-tolerance to some form of
data or state replication
In this case fault-tolerance is often provided by
some form of broadcast
Mechanism for detecting faults is also important
in many systems.
Timeout is common but can behave inconsistently
View change notification is used in some
systems. They typically implement a fault
agreement protocol.