Consistency and Replication - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Consistency and Replication

Description:

Preliminary Version, Not Final. Consistency and Replication. Introduction to ... How strong is Orbitz's model? If it shows a ticket available, is it really? ... – PowerPoint PPT presentation

Number of Views:147
Avg rating:3.0/5.0
Slides: 73
Provided by: Ken667
Category:

less

Transcript and Presenter's Notes

Title: Consistency and Replication


1
Consistency and Replication
  • Introduction to Distributed SystemsCS
    457/557Fall 2008Kenneth Chiu

2
  • Topics
  • Consistency models
  • Implementation
  • Replica location and content distribution
  • Maintaining consistency

3
Why Replicate?
  • Reliability
  • If one goes down, the others can stay up.
  • How can it address corrupted data?
  • Compare multiple versions
  • Performance
  • Divide the work
  • Place data closer to place it is used.
  • What is the challenge?
  • Consistency
  • Consider a web cache in your browser.

4
Costs
  • As a scaling technique, may not always be
    applicable.

Update replica M times per second
Access replica N times per second
P
  • What if N

5
  • What do we do?

WAN
Withdraw 50
Withdraw 50
  • A dilemma
  • Scalability can be alleviated by replication and
    caching.
  • But consistency requires global synchronization!
  • Only real solution is to relax consistency
    requirements.

6
Consistency Models Review
  • Enforcing absolute ordering is too expensive,
    especially with replication and caching.
  • So we need to allow for mis-ordering.
  • We could just do it casually. Tell programmers,
    Well, you might see things out of order a little
    bit, but only in ways that wont matter.
  • They would say, What do you mean?
  • So we need an exact, very precise way of
    specifying the kinds of inconsistencies that the
    application might see.
  • That is the purpose and point of having
    consistency models.

7
Data Centric Consistency Models
8
Data Stores
  • Consistency is viewed as read/write ops on shared
    data.
  • A consistency model is a contract between the
    processes and the data store.

9
Continuous Consistency
  • Three axes for continuous consistency ranges
  • Deviation in numerical values
  • Deviation in staleness (age) between replicas
  • Deviation with respect to ordering
  • Numerical deviation
  • Can be specified in terms of deviation in values.
  • Can also be specified in terms of the number of
    updates that have been applied, but not yet seen
    by others. Deviation in value is then known as
    the weight.
  • Staleness deviation
  • A replica can be out-of-date, as long as it is
    not too out-of-date
  • For example, a weather report.
  • Ordering deviation
  • Can be specified as the number of ops that may
    need to be rolled back.

10
Consistency Unit
  • Conit The unit of data over which consistency is
    to be measured. Examples?
  • A single stock
  • A single weather report

11
  • Each replica maintains a vector clock. So it can
    do causally ordered multicast.
  • The notation means time t at replica i.
  • Conit is data items x and y. Both initialized to
    0. Replica A has committed one operation.

12
  • Replica A
  • Ordering deviation is 3, since it has three
    uncommitted operations.
  • Numerical deviation by operations is 1. Weight is
    5.
  • Replica B
  • Ordering deviation is 2.
  • Numerical deviation is 3, with weight of 6.

13
Conit Granularity
  • Why do some hotels have a sink outside?
  • Should conits be coarse-grained (a whole
    database) or fine-grained (just one record in
    it)?
  • In other words, should we try to keep large
    pieces of data consistent or small pieces?

14
  • Assume that two replicas may only differ in one
    outstanding update.
  • In top, the conit has two data items. In the
    bottom, it only has one.
  • Two updates for the top will force propagation,
    on the bottom it will not.

Data item
Update
Propagate updates
Update
Conit
Replica 2
Replica 1
Update
Updates postponed
Update
Replica 2
Replica 1
15
  • So should conits always be as small as possible?
  • Higher overhead.
  • Similar things in real life. For example, hotel
    rooms with sink outside.

Data item
Update
Propagate updates
Update
Conit
Replica 2
Replica 1
Update
Updates postponed
Update
Replica 2
Replica 1
16
Consistent Ordering
  • A more traditional way to model consistency.
  • From architecture and concurrent programming.

17
Notation
  • Processes execute to the right as time
    progresses.
  • The notation W1(x)a means that the process P1
    wrote the value a to the variable x.
  • The notation R2(x)a means that the process P2
    read the value a from the variable x.
  • The subscript is often dropped.

18
Sequential Consistency
  • The result of any execution is the same as if the
    (read and write) operations by all processes on
    the data store were executed in some sequential
    order and the operations of each individual
    process appear in this sequence in the order
    specified by its program.
  • There is some global order.
  • Operations between processes must be as in the
    program.

Program A A-OP1A-OP2A-OP3
Which of these are valid?
Global Order 2 A-OP1B-OP1A-OP2B-OP2B-OP3A-OP3
Global Order 3 A-OP1B-OP1A-OP2B-OP3B-OP2A-OP3
Global Order 1 A-OP1A-OP2A-OP3B-OP1B-OP2B-OP3
Program B B-OP1B-OP2B-OP3
19
  • Which of these is sequentially consistent?

20
  • Consider three concurrently executing processes
    P1, P2, and P3.
  • The data items are x, y, and z.
  • Assume all initialized to 0.
  • Assignment is a write operation.
  • Print is a simultaneous read operation.
  • All operations are indivisible.
  • What are some possible execution interleavings?
  • Which ones are valid?

21
  • The signature is the value of the output of P1,
    P2, and P3, concatenated in that order.
  • Not all signatures are valid.
  • Which of these are valid?

Process P1
Process P2
Process P3
22
Sequential Consistency(From 2006)
  • The result of any execution is the same as if the
    (read and write) operations by all processes on
    the data store were executed in some sequential
    order and the operations of each individual
    process appear in this sequence in the order
    specified by its program.
  • There is some global order.
  • Operations between processes must be as in the
    program.

Program A A-OP1A-OP2A-OP3
Global Order 2 A-OP1B-OP1A-OP2B-OP2B-OP3A-OP3
Global Order 3 A-OP1B-OP1A-OP2B-OP3B-OP2A-OP3
Global Order 1 A-OP1A-OP2A-OP3B-OP1B-OP2B-OP3
Program B B-OP1B-OP2B-OP3
23
Sequential Consistency (3)(From 2006)
24
Sequential Consistency (4)(From 2006)
  • Figure 7-6. Three concurrently-executing
    processes.

25
Sequential Consistency (5)(From 2006)
  • Figure 7-7. Four valid execution sequences for
    the processes of Fig. 7-6. The vertical axis is
    time.

26
Causal Consistency
  • For a data store to be considered causally
    consistent, it is necessary that the store obeys
    the following condition
  • Writes that are potentially causally related must
    be seen by all processes in the same order.
    Concurrent writes may be seen in a different
    order on different machines.

27
  • Allowed?
  • This sequence is allowed with a
    causally-consistent store, but not with a
    sequentially consistent store.

28
  • Causally consistent?

29
Grouping Operations
  • Do SMP machines also need consistency models?
  • Yes, there are many kinds.
  • Why we not care about these when writing MT
    programs?
  • We do, if we are platform dependent and dont use
    locks.
  • How do we handle consistency in MT programs?
  • Use locks.
  • As viewed by an external, data-centric process,
    what do locks do?
  • They turn non-atomic operations into atomic ones
    (functionally).
  • In other words, they group them.

30
Synchronization Variables
  • Operations are grouped via synchronization
    variables (locks).
  • Each synchronization variable protects an
    associated data set.
  • Each kind of synchronization variable has some
    associated properties.

31
Release Consistency
  • Two operations
  • Acquire a critical section is about to be
    entered.
  • Release a critical section is about to be exited.

32
Entry Consistency
  • Entry Consistency Necessary criteria for correct
    synchronization
  • An acquire access of a synchronization variable
    is not allowed to perform until all updates to
    guarded shared data have been performed with
    respect to that process.
  • Before exclusive mode access to synchronization
    variable by a process is allowed to perform with
    respect to that process, no other process may
    hold the synchronization variable, not even in
    nonexclusive mode.
  • After exclusive mode access to a synchronization
    variable has been performed, any other process
    next nonexclusive mode access to that
    synchronization variable may not be performed
    until it has performed with respect to that
    variables owner.

33
  • An acquire access of a synchronization variable
    is not allowed to perform until all updates to
    guarded shared data have been performed with
    respect to that process.
  • When a process does an acquire, the acquire may
    not complete until all remote changes to the
    guarded data have been made visible.
  • Before exclusive mode access to synchronization
    variable by a process is allowed to perform with
    respect to that process, no other process may
    hold the synchronization variable, not even in
    nonexclusive mode.
  • Before updating a shared item, a process must
    enter the critical section in exclusive mode.
  • After exclusive mode access to a synchronization
    variable has been performed, any other process
    next nonexclusive mode access to that
    synchronization variable may not be performed
    until it has performed with respect to that
    variables owner.
  • If a process wants to enter a critical section in
    non-exclusive mode, it must first check with the
    owner of the synchronization variable to get the
    most recent copies of the shared data.

34
  • Is this valid for entry consistency?
  • Yes, a valid event sequence for entry consistency.

35
Consistency vs. Coherence
  • Consistency model describes what happens to a set
    of data when a set of processes operate on that
    data.
  • Coherence model only pertains to a single data
    item. So it is about a set of processes writing
    to a single data item.

36
Client Centric Models
37
Weaker Models
  • Sometimes strong models are needed, if the result
    of race conditions are very bad.
  • Banks
  • Sometimes the result of races are just
    inefficiency, or inconvenience, etc.
  • How strong is Orbitzs model?
  • If it shows a ticket available, is it really?
  • How does it prevent two people from reserving the
    same seat?
  • One kind of weaker model is eventual consistency
  • It eventually becomes consistent

38
Eventual Consistency
Client moves to other location and(transparently)
connects to other replica
Replicas need tomaintain client-centric
consistency
WAN
Laptop
Read/writeoperations
Distributed andreplicated database
  • How well does EC work for mobile clients?
  • Not very well. Things can disappear (go
    backwards, etc.).
  • Client-centric is intended to address this.
    Consistent for a single client.

39
Client-Centric Consistency
  • Intended to address the issues in eventual
    consistency for mobile clients.
  • Consistent for a single client.
  • Notation
  • xit is the version of x at local copy Li at
    time t.
  • Version xit is the result of a series of write
    operations at Li that took place since
    initialization. This is WS(xit).
  • If operations in WS(xit) have also been
    performed at local copy Lj at a later time t2, we
    write WS(xit1xjt2).

40
Monotonic Reads
  • A data store is said to provide monotonic-read
    consistency if the following condition holds
  • If a process reads the value of a data item x any
    successive read operation on x by that process
    will always return that same value or a more
    recent value.
  • In other words, if a process has seen a value of
    x at time t, it will never see an older version
    of x at a later time.
  • Example Suppose a user opens his mailox in San
    Francisco, then flies to New York. Should he see
    an earlier version of his mailbox?

41
  • Which one of these obeys this model?

42
Monotonic Writes
  • In a monotonic-write consistent store, the
    following condition holds
  • A write operation by a process on a data item x
    is completed before any successive write
    operation on x by the same process.
  • In other words, a write operation must wait for
    all preceding write operations.

43
  • Which one of these obeys that?

44
Read Your Writes
  • A data store is said to provide read-your-writes
    consistency, if the following condition holds
  • The effect of a write operation by a process on
    data item x will always be seen by a successive
    read operation on x by the same process.
  • In other words a write operation is always
    completed before a successive read operation by
    the same process, no matter where the read
    operation takes place.
  • Suppose your web browser has a cache.
  • You update your web page on the server.
  • You refresh your browser.
  • Do you have read-your-writes consistency?

45
  • Which of these is read-your-writes?

46
Writes Follow Reads
  • A data store is said to provide
    writes-follow-reads consistency, if the following
    holds
  • A write operation by a process on a data item x
    following a previous read operation on x by the
    same process is guaranteed to take place on the
    same or a more recent value of x that was read.
  • In other words, any successive write operation by
    a process on a data item x is guaranteed to take
    place on a copy of x that is up to date with the
    value most recently read.
  • Example Suppose we are replicating a database
    for a blog. Performing a write amounts to posting
    a response. If we do not use writes-follow-reads,
    then it would be possible for a user to read a
    response without the original.

47
  • Which of these obeys writes-follow-reads?

48
Replica Management
49
Two Subproblems
  • Your boss says to you, Our system is too slow,
    make it faster.
  • You decide that replication of servers is the
    answer. What do you do next? What are the
    questions that need to be answered?
  • Where to place servers?
  • Where to place content?

50
Placing Servers
  • Given a set of N locations, how do you place the
    K servers?
  • What are the goals?
  • What is the metric that is being optimized?
  • One algorithm, each time you place a server,
    minimize the average remaining distance to
    clients.
  • What is distance?
  • Is average the right thing to minimize? What if
    one client accesses a lot, the other not so much.
  • Can we ignore the client locations?
  • Yes, if they are uniformly distributed.
  • Other ideas for algorithms?

51
Clustering
  • One idea, identify the K largest clusters, then
    put one server in each cluster.
  • How do you find clusters?
  • One way, divide space up into cells, pick K most
    populated ones.

52
Replica-Server Placement
  • Choosing a proper cell size for server placement.
  • Turns out that computing from average distance
    between two nodes and the number of replicas
    works well.
  • Close to optimum results, but takes much less
    time O(Nmaxlog(N),K).
  • For example, computing the 20 best replica
    locations for 64,000 nodes is about 50,000 times
    faster.

53
Content Replication and Placement
  • The logical organization of different kinds of
    copies of a data store into three concentric
    rings.

Server-initiated replication
Client-initiated replication
54
Content Replication
  • Permanent replicas
  • Can be distributed across servers at a single
    location. (What problem does this address?)
  • Can be distributed geographically. (What problem
    does this address?)

55
  • Server-initiated replicas
  • Created more dynamically, at the request of the
    server.
  • For example, imagine the traffic on a
    hypothetical Red Sox web site the night they won
    the world series.
  • Can be done to reduce load, and also to improve
    client performance.
  • One algorithm Each server keeps track of
    requests for files, and where they come from.
  • If the number of requests for F at Q drops below
    del(Q,F), the file is removed (if not the last
    replica).
  • If the number of requests for F at Q goes above
    rep(Q, F), the file is replicated.
  • If the number of requests for F is between del(Q,
    F) and rep(Q, F), the file will be migrated if
    for some server P, cntQ(P,F) exceeds more than
    half of the total requests for F.

56
  • Counting access requests from different clients.

57
  • If migration does not succeed for some reason,
    then replication is attempted. Server checks all
    other servers, starting with the one farthest
    away (why?). If some server has cntQ(R,F) above
    a certain fraction of the requests for F, a
    replication attempt is made.

58
  • Client-Initiated Replicas (client-side caches)
  • Client can cache at will.
  • Can have different invalidation policies, etc.

59
Content Distribution
  • What to propagate? Possibilities
  • Propagate only a notification of an update.
  • Invalidation protocol.
  • Transfer data from one copy to another.
  • Propagate the update operation to other copies.
  • When is each advantageous?
  • Read/write ratio is small?
  • Read/write ratio is high?

60
Pull vs. Push
  • Push is sent by servers without request.
  • Pull is specifically asked.
  • When is each advantageous?
  • One way of looking at efficiency is whether or
    not a message is likely to be useless. For
    example, an update message that is not read
    before another one is sent.

61
Leases
  • Hybrid approach A lease is a promise by the
    server to push for a specified amount of time.
    After that, the client must poll.
  • Can distinguish three criteria
  • If the data is rarely modified, should we give
    long or short lease?
  • If a client often requests an update, should we
    give long or short?
  • If space is short at the server?

62
Unicasting vs. Multicasting
  • Which is better?

63
Consistency Protocols
64
Primary-Based Protocols
  • In practice, consistency models are usually not
    too hard to understand.
  • If it is too hard to understand, it is too hard
    to write correct applications.
  • Note that this situation is somewhat different
    for hardware consistency models. Why?
  • In primary-based protocols, each data item has an
    associated primary replica.
  • Can be fixed or can move around.

65
Remote-Write Protocols
  • All write operations forwarded to a single fixed
    primary server (also known as primary-backup).
  • This does the update and forwards to all others.
    Only when all have responded does the original
    respond.

66
Client
Primary serverfor item x
Client
R2
W1
W5
R1
W4
W4
W3
W3
Backup server
W3
W2
W4
Data store
W1. Write requestW2. Forward request to
primaryW3. Tell backups to updateW4.
Acknowledge updateW5. Acknowledge write completed
R1. Read requestR2. Response to read
67
Client
Primary serverfor item x
Client
R2
W1
W5
R1
W4
W4
W3
W3
Backup server
W3
W2
W4
Data store
  • How is the performance of this protocol?
  • Is it necessary to wait for the W5 to complete
    before allowing the client to continue?

68
Local-Write Protocols
  • Primary copy migrates.
  • Advantage is that multiple successive writes can
    be carried out locally.
  • Reading processes can continue to read.

69
(No Transcript)
70
  • Also corresponds well with mobile computing.
  • Before you disconnect, make your laptop the
    primary server.
  • While disconnected, everything is update locally.
  • Also fits distributed file systems.

71
Replicated-Write Protocols
  • Active replication
  • Writes may happen to any replica
  • Need to handle ordering issues.
  • One way is with totally ordered multicast.
  • Another way is with a sequencer coordinator that
    assigns sequence numbers.

72
  • Quorum-based Use voting.
  • To do a write, a client must first get the
    approval of a majority of the servers.
  • File is then updated, and a new version number is
    assigned.
  • To do a read, a client also contacts a majority,
    and gets the current version number from them. If
    version numbers are the same, then it is the most
    recent version.
  • Generalized
  • To do a read, assemble a read quorum, NR.
  • To modify, assemble a write quorum, NW.
  • Constraints
  • NR NW N, to prevent read-write conflicts.
  • NW N/2, to prevent write-write conflicts.

73
Read quorum
Write quorum
  • Which of these are valid?

74
Cache-Coherence Protocols
  • For hardware, broadcast or snooping is possible.
    Not for distributed systems.
  • Three aspects
  • Coherence detection strategy When are
    inconsistencies detected.
  • Static, such as a compiler, inserts instructions
    that might lead to inconsistencies. What about
    for concurrency?
  • Dynamic, inconsistencies are detected at runtime.
  • When accessed, block the operation/transaction.
  • When accessed, but do not block the transaction
    (optimistic).
  • Only when commit.
  • When is each of these good?
  • Coherence enforcement strategy How caches are
    kept consistent.
  • Do not cache any shared data.
  • If can be cached
  • Send invalidation to all caches.
  • Send the actual update.
  • When is each of these going to be better?
  • Modifications by clients What happens when a
    client modifies data.
  • Write-through
  • Write-back

75
  • void foo(int a, int b) // Does b0 need to
    be reloaded? for (int i 0 i ai bi b0

76
Client-Centric Consistency
  • Straightforward, if we ignore performance issues.
Write a Comment
User Comments (0)
About PowerShow.com