Consistency and Replication - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Consistency and Replication

Description:

... avoid system wide consistency, by concentrating on what ... A data store is said to provide read-your-writes consistency, if the following condition holds: ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 32
Provided by: Ken694
Category:

less

Transcript and Presenter's Notes

Title: Consistency and Replication


1
Consistency and Replication
  • Introduction (whats it all about)
  • Data-centric consistency
  • Client-centric consistency
  • Replica management
  • Consistency protocols

2
Outline
  • Data-centric consistency
  • Continuous Consistency
  • Sequential Consistency
  • Causal Consistency
  • Grouping Operations
  • Client-centric consistency

3
a and b are related, so not causal
a and b are not related, - causal
4
Write b happened before write a -- sequential
Write b happened before write a -- not
sequential
5
Grouping Operations (2)
  • Show entry consistency

6
Outline
  • Data-centric consistency
  • Client-centric consistency
  • Goal perhaps avoid system wide consistency, by
    concentrating on what specific clients want,
    instead of what should be maintained by servers.
  • Eventual Consistency
  • Monotonic Reads
  • Monotonic Writes
  • Read Your Writes
  • Writes Follow Reads

7
WS(x1) sent to L2, is monotonic R
WS(x1) not sent to L2, not monotonic R
8
WS(x1) sent to L2, is monotonic W
WS(x1) not sent to L2, not monotonic W
9
(No Transcript)
10
(No Transcript)
11
Read Your Writes
  • A data store is said to provide read-your-writes
    consistency, if the following condition holds
  • The effect of a write operation by a process on
    data item x will always be seen by a successive
    read operation on x by the same process.
  • No matter where the location of the read is
  • Suppose your web browser has a cache.
  • You update your web page on the server.
  • You refresh your browser.
  • Do you have read-your-writes consistency?

12
Read Your Writes (2)
W(x1) is part of WS (x1,x2), is read your writes
The read doesnt include the W(x1), not R-Y-W
  • i.e. updating your Web page and guaranteeing that
    your Web browser shows the newest version instead
    of its cached copy.

13
Writes Follow Reads (1)
  • A data store is said to provide
    writes-follow-reads consistency, if the following
    holds
  • A write operation by a process on a data item x
    following a previous read operation on x by the
    same process is guaranteed to take place on the
    same or a more recent value of x that was read.
  • Example See reactions to posted articles only if
    you have the original posting (a read pulls in
    the corresponding write operation).

14
Writes Follow Reads (2)
is writes follow reads
Not writes follow reads
15
Outline
  • Introduction (whats it all about)
  • Data-centric consistency
  • Client-centric consistency
  • Replica management
  • Consistency protocols

16
Two Subproblems
  • Your boss says to you, Our system is too slow,
    make it faster.
  • You decide that replication of servers is the
    answer. What do you do next? What are the
    questions that need to be answered?
  • Where to place servers?
  • Where to place content?

17
Placing Servers
  • Given a set of N locations, how do you place the
    K servers?
  • Locations network locations and geographic
    locations.
  • A server may only part of the data store
  • What are the goals?
  • What is the metric that is being optimized?

18
Placing Servers
  • One algorithm, each time you place a server,
    minimize the average remaining distance to
    clients.
  • What is distance?
  • Can we ignore the client locations?
  • Yes, if they are uniformly distributed.
  • Other ideas for algorithms?

19
Possible approaches
20
Example Clustering
  • One idea, identify the K largest clusters, then
    put one server in each cluster.
  • How do you find clusters?
  • One way, divide space up into cells, pick K most
    populated ones.
  • Calculate an appropriate cell size a simple
    function of average distance
  • Complexity reduce from O(N2) to O(N x
    Max(log(N),K))

21
Replica-Server Placement
  • Choosing a proper cell size for server placement.
  • Turns out that computing from average distance
    between two nodes and the number of replicas
    works well.

22
Placing Content
  • Which server or servers to select to place an
    object (data, code)?

23
Permanent replicas
  • E.g, Mirror sites. Database replica on servers
    without sharing disks, memory and processes
  • Initial set of replica, static organization.

24
Server-Initiated Replicas
  • Created by the owner of the data store
  • temporal use,
  • Dynamic load,
  • E.g, Web hosting dynamic replica
  • Specific files on a server can be migrated or
    replicated to servers placed in the proximity of
    clients that issue many requests for those files.
  • Keep track of access counts per file, aggregated
    by considering server closest to requesting
    clients
  • Number of accesses drops below threshold D ? drop
    file
  • Number of accesses exceeds threshold R? replicate
    file
  • Number of access between D and R ? migrate file

25
Server-Initiated Replicas
26
Client-Initiated Replicas
  • Client caches
  • temporarily, to improve access time
  • Measured by cache hit.
  • One client or shared by clients.
  • Client request a near-by server to cache.

27
Content Replication and Placement
28
Content Distribution
  • Issue propagate of (updated) content to the
    relevant replica servers.
  • Possibilities for what is to be propagated in
    terms of State versus Operations
  • Propagate only a notification/invalidate of an
    update (often for caches).
  • Transfer data from one copy to another
    (distributed databases).
  • Propagate the update operation to other copies
    (also called active replication).
  • No single approach is the best, depending on
    available bandwidth, read-to-write ratio at
    replicas

29
Pull versus Push Protocols
  • Pushing updates server-initiated approach, in
    which update is propagated regardless whether
    target asked for it.
  • Pulling updates client-initiated approach, in
    which client requests to be updated.
  • Best practices? Consistency need? Other
    trade-offs
  • Hybrid approach lease A contract in which the
    server promises to push updates to the client
    until the lease expires.
  • E.g, multiple-client, single-server systems.

30
(No Transcript)
31
Outline
  • Introduction (whats it all about)
  • Data-centric consistency
  • Client-centric consistency
  • Replica management
  • Consistency protocols
Write a Comment
User Comments (0)
About PowerShow.com