Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation - PowerPoint PPT Presentation

About This Presentation
Title:

Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation

Description:

Oracle and PostgreSQL use multiversion concurrency control providing the ... Borland, Oracle, version 7 of PostgreSQL implements this SI using MVCC. 9. 9/23/09 ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 31
Provided by: rish8
Category:

less

Transcript and Presenter's Notes

Title: Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation


1
Postgres-R(SI) Combining Replica Control with
Concurrency Control based on Snapshot Isolation
  • Shuqing Wu Bettina Kemme
  • School of Computer Science,
  • McGill University, Montreal
  • CS632 Seminar
  • By
  • Rishiraj Gupta (05305014)
  • CSE, MTech1
  • IIT Bombay

2
Introduction
  • Oracle and PostgreSQL use multiversion
    concurrency control providing the isolation level
    snapshot isolation
  • Postgres-R(SI), an extension of PostgreSQL offers
    transparent replication.
  • In this paper, the replica control algorithm is
    combined with PostgreSQLs concurrency control
    component.

3
Introduction (contd)
  • Data Replication
  • Increase throughput
  • Decrease response time
  • Provide Fault tolerance
  • Challenge Replica Control
  • Standard correctness criteria
    1-copy-serializability

4
Implementing Replica Control
  • Kernel-based approach
  • Middleware-based approach
  • Advantages
  • Modular
  • Do not need access to DB code
  • Reusability
  • Disadvantages
  • No access to concurrency control information in
    the kernel

5
Replica Control (RC) methods
  • Through Middleware
  • Has to implement its own concurrency control (CC)
    component.
  • Only SQL statements are visible but not the
    individual tuples to be accessed.
  • Integrating RC into Kernel of DB system
  • Directly interacts with the tuple-based CC of the
    DB system.
  • Direct access to tuples and/or logs allows an
    efficient propagation of changed tuples.
  • Replica Tool comes with the DB system package.

6
Classifying Replica Control algorithms
  • Time point of coordination
  • Lazy schemes
  • Eager schemes
  • Update handling
  • Primary copy approach
  • Update everywhere

7
Concurrency Control Snapshot Isolation
  • Snapshot Isolation (SI)
  • Conflict only write/write
  • Read from a snapshot of the committed data as of
    the time the transaction starts.
  • First committer wins of 2 concurrent conflicting
    write transactions, one commits, the other aborts
  • Similarity with optimistic concurrency control
  • Very popular (Oracle, PostgreSQL, SQL Server 2005)

T0
w(x)
commit
T1
w(x)
r(x)
w(y)
abort
T2
r(x)
w(x)
time
8
Overview on Snapshot Isolation(SI)
  • Allows some non-serializable executions, but
    provided more concurrency for RO transactions.
  • Read operations read from a committed snapshot
    of DB, while Write operations of concurrent
    transactions are disallowed.
  • Replica Control integrated with MVCC algorithm of
    version7 PostgreSQL, in order to detect
    write/write conflicts and determine snapshots of
    read operations.
  • Borland, Oracle, version 7 of PostgreSQL
    implements this SI using MVCC.

9
Global Communication System(GCS)
  • Properties of GCS
  • Uniform reliable delivery
  • all or nothing
  • avoids lost transactions
  • Total Order
  • If two member receive m1 and m2, they both
    receive them in the same order
  • determines serialization order
  • Virtual Synchrony
  • GCS maintains view of the currently connected
    members.
  • If two members both first receive view1 and then
    view2, they receive the same set of messages
    while members of view1
  • handles failures

10
SI-P Concurrency Control in PostgreSQL
  • Each Transaction Ti assigned a unique identifier
    TIDi
  • Committed (aborted) Version
  • w(X) acquire an exclusive lock on X, which is
    only released at end of transaction.
  • Valid Version of X Version of X created by last
    committed transaction that updated X (only
    one is Valid)
  • Active Version of X
  • t_xmin TID of transaction that created tuple
    version V
  • t_xmax TID of transaction that invalidated V by
    an update
  • Visible tuple version Version read by a r(X)

11
Version creation during execution
t_xmin T0 t_xmax Null
b0
Time
  • Execution
  • Version Check
  • Exclusive Lock
  • t_xmin, t_xmax
  • Waiting queue
  • Termination
  • clog update
  • locks release

w0(X)
X V0
c0
t_xmin T1 t_xmax Null
t_xmin T0 t_xmax T1
b1
w1(X)
X V1
X V0
b2
t1
w2(Y)
c1
t_xmin T3 t_xmax Null
t_xmin T0 t_xmax T1
t_xmin T1 t_xmax T3
w2(X)
t2
a2
b3
X V2
X V1
X V0
w3(X)
t3
b4
r4(X)
t4
At t2 T2 must abort, Why? At t3 V0 V1
committed, V1 valid, V2 is active version At t4
T4 Reads V1
12
Information maintained by SI-P
  • PostgreSQL keeps information about each active
    transaction Ti in shared memory.
  • xmax TID for the next transaction which will
    start just after Ti
  • xip a list of TIDs of all active transactions
    at the moment Ti starts
  • Ticonc Set of concurrent transactions whose
    updates are invisible to Ti
  • Ticom all other committed transactions.
  • clog file storing commit/abort log entries.

13
An example for SI-P Protocol
t_xmin T0 t_xmax Null
b0
Time
  • Original
  • At t2 T2 aborts.
  • Replacing the order of w2(X)
  • Suppose instead of T1 committing, it aborts.

w0(X)
X V0
c0
t_xmin T1 t_xmax Null
t_xmin T0 t_xmax T1
b1
w1(X)
X V1
X V0
b2
t1
w2(Y)
w2(X)
t_xmin T3 t_xmax Null
t_xmin T0 t_xmax T1
t_xmin T1 t_xmax T3
c1
a1
t_xmin T2 t_xmax T3
w2(X)
t2
t_xmax Null
a2
c2
b3
X V2
X V1
X V0
X V2
w3(X)
t3
b4
r4(X)
t4
At t2 T2 Waits for lock release. When T1
aborts at a1, T2 rebegins the version check for
X t_xmin T0 ( T0 doesnt belongs to T2conc
T0 belongs to T2com) Therefore, T2 gets the
lock on X, and can commit after doing w(X)
At t2 T2 Waits for lock release. When T1
commits at c1, T2 again does the version check
for X t_xmin T1 ( T1 belongs to T2conc )
Therefore, T2 still aborts.
14
SI-PR Replica Control based on SI-P
  • TIDs are local at each replica and will differ at
    different replicas
  • To compare transactions across replicas, each
    update transaction receives a GID
  • GIDs are generated by using total order in which
    writesets are delivered.
  • Conflicting operations of both local and remote
    transactions are executed in the order of
    writeset delivery.
  • Writesets delivered atomically.

15
Local Transaction (Ti)
  • Execution Phase
  • Same as in SI-P. Version check for each w(X).
  • New version of X and GID of valid visible
    version are added to the writeset WSi.
  • Send Phase
  • Writeset is multicast to all replicas in total
    order multicast.
  • Commit Phase (atomic)
  • On delivery of writeset, its GIDi is generated
  • If Ti was not yet aborted, GIDi added in internal
    table, clog updated, locks released waking up
    waiting transactions.
  • Abort Phase
  • clog updated, locks released.

16
Remote Transaction (atomic)
  • On receiving WSi, Ti is started, its GIDi
    generated and (GIDi/TIDi) added to internal
    table.
  • Version Check and Early execution
  • Valid version sent along with WSi is compared
    with local valid version, to decide
    serializablity.
  • Any local transaction whos WS has not yet been
    delivered or sent, and is holding lock on tuples
    required by remote transaction, must be aborted.
  • Late Execution
  • Aborting is not done immediately. Its delayed
    till Ti is ready to commit on local transaction.
  • Commit Phase
  • Updates performed, clog updated, locks released.

17
Example execution of SI-PR
Multicast order T1 ? T2
R2
R1
t_xmin T0 t_xmax Null
t_xmin T0 t_xmax Null
T0 GID0 T0 commit
T1 GID0 T1 commit
X V0
X V0
t_xmin T0 t_xmax T1
t_xmin T1 t_xmax Null
t_xmin T0 t_xmax T2
t_xmin T2 t_xmax Null
X V0
X V1
X V0
X V2
XV1, GID0
t_xmin T0 t_xmax T2
t_xmin T1 t_xmax Null
t_xmin T2 t_xmax Null
T1 GID1 T1 commit
T1 GID1 T2 abort T1 commit
T1
X V0
X V1
X V2
abort
XV2, GID0
T2 abort
Discard
R1 R2 are two replicas
Time
Time
18
An Example Why Late Execution useful?
  • A remote transaction should not immediately abort
    a local transaction whose writeset has been sent
    but not yet delivered.

R1, R2, R3 are three replicas
Assume X , Y V0, GID 0
R1
R2
R3
T1 w(X)
T3 w(Y)
T2 w(Y) w(X)
T2 Aborts T1 Commit T3 Commit
T1 Commit T2 Abort T3 Commit
T1 Commit T3 Abort T2 Commit
T3 Commit T2 Aborts
If Aborting was done as Early Execution
By using Late Execution
All three send their write sets
concurrently Ideally, T1 should commit, T2 abort
T3 commit
Assume Delivery order T1 -gt T2 -gt T3
19
Recovery
  • GCS delivers view change message to all replicas.
  • Crashed Replica rejoins
  • Peer sends writesets from previous GID (when
    failed) till current GID using log.
  • No version check required
  • New writesets received are buffered.
  • New Replica joining
  • All writesets applied, that were delivered before
    view change message.
  • For each tuple, visible version transferred to
    new replica together with GID of the transaction
    that created it.
  • Restart a total failure
  • A new replica group started by replica with
    largest GID

20
Architecture of Postgres-R
21
Evaluation Experiment 1
(a)TPC-W Browsing (Read Only)
(b) TPC-W Ordering (Update)
Test uses OSDL-DBT-1 (TPC-W) benchmark to
evaluate, which simulates an online bookstore
with 80 Browsing and 20 Ordering
transactions. Database Size 1000 items and 40
clients.
22
Evaluation Experiment 2
100 Update Transactions
Update Workload Response Time
Test uses Database consisting of 10 tables with
10,000 tuples each. Atleast 20,000 transactions
were run to achieve stable results. Each test has
a fixed set of clients.
23
Experimental Conclusions
  • Experiment 1
  • Response time increases with increasing load.
  • Response time of centralized system is much worse
    than the replicated configuration.
  • 10 replica system has less response time than 5
    replica system.
  • Experiment 2
  • Central system has considerable faster response
    times until max. throughput for 5 clients is
    reached.
  • Replication might pay off in update intensive
    workloads in cases system enables to distributed
    work loads.

24
Conclusions
  • We looked at PostgreSQLs version 7 Concurrency
    control mechanism (SI-P)
  • Replica Control protocol (SI-PR) based on SI-P.
  • Paper presented an eager, update everywhere
    database replication approach based on snapshot
    isolation.
  • Current solutions of pgreplication project of
    PostgreSQL can be found at
  • http//gborg.postgresql.org/project/pgreplication

25
Thank you !!
  • Questions ??
  • rishirajgupta_at_cse.iitb.ac.in

26
Extra Slides
27
Experiment 3 (Replication Overhead)
To Evaluate the delay incurred by
replication Setup 10 table database, single
client. Transactions updating 10
random tuples
28
Transaction Execution
29
Version creation during execution
t_xmin T0 t_xmax Null
b0
Time
  • Execution
  • Version Check
  • Exclusive Lock
  • t_xmin, t_xmax
  • Waiting queue
  • Termination
  • clog update
  • locks release

w0(X)
X V0
c0
t_xmin T1 t_xmax Null
t_xmin T0 t_xmax T1
b1
w1(X)
X V1
X V0
b2
t1
w2(Y)
c1
t_xmin T3 t_xmax Null
t_xmin T0 t_xmax T1
t_xmin T1 t_xmax T3
w2(X)
t2
a2
b3
X V2
X V1
X V0
w3(X)
t3
b4
r4(X)
t4
At t3 V0 V1 committed, V1 valid, V2 is
active version
30
Example execution of SI-PR
R2
R1
t_xmin T0 t_xmax Null
t_xmin T0 t_xmax Null
T0 GID0 T0 commit
X V2
X V2
t_xmin T0 t_xmax Null
t_xmin T0 t_xmax Null
t_xmin T0 t_xmax Null
t_xmin T0 t_xmax Null
X V2
X V2
X V2
X V2
XV1, GID0
t_xmin T0 t_xmax Null
t_xmin T0 t_xmax Null
t_xmin T0 t_xmax Null
T1 GID1 T1 commit
T1 GID1 T2 abort T1 commit
X V2
X V2
X V2
XV2, GID0
T2 abort
Discard
R1 R2 are two replicas
Time
Time
Write a Comment
User Comments (0)
About PowerShow.com