Transactions and Databases - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Transactions and Databases

Description:

1. Advanced Distributed Software Architectures and Technology ... Left overs. http://research.microsoft.com/~gray/wics_99_TP ... How do we recover ... – PowerPoint PPT presentation

Number of Views:262
Avg rating:3.0/5.0
Slides: 61
Provided by: paulg181
Category:

less

Transcript and Presenter's Notes

Title: Transactions and Databases


1
Transactions and Databases
  • Paul Greenfield
  • CSIRO

2
This Week
  • More on transactions
  • Left overs
  • http//research.microsoft.com/gray/wics_99_TP
  • Isolation and locking
  • How do we achieve isolation?
  • Recovery
  • How do we recover after failure?

3
Why bother with TP?
  • Use two-tier apps with database transactions?
  • Business logic in client and stored procedures
  • Fast!
  • Scalable?
  • Maintainable?
  • Cheaper?
  • Flexible??

Stored procedures
Database Server
4
Two-tier Applications
  • The most recent legacy
  • Stored procedures
  • Different and proprietary languages
  • Integrated debugging?
  • Re-use in different applications?
  • DB connection per client
  • Even when not active

5
Three-tier Applications
  • Business logic written in common or standard
    languages (VB, C, Java)
  • Clean separation of business logic
  • Easier re-use and maintainability?
  • Use server resources only for active transactions
  • Process and connection pooling

6
TP Implementation
  • What are the TP programs?
  • Small one-shot executable programs?
  • Application programs fed from queue?
  • Libraries called from a process?
  • Libraries called from threads?
  • Answer have an effect on performance, integrity
    and management

7
One-shot Programs
  • Old-style solution (CICS, TIP, )
  • Schedule application to run when transaction
    request arrives
  • Start app, process request, terminate
  • Single function per application
  • OS/TP monitor support for
  • Fast application startup
  • Application recycling (reduce overheads)

8
Queued Applications
  • TP application always running
  • Instances balanced against load
  • Queue of waiting requests
  • Application supports multiple functions
  • Group functions into applications
  • Clients not bound to server applications
  • Tune response times
  • Faster response time for some transactions
  • Multiple copies of critical applications

9
TP Processes
  • Client bound to server process
  • Typical CORBA approach
  • Queue of requests for each server
  • Need to run/manage multiple servers
  • Tune response times?
  • Can allocate transactions to programs
  • Fast, critical transactions delayed?
  • Need for load balancing
  • Unequal server load possible

10
TP Process - Orbix
Server processes
Server objects
Waiting requests
11
Orbix Example
12
TP Threads
  • Thread pool inside a server process
  • No binding from client to thread
  • Objects live in process address space
  • Threads have access to all objects
  • Queue of requests shared by all threads
  • No need for load balancing
  • No idle/busy processes
  • No way to push priority of some transactions
    may not matter?

13
TP Threads - MTS
Server threads
Activeserver objects
Proxyserver objects
Waiting requests
14
MTS Example
15
Failure
  • Need to isolate faults
  • Failing application takes down what??
  • Entire application process?
  • Process holding thread pool?
  • Entire transaction system?
  • Need to run applications as separate processes or
    have careful fault traps

16
What Goes Where?
  • Routing and directories
  • Where to send a request message?
  • Where to create a remote object?
  • Routing tables
  • Table of what requests go where
  • Directories/name servers
  • Database and server that knows who is providing
    what service

17
Directory/Name Servers
  • Map name onto server locations
  • Could be part of TP system
  • CORBA Name Servers
  • Could be part of system-wide directory
  • Active Directory for COM
  • Hard-wiring also works
  • Administration costs can be high

18
Name Servers
  • Client asks name server where to find a service
    when creating object
  • Servers advertise their services to the name
    server
  • Load balancing by name server distributing
    requests over multiple server processes and
    systems

19
Name Servers
20
Request Integrity
  • What happens to requests on failure
  • Transactions ensure database integrity
  • Incoming requests can be saved to disk
  • Fetch request operation included as part of
    transaction
  • Undone and request requeued on failure
  • Need to avoid failure loops!
  • Easy recovery from transient errors

21
Response Integrity
  • Are responses part of transaction?
  • Rolled out if transaction fails
  • Recovered and sent after system recovery if
    committed
  • Is this reasonable? Sent to who??
  • Just discard?
  • Need feedback to know delivery succeeded
  • Just what does the operator see/do?
  • Wait? Retry? Check success?

22
RPC Extras
  • DCE, CORBA, COM, are language and platform
    independent
  • Interfaces specified in IDL
  • Marshalling translates between languages and
    platforms
  • Character sets, byte order,
  • Translate to and from canonical form
  • Or use receiver makes it right
  • Send in client format
  • Receiver translates only if necessary

23
IDL Example
  • COM IDL fragment
  • More detail in a later lecture!!

object, uuid(6B29FC40-CA47-1067-B31D-00DD010662DA
)interface IHop IUnknown import
unknwn.idl // bring in definition of
IUnknown HRESULT Walk(in long
How_far) HRESULT Hop(in long
How_far) HRESULT Bound(in BSTR Over_what)
24
Nested Transactions
  • Calling a transaction from anywhere
  • Directly from a client
  • From within a transaction
  • Start a sub-transaction, linked into the parent
    transaction
  • All transactions committed together
  • Sub-transaction commit does not really commit and
    make changes durable. Changes made visible to
    other sub-transactions.

25
Nested Transactions
  • Not widely supported
  • Alternative programming models
  • Top-level transactional service code calling on
    business logic
  • MTS and EJB requires transaction
  • Run in existing transaction if there is one
  • Start new transaction otherwise
  • More in MTS/COM and EJB lectures

26
Nested Transactions
Function transfer(src, dest, amt) tx_start
withdraw(src, amt) deposit(src, amt)
tx_commit Function withdraw(src, amt)
tx_start .. Tx_commit Function
deposit(dest, amt) tx_start ..
Tx_commit Nested Transactions
Function transfer(src, dest, amt) tx_start
withdraw(src, amt) deposit(src, amt)
tx_commit Function withdraw(src, amt)
.. Function deposit(dest, amt)
.. Transactional Services
27
Isolation and Locking
  • How do resource managers achieve the illusion of
    isolation
  • Application programmers can (largely) pretend no
    other programs are running concurrently
  • Done using locks and lock managers
  • Application programmers still need to be aware of
    possible problems

28
Serialisable
  • Concurrent execution of concurrent transactions
    has the same effect as running them serially.
  • One after another with no overlap
  • Highest level SQL Isolation Level
  • Implemented by locking resources before they are
    used

29
Locks
  • Lock data before using it
  • Set read lock before reading
  • Set write lock before writing
  • Wait if lock cannot be granted
  • Locks only granted if no conflicts
  • Read locks conflict with write locks
  • Write locks conflict with both read and write
    locks

30
Locks
  • Locks affect performance
  • All computers wait at the same speed
  • Can result in single-threading
  • Concurrent transactions waiting for access to the
    same resource
  • Strongly influenced by application design
  • Locks introduce new problems
  • deadlocks

31
Two-phase Rule
  • Correct locking avoids problems
  • Locks have to be held until commit to achieve
    isolation
  • Locks are held for longer
  • Performance is reduced
  • Two phases
  • Locking resources
  • Unlocking (only at commit)
  • Avoids cascading aborts

32
Lock Managers
  • Code that manages locking
  • Maintains a lock table
  • Keeps track of all locks in the database
  • Waiting requests and granted locks
  • Lock operations are atomic
  • Protected by low-level locks (mutex, spin)

Locks granted
Locks requested
x
T1(read), T2(read)
T3(write)
y
T2(write)
T4(read), T1(read)
z
T1(read)
33
Lock Managers
  • Distributed systems can have interesting locking
    problems
  • No lock analysis across databases?
  • Distributed databases have distributed lock
    managers
  • Shared lock state
  • Communication between LMs

34
Lock Types
  • More than just read and write!
  • Shared (read) locks
  • Exclusive (write) locks
  • Update (read then write)
  • Intent locks (lock also held at finer level)
  • Key locks (lock ranges within keys)

35
Lock Granularity
  • What is locked?
  • Whole database
  • Whole table?
  • Page of data?
  • Individual record?
  • All of the above at times
  • X lock on record
  • IX locks on page and table
  • S locks on database

36
Tables to Records
Table
Page
Page
Page
37
Lock Granularity
  • Level of locking a DB decision
  • Fine grain locks give less contention and better
    performance
  • Fine grain locks using lots of locks and are more
    expensive to manage
  • Choose record lock when..
  • Just locking a few records
  • Otherwise get coarser locks

38
Lock Escalation
  • DB can start with record locks and move to
    page/table locks
  • Finds that many locks are being held for the
    page/table
  • Escalate lock up a level
  • Free lock resources
  • Guess at proper locking level and adjust as
    needed (up only?)

39
SQL Isolation Levels
  • Uncommitted read (dirty read)
  • Read all changes, no locks, no waits
  • Fastest and sometimes useful
  • Statistical scans of data
  • Committed read (SQL default)
  • Only read committed data
  • Release read locks after use
  • Repeating an SQL statement can give different
    results each time

40
SQL Isolation Levels
  • Repeatable read
  • Same query always returns same data
  • Can get phantoms new records
  • Keep shared locks until Commit
  • Serializable (TP Isolation)
  • Same query returns same data
  • No phantoms!
  • Lock data that does not exist
  • Need to keep key locks as well

41
Locking Hints
  • DB decides what locks to use
  • Shared or exclusive lock?
  • Locks can be converted normally
  • Programmer can override with hints
  • Programmer knows what will happen next
  • Avoid deadlocks?

Select from accounts (updlock) where acc_no
123 Update accounts set balance where
acc_no123
42
Deadlocks
  • Normally applications just wait for locks to be
    granted
  • Sometimes dependencies between locks means they
    would wait forever

Granted
Lock B Lock A
T2
A
T1
T2
B
T2
T1
Waiting
43
Deadlocks
  • Db performs locking graph analysis
  • Deadlock if loop found!
  • Solution?
  • Pick a process/transaction and return a db error
  • Application recovers or dies
  • Transaction abort and retry?

44
Deadlocks
  • Deadlock avoidance is an application coding
    problem and a hard one
  • Use canonical locking orders
  • Define a standard locking order
  • Invoice header before invoice details
  • Nice idea in theory
  • Can still get conversion deadlocks

45
Conversion Deadlocks
  • Database uses shared locks rather than exclusive
    locks for reading
  • Can convert to exclusive later
  • Deadlocks when DB cannot do convert

Granted
Select next from keytable where type1 Update
keytable set nextnext1 where type1
K1
T1(s)
T1(x)
T2(s)
T2(x)
Waiting
46
Conversion Deadlocks
  • A use for locking hints
  • Tell DB to get exclusive lock earlier

Granted
Select next from keytable (updlock) where
type1 Update keytable set nextnext1 where
type1
K1
T1(x)
T2(x)
Waiting
47
Performance
  • Blocking on waits undesirable
  • Remove hot spots
  • next entry counters, summary information, end
    of file counter
  • Avoid altogether
  • Cache high contention records
  • Reduce path length
  • Obtain locks as late as possible

48
Performance
49
Performance
50
Recovery
  • Durability and redundancy
  • Keep critical information on disk
  • In-memory copies for performance
  • Ensure disk writes complete before continuing at
    critical times
  • Keep multiple copies of disk data
  • Protecting against
  • Memory loss when system fails
  • Disk file loss with disk failure

51
Database Model
  • Really two databases
  • Database tables on disk in-memory changed
    pages/records
  • For performance
  • Logged changes on disk/tape database dump
  • For durability
  • The log really the durable database
  • Can recreate the disk/memory form

52
Logging
  • Write to log
  • Before images
  • Changes, deletions
  • After images
  • Changes, insertions
  • Data pages, index blocks, storage allocation
  • Need to wait for log flushes
  • Can be major performance bottleneck
  • Batch flushes by adding a short delay

53
Logging
  • Write-log-ahead
  • Never flush an uncommitted change to the
    database.
  • Changes can be flushed after they have been
    committed
  • Leave in memory until cache manager needs the
    space

54
Commit
  • Changes are written to a log page
  • Page write initiated when page full
  • At commit time
  • Flush all logged changes to disk
  • Flush logged commit record to disk
  • Changes are now in stable storage
  • Database is recoverable

55
Recovery
  • Recover from abort
  • Apply before images if necessary to pages in
    cache
  • Recovery from system failure
  • Apply after images to disk pages
  • Recovery from media failure
  • Restore from backup
  • Apply after images to disk pages

56
Checkpoints
  • How can we recover more quickly?
  • How far back do we go in the log?
  • When do we know that there are no more log
    records that need to be applied?
  • Problem comes from caching and lazy database page
    writes
  • Checkpoints force database pages back out to disk
    now and then
  • Stop recovery when checkpoint found
  • Fuzzy checkpoints to improve CP cost

57
Checkpoints
Lastcheckpoint
All updates in stable database
Log
Classic checkpoint
All updates in stable database
Log
Lastcheckpoint
2nd lastcheckpoint
Fuzzy checkpoint
58
Media failure?
  • Duplicate the media (disks)
  • RAID disks
  • Mirror/shadow disks
  • Avoid sharing anything
  • Multiple disks with multiple controllers
  • Remote sites for backup?
  • Put logs on mirror/RAID at least
  • Archive logs to tape or

59
Performance
  • Disk performance is the key
  • Disks are slow to rotate (latency)
  • Disk heads are slow to move (seek)
  • One heavily used file per disk is best
  • Allocate DB files and logs across disks to
    balance out usage
  • Number of disks can be more important than
    storage capacity

60
Next week
  • Security!
  • Access control
  • Authentication
  • Data privacy
  • Public key crypto
  • SSL/TLS
Write a Comment
User Comments (0)
About PowerShow.com