CS556: Distributed Systems - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

CS556: Distributed Systems

Description:

Reliability - the extent to which a system yields expected results on repeated trials ... Next call picks up the scan where it left off. cursors ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 77
Provided by: mar177
Category:

less

Transcript and Presenter's Notes

Title: CS556: Distributed Systems


1
CS-556 Distributed Systems
Transactions
  • Manolis Marazakis
  • maraz_at_csd.uoc.gr

2
Terminology (I)
  • Reliability - the extent to which a system yields
    expected results on repeated trials
  • Reliability is measured by mean-time-between-fail
    ures (MTBF)
  • Availability - fraction of time the system yields
    expected results
  • reduced by downtime, repair, preventive
    maintenance
  • Availability MTBF/(MTBFMTTR), where MTTR is
    mean-time-to-repair

3
Terminology (II)
  • 24x7 - system should always be operational
  • Failure - an event where the system gives
    unexpected results (e.g. wrong or no result)
  • Fault - an identified or hypothesized cause of
    failure
  • A bad memory board (fault) causes OS to fail
  • Masking - prevents a fault from becoming a
    failure
  • Eg OS reconfigures around a faulty disk
    controller
  • Transient fault - the fault does not reoccur when
    retrying the operation that caused the fault
  • Permanent fault - non-transient fault

4
Where Do Failures Come From?
Tandem 89 Tandem
85 ATT/ESS 85 Environment 7
14 Hardware 8 18
Subtotal 15 32 20 System Mgmt 21
42 30 Software 64 25 50
  • Software is most of the problem !
  • Environment fire, flood, earthquake, vandalism,
    communications/power/air conditioning failures
  • System management maintenance, system operation,
    system configuration
  • System managers vendors are responsible for
    environment and administration

5
Recovery in Client/Server Systems
  • For each component how to detect failure and
    recovery, and what to do about them
  • Client submits a request
  • Server processes the request
  • Server returns a reply
  • A failure can occur during any of these activities

6
Detecting Failures
  • Fault detection must be accurate
  • or youll take recovery actions for a
    still-active process
  • Fault detection must be fast
  • since it contributes to MTTR
  • Techniques
  • Processes sends heartbeats (Im alive) to a
    monitor process.
  • No heartbeat gt failed process
  • Monitor polls processes for Im alive messages
  • Process gets an OS lock. Monitor waits for the
    lock. If process fails, OS releases its lock, so
    monitor gets it and detects the failure

7
Client Recovery
  • What calls were in-flight at failure time?
  • How much of each call was processed?
  • How to finish these calls before proceeding?
  • General-purpose solution
  • Execute requests as transactions.
  • Maintain a persistent queue of requests and
    replies, shared by client and server.
  • At recovery time, use the queue contents to
    decide which transactions executed. Re-run ones
    that didnt.

8
Server Recovery (I)
  • Assume no transactions, for now
  • When it failed, the server maybe was processing
    a call. If all of the calls operations are
    redoable, then just re-execute the call.
  • What if it performed non-redoable operations
    before it failed (increment inventory, print
    ticket, )?
  • Recover to a state after the last non-redoable
    operation
  • Requires careful bookkeeping during normal
    operation, so theres enough state to figure out
    how to recover
  • Checkpoint - an action to avoid redo during
    recovery
  • E.g. save servers memory state to disk

9
Server Recovery (II)
  • Checkpoint before each non-redoable operation
  • At recovery time
  • Restore last checkpoint
  • Check if last non-redoable operation actually ran
  • E.g. save incremented inventory value in memory
    before checkpointing. At recovery compare memory
    and disk copies of inventory.
  • E.g. read ticket number from ticket printer
    before checkpointing. At recovery, read ticket
    number from ticket printer and compare to number
    before checkpoint.

10
Server-side Transactions
  • If servers run each clients call within a
    transaction, then at recovery time, abort all
    active transactions re-run them. No
    checkpointing required.
  • Its as if server checkpoints before Start
    transaction and Commit is the non-redoable
    operation to check at recovery to see if it
    actually ran.
  • This assumes all transaction operations are
    undoable
  • Transaction recovery techniques are much cheaper
    than memory checkpointing.
  • Client performs non-redoable operations outside a
    transaction, and uses other checkpointing
    techniques

11
Transactions
  • Dependable computing
  • All objects managed by a server remain in a
    consistent state
  • when they are accessed by multiple clients
  • in the presence of failures
  • Protect the operations of clients from
    interference
  • Atomic operations at the server
  • Synchronized access to shared resources
  • Synchronization cooperation of clients
  • Wait/notify/notify_all, locks
  • Transactions
  • Predictable faults
  • Although errors can occur, they can be detected
    dealt with before any incorrect behavior occurs
  • Stable storage processors

12
All-or-nothing (Exactly-Once)
  • A transaction either completes successfully
  • in which case the effects of all of its
    operations are recorded in the systems objects
  • or fails (or is deliberately aborted)
  • in which case it has no effect at all
  • Failure atomicity
  • Atomic effects, even if server crashes
  • Durability
  • State data are made permanent
  • Will survive if server process crashes
  • The intermediate effects of a transaction must
    not be visible to others

13
Serializable execution
  • Transactions are allowed to execute concurrently
  • but only as long as their overall effects are
    equivalent to some serial execution

14
Masking faults



  • Architecture Hardware Faults
  • Software Masks Environmental Faults
  • Distribution Maintenance
  • Software automates / eliminates operators
  • In the limit there are only software design
    faults.
  • Software-fault tolerance the key to
    dependability

15
Fault Tolerance Techniques
  • FAIL FAST MODULES work or stop
  • SPARE MODULES instant repair time.
  • INDEPENDENT MODULE FAILS by design MTTFPair
    MTTF2/ MTTR (must have small MTTR)
  • MESSAGE BASED OS Fault Isolation Software has
    no shared memory.
  • SESSION-ORIENTED COMM Reliable messages Detect
    lost/duplicate messages Coordinate messages
    with commit
  • PROCESS PAIRS Mask Hardware Software Faults
  • TRANSACTIONS A.C.I.D. (simple fault model)

16
Fail-Fast Repair
Lifecycle of a module fail-fast gives short
fault latency High Availability is
low UN-Availability Unavailability MTTR
MTTF
Return
  • Improving either MTTR or MTTF gives benefit
  • Simple redundancy is not enough on its own.

17
Atomic Actions
  • Either the action is executed completely (and
    successfully), or it has not happened at all.
  • If it is not successful, it has not left any side
    effects.
  • Except for simple instructions at the processor
    level, no operation is really atomic
  • Most operations are implemented by a sequence of
    more primitive operations.
  • Precautions have to be taken
  • The lower-level operations must not make any
    visible changes before it is clear that the
    top-level operation will succeed.
  • For any temporary changes the lower-level
    operations make, it must be made sure that they
    do not become visible, and that they can be
    revoked automatically should anything go wrong.

18
A Classification of Action Types
  • Unprotected actions lack all of the ACID
    properties except for consistency
  • Unprotected actions are not atomic, and their
    effects cannot be depended upon.
  • Almost anything can fail.
  • Protected actions actions that do not
    externalize their results before they are
    completely done
  • Their updates can be rolled back
  • Once they have reached their normal end, there
    will be no unilateral rollback.
  • Real actions affect the real, physical world
  • consistent, isolated, durable
  • irreversible (in the majority of cases)

19
Sample Transaction Program (I)
exec sql begin declare section long Aid, Bid,
Tid, delta, Abalance exec sql end declare
section DCApplication() read input
msg exec sql begin work Abalance
DoDebitCredit(Bid, Tid, Aid, delta) send output
msg exec sql commit work
20
Sample Transaction Program (II)
long DoDebitCredit(long Bid, long Tid, long Aid,
long delta) exec sql update accounts set
Abalance Abalance delta where Aid
Aid exec sql select Abalance into Abalance
from accounts where Aid Aid exec sql
update tellers set Tbalance Tbalance
delta where Tid Tid exec sql update
branches set Bbalance Bbalance delta
where Bid Bid exec sql insert into
history(Tid, Bid, Aid, delta, time) values
(Tid, Bid, Aid, delta, CURRENT) return(Abala
nce)
21
ACID Properties (I)
  • Atomicity 
  • State transition occurs without any observable
    intermediate states
  • ... or the system appears as though it never left
    the initial state
  • It holds whether the transaction, the entire
    application, the operating system, or other
    components function normally, function
    abnormally, or crash.
  • For a transaction to be atomic, it must behave
    atomically to any outside observer.

22
ACID Properties (II)
  • Consistency A transaction produces consistent
    results only otherwise it aborts.
  • A result is consistent if the new state of the
    database fulfills all the consistency constraints
    of the application
  • The program has functioned according to
    specification.

23
ACID Properties (III)
  • Isolation
  • A program running under transaction protection
    must behave exactly as it would in single-user
    mode.
  • That does not mean transactions cannot share data
    objects.
  • The definition of isolation is based on
    observable behavior from the outside, rather than
    on what is going on inside.

24
ACID Properties (IV)
  • Durability The results of transactions having
    completed successfully must not be forgotten by
    the system
  • Once the system has acknowledged the execution of
    a transaction, it must be able to reestablish its
    results after any type of subsequent failure,
    whether caused by the user, the environment, or
    the hardware components.

25
Concurrency control (I)
  • Lost update
  • 3 accounts (A, B, C)
  • with balances 100, 200, 300
  • T1 transfers from A to B, for 10 increase
  • T2 transfers from C to B, for 10 increase
  • Both T1, T2 read balance of B (200)
  • T1 overwrites the update by T2
  • Without seeing it

Transactions should not read a stale value
use it in computing a new value
26
Concurrency control (II)
  • Inconsistent retrievals
  • T1 transfers 10 of account A to account B
  • T2 computes sum of account balances
  • T2 computes sum before T1 updates B

Update transactions should not interfere with
retrievals.
In general Transactions should not violate
operation conflict rules.
27
Concurrency control (III)
Serial equivalence criterion for correct
concurrent execution
T1 serially equivalent with T2 iff All pairs of
conflicting operations of the two transactions
are executed in the same order at all objects
that both transactions access.
  • 3 approaches to CC
  • Locking
  • Optimistic CC
  • Timestamp ordering

Txs wait for one another OR Restart Txs after
conflicts have been detected
28
Common Performance Problems
  • Convoys on semaphores or high-traffic locks
  • Log semaphore is hotspot
  • Sequential insert is hotspot
  • Lock manager costs too much A good number 300
    instructions for lockunlock (no wait case)
  • File or page granularity locking causes hotspot
    for small files

29
Comparisons of CC methods
  • Order of serialization
  • Timestamp-based static
  • when Txs begin
  • 2-phase locking dynamic
  • based on access pattern
  • 2-phase locking is better when the frequency of
    updates is high
  • Otherwise, timestamp-based optimistic perform
    better
  • Handling conflicts
  • 2-phase locking block (danger for deadlock)
  • Timestamp-based immediate abort

30
Challenging applications
  • Multi-user, collaborative
  • Atomic updates, in the presence of concurrency
    server crashes
  • Must also support notification of changes
    access to work-in-progress (sharing)
  • Relaxed isolation guarantees
  • Long-lasting Txs
  • CAD/CAM, software development
  • Independent versions of objects
  • Check-in / check-out / merge operations

31
Recoverability from aborts (I)
  • Servers must prevent a aborting Tx from affecting
    other concurrent Txs.
  • Dirty reads
  • T2 sees result update by T1 on account A
  • T2 performs its own update on A then commits.
  • T1 aborts -gt T2 has seen a transient value
  • T2 is not recoverable
  • If T2 delays its commit until T1s outcome is
    resolved
  • Abort(T1) -gt Abort(T2)
  • However, if T3 has seen results of T2
  • Abort(T2) -gt Abort(T3) !
  • Cascading aborts

Txs should only read values written by committed
Txs
32
Recoverability from aborts (II)
  • Premature writes
  • Assume server implements abort by maintaining the
    before image of all update operations
  • T1 T2 both updates account A
  • T1 completes its work before T2
  • If T1 commits T2 aborts, the balance of A is
    correct
  • If T1 aborts T2 commits, the before image
    that is restored corresponds to the balance of A
    before T2
  • If both T1 T2 abort, the before image that is
    restored corresponds to the balance of A as set
    by T1

Txs should be delayed until earlier Txs that
update the same objects have been either
committed or aborted.
33
Recoverability from aborts (III)
  • Txs should delay both their reads updates in
    order to avoid interference
  • Strict execution -gt enforce isolation
  • Servers should maintain tentative versions of
    objects in volatile memory

Txs should be delayed until earlier Txs that
update the same objects have been either
committed or aborted.
34
Interface of transaction coordinator
openTransaction() -gt trans starts a new
transaction and delivers a unique TID trans. This
identifier will be used in the other operations
in the transaction. closeTransaction(trans) -gt
(commit, abort) ends a transaction a commit
return value indicates that the transaction has
committed an abort return value indicates that
it has aborted. abortTransaction(trans) aborts
the transaction.
35
Outcomes of a Flat Transaction
36
Stateless Servers (I)
  • Assume requests run as transactions
  • Two types of servers
  • application server - application logic
  • resource manager - shared resources (e.g. disks)

Client
  • Application server is stateless if it
  • runs in transactions
  • maintains all state in resource mgrs
  • E.g. start Tx, call resource mgrs, commit
  • It doesnt need to recover Only needs to know
    which request it was processing (to redo it).

Application Server
Resource Manager
Resource
37
Stateless Servers (II)
  • Resource mgr runs all requests in Txs
  • Needs to be capable to perform recovery work
  • abort all partially completed transactions
  • ensure resources include all the results of all
    committed transactions (atomic and durable)
  • database recovery algorithms (complex!)

38
Stateful Servers (I)
  • Sometimes a server maintains state on clients
    behalf. E.g.,
  • Server scans a file. Each time it hits a relevant
    record it returns it. Next call picks up the scan
    where it left off.
  • cursors
  • Server maintains lots of user information, which
    is too expensive to reconstruct on every call.
  • Approach 1 Client passes state to server on each
    call, and server returns it on each reply. Server
    retains no state.
  • This is the default assumption outside TP, but
    doesnt work well for TP.
  • Note that transaction id context is handled this
    way.

39
Stateful Servers (II)
  • Approach 2 server maintains state, indexed by
    client id (e.g. transaction id). Subsequent RPCs
    from the client must go to the same server and
    pick up the retained context.
  • RPC can provide a binding handle for subsequent
    calls. This ensures later calls go to the same
    server.
  • If the client fails (e.g. it aborts the
    transaction), server must be notified to release
    clients context
  • its just like a resource manager that releases
    locks
  • so encapsulate context as a (volatile) resource
  • If state must be maintained across transaction
    boundaries, then treat it like any resource
    manager (e.g. DBMS)

40
Fault Tolerance (I)
  • What to do if a client doesnt receive a reply
    within its timeout period?
  • Why not just retry?
  • In TP, many RPC calls are not idempotent
  • Idempotent any number of operation executions
    has the same effect as one execution
  • Queries (read-only) are idempotent, but not most
    updates
  • Send a ping for non-idempotent calls
  • After giving up, ignore late-arriving responses
  • Cant assume that the call didnt run, so usually
    requires aborting the transaction that made the
    call (up to the application)

41
Fault Tolerance (II)
  • Interface definition can say whether server is
    idempotent
  • could even be done per member function
  • More abstract view
  • RPC executes idempotent calls at least once
  • RPC executes non-idempotent calls at most once
  • If the goal is exactly once, execute the RPC
    within a transaction use transaction retry
    logic to ensure transaction actually runs

42
Fault Tolerance By Logging Device I/O
  • Consider a transaction all of whose operations
    are undoable.
  • Log all of the transactions interaction with the
    outside world.
  • If the transaction fails, replay it.
  • During the replay,
  • get input from the log
  • validate that output is identical to what was
    logged.
  • If the output diverges from the log, then start
    asking for live input (and then ignore rest of
    the log).
  • A variation of this is used by Digitals RTR

43
Transparent Transaction IDs
  • Ideally, Start returns a transaction ID thats
    hidden from the caller
  • Procedures dont need to explicitly pass
    transaction ids.
  • Easier avoids errors
  • Moreover, when a transaction first arrives at a
    site, the local transaction manager needs to be
    notified.
  • Application shouldnt need to deal with this
  • This is what makes RPC (or other paradigms)
    transactional.

44
Implementing transactions
45
Private Workspace
  • The file index and disk blocks for a three-block
    file
  • The situation after a transaction has modified
    block 0 and appended block 3
  • After committing

46
Writeahead Log
47
Locking schemes
  • Give each Tx the illusion that there are no
    concurrent updates
  • Hide concurrency anomalies.
  • Do it automatically
  • Apps do not know transaction semantics
  • Goal
  • Although there is concurrency in systemexecution
    is equivalent to some serial execution of the
    system
  • Not deterministic outcome, just a consistent
    transformation

48
Two-Phase Locking
49
Strict Two-Phase Locking
50
Deadlock Detection
  • Deadlock a cycle in the wait-for graph
  • Kinds of waits database locks terminal/device
    storage session server
  • Correct detection must get complete graph
  • Not likely, so always fall back on timeout
  • Model of deadlock showswaits are raredeadlocks
    are rare2 (very very rare)virtually all cycles
    are length 2so do depth-first search either as
    soon as you wait OR after a timeout

51
Model of Deadlock (I)
  • R number of objects (locks)
  • r objects locked per transaction
  • N1 Concurrent Transactions
  • ASSUME
  • Transaction is LOCKr lock steps, then commit
  • Uniform distribution
  • exclusive locks only
  • Nr ltlt R

Prob. a request waits
Prob. a transaction waits
52
Model of Deadlock (II)
  • Probability of cycle length 2 length 3 ...


Prob transaction deadlocks PD assumes all
cycles of length 2
System deadlock rate is N1 times higher
Conclusions control transaction size
duration limit multiprogramming
53
Atomicity of multi-server transactions
  • Goal is to ensure the atomicity of a Tx that
    accesses multiple resource managers
  • Data
  • Messages
  • Anything shared by Txs
  • Problems
  • What if resource manager RMi fails after a
    transaction commits at RMk?
  • What if other RMs are down when RMi recovers?
  • What if a Tx thinks RMi failed therefore
    aborted, when it actually is still running?

54
Transactional Communications
  • Three paradigms for communications between
    application programs in transactions
  • remote procedure call - procedure calls between
    address spaces
  • peer-to-peer messages - send-message /
    receive-message
  • queues - enqueue, dequeue to a shared queue
  • These paradigms are not unique to TP, but they
    all have TP-specific aspects

55
Remote Procedure Call
  • Program calls remote procedure the same way it
    would call a local procedure
  • Variation asynchronous call return, for
    single-threaded client
  • Most widely-used standard is RPC in OSF/DCE
  • Hides certain underlying complexities
  • communications and message ordering errors
  • data representation differences between programs

56
Desirable RPC Features
  • A way of pipelining large parameters on call or
    return (e.g. for queries). Pipe in DCE/RPC.
  • pass a handle as parameter, with a type, so
    client and server agree on whats being passed
  • receiver can claim pieces, a chunk-at-a-time
  • Callbacks - a server calls a procedure in the
    client
  • essentially a reverse RPC
  • requires another controlled binding to the right
    client entry point
  • useful for controlled conversational access
    between server and client
  • not supported in DCE RPC

57
RPC Performance
  • RPC costs
  • marshaling unmarshaling
  • RPC runtime and network protocol
  • physical wire transfer
  • In the remote case, these are about equal (but
    people are working to do better)
  • Typical commercial numbers are 10-15K machine
    instructions
  • Can do much better in the local case by avoiding
    a full context switch

58
IBMs LU6.2 Peer-to-Peer
  • De-facto standard
  • APPC CPI-C APIs
  • Programs establish conversations (i.e. session)
    via Allocate
  • Close the conversation with Deallocate
  • Then send and receive messages over the
    conversation using Send_Data and Receive_Data
  • Uses the chained transaction model. Announce
    transaction done using Syncpoint or Backout.
  • One pipe model - data (send/receive) control
    (2-phase commit) messages flow on the same
    session.

59
Conversations Two-way Alternate
  • A conversation is half-duplex.
  • Reflects the call-return style of most TP
    communications
  • One participant is in send mode and the other is
    in receive mode.
  • The sender must explicitly turn over send control
    to the receiver, in a Send_Data call.
  • The receiver cant start sending until it
    receives from the sender (using Receive_data) a
    send-mode signal (a.k.a. polarity indicator)

60
Conversation Trees
  • When a program issues an Allocate(program-name),
    the called instance of program-name becomes a
    child of the caller
  • Thus Allocate calls from programs cause a
    conversation tree to develop
  • E.g. A calls Allocate(B) B calls Allocate(C),
    Allocate(D), and Allocate(A)

A
B
C
D
A
61
Synchronization Levels
  • There are 3 levels of synchronization in LU6.2
  • Level 2 - programs in the conversation tree
    execute in a transaction. Each program explicitly
    says when to commit by issuing Syncpoint.
  • Level 1 - No transactions. Each program can
    acknowledge receipt of a message by issuing a
    Confirm signal, which is meant to indicate that
    the program has processed the message(s).
  • Level 0 - No transactions. No confirm. Just send
    and receive message over a conversation.
  • Many non-IBM implementations are level 0.

62
Syncpoint Rules (I)
  • A program issues Syncpoint to announce its done
    with its part of the transaction
  • Causes Syncpoint message to propagate to its
    neighbors in the conversation tree.
  • A program can issue Syncpoint if either
  • all of its conversations are in send mode, and it
    has not received a Syncpoint request over any
    other conversation, or
  • all but one of its conversations are in send
    mode, and it received a Syncpoint over the
    receive-mode conversation
  • Syncpoint blocks the caller until the whole
    transaction is committed or aborted (return code
    tells which).

63
Syncpoint Rules (II)
  • Next statement is part of a new transaction
    (chained model)
  • all programs in the conversation are part of the
    same new transaction (chaining is in the
    protocol, not just the API)
  • Eliminates some but not all protocol errors.
    E.g.,
  • A and C are in send mode to B, and no Syncpoints
    yet
  • A and C issue Syncpoint, which collides at B
  • B is stuck and will never satisfy the rules
  • LU6.2 is commit-from-anywhere. I.e. any program
    in the conversation tree can be the first to call
    Syncpoint. It neednt be the root of the
    conversation tree.

64
Stateful Programs (I)
  • Connection-oriented communications model
  • A conversation names some shared state between
    the communicating programs
  • direction of communications
  • direction of the link
  • transaction id
  • state of the transaction
  • Since programs hold conversations across message
    exchanges, they may rely on each others retained
    state from previous message exchanges.

65
Stateful Programs (II)
  • E.g., P1 has a connection to P2. P1 scans a file
    owned by P2. P2 maintains a cursor (retained
    state), indicating P1s position in the file.
  • Since connections arent recoverable across
    system failures, programs must be able to
    reconstruct retained state after they recover.
  • I.e. after each transaction commits or aborts
  • When a session is lost, programs must be able to
    release retained state (needed anyway to abort
    automatically when a program fails)

66
Message Passing Flexibility (I)
  • Request-reply protocols (RPC) require programs to
    properly nest their request and reply messages.
  • Example - Request-reply matching

A
B
C
Call
T I M E
Call
Return
Return
67
Message Passing Flexibility (II)
  • But peer-to-peer allows arbitrary message flows
    between the two parties to a conversation
  • Example - peer-to-peer message passing

A
B
C
Send
Rcv
  • To communicate with
  • an application that uses
  • peer-to-peer, you must
  • know the message
  • flows (protocol) that
  • the application expects

T I M E
68
Termination Model (I)
  • In RPC, a program normally announces termination
    by returning to its caller
  • It must not return until all of its outbound
    calls have returned
  • In peer-to-peer, a program announces termination
    by invoking Syncpoint.
  • This also tells the programs transaction to
    start committing, but each program decides
    independently when to commit (by issuing
    Syncpoint)
  • Termination errors are the price of more message
    passing flexibility ...

69
Termination Model (II)
  • Certain programming errors are possible in
    peer-to-peer
  • P1 invokes Syncpoint, but P2 is waiting for a
    message from P1. P1 and P2 are deadlocked.
  • P2 gives up waiting for P1s message, so P2
    invokes Syncpoint. P2 must be ready for P1s
    message after Syncpoint returns.

70
Connection Models
  • To cope with stateful servers, both models need a
    way to manage shared state.
  • In peer-to-peer, the state is implicitly attached
    to conversation (session) context
  • In RPC, it is either exchanged in parameters or a
    session is created above the communications layer
    using a binding handle.
  • In both models, we need to clean up retained
    state after a failure and need to reconstruct
    shared state at appropriate times.

71
Multi-Transaction Requests
  • Some requests cannot execute as one transaction
    because
  • it executes too long (causing lock contention)
  • resources dont support a compatible 2-phase
    commit protocol.
  • Transaction may run too long because
  • It requires display I/O with user
  • People or machines are unavailable (hotel
    reservation system, manager who approves the
    request)
  • It requires long-running real-world actions (eg
    get 2 estimates before settling an insurance
    claim)
  • Subsystems transactions must be ACID (placing an
    order, scheduling a shipment, reporting
    commission)

72
Workflow
  • A multi-transaction request is called a workflow
  • Specialized workflow products are being offered.
  • IBM Flowmark, Action, JetForm, Wang/Kodak, ...
  • They have special features, such as
  • Flow-graph language for describing processes
    consisting of steps, with preconditions for
    moving between steps
  • representation of organizational structure and
    roles (manual step can be performed by a person
    in a role, with complex role resolution
    procedure)
  • tracing of steps, locating in-flight workflows
  • ad hoc workflow, integrated with e-mail (case
    mgmt)

73
Managing Workflow with Queues
  • Each workflow step is a request
  • Send the request to the queue of the server that
    can process the request
  • Server outputs request(s) for the next step(s) of
    the workflow

74
Workflows Can Violate Atomicity Isolation
  • Since a workflow runs as many transactions,
  • it may not be serializable relative to other
    workflows
  • it may not be all-or-nothing
  • Consider a money transfer run as 2 Txs, T1 T2
  • Conflicting money transfers could run between T1
    T2
  • A failure after T1 might prevent T2 from running
  • These problems require application-specific logic
  • E.g. T2 must send ack to T1s node. If T1s node
    times out waiting for the ack, it takes action,
    possibly compensating for T1

75
Automated Compensation
  • Saga workflow specification, where for each step
    we identify a compensation.
  • If a workflow stops making progress, run
    compensations for all committed steps, in reverse
    order (like Tx abort).
  • Need to ensure that each compensations input is
    available (e.g. log it) and that it definitely
    can run (enforce constraints until workflow
    completes).
  • Concept is still at the research stage.

76
Pseudo-conversations
  • A conversational transaction interacts with its
    user during its execution.
  • Since this is long-running, it should run as
    multiple requests
  • Since there are exactly two participants, just
    pass the request back and forth
  • request carries all workflow context
  • request is recoverable, e.g. send/receive is
    logged or request is stored in shared disk area
  • This is a simpler mechanism than queues
Write a Comment
User Comments (0)
About PowerShow.com