The Future of Distributed Systems . - PowerPoint PPT Presentation

Loading...

PPT – The Future of Distributed Systems . PowerPoint presentation | free to download - id: dab55-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

The Future of Distributed Systems .

Description:

Refrigerator-sized. CPU. 15. 1997: 10 years later. 1 Person and 1 box = 1250 tps ... Doors, rooms, cars... Computing will be ubiquitous. Billions Of Clients ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 61
Provided by: ellen150
Learn more at: http://research.microsoft.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: The Future of Distributed Systems .


1
The Future of Distributed Systems .
Jim Gray Researcher Microsoft Corp. Gray_at_Microsoft
.com
2
Outline
  • Global forces
  • Moores, Metcalfs, Bells, Bills, Andys laws
  • Micro dollars per transaction
  • Cyber-content is key value because distribution
    costs go to zero
  • Distributed Systems Concepts and terms
  • Key software technologies
  • objects, transactions

3
Metcalfs Law Network Utility Users2
  • How many connections can it make?
  • 1 user no utility
  • 100,000 users a few contacts
  • 1 million users many on Net
  • 1 billion users everyone on Net
  • That is why the Internet is so hot
  • Exponential benefit

4
Moores First Law
  • XXX doubles every 18 months 60 increase per
    year
  • Micro processor speeds
  • Chip density
  • Magnetic disk density
  • Communications bandwidth WAN bandwidth
    approaching LAN speeds
  • Exponential growth
  • The past does not matter
  • 10x here, 10x there, soon youre talking REAL
    change
  • PC costs decline faster than any other platform
  • Volume and learning curves
  • PCs will be the building bricks of all future
    systems

1GB
128MB
1 chip memory size ( 2 MB to 32 MB)
8MB
1MB
128KB
8KB
1980
1990
2000
1970
256M
1M
16M
bits 1K
4K
16K
64K
256K
4M
64M
5
Bumps In The Moores Law Road
  • DRAM
  • 1988 United States anti-dumping
    rules
  • 1993-1995 ?price flat
  • Magnetic disk
  • 1965-1989 10x/decade
  • 1989-1996 4x/3year! 100X/decade

6
Gordon Bells Seven Price Tiers
  • 10 wrist watch computers
  • 100 pocket/ palm computers
  • 1,000 portable computers
  • 10,000 personal computers (desktop)
  • 100,000 departmental computers
    (closet)
  • 1,000,000 site computers (glass house)
  • 10,000,000 regional computers (glass
    castle)

Super server costs more than 100,000 Mainframe
costs more than 1 million Must be an array
of processors, disks, tapes, comm ports
7
Bells Evolution Of Computer Classes
Technology enables two evolutionary paths 1.
constant performance, decreasing cost 2.
constant price, increasing performance
Mainframes (central)
Minis (dept.)
Log price
WSs
PCs (personals)
??
Time
1.26 2x/3 yrs -- 10x/decade 1/1.26 .8 1.6
4x/3 yrs --100x/decade 1/1.6 .62
8
Software Economics
Microsoft 9 billion
  • An engineer costs about 150,000/year
  • RD gets 515 of budget
  • Need 3 million 1 million revenue per
    engineer

Profit 24
RD 16
SGA 34
Tax 13
Product and Service 13
Intel 16 billion
IBM 72 billion
Oracle 3 billion
Profit 15
Profit 6
RD 9
RD 8
Profit
22
Tax 7
SGA
11
Tax
SGA
12
PS 59
43
PS 47
PS 26
9
Software Economics Bills Law
Fixed_
Cost
Price
Marginal _Cost


Units
  • Bill Joys law (Sun) dont write software for
    less than 100,000 platforms _at_10 million
    engineering expense, 1,000 price
  • Bill Gates law dont write software for less
    than 1,000,000 platforms _at_10 engineering
    expense, 100 price
  • Examples
  • UNIX versus Windows NT 3,500 versus 500
  • Oracle versus SQL-Server 100,000 versus 6,000
  • No spreadsheet or presentation pack on
    UNIX/VMS/...
  • Commoditization of base software and hardware

10
Gordon Bells Platform Economics
  • Traditional computers custom or semi-custom,
    high-tech and high-touch
  • New computers high-tech and no-touch

100000
10000
Price (K)
1000
Volume (K)
Application price
100
10
1
0.1
0.01
Mainframe
WS
Browser
Computer type
11
Groves Law The New Computer Industry
  • Horizontal integration is new structure
  • Each layer picks best from lower layer
  • Desktop (C/S) market
  • 1991 50
  • 1995 75

Example
Function
Operation
ATT
Integration
EDS
Applications
SAP
Middleware
Oracle
Baseware
Microsoft
Systems
Compaq
Intel Seagate
Silicon Oxide
12
Outline
  • Global forces
  • Moores, Metcalfs, Bells, Bills, Andys laws
  • Micro dollars per transaction
  • Cyber-content is key value because distribution
    costs go to zero
  • Distributed Systems Concepts and terms
  • Key software technologies
  • objects, transactions

13
1987 256 tps Benchmark
  • 14 M computer (Tandem)
  • A dozen people
  • False floor, 2 rooms of machines

Admin expert
Hardware experts
A 32 node processor array
Auditor
Network expert
Simulate 25,600 clients
Manager
Performance expert
OS expert
DB expert
A 40 GB disk array (80 drives)
14
1988 DB2 CICS Mainframe 65 tps
  • IBM 4391
  • Simulated network of 800 clients
  • 2m computer
  • Staff of 6 to do benchmark

2 x 3725 network controllers
Refrigerator-sized CPU
16 GB disk farm 4 x 8 x .5GB
15
1997 10 years later 1 Person and 1 box 1250 tps
  • 1 Breadbox 5x 1987 machine room
  • 23 GB is hand-held
  • One person does all the work
  • Cost/tps is 1,000x less 1 micro dollar per
    transaction

4x200 Mhz cpu 1/2 GB DRAM 12 x 4GB disk
Hardware expert OS expert Net expert DB
expert App expert
3 x7 x 4GB disk arrays
16
What Happened?
  • Moores law Things get 4x better every 3
    years (applies to computers, storage, and
    networks)
  • New Economics Commodity class price/mips
    software /mips
    k/year mainframe 10,000 100 minicomputer
    100 10 microcomputer 10
    1
  • GUI Human - computer tradeoff optimize for
    people, not computers

17
What Happens Next
  • Last 10 years 1000x improvement
  • Next 10 years ????
  • Today text and image servers are free
    1 m/hit cost 70,000m/hit advertising
    revenue
  • Advertising pays for them
  • Content is only real expense
  • You aint seen nothing yet!

18
Kinds Of Information Processing
Point-to-point
Broadcast
Lecture Concert
Conversation Money
Network
Immediate
Book Newspaper
Mail
Time- shifted
Database
Its ALL going electronic Immediate is being
stored for analysis (so ALL database) Analysis
and automatic processing are being added
19
Why Put Everything In Cyberspace?
Point-to-point OR broadcast
Low rent - min /byte Shrinks time - now
or later Shrinks space - here or
there Automate processing - knowbots
Network
Immediate OR time-delayed
Locate Process Analyze Summarize
Database
20
Billions Of Clients
  • Every device will be intelligent
  • Doors, rooms, cars
  • Computing will be ubiquitous

21
Billions Of Clients Need Millions Of Servers
  • All clients networked to servers
  • May be nomadic or on-demand
  • Fast clients want faster servers
  • Servers provide
  • Shared Data
  • Control
  • Coordination
  • Communication

Clients
Mobile clients
Fixed clients
Servers
Server
Super server
22
Thesis Many little beat few big
1 million
100 K
10 K
Pico Processor
Micro
Nano
10 pico-second ram
1 MB
Mini
Mainframe
10
0

MB
1
0 GB
1
TB
1
00 TB
1.8"
2.5"
3.5"
5.25"
1 M SPECmarks, 1TFLOP 106 clocks to bulk
ram Event-horizon on chip VM reincarnated Multi
program cache, On-Chip SMP
9"
14"
  • Smoking, hairy golf ball
  • How to connect the many little parts?
  • How to program the many little parts?
  • Fault tolerance?

23
Future Super Server 4T Machine
  • Array of 1,000 4B machines
  • 1 bps processors
  • 1 BB DRAM
  • 10 BB disks
  • 1 Bbps comm lines
  • 1 TB tape robot
  • A few megabucks
  • Challenge
  • Manageability
  • Programmability
  • Security
  • Availability
  • Scaleability
  • Affordability
  • As easy as a single system

Cyber Brick a 4B machine
Future servers are CLUSTERS of processors,
discs Distributed database techniques make
clusters work
24
The Hardware Is In Place And then a miracle
occurs
?
  • SNAP scaleable network and platforms
  • Commodity-distributed OS built on
  • Commodity platforms
  • Commodity network interconnect
  • Enables parallel applications

25
Outline
  • Global forces
  • Moores, Metcalfs, Bells, Bills, Andys laws
  • Micro dollars per transaction
  • Cyber-content is key value because distribution
    costs go to zero
  • Distributed Systems Concepts and terms
  • Key software technologies
  • objects, transactions

26
Outline Concepts and Terminology
  • Why Distributed
  • Distributed data objects
  • Distributed execution
  • Three tier architectures
  • Transaction concepts

27
Whats a Distributed System?
  • Centralized
  • everything in one place
  • stand-alone PC or Mainframe
  • Distributed
  • some parts remote
  • distributed users
  • distributed execution
  • distributed data

28
Why Distribute?
  • No best organization
  • Companies constantly swing between
  • Centralized focus, control, economy
  • Decentralized adaptive, responsive, competitive
  • Why distribute?
  • reflect organization or application structure
  • empower users / producers
  • improve service (response / availability)
  • distributed load
  • use PC technology (economics)

29
What Should Be Distributed?
  • Users and User Interface
  • Thin client
  • Processing
  • Trim client
  • Data
  • Fat client
  • Will discuss tradeoffs later

Presentation
workflow
Business Objects
Database
30
Transparency in Distributed Systems
  • Make distributed system as easy to use and manage
    as a centralized system
  • Give a Single-System Image
  • Location transparency
  • hide fact that object is remote
  • hide fact that object has moved
  • hide fact that object is partitioned or
    replicated
  • Name doesnt change if object is replicated,
    partitioned or moved.

31
Naming- The basics
  • Objects have
  • Globally Unique Identifier (GUIDs)
  • location(s) address(es)
  • name(s)
  • addresses can change
  • objects can have many names
  • Names are context dependent
  • (Jim _at_ KGB not the same as Jim _at_ CIA)
  • Many naming systems
  • UNC \\node\device\dir\dir\dir\object
  • Internet http//node.domain.root/dir/dir/dir/obje
    ct
  • LDAP ldap//ldap.domain.root/oorg,cUS,cndir

guid
James
32
Name Servers in Distributed Systems
North
  • Name servers translate names context to
    address ( GUID)
  • Name servers are partitioned (subtrees of name
    space)
  • Name servers replicate root of name tree
  • Name servers form a hierarchy
  • Distributed data from hell
  • high read traffic
  • high reliability availability
  • autonomy

root
Northern names
South
root
Southern names
33
Autonomy in Distributed Systems
  • Owner of site (or node, or application, or
    database) Wants to control it
  • If my part is working , must be able to access
    manage it (reorganize, upgrade, add user,)
  • Autonomy is
  • Essential
  • Difficult to implement.
  • Conflicts with global consistency
  • examples naming, authentication, admin

34
Security The Basics
  • Authentication server subject Authenticator gt
    (Yes token) No
  • Security matrix
  • who can do what to whom
  • Access control list is column of matrix
  • who is authenticated ID
  • In a distributed system, who and what and
    whom are distributed objects

35
Security in Distributed Systems
  • Security domain nodes with a shared security
    server.
  • Security domains can have trust relationships
  • A trusts B A believes B when it says this is
    Jim_at_B
  • Security domains form a hierarchy.
  • Delegation passing authority to a server when A
    asks B to do something (e.g. print a file, read a
    database) B may need As authority
  • Autonomy requires
  • each node is an authenticator
  • each node does own security checks
  • Internet Today
  • no trust among domains (fire walls, many
    passwords)
  • trust based on digital signatures

36
Clusters The Ideal Distributed System.
  • Cluster is distributed system BUT single
  • location
  • manager
  • security policy
  • relatively homogeneous
  • communications is
  • high bandwidth
  • low latency
  • low error rate
  • Clusters use distributed system techniques for
  • load distribution
  • storage
  • execution
  • growth
  • fault tolerance

37
Cluster Shared What?
  • Shared Memory Multiprocessor
  • Multiple processors, one memory
  • all devices are local
  • DEC or SGI or Sequent 16x nodes
  • Shared Disk Cluster
  • an array of nodes
  • all shared common disks
  • VAXcluster Oracle
  • Shared Nothing Cluster
  • each device local to a node
  • ownership may change
  • Tandem, SP2, Wolfpack

38
Outline Concepts and Terminology
  • Why Distribute
  • Distributed data objects
  • Partitioned
  • Replicated
  • Distributed execution
  • Three tier architectures
  • Transaction concepts

39
Partitioned Data Break file into disjoint groups
Orders
  • Exploit data access locality
  • Put data near consumer
  • Less network traffic
  • Better response time
  • Better availability
  • Owner controls data autonomy
  • Spread Load
  • data or traffic may exceed single store

N.A. S.A. Europe Asia
40
How to Partition Data?
  • How to Partition
  • by attribute or
  • random or
  • by source or
  • by use
  • Problem to find it must have
  • Directory (replicated) or
  • Algorithm
  • Encourages attribute-based partitioning

N.A. S.A. Europe Asia
41
Replicated Data Place fragment at many sites
  • Pros
  • Improves availability
  • Disconnected (mobile) operation
  • Distributes load
  • Reads are cheaper
  • Cons
  • N times more updates
  • N times more storage
  • Placement strategies
  • Dynamic cache on demand
  • Static place specific

Catalog
42
Updating Replicated Data
  • When a replica is updated, how do changes
    propagate?
  • Master copy, many slave copies (SQL Server)
  • always know the correct value (master)
  • change propagation can be
  • transactional
  • as soon as possible
  • periodic
  • on demand
  • Symmetric, and anytime (Access)
  • allows mobile (disconnected) updates
  • updates propagated ASAP, periodic, on demand
  • non-serializable
  • colliding updates must be reconciled.
  • hard to know real value

43
Replication and Partitioning Compared
  • Central Scaleup 2x more work
  • Partition Scaleup 2x more work
  • Replication Scaleup 4x more work

Replication
44
Outline Concepts and Terminology
  • Why Distribute
  • Distributed data objects
  • Partitioned
  • Replicated
  • Distributed execution
  • remote procedure call
  • queues
  • Three tier architectures
  • Transaction concepts

45
Distributed Execution Threads and Messages
  • Thread is Execution unit (software analog of
    cpumemory)
  • Threads execute at a node
  • Threads communicate via
  • Shared memory (local)
  • Messages (local and remote)

messages
46
Peer-to-Peer or Client-Server
  • Peer-to-Peer is symmetric
  • Either side can send
  • Client-server
  • client sends requests
  • server sends responses
  • simple subset of peer-to-peer

request
response
47
Connection-less or Connected
  • Connected (sessions)
  • open - request/reply - close
  • client authenticated once
  • Messages arrive in order
  • Can send many replies (e.g. FTP)
  • Server has client context (context sensitive)
  • e.g. Winsock and ODBC
  • HTTP adding connections
  • Connection-less
  • request contains
  • client id
  • client context
  • work request
  • client authenticated on each message
  • only a single response message
  • e.g. HTTP, NFS v1

48
Remote Procedure Call The key to transparency
  • Object may be local or remote
  • Methods on object work wherever it is.
  • Local invocation

49
Remote Procedure Call The key to transparency
  • Remote invocation

y pObj-gtf(x)
50
Object Request Broker (ORB) Orchestrates RPC
  • Registers Servers
  • Manages pools of servers
  • Connects clients to servers
  • Does Naming, request-level authorization,
  • Provides transaction coordination (new feature)
  • Old names
  • Transaction Processing Monitor,
  • Web server,
  • NetWare

Object-Request Broker
51
History and Alphabet Soup
1985
X/Open
1990
1995
Open Group
52
ActiveX and COM
  • COM is Microsoft model, engine inside OLE ALL
    Microsoft software is based on COM (ActiveX)
  • CORBA OpenDoc is equivalent
  • Heated debate over which is best
  • Both share same key goals
  • Encapsulation hide implementation
  • Polymorphism generic operations key to GUI and
    reuse
  • Versioning allow upgrades
  • Transparency local/remote
  • Security invocation can be remote
  • Shrink-wrap minimal inheritance
  • Automation easy
  • COM now managed by the Open Group

53
Linking And Embedding Objects are data
modules transactions are execution modules
  • Link pointer to object somewhere else
  • Think URL in Internet
  • Embed bytes are here
  • Objects may be active can callback to subscribers

54
Bottom Line Re ORBs
  • Microsoft Promises Cairo distributed objects,
    secure, transparent, fast invocation
  • Netscape promises the CORBA
  • Both will deliver
  • Customers can pick the best one

Object-Request Broker
55
Using RPC for Transparency Partition Transparency
  • Send updates to correct partition

y pfile-gtwrite(x)
56
Using RPC for Transparency Replication
Transparency
  • Send updates to EACH node

y pfile-gtwrite(x)
57
Outline Concepts and Terminology
  • Why Distributed
  • Distributed data objects
  • Distributed execution
  • remote procedure call
  • queues
  • Three tier architectures
  • what
  • why
  • Transaction concepts

58
Client/Server Interactions All can be done with
RPC
C
S
  • Request-Response response may be many messages
  • Conversational server keeps client context
  • Dispatcher three-tier complex operation at
    server
  • Queued de-couples client from server allows
    disconnected operation

C
S
S
S
C
S
S
C
S
59
Queued Request/Response
  • Time-decouples client and server
  • Three Transactions
  • Almost real time, ASAP processing
  • Communicate at each others convenience Allows
    mobile (disconnected) operation
  • Disk queues survive client server failures

Client
Server
60
Why Queued Processing?
  • Prioritize requests ambulance dispatcher favors
    high-priority calls
  • Manage Workflows
  • Deferred processing in mobile apps
  • Interface heterogeneous systems EDI, MOM
    Message-Oriented-Middleware DAD Direct Access
    to Data

61
Work Distribution Spectrum
  • Presentation and plug-ins
  • Workflow manages session invokes objects
  • Business objects
  • Database

Presentation
workflow
Business Objects
Database
62
Transaction Processing Evolution to Three
Tier Intelligence migrated to clients
Mainframe
cards
  • Mainframe Batch processing (centralized)
  • Dumb terminals Remote Job Entry
  • Intelligent terminals database backends
  • Workflow Systems Object Request
    Brokers Application Generators

TP Monitor
ORB
63
Web Evolution to Three Tier Intelligence migrated
to clients (like TP)
Web Server
WAIS
  • Character-mode clients, smart servers
  • GUI Browsers - Web file servers
  • GUI Plugins - Web dispatchers - CGI
  • Smart clients - Web dispatcher (ORB) pools of app
    servers (ISAPI, Viper) workflow scripts at client
    server

archie ghopher green screen
64
PC Evolution to Three Tier Intelligence migrated
to server
  • Stand-alone PC (centralized)
  • PC File print server message per I/O
  • PC Database server message per SQL statement
  • PC App server message per transaction
  • ActiveX Client, ORB ActiveX server, Xscript

IO request reply
disk I/O
SQL Statement
Transaction
65
The Pattern Three Tier Computing
Presentation
  • Clients do presentation, gather input
  • Clients do some workflow (Xscript)
  • Clients send high-level requests to ORB (Object
    Request Broker)
  • ORB dispatches workflows and business objects --
    proxies for client, orchestrate flows queues
  • Server-side workflow scripts call on distributed
    business objects to execute task

workflow
Business Objects
Database
66
The Three Tiers
Object Data server.
67
Why Did Everyone Go To Three-Tier?
  • Manageability
  • Business rules must be with data
  • Middleware operations tools
  • Performance (scaleability)
  • Server resources are precious
  • ORB dispatches requests to server pools
  • Technology Physics
  • Put UI processing near user
  • Put shared data processing near shared data

Presentation
workflow
Business Objects
Database
68
Why Put Business Objects at Server?
69
What Middleware Does ORB, TP Monitor, Workflow
Mgr, Web Server
  • Registers transaction programs workflow and
    business objects (DLLs)
  • Pre-allocates server pools
  • Provides server execution environment
  • Dynamically checks authority (request-level
    security)
  • Does parameter binding
  • Dispatches requests to servers
  • parameter binding
  • load balancing
  • Provides Queues
  • Operator interface

70
Server Side Objects Easy Server-Side Execution
A Server
  • ORB gives simple execution environment
  • Object gets
  • start
  • invoke
  • shutdown
  • Everything else is automatic
  • Drag Drop Business Objects

Network
Receiver
Queue
Management
Connections
Context
Security
Configuration
Thread Pool
Service logic
Synchronization
Shared Data
71
A new programming paradigm
  • Develop object on the desktop
  • Better yet download them from the Net
  • Script work flows as method invocations
  • All on desktop
  • Then, move work flows and objects to server(s)
  • Gives
  • desktop development
  • three-tier deployment
  • Software Cyberbricks

72
Why Server Pools?
  • Server resources are precious. Clients have 100x
    more power than server.
  • Pre-allocate everything on server
  • preallocate memory
  • pre-open files
  • pre-allocate threads
  • pre-open and authenticate clients
  • Keep high duty-cycle on objects (re-use them)
  • Pool threads, not one per client
  • Classic example TPC-C benchmark
  • 2 processes
  • everything pre-allocated

N clients x N Servers x F files N x N x F
file opens!!!
Pool of DBC links
HTTP
IE
7,000 clients
IIS
SQL
73
Classic Three-Tier Example TPC-C
7,000 Web clients
  • Transaction Processing Performance Council (TPC)
    standard performance benchmarks
  • 5 transaction types
  • order entry , payment , status (oltp)
  • delivery (mini-batch)
  • restock (mini-DSS)
  • Metrics Throughput, Price/Performance
  • Shows best practices
  • everyone three tier
  • 2 processes at server
  • everything pre-allocated

HTTP
IIS Web
Pool of DBC links
ODBC
SQL
74
Classic Mistakes
  • Thread per terminal fix DB server thread
    pools fix server pools
  • Process per request (CGI) fix ISAPI NSAPI DLLs
    fix connection pools
  • Many messages per operation fix stored
    procedures fix server-side objects
  • File open per request fix cache hot files

75
Outline
  • Laws micro/transaction
  • Distributed Systems
  • Why Distributed
  • Distributed data objects
  • Distributed execution
  • Three tier architectures
  • why manageability performance
  • what server side workflows objects
  • Transaction concepts
  • Why transactions?
  • Using transactions

76
Thesis
  • Transactions are key to structuring distributed
    applications
  • ACID properties ease exception handling
  • Atomic all or nothing
  • Consistent state transformation
  • Isolated no concurrency anomalies
  • Durable committed transaction effects persist

77
What Is A Transaction?
  • Programmers view
  • Bracket a collection of actions
  • A simple failure model
  • Only two outcomes

Begin() action action action
action Commit()
Begin() action action action Rollback()
Begin() action action action Rollback()
Fail !
Success!
Failure!
78
Why Bother Atomicity?
  • RPC semantics
  • At most once try one time
  • At least once keep trying till acknowledged
  • Exactly once keep trying till acknowledged
    and server discards duplicate requests

?
?
?
79
Why Bother Atomicity?
  • Example insert record in file
  • At most once time-out means maybe
  • At least once retry may get duplicate error
    or retry may do second insert
  • Exactly once you do not have to worry
  • What if operation involves
  • Insert several records?
  • Send several messages?
  • Want ALL or NOTHING for group of actions

80
Why Bother Durability
  • Once a transaction commits, want effects to
    survive failures
  • Fault tolerance old master-new master wont
    work
  • Cant do daily dumps would lose recent work
  • Want continuous dumps
  • Redo lost transactions in case of failure
  • Resend unacknowledged messages

81
Why ACID For Client/Server And Distributed
  • ACID is important for centralized systems
  • Failures in centralized systems are simpler
  • In distributed systems
  • More and more-independent failures
  • ACID is harder to implement
  • That makes it even MORE IMPORTANT
  • Simple failure model
  • Simple repair model

82
ACID Generalizations
  • Taxonomy of actions
  • Unprotected not undone or redone
  • Temp files
  • Transactional can be undone before commit
  • Database and message operations
  • Real cannot be undone
  • Drill a hole in a piece of metal, print a check
  • Nested transactions subtransactions
  • Work flow long-lived transactions

83
Programming Transactions The Application View
  • You Start (e.g. in TransactSQL)
  • Begin Distributed Transaction ltnamegt
  • Perform actions
  • Optional Save Transaction ltnamegt
  • Commit or Rollback
  • You Inherit a XID
  • Caller passes you a transaction
  • You return or Rollback.
  • You can Begin / Commit sub-trans.
  • You can use save points

Begin
Begin
RollBack
Commit
XID
RollBack Return
Return
84
Nested Transactions Going Beyond Flat Transactions
  • Need transactions within transactions
  • Sub-transactions commit only if root does
  • Only root commit is durable.
  • Subtransactions may rollback if so, all its
    subtransactions rollback
  • Parallel version of nested transactions

T12
T123
T121
T122
T1
T11
T13
T112
T114
T133
T131
T132
T111
T113
85
Workflow A Sequence of Transactions
  • Application transactions are multi-step
  • order, build, ship invoice, reconcile
  • Each step is an ACID unit
  • Workflow is a script describing steps
  • Workflow systems
  • Instantiate the scripts
  • Drive the scripts
  • Allow query against scripts
  • Examples Manufacturing Work In Process
    (WIP) Queued processing Loan application
    approval, Hospital admissions

Presentation
workflow
Business Objects
Database
86
Workflow Scripts
  • Workflow scripts are programs (could use
    VBScript or JavaScript)
  • If step fails, compensation action handles error
  • Events, messages, time, other steps cause step.
  • Workflow controller drives flows

fork
Source
join
branch
case
loop
Compensation Action
Step
87
Workflow and ACID
  • Workflow is not Atomic or Isolated
  • Results of a step visible to all
  • Workflow is Consistent and Durable
  • Each flow may take hours, weeks, months
  • Workflow controller
  • keeps flows moving
  • maintains context (state) for each flow
  • provides a query and operator interface e.g.
    what is the status of Job 72149?

88
ACID Objects Using ACID DBs The easy way to build
transactional objects
  • Application uses transactional objects (objects
    have ACID properties)
  • If object built on top of ACID objects, then
    object is ACID.
  • Example New, EnQueue, DeQueue on top of SQL
  • SQL provides ACID

SQL
dim c as Customer dim CM as CustomerMgr ... set
C CM.get(CustID) ... C.credit_limit
1000 ... CM.update(C, CustID) ..
Business Object Customer
Business Object Mgr CustomerMgr
SQL
Persistent Programming languages automate this.
89
ACID Objects From Bare Metal The Hard Way to
Build Transactional Objects
  • Object Class is a Resource Manager (RM)
  • Provides ACID objects from persistent storage
  • Provides Undo (on rollback)
  • Provides Redo (on restart or media failure)
  • Provides Isolation for concurrent ops
  • Microsoft SQL Server, IBM DB2, Oracle, are
    Resource managers.
  • Many more coming.
  • Any masochist can build one

90
Outline
  • Why Distributed
  • Distributed data objects
  • Distributed execution
  • Three tier architectures
  • Transaction concepts
  • Why transactions?
  • Using transactions
  • programming
  • workflow

91
References
  • Essential Client/Server Survival Guide 2nd ed.
  • Orfali, Harkey Edwards, J. Wiley, 1996
  • Client/Server Programming with Java and CORBA
  • Orfali, Harkey, J Wiley, 1997
  • Principles of Transaction Processing
  • Bernstein Newcomer, Morgan Kaufmann, 1997
  • Transaction Processing Concepts and Techniques
  • Gray Reuter, Morgan Kaufmann, 1993

92
(No Transcript)
About PowerShow.com