Lecture 4: DBMS Architecture - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Lecture 4: DBMS Architecture

Description:

There exists someone who can write popular textbooks in both OS and DBMS! ... recovery is cheap. time travel is easy. 30. CS511 Advanced Database Management Systems ... – PowerPoint PPT presentation

Number of Views:1122
Avg rating:3.0/5.0
Slides: 34
Provided by: ChengXi4
Category:

less

Transcript and Presenter's Notes

Title: Lecture 4: DBMS Architecture


1
Lecture 4 DBMS Architecture
Sept. 6 2006 ChengXiang Zhai
Most slides are adapted from Kevin Changs
lecture slides
2
DBMS Mission Statement
  • Simply maintenance and computation of data
  • But how to do it?

Data
Operations
Results
3
DBMS Architecture
User/Web Forms/Applications/DBA
query
transaction
Query Parser
Transaction Manager
Query Rewriter
Logging Recovery
Query Optimizer
Lock Manager
Query Executor
Files Access Methods
Lock Tables
Buffers
Buffer Manager
Main Memory
Storage Manager
Storage
4
A Design Dilemma
  • To what extent should we reuse OS services?
  • Reuse as much as we can
  • Performance problem (inefficient)
  • Lack of control (incorrect crash recovery)
  • Replicating some OS functions (mini OS)
  • Have its own buffer pool
  • Directly manage record structures with files

5
OS vs. DBMS
  • Conjecture Perhaps pretty close
  • Proof
  • There exists someone who can write popular
    textbooks in both OS and DBMS!
  • Operating Database System Concepts
  • Jim Gray is from OS background!

6
OS vs. DBMS Similarities??
  • What do they manage?
  • What do they provide?

7
OS vs. DBMS Similarities
  • Purpose of an OS
  • managing hardware
  • presenting interface abstraction to applications
  • DBMS is in some sense an OS?
  • DBMS manages data
  • presenting interface abstraction to applications
  • Both as API for application development!

8
Applications built upon DBMS
  • ERP Enterprise Resource Planning
  • SAP, Baan, PeopleSoft, Oracle, IBM,...
  • CRM Customer Relationship Management
  • E.phiphany, Siebel, Vantive, Oracle, IBM, ...
  • SCM Supply Chain Management
  • Trilogy, i2, Oracle, IBM, ...
  • A lot more in the Info Tech era
  • e-business software
  • scientific data
  • multimedia
  • data analysis and decision support

9
OS vs. DBMS Related Concepts
  • Process Management ? What DB concepts?
  • process synchronization
  • deadlock handling
  • Storage management ? What DB concepts?
  • virtual memory
  • file system
  • Protection and security ? What DB concepts?
  • authentication
  • access control

10
OS vs. DBMS Differences??
11
OS vs. DBMS Differences
  • DBMS Top-down to encapsulate high-level
    semantics!
  • Data
  • data with particular logical structures
  • Queries
  • query language with well defined operations
  • Transactions
  • transactions with ACID properties
  • OS Bottom-up to present low-level hardware

12
DBMS on top of OS Relations vs. File system
  • Data object abstraction
  • file array of characters
  • relation set of tuples
  • Physical contiguity
  • large DB files want clustering of blocks
  • extent larger granularity allocation unit
  • sol1 managing raw disks by DBMS
  • sol2 simulate by managing free spaces in DBMS
  • Multiple trees (access methods)
  • file access directory hierarchy (user access
    method)
  • block access inodes
  • tuple access DBMS indexes

13
DBMS on top of OS BM vs. VM
  • Query-aware replacement needed for performance
  • not always LRU
  • Examples?
  • how about sort-merge join??
  • how about nested-loop join??

14
DBMS on top of OS BM vs. VM
  • System-controlled replacement needed for
    correctness
  • not always LRU
  • Examples?

15
Not Really OS Problems Deferred Update Semantics
  • Update emp.sal 0.8emp.sal if emp.sal gt mgr.sal
  • empname sal manager
  • Smith 10k Brown
  • Jones 9k
  • Brown 11k Jones
  • what are the possible semantics?
  • INGRES solution deferred updates
  • buffer updates in intentions list for actual
    updates (also serve as redo log)
  • an example of needing buffer knowledge in DBMS,
    so perhaps not sensible to do BM totally in OS

16
As the data model and application context change,
so does the DBMS architecture
17
Post-Relational DB Projects
  • Motivation
  • RDBMS not powerful enough for non-administrative
    data-intensive applications such as CAD/CAM,
    GIS
  • Buzz terms object-oriented, extensible
  • Sample projects
  • Postgres U.C. Berkeley
  • Starburst IBM Almaden highly extensible
  • after System R (relational), R (distributed)
  • ultimately finding its way into IBM DB2 UDB
  • Exodus U. Wisconsin
  • not a complete DB an OO-style storage manager
    toolkit
  • followed by Shore at Wisconsin, Predator at
    Cornell

18
POSTGRES Post INGRES
  • Stonebraker, U.C. Berkeley
  • 1977-1985 INGRES
  • among the first relational DB implementation
  • ? Ingres Inc. --gt .. ? acquired by Computer
    Associates
  • 1986-1994 POSTGRES
  • among the first object-relational DB
    implementation
  • ? Illustra ? acauqired by Informix
  • PostgreSQL (the SQL version)

19
RDBMS the Relational Root
  • Data model (Codd, 1970s)
  • a database is a set of relations
  • relation of n attributes a set of n-tuples
  • n-tuple (v1, , vn), where vi is in domain Si

20
Relational Model Normal Forms
  • Basic 1NF (First Normal Form)
  • implicitly required in the relation model
  • definition
  • only simple domains of atomic elements (Codd)
  • simple domains represent the base (built-in)
    types
  • ? why?
  • Stronger normal forms
  • 4NF, Boyce-Codd Normal Form, 3NF, 2NF,
  • ? why?

21
Normalizing Relations Example
  • Unnormalized relation of book objects
  • Normalized relations by decomposition
  • ?? Problems of the relational model?

Books title authors date great future smith,
jones 4/01/01 career jones 7/12/00
Books title day month year great future
4 1 01 career 7 12 00
Books title authors great future
smith great future jones career jones
22
Relational Model Problems
  • A relational DB is like a garage that forces you
    to take your car apart and store the pieces in
    little drawers. (some researcher)
  • Object notion lost by decomposition
  • non-intuitive object is decomposed into several
    relations
  • inefficient a lot of online assembling by joins
  • Base types are too restrictive
  • integers and strings are very primitive
  • data types are typically application specific
  • Relational algebra is the only allowed operation
  • simple, declarative, but also restrictive
  • application host language embedded SQL
  • ?? How to remedy these problems?

23
Quest for a Richer Model?
  • Object-oriented data model
  • Extensible ADTs
  • Programming-language constructs

24
ORDBMS vs. OODBMS
  • Question How important is the relation?
  • ORDBMS
  • RDBMS OO features
  • query-based
  • OODBMS
  • OO PL database features (persistent objects)
  • programming-based
  • Meeting in the middle

25
Stonebrakers Matrix
  • Prediction ORDBMS will dominate
  • evidence big DB players are all on this side

Simple Data Complex Data Query RDBMS ORDBMS No
Query File System OODBMS
26
Object Orientation Concepts
  • Classes
  • classes as types
  • encapsulation interface implementation
  • inheritance building class hierarchies
  • Objects
  • complex objects
  • built from constructors, e.g., set-of, array,
    nested objs
  • object identity (OID)
  • system generated as unique object reference
  • enables (efficient) object linking and navigation

27
POSTGRES Data Model
  • POSTGRES data model
  • OO constructs
  • classes as relations
  • object (class instance) tuple
  • object-id tuple-id
  • method attribute or function of attributes
  • inheritance (multiple parents)
  • ADT constructs
  • types
  • functions

28
POSTGRES Functions
  • Arbitrary C functions
  • e.g. overpaid(Employee)
  • arbitrary semantics-- not optimized
  • no fancy access methods-- typically sequential
    scan
  • Binary operators
  • hints to provide semantics
  • extensible access methods
  • extensible Btree or user-defined index
  • PostQuel procedures
  • parameterized queries as functions
  • e.g. sal-lookup(name)
  • retrieve Emp.salary where Emp.name name

29
POSTGRES Storage System
  • We were guided by a missionary zeal to do
    something different
  • No-overwrite system
  • Logging
  • old values are not overwritten-- no value logging
    necessary
  • log only needs to keep transaction state
    (commit/abort/going)
  • ?? crash recovery-- how?
  • Vacuum-cleaner daemon to archive historical data
  • Advantages
  • recovery is cheap
  • time travel is easy

30
Storage System Problems
  • Problems
  • flushing differential data (why?) by commit time
    can be costly
  • unless stable main memory
  • more costly than sequentially writing out logs
    why ??
  • reads have to stitch together current picture
  • And, yes, there are lots details unexplored or
    unexplained

31
Questing for the Right Models
  • Speaking about knowledge representation The
    simple relational model is by far the only
    successful KR paradigm.
  • When the relational model came along, the network
    guys resisted and their companies went under.
  • When the OO model came along, the relational guys
    absorb its best, and their companies prospered
    again!
  • -- Jeffery Ullman

32
What You Should Know
  • What are some major limitations of services
    provided by an OS in supporting a DBMS?
  • In response to such limitations, what does a DBMS
    do?
  • As the data model and task environment change,
    the architecture will also need to change

33
Carry Away Messages
  • One usually doesnt fit all!
  • An OS is designed to serve all kinds of
    applications, so its not optimal for supporting
    a DBMS
  • Other examples a search engine is designed to
    serve all kinds of people, so its not optimal
    for a particular person (personalized search)
  • When a problem is recognized, there are often
    opportunities for breakthroughs in multiple areas
  • DBMS could take over OS functions
  • OS could provide more opportunities for
    customization
  • From day 1, high efficiency has been the
    primary challenge/concern in designing and
    implementing a DBMS reliability may be the
    second major concern
  • In contrast, accuracy of answers is at least
    as important as efficiency for a Web search
    engine
  • In the future, accuracy of answers will likely
    become more important
Write a Comment
User Comments (0)
About PowerShow.com