O-O, What Are They Doing - PowerPoint PPT Presentation

About This Presentation
Title:

O-O, What Are They Doing

Description:

Object views (via a novel approach) Virtual table hierarchies for flexible access control ... create table emp of Emp_t under person (dept with options scope dept) ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 37
Provided by: MikeC159
Learn more at: https://dsf.berkeley.edu
Category:
Tags: create | doing | person | virtual

less

Transcript and Presenter's Notes

Title: O-O, What Are They Doing


1
  • O-O, What Are They Doing
  • to Relational Databases?
  • (The Evolution of DB2 Universal Database)

Michael J. Carey IBM Almaden January 1999
2
  • Plan for Today's Presentation
  • The relational DBMS revolution
  • The object-relational DBMS evolution
  • O-R features in DB2 Universal Database V5.2
  • Some O-R implementation tradeoffs (V5.2)
  • What lies ahead for DB2 UDB O-R databases?
  • Questions (and possibly answers)

Please ask questions throughout...!
3
  • The Relational DBMS Revolution
  • The pre-relational era (1970's)
  • Graph-based data models
  • Hierarchical (IMS), network (Codasyl)
  • Low-level, navigational interfaces
  • Labor-intensive and error-prone
  • The relational era (1980's)
  • Simple, abstract data model
  • Database set of relations ("tables")
  • 3 schema levels views, base tables, physical
    schema
  • Algebra of set-oriented operations
  • High-level, declarative interfaces
  • SQL, QBE, et al
  • Embedded languages, 4GLs

4
  • The Relational Model (in one slide)
  • Employees and departments

Department
dno
name

10
Toy

20
Shoe
Employee
eno
name
salary
dept

1
Lou
10000000
10

7
Laura
150000
20
?

22
Mike
80000
20
select E.name, E.salary, D.no from Employee E,
Department D where E.salary lt 100000 and D.name
'Shoe' and E.dept D.dno
5
  • Relational Databases A Success Story
  • The relational model has been a big success
  • Simplicity has made research tractable
  • Data independence yields productivity gains
  • Both academia and industry have benefitted
  • Relational DBMS "goodies" include
  • Efficient query optimization and execution
  • Well-defined transaction semantics and support
  • Excellent multi-user performance and robustness
  • Views for data independence, authorization
  • Constraints, triggers, and stored procedures for
    (shared) business rule capture/enforcement
  • "The" success story for parallel computing

6
  • We've Achieved Nirvana ... Right?
  • The world is becoming increasingly complex
  • New data types are appearing (e.g., multimedia)
  • Real-world data doesn't fit neatly into tables
  • Entities and relationships (vs. tables)
  • Variance among entities (vs. homogeneity)
  • Set-valued attributes (vs. normalization)
  • Advanced applications bring complex data
  • E.g., CAD/CAM data management, web data
    management, geographic information management,
    medical data management, (your favorite
    application goes here)
  • So maybe objects are the answer...?
  • Yes, if we can keep all the relational "goodies"!

7
  • The Object-Relational DBMS Evolution
  • O-R extension 1 Abstract data types (ADTs)
  • New column types and functions/methods
  • E.g., text, image, audio, video, time series,
    point, line, OLE...
  • For modeling new kinds of facts about enterprise
    entities
  • Infrastructure for extenders/datablades/cartridges
  • O-R extension 2 Row types
  • Types and functions/methods for rows of tables
  • Desirable features include references,
    inheritance, methods, late binding, and
    collection-valued attributes
  • For modeling enterprise entities with
    relationships behavior
  • Infrastructure for DBMS-native object management
  • Recent SQL3 merger Structured types
  • Can use for types of columns and/or tables

8
  • "Not Your Father's Employee Type"
  • Beyond name, rank, and serial number
  • New attribute types
  • Location (2-d point), job description (text),
    photo (image), ...
  • Associated functions
  • Distance(point, point), contains(text, string),
    ...
  • Beyond your basic employee record
  • Employees come in different flavors
  • Emp, RSM, Programmer, Manager, Temp, ...
  • Employees have many known relationships
  • Manager, department, projects, ...
  • Employees have behavior
  • Age(Emp), qualified(Emp, Job), hire(Emp), ...

An Employee is a simple "business object"
9
The OSF Project at IBM Almaden
  • OSF stands for "Object Strike Force"
  • Semi-autonomous group "outside" UDB development
  • Focus object-relational extensions for DB2 UDB
  • Both near-term and longer-term interests
  • Collaborate with our Toronto and Santa Teresa
    labs
  • Significant activities to date
  • Prototyped "row type" support for DB2 UDB
  • Delivered in DB2 UDB Version 5.2 (9/98)
  • Significantly revised SQL3 draft standard
  • Working on next step plus future technologies

10
DB2 Universal Database, Version 5
  • DB2 for Common Servers (Version 2)
  • User-defined column types (UDTs/distinct types)
  • User-defined functions (UDFs)
  • Binary/character large objects (BLOBs/CLOBs)
  • Distinct types new data types for columns
  • Ex create distinct type US_Dollar as Real with
    comparisons
  • US_Dollar is an available UDT with functions ,
    ltgt, lt, lt, gt, gt, US_Dollar(Real),
    Real(US_Dollar)
  • User-defined functions associated operations
  • create function CA_Tax (US_Dollar) returns
    US_Dollar external name 'money!US_Dollar'
    language C

11
DB2 Universal Database, Version 5 (cont.)
  • Lots of other interesting features as well, e.g.
  • Constraints and triggers
  • Recursive queries
  • OLAP support (cube and rollup)
  • Extenders (based on UDTs/UDFs)
  • Wide range of hardware/software platforms
  • PCs Windows95, NT, OS/2, SCO
  • Unix workstations AIX, Solaris, HP/UX
  • Parallel platforms SMPs, MPPs (e.g., SP2)
  • Descended from Almaden's Starburst system
  • Extensible query compiler (with rule-based query
    rewrite and query optimizer components)

12
New O-R Features in DB2 UDB V5.2
  • Structured types and references
  • Named types with attributes, O-O subtyping model
  • Ref(T) for directly modelling relationships
  • Typed tables and table hierarchies
  • Oid (user-provided) plus a column per attribute
    of T
  • Subtables for querying and managing subtype
    instances
  • Query language extensions
  • Substitutability for queries/updates (data
    independence )
  • Path expressions for querying relationships
    easily
  • Functions/predicates for runtime type inquiries
  • Object views (via a novel approach)
  • Virtual table hierarchies for flexible access
    control
  • Also facilitates O-O views of legacy tables

13
  • A Simple Example
  • Employee and department tables in the (late) 90's

mgr
dept
14
  • Structured Types and References
  • Create structured types (and references)

create type Person_t as ( name Varchar(40),
birthyear Integer ) create type Emp_t under
Person_t as ( salary Integer, dept
Ref(Dept_t) ) create type Exec_t under Emp_t as
( bonus Integer ) create type Student_t
under Person_t as ( major Varchar(20) )
15
  • Structured Types and References (cont.)
  • Create structured types (cont).

create type Dept_t as ( name Varchar(20),
budget Integer, headcount Integer,
mgr Ref(Emp_t) )
  • Okay, so I lied (a little) on the last slide...

alter type Emp_t add attribute dept Ref(Dept_t)
16
  • Typed Tables and Table Hierarchies
  • Now create typed tables (and subtables)

create table person of Person_t (ref is oid
user generated) create table emp of Emp_t under
person (dept with options scope
dept) create table exec of Exec_t under
emp create table student of Student_t under
person create table dept of Dept_t (ref is
oid user generated, mgr with options scope
emp)
17
  • SQL Query Extensions (by example)
  • Substitutability

select E. from emp E where E.birthyear gt 1970
and E.salary gt 50000
  • Data modification (insert update/delete)

insert into emp values (Emp_t('e100'), 'John
Smith', 1968, 65000, (select oid
from dept where name 'Database')) update
person set birthyear birthyear 1 where name
'John Smith'
  • Path expressions

select E.name, E.dept-gtname from emp E where
E.dept-gtmgr-gtdept-gtmgr-gtname 'Lou Gerstner'
18
  • Querying Table Hierarchies An Example

Dept
Person
name
birthyear
oid
P1
Harold
1970
P2
Carol
1958
Emp
name
birthyear
oid
dept
P3
Hamid
1956
_
P4
Lou
1940
1940
19
  • SQL Query Extensions (cont.)
  • Support for type-dependent queries

select from only (emp) E where dept-gtbudget gt
10000000 select name from person P where
deref(oid) is of type (only Emp_t,
Student_t) select type_name(deref(E.oid)),
E. from outer (emp) E where e.oid Emp_t('e13')
20
  • Other Data Definition Features
  • ref is for object id column
  • Unique, user-generated (on insert)
  • scope clause for reference columns
  • Provides critical information to the query
    optimizer
  • not null constraints
  • Definable at any level of a table hierarchy
  • Enforced for indicated table and its subtables
  • unique constraints
  • Root level (and columns) only
  • create index for physical schema
  • Unique or non-unique index on root table
  • Non-unique index on subtable

21
  • Other Data Definition Features (cont.)
  • Authorization model for table hierarchies
  • grant and revoke on table or subtables
  • Substitutability implicit subtable authorization
    on columns inherited from an authorized
    supertable
  • Ex 1 select privilege on person table
  • Ex 2 update privilege on salary column of emp
    table
  • Some operations require authorization everywhere
  • deref function
  • is of type predicate and type_xxx functions
  • SQL3 also supports granting of table/subtable
    privileges with hierarchy option

22
  • Object Views in DB2 UDB
  • Typed views and view hierarchies

mgr
dept
  • Requirements virtual table hierarchies
  • Typed rows with (derived) object ids
  • Views may be quite different from base data
  • Support for interconnected "view schemas"

23
  • Types For Object Views
  • Create types for use in views

create type VPerson_t as ( name
Varchar(40) ) create type VEmp_t under
VPerson_t as ( dept Ref(VDept_t) ) create
type VStudent_t under VPerson_t as ( kind
Varchar(8) ) create type VDept_t as ( name
Varchar(20), mgr Ref(VEmp_t) )
24
  • Typed View Hierarchies
  • Now create typed views (and subviews)

create view vperson of VPerson_t (ref is oid user
generated) as select VPerson_t(Varchar(oid)),
name from only (person) create view vemp of
VEmp_t under vperson (dept with options scope
vdept) as select VEmp_t(Varchar(oid)), name,
VDept_t(Varchar(dept)) from emp where salary
gt 0 create view vstudent of VStudent_t under
vperson as select VStudent_t(Varchar(oid)),
name, case when major like
'Engineer' then 'Geek'
else 'non-Geek' end from student create
view vdept of VDept_t ...
25
O-R Implementation Issues/Tradeoffs
  • Some guiding principles for DB2 UDB V5.2
  • Performance must equal/exceed relational
    equivalents
  • Design amenable to future plans w.r.t. type
    evolution
  • Structured types must be supported in columns
    (someday)
  • Localize initial changes to query compiler where
    possible
  • Want "free" indexing, rewrites, optimization,
    parallelization, ...
  • Influenced by discussions with a CAD/CAM vendor
  • Information on existing approach and
    installations
  • Requirements for efficiency of new products
  • Let's look briefly at two areas
  • Table hierarchy representation
  • References and path query processing

26
Implementing Table Hierarchies
  • Implementation table approach
  • One physical table per table hierarchy with
  • Type tag column (to distinguish subtable rows)
  • Object id column
  • Columns for all columns of the root table and its
    subtables
  • Vertical partitioning approach
  • One physical root table with
  • Type tag column
  • Object id column
  • Columns for each root table column
  • N physical delta tables (one per subtable) with
  • Object id column
  • Columns for each column introduced by this
    subtable

27
Implementing Table Hierarchies (cont.)
  • Horizontal partitioning approach
  • N separate physical tables with
  • Object id column
  • Columns for every subtable column (inherited or
    not)
  • So what did we do for UDB V5.2...?
  • Vertical partitioning approach rejected quickly
  • Too many joins to materialize subtable rows
  • Multi-column constraints and indices problematic
  • Horizontal partitioning approach rejected
    eventually
  • Uniqueness issue for user-generated object ids
  • Query complexity for multi-hierarchy join queries
  • Ex select p.name, q.name from Person p, Person
    q where ...
  • Implementation table approach taken for V5.2
  • Appeared to give us the most "free" functionality
  • Adopted despite row size (null columns) downside

28
References and Path Expressions
  • Reference values in tables should have a scope
  • "Other end" info for query rewrite and join
    optimization
  • Ditto for authorization checking (static vs.
    dynamic)
  • Schema makes overly wide references unnecessary
  • Uniqueness is hierarchy-relative, enforced with
    an index
  • V5.2 self-references (object ids) are
    user-generated
  • CAD/CAM vendor had "legacy references" in files
  • Different users have different id generation
    schemes
  • Loading cyclic data (e.g., emplt-gt dept) is messy
    and slow
  • Ditto for creating objects from an object cache

29
References and Path Expressions (cont.)
  • Path expressions are logically equivalent to
    subqueries

select E.name, E.dept-gtname, E.dept-gtmgr-gtname fr
om emp E where E.dept-gtheadcount gt 10
  • Actual approach shared subquery generation (QGM)
  • Compute common paths (prefixes) once to save work
  • Not every SQL context accepts an actual subquery
  • Also need to handle non-serializable locking
    levels
  • Efficiency obtained through query rewrite, e.g.
  • Subquery to outer-join transformation
  • Outer-join to join transformation where possible

30
Where We Are Today in UDB
  • V5.2 of UDB contains new O-R features
  • Structured types with inheritance
  • Object tables and table hierarchies
  • References and path expressions
  • Object views and view hierarchies
  • Moreover, so does the SQL3 standard
  • Includes object views and user-defined references
  • IBM, Oracle, Informix heading in same general
    direction
  • Work continuing on O-R extensions
  • Let's have a brief look...

31
Additional Object Table Support
  • Business rule mechanisms for typed tables
  • Check constraints on tables/subtables
    (w/inheritance)
  • Referential integrity constraints to and from
    tables/subtables
  • Triggers on tables/subtables
  • Object modeling and management extensions
  • User-defined reference types (ref using)
  • More flexible object view definitions
  • Type and instance (i.e., row) evolution
  • Structured types for attributes/columns
  • Work in progress at IBM Santa Teresa Lab
  • Functions/methods just around the corner as well

32
Other Exploratory O-R Work
  • Efficient support for collection types
  • Multivalued attributes (e.g., Project.team)
  • Flavors set, multiset, array, list, ...
  • Need to integrate into SQL, support querying well
  • Some experience from a first prototype
  • Other activities (and open problems)
  • Java mappings bindings for O-R data
  • XML data-centric web sites ("d-commerce")
  • Business object servers (caching/consistency)
  • Heterogeneous data O-R database systems
  • User-defined and/or external indexing
  • Optimizer "hooks" for new data types
  • Etc.!

33
Partial List of UDB O-R Contributors
  • Almaden Research Center
  • Mike Carey, Don Chamberlin, Srinivasa Narayanan,
    Bennet Vance C.M. Park Guido Moerkotte
  • Santa Teresa Lab
  • Nelson Mattos
  • Gene Fuh, Michelle Jou, Brian Tran
  • Toronto Lab
  • Doug Doole, Serge Rielau, Rick Swagerman
  • Leo Lao, Walid Rjaibi, Calisto Zuzarte
  • Cheryl Greene, various other consultants/hecklers
  • And as for future versions of UDB
  • Your name could appear here! (MS/PhD)

34
  • The End

35
  • What About Object-Oriented DBMSs?
  • OOPL DBMS OO-DBMS
  • Commonly based on C or Smalltalk
  • Persistence, collections, queries, versions, ...
  • Lots of interesting and useful research results
  • O-O data models and query languages
  • O-O query processing, system architecture,
    performance
  • Various products (O2, Objectstore, Versant,
    Objectivity, ...)
  • No widespread commercial acceptance
  • Many differences across systems (despite ODMG-93)
  • Never really caught up to RDBMS techology
  • Schema compilation, evolution painful
  • Missing many of the relational "goodies"
  • Single-language focus, lack of (relational) tools

36
  • Stonebraker Fellow Criteria (found on web)
  • Industrial database researcher
  • PhD from UC Berkeley
  • Must agree with the following motto
  • Databases are the answer...!
  • What was the question again...?
  • At least 6' tall
  • Had a PhD thesis advisor with first name Mike
  • Produced a PhD student with first name Mike
Write a Comment
User Comments (0)
About PowerShow.com