Applied Mathematics for Database Professionals - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Applied Mathematics for Database Professionals

Description:

Logic and set theory provide discourse to: ... Scope of constraint = the table. Phase 4: Table Universes. tab EMP = { E | E (tup EMP) ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 38
Provided by: Koppe
Category:

less

Transcript and Presenter's Notes

Title: Applied Mathematics for Database Professionals


1
Applied MathematicsforDatabase Professionals
  • Part two The Application

Lex de Haan and Toon Koppelaars
2
Recap Logic
  • Predicate, proposition
  • Logical connectives
  • Rewrite rules
  • Quantifiers
  • Nesting of quantifiers
  • More rewrite rules

3
Recap Set Theory
  • Set specification
  • Union, intersection, difference
  • Cardinality, empty set
  • Subset, powerset
  • Ordered pair
  • Functions, set functions
  • Generalized product

4
Goal
  • Logic and set theory provide discourse to

Specify database models including all involved
data integrity constraints in a formal way
Database modeling Integrity constraint modeling
5
Repeat notice
  • Take a few seconds to
  • Clear your mind
  • Forget everything you know about data modeling,
    ER-model, UML,
  • Be spongy there is a lot to tell

6
Database Model
  • Model of real world we want to capture
  • Every state of real world maps to state of
    model
  • A database state
  • We do not want model to be able to take on states
    that cannot (or should not) happen in real world
  • Integrity constraints

7
Database Model
lt - - - - - - - - - - application - - - - - - - -
- - - - - gt
The Real World
A model of TRW
Queries Transactions
Map
Robust database
Business Logic
Data Integrity Constraints
Business Rules
8
Database Model
  • What is a database state?
  • Database state is a set of tables
  • Table is a set of tuples
  • Tuple is a set of attribute-value pairs
  • The set of allowed database states is called
    database universe
  • Database fairly complex variable
  • Database universe its data type

9
Defining this Data Type
Object characterizations
10
Example Database Model
HIST
TERM
SREP
EMP
DEPT
CRS
OFFR
REG
GRD
11
Phase 1 Database Skeleton
  • Bottom-up construction of a DB-Universe
  • Database SkeletonPer table, what are the
    attributes
  • Defines our vocabulary

12
Phase 1 Database Skeleton
DB-S (EMP -- Employees EMPNO
/ Employee number / , ENAME
/ Employee name / , JOB
/ Employee job / , BORN
/ Date of birth / , HIRED
/ Date hired / , MSAL
/ Monthly salary / , DEPTNO )
/ Department number / , (DEPT --
Departments DEPTNO / Department
number / , DNAME / Department
name / , LOC / Location
/ , MGR ) / Manager
emp-number /
13
Phase 1 External Predicates
Bridge to the real world
  • Employee ENAME has unique employee number EMPNO,
    job JOB, was born at BORN, is hired at HIRED, has
    a monthly salary of MSAL dollars and works for
    the department with department number DEPTNO.
  • Department DNAME has the unique department number
    DEPTNO, is located at LOC, and is managed by the
    employee with employee number MGR.

14
Phase 2 Object Characterizations
  • Per attribute of tables introduced in
    skeletonWhat is its value set?
  • The flesh on the skeleton

15
Phase 2 Object Characterizations
ochEMP ( EMPNO n n?num(4,0) ? n gt 999
) , ( ENAME vch(9)
) , ( JOB s s?vch(9) ?
s?'PRESIDENT','MANAGER'
,'SALESREP','TRAINER','ADMIN') , ( BORN
date ) , ( HIRED
date ) , ( MSAL
n n?num(7,2) ? n gt 0 ) , (
DEPTNO n n?num(2,0) ? n gt 0 )
Use available data types as driving sets
16
Phase 2 Object Characterizations
ochDEPT ( DEPTNO n n?num(2,0) ? n gt 0
) , ( DNAME vch(10) ) ,
( LOC vch(8) ) , ( MGR
n n?num(4,0) ? n gt 999 )
17
Phase 3 Tuple Universes
  • Possible tuples
  • Generalized product of object characterization
  • Add to that tuple constraints ? Tuple Universes
  • Predicates over different attributes in same
    tuple
  • Scope of constraint the tuple

18
Phase 3 Tuple Universes
tup-EMP e e??(ochEMP) ? / We hire
adult employees only / e(BORN)
18 ? e(HIRED) ? / The president earns more
than 10K monthly / e(JOB) ? 'PRESIDENT' ?
e(MSAL) ? 10000 ? / Administrators earn
less than 5K monthly / e(JOB) ? 'ADMIN' ?
e(MSAL) ? 5000
tupDEPT d d??(ochDEPT)
19
Phase 4 Table Universes
  • Possible tables
  • Powerset of tuple universe
  • Set of all subsets
  • Add to that table constraints ? Table Universes
  • Predicates over different tuples in same table
  • Scope of constraint the table

20
Phase 4 Table Universes
tabEMP E E??(tupEMP) ? / EMPNO uniquely
identifies an employee / ( ?e1,e2?E
e1(EMPNO)?e2(EMPNO) ? e1?e2 ) ? / At most one
president allowed / e e?E ?
e(JOB)?'PRESIDENT' ? 1 ? / A department that
employs the president or / / a manager should
also employ at least one / / administrator
/ ( ?d? e1(DEPTNO)
e1?E ( ?e2?E e2(DEPTNO)?d ?
e2(JOB)?'PRES','MGR' ) ? ( ?e3?E
e3(DEPTNO)?d ? e3(JOB)?'ADMIN' ) )
21
Phase 4 Table Universes
tabDEPT D D??(tupDEPT) ? / Department
number uniquely identifies tuple / ( ?d1,d2?D
d1(DEPTNO)?d2(DEPTNO) ? d1?d2 ) ? /
Department name and location uniquely /
/ identify a tuple
/ ( ?d1,d2?D d1?DNAME,LOC?d2?DNAME,LOC
? d1?d2 ) ? / You cannot manage more than
two departments / ( ?m? d(MGR) d?D
d d?D ? d(MGR)?m ? 2 )
22
Phase 5 Database Universe
  • Possible DB-states
  • Generalized product of set of (Table-nametable-un
    iverse) pairs
  • Add to that database constraints ? Database
    Universe
  • Predicates over different tables in same database
    state
  • Scope of constraint more than 1 table

23
Phase 5 Database Universe
DB-U s s??((EMPtab-EMP),(DEPTtab-DEPT))
? / Employee works for a known department
/ e(DEPTNO) e?s(EMP) ? d(DEPTNO)
d?s(DEPT) ? / Dept manager is a known
employee, / excluding admins and president
/ d(MGR) d?s(DEPT) ?
e(EMPNO) e?s(EMP) ? e(JOB)?'ADMIN','PRES'
? / Department manager must work for
/ / a department he/she manages
/ ( ?d1?s(DEPT) e(DEPTNO) e?s(EMP) ?
e(EMPNO)?d1(MGR) ? d2(DEPTNO) d2?s(DEPT)
? d2(mgr)d1(mgr) )
24
Phase 6 Transaction Universe
  • Possible DB-state changes
  • Directed graph on top of DB states
  • Add to that transition-constraints ? Transaction
    Universe
  • See the book

25
Goal
  • Logic and set theory provide discourse to

Specify database models including all involved
data integrity constraints in a formal way
FORMAL Clear and precise
26
Benefits of This Approach
  • Formal specification ? no ambiguity
  • No meaning added by programmer
  • Managers always manage themselves
  • What are managers? What does manage
    themselves mean?
  • Two salesman on duty during vacation of manager
  • The same two the whole vacation?

27
Benefits of This Approach
  • 2. Warning (!) and benefit
  • Do NOT confront users with this formalism
  • DB-professional deals with users in informal way
  • Continually tries to map rules to model
  • By doing so the right counter-questions will
    arise and trust will grow

The Real World
Map
Counter
Informal
The Formal Model
28
Benefits of This Approach
  • Documentation dense
  • Instead of 100 pages printout from repository
  • 20 page document
  • Risk average programmer unable to comprehend
    formal specification
  • No knowledge of logic or set theory
  • Create separate team of data model specialists
  • Always embed informal description too

29
Benefits of This Approach
  • Forces you to think clearly
  • Set-oriented
  • But also speak clearly
  • Has spin-off to (complex) query design
  • Do the formal spec first
  • Play around with rewrite rules
  • Then do the SQL

30
Benefits of This Approach
  • Can be used to specify business logic too
  • I.e. queries and transactions
  • With same benefits
  • Separate chapters in book

lt - - - - - - - - - - - - - application - - - - -
- - - - - gt
The Real World
A model of TRW
Queries Transactions
Map
Robust database
Business Logic
31
Benefits of This Approach
  • Scope-of-data-constrained drives classification
  • Closely related to implementation issues
  • More issues with more scope
  • Similar issues within same scope!
  • Separate chapter in book on implementation
    strategies

32
Implementation Issues
  • Why do we create?
  • Primary keys
  • Unique keys
  • Foreign keys
  • Check constraints
  • Not because we have to

33
Implementation Issues
  • DBMS vendors have barely begun to offer support
    (table database constraints)
  • Tricks using MVs, complex triggers
    (serializability!)
  • Huge opportunity for DBMS vendors!
  • CDMRuleFrame (consulting NL)
  • Based on 1994 paper (EOUG Maastricht)
  • Based on Oracle7 technology
  • RuleGen (personal research project)
  • Based on many further insights
  • Based on 10G technology

34
Summary
  • Database modeling integrity constraint modeling
  • Its all applied logic and set theory
  • If you choose to document them
  • Do it non-ambiguously
  • If you choose to implement them
  • Push for support in future DBMS versions
  • More explanation in the book
  • More examples in the book

35
End Of Part Two
Applied MathematicsforDatabase Professionals
Part two The Application
Lex de Haan and Toon Koppelaars t.koppelaars_at_centr
aal.boekhuis.nl http//web.inter.nl.net/users/T.Ko
ppelaars
Thank you for your attention I appreciate your
evaluation
36
(No Transcript)
37
RuleGen
  • Takes care of
  • Mutating table issue
  • Locking
  • Deferring
  • You supply
  • the when of the rule
  • the how of the rule
  • Including serialization logic
  • All code 100 generated
Write a Comment
User Comments (0)
About PowerShow.com