Carnegie Mellon Univ. Dept. of Computer Science 15415 Database Applications - PowerPoint PPT Presentation

About This Presentation
Title:

Carnegie Mellon Univ. Dept. of Computer Science 15415 Database Applications

Description:

in short: 'we want tables where the attributes depend on the primary ... split offending table in two (or more), eg.: 15-415 - C. Faloutsos. 11. Carnegie Mellon ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 54
Provided by: christosf
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Carnegie Mellon Univ. Dept. of Computer Science 15415 Database Applications


1
Carnegie Mellon Univ.Dept. of Computer
Science15-415 - Database Applications
  • C. Faloutsos
  • Database design and normalization

2
Overview
  • Relational model
  • formal query languages
  • commercial query languages (SQL)
  • Integrity constraints
  • domain I.C., foreign keys
  • functional dependencies
  • DB design and normalization

3
Overview - detailed
  • DB design and normalization
  • pitfalls of bad design
  • decomposition
  • normal forms

4
Goal
  • Design good tables
  • sub-goal1 define what good means
  • sub-goal2 fix bad tables
  • in short we want tables where the attributes
    depend on the primary key, on the whole key, and
    nothing but the key
  • Lets see why, and how

5
Pitfalls
  • takes1 (ssn, c-id, grade, name, address)

6
Pitfalls
  • Bad - why? because ssn-gtaddress, name

7
Pitfalls
  • Redundancy
  • space
  • (inconsistencies)
  • insertion/deletion anomalies

8
Pitfalls
  • insertion anomaly
  • jones registers, but takes no class - no place
    to store his address!

9
Pitfalls
  • deletion anomaly
  • delete the last record of smith (we lose his
    address!)

10
Solution decomposition
  • split offending table in two (or more), eg.

?
?
11
Overview - detailed
  • DB design and normalization
  • pitfalls of bad design
  • decomposition
  • lossless join decomp.
  • dependency preserving
  • normal forms

12
Decompositions
  • there are bad decompositions we want
  • lossless and
  • dependency preserving

13
Decompositions - lossy
  • R1(ssn, grade, name, address) R2(c-id, grade)

ssn-gtname, address ssn, c-id -gt grade
14
Decompositions - lossy
  • can not recover original table with a join!

ssn-gtname, address ssn, c-id -gt grade
15
Decompositions
  • example of non-dependency preserving

S -gt address
S -gt status
S -gt address, status address -gt status
16
Decompositions
  • (drill is it lossless?)

S -gt address, status address -gt status
S -gt address
S -gt status
17
Decompositions - lossless
  • Definition
  • consider schema R, with FD F. R1, R2 is a
    lossless join decomposition of R if we always
    have
  • An easier criterion?

18
Decomposition - lossless
  • Theorem lossless join decomposition if the
    joining attribute is a superkey in at least one
    of the new tables
  • Formally

19
Decomposition - lossless
  • example

R2
R1
ssn-gtname, address
ssn, c-id -gt grade
ssn-gtname, address ssn, c-id -gt grade
20
Overview - detailed
  • DB design and normalization
  • pitfalls of bad design
  • decomposition
  • lossless join decomp.
  • dependency preserving
  • normal forms

21
Decomposition - depend. pres.
  • informally we dont want the original FDs to
    span two tables - counter-example

S -gt address
S -gt status
S -gt address, status address -gt status
22
Decomposition - depend. pres.
  • dependency preserving decomposition

S -gt address
address -gt status
S -gt address, status address -gt status
(but S-gtstatus ?)
23
Decomposition - depend. pres.
  • informally we dont want the original FDs to
    span two tables.
  • More specifically the FDs of the canonical
    cover.

24
Decomposition - depend. pres.
  • why is dependency preservation good?

S -gt address
S -gt address
address -gt status
S -gt status
(address-gtstatus lost)
25
Decomposition - depend. pres.
  • A eg., record that Philly has status A

S -gt address
S -gt address
address -gt status
S -gt status
(address-gtstatus lost)
26
Decomposition - conclusions
  • decompositions should always be lossless
  • joining attribute -gt superkey
  • whenever possible, we want them to be dependency
    preserving (occasionally, impossible - see STJ
    example later)

27
Overview - detailed
  • DB design and normalization
  • pitfalls of bad design
  • decomposition (-gt how to fix the problem)
  • normal forms (-gt how to detect the problem)
  • BCNF,
  • 3NF
  • (1NF, 2NF)

28
Normal forms - BCNF
  • We saw how to fix bad schemas -
  • but what is a good schema?
  • Answer good, if it obeys a normal form,
  • ie., a set of rules.
  • Typically Boyce-Codd Normal form

29
Normal forms - BCNF
  • Defn. Rel. R is in BCNF wrt F, if
  • informally everything depends the full key, and
    nothing but the key
  • semi-formally every determinant (of the cover)
    is a candidate key

30
Normal forms - BCNF
  • Example and counter-example

ssn-gtname, address
ssn-gtname, address ssn, c-id -gt grade
31
Normal forms - BCNF
  • Formally for every FD a-gtb in F
  • a-gtb is trivial (a superset of b) or
  • a is a superkey (or both)

32
Normal forms - BCNF
  • Theorem given a schema R and a set of FD F, we
    can always decompose it to schemas R1, Rn, so
    that
  • R1, Rn are in BCNF and
  • the decompositions are lossless.
  • (but, some decomp. might lose dependencies)

33
Normal forms - BCNF
  • How? algorithm in book - essentially, break off
    FDs of the cover
  • eg. TAKES1(ssn, c-id, grade, name, address)
  • ssn -gt name, address
  • ssn, c-id -gt grade

34
Normal forms - BCNF
  • eg. TAKES1(ssn, c-id, grade, name, address)
  • ssn -gt name, address ssn, c-id -gt grade

35
Normal forms - BCNF
ssn-gtname, address
ssn, c-id -gt grade
ssn-gtname, address ssn, c-id -gt grade
36
Normal forms - BCNF
  • pictorially we want a star shape

not in BCNF
37
Normal forms - BCNF
  • pictorially we want a star shape

or
38
Normal forms - BCNF
  • or a star-like (eg., 2 cand. keys)
  • STUDENT(ssn, st, name, address)


39
Normal forms - BCNF
  • but not

or
40
Normal forms - 3NF
  • consider the classic case
  • STJ( Student, Teacher, subJect)
  • T-gt J
  • S,J -gt T
  • is it BCNF?

41
Normal forms - 3NF
  • STJ( Student, Teacher, subJect)
  • T-gt J S,J -gt T
  • How to decompose it to BCNF?

42
Normal forms - 3NF
  • STJ( Student, Teacher, subJect)
  • T-gt J S,J -gt T
  • 1) R1(T,J) R2(S,J)
  • (BCNF? - lossless? - dep. pres.? )
  • 2) R1(T,J) R2(S,T)
  • (BCNF? - lossless? - dep. pres.? )

43
Normal forms - 3NF
  • STJ( Student, Teacher, subJect)
  • T-gt J S,J -gt T
  • 1) R1(T,J) R2(S,J)
  • (BCNF? YY - lossless? N - dep. pres.? N )
  • 2) R1(T,J) R2(S,T)
  • (BCNF? YY - lossless? Y - dep. pres.? N )

44
Normal forms - 3NF
  • STJ( Student, Teacher, subJect)
  • T-gt J S,J -gt T
  • in this case impossible to have both
  • BCNF and
  • dependency preservation
  • Welcome 3NF!

45
Normal forms - 3NF
  • STJ( Student, Teacher, subJect)
  • T-gt J S,J -gt T

informally, 3NF forgives the red arrow in the
can. cover
46
Normal forms - 3NF
  • Formally, a rel. R with FDs F is in 3NF if
    for every a-gtb in F
  • it is trivial or
  • a is a superkey or
  • each b-a attr. part of a cand. key
  • STJ( Student, Teacher, subJect)
  • T-gt J S,J -gt T

47
Normal forms - 3NF
  • how to bring a schema to 3NF?
  • algo in book
  • for each FD in the cover, put it in a table

48
Normal forms - 3NF vs BCNF
  • If R is in BCNF, it is always in 3NF (but not
    the reverse)
  • In practice, aim for
  • BCNF lossless join and dep. preservation
  • if impossible, we accept
  • 3NF but insist on lossless join and dep.
    preservation

49
Normal forms - more details
  • why 3NF? what is 2NF? 1NF?
  • 1NF attributes are atomic (ie., no set-valued
    attr., a.k.a. repeating groups)

not 1NF
50
Normal forms - more details
  • 2NF 1NF and non-key attr. fully depend on the
    key
  • counter-example TAKES1(ssn, c-id, grade, name,
    address)
  • ssn -gt name, address ssn, c-id -gt grade

51
Normal forms - more details
  • 3NF 2NF and no transitive dependencies
  • counter-example

in 2NF, but not in 3NF
52
Normal forms - more details
  • 4NF, multivalued dependencies etc IGNORE
  • in practice, E-R diagrams usually lead to tables
    in BCNF

53
Overview - conclusions
  • DB design and normalization
  • pitfalls of bad design
  • decompositions (lossless, dep. preserving)
  • normal forms (BCNF or 3NF)
  • everything should depend on the key, the whole
    key, and nothing but the key
Write a Comment
User Comments (0)
About PowerShow.com