Relational Database Design Chapter 8 [1 of 2] - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Relational Database Design Chapter 8 [1 of 2]

Description:

... {A B, ABCD E, EF GH ... This results in data duplication and an unnecessary dependency between the ... is a type of relationship between ... – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 62
Provided by: DaveH213
Learn more at: http://cs.union.edu
Category:

less

Transcript and Presenter's Notes

Title: Relational Database Design Chapter 8 [1 of 2]


1
Relational Database DesignChapter 8 1 of 2
  • Features of Good Relational DesignsAtomic
    Domains and First Normal FormDecomposition Using
    Functional DependenciesFunctional-Dependency
    Theory

2
Approaches toDesigning a Set of Relations
  • The 1st approach is (Chapter 7) a Top-Down Design
    (Relational Design by Analysis)
  • 1. Designing a conceptual schema in a high-level
    data model, such as the EER model
  • 2. Mapping the conceptual schema into a set of
    relations using mapping procedures.

3
Approaches toDesigning a Set of Relations
  • The 2nd approach is (Chapter 8) a Bottom-Up
    Design (Relational Design by Synthesis)
  • 1. First construct a minimal set of FDs
  • Assumes that all possible functional dependencies
    are known.
  • 2. a normalization algorithm is applied to
    construct a target set of 3NF or BCNF relations.
  • start by one large relation schema, called the
    universal relation, which includes all the
    database attributes.
  • then repeatedly perform decomposition until it is
    no longer feasible or no longer desirable, based
    on the functional and other dependencies
    specified by the database designer.

4
Well-Structured Relations
  • A relation that contains minimal data redundancy
  • Allows users to insert, delete, and update rows
    without causing data inconsistencies
  • Goal is to avoid "anomalies"

General rule of thumb A table should not pertain
to more than one entity type
5
Anomalies
  • An anomaly is an inconsistent, incomplete, or
    contradictory state of the database
  • Insertion anomaly user is unable to insert a
    new record when it should be possible to do so
  • Deletion anomaly when a record is deleted,
    other information that is tied to it is also
    deleted
  • Update anomaly a record is updated, but other
    appearances of the same items are not updated

6
Example
QuestionIs this a relation?
AnswerYes Unique rows
QuestionWhats the primary key?
AnswerComposite Emp_ID, Course_Title
7
Anomalies in this Table
  • Insertionwe need to specify a class when adding
    a new person, need to add person when adding a
    class
  • Deletionif we remove employee 140, we lose
    information about the existence of a Tax Acc
    class
  • Modificationgiving a salary increase to employee
    100 forces us to update multiple records

Why do these anomalies exist? Because there are
two themes (entity types) in this one relation.
This results in data duplication and an
unnecessary dependency between the entities
8
Example of Bad Design
DRINKER(name, addr, beersLiked, manf,
favBeer) name addr beersLiked manf favBeer Jane
way Voyager Bud A.B. WickedAle Janeway ??? Wicke
dAle Petes ??? Spock Enterprise Bud ??? Bud
Data is redundant, because each of the ???s can
be figured out since name implies addr favBeer
and beersLiked implies manf.
9
This Bad Design AlsoExhibits Anomalies
name addr beersLiked manf favBeer Janeway Voya
ger Bud A.B. WickedAle Janeway Voyager WickedAle
Petes WickedAle Spock Enterprise Bud A.B. Bud
  • Update anomaly if Janeway is transferred to
    Intrepid,
  • will we remember to change each of her tuples?
  • Deletion anomaly If nobody likes Bud, we lose
    track
  • of the fact that Anheuser-Busch manufactures
    Bud.

10
Objectives of Normalization
  • Develop a "good" description of the data, its
    relationships and constraints
  • Produce a stable set of relations that
  • Is a faithful model of the enterprise
  • Is highly flexible
  • Reduces redundancy
  • saves space and reduces inconsistency in data
  • Is free of update, insertion and deletion
    anomalies

11
Normal Forms
  • First Normal Form - 1NF
  • Second Normal Form - 2NF
  • Third Normal Form - 3NF
  • Boyce-Codd Normal Form - BCNF
  • Fourth Normal Form - 4NF
  • Fifth Normal Form - 5NF
  • Domain/Key Normal Form DKNF
  • Others

12
Steps in normalization
12
13
Data Normalization Background
  • Info Needed
  • all the attributes that must be incorporated into
    the database
  • a list of all the defining associations between
    the attributes (i.e., the functional
    dependencies).
  • a means of expressing that the value of one
    particular attribute is associated with a single,
    specific value of another attribute.
  • If we know that one of these attributes has a
    particular value, then the other attribute must
    have some other value.

14
First Normal Form
  • A relation is in 1NF if every attribute is
    single-valued for each tuple
  • Each cell of the table has only one value in it
  • Domains of attributes are atomic no sets, lists,
    repeating fields or groups allowed in domains

15
Table with multivalued attributes, not in 1st
normal form
15
16
Table with no multivalued attributes and unique
rows, in 1st normal form
Note this is a relation, but not a
well-structured one
16
17
Anomalies in this Table
  • Insertionif new product is ordered for order
    1007 of existing customer, customer data must be
    re-entered, causing duplication
  • Deletionif we delete the Dining Table from Order
    1006, we lose information concerning this item's
    finish and price
  • Updatechanging the price of product ID 4
    requires update in several records

Why do these anomalies exist? Because there are
multiple themes (entity types) in one relation.
This results in duplication and an unnecessary
dependency between the entities
18
Ensuring 1NF 1 of 3
  • Best solution For each multi-valued attribute,
    create a new table, in which you place the key of
    the original table and the multi-valued
    attribute. Keep the original table, with its key
  • Example
  • STUDENT(stuId, lastName, major, credits,
    status, socSecNo)
  • becomes
  • NEWSTUDENT(stuId, lastName, credits,status,
    socSecNo)
  • MAJORS(stuId, major)

19
Another method for 1NF 2 of 3
  • Flatten the original table by making the
    multi-valued attribute part of the key
  • Student(stuId, lastName, major, credits, status,
    socSecNo)

20
A Third Method for 1NF 3 of 3
  • If the number of repeats is limited, make
    additional columns for multiple values
  • Student(stuId, lastName, major1, major2, credits,
    status, socSecNo)
  • Hannay Reels
  • 3 Product Types, 2 Address Lines

21
Just Getting Started
  • First normal form is merely a starting point in
    the normalization process.
  • First normal form contains a great deal of data
    redundancy.

22
Functional Dependency
  • A functional dependency (FD) is a type of
    relationship between attributes
  • If A and B are sets of attributes of relation R,
    we say B is functionally dependent on A if each A
    value in R has associated with it exactly one
    value of B in R.
  • Recall the concept of "single-valued" function
  • Alternatively, if two tuples have the same A
    values, they must also have the same B values
  • Write A?B, read A functionally determines B, or B
    functionally dependent on A

23
Example of FDs
  • Let R be
  • NewStudent(stuId, lastName, major, credits,
    status, socSecNo)
  • FDs in R include
  • stuId ? lastName, but not the reverse
  • stuId ? lastName, major, credits, status,
    socSecNo, stuId
  • socSecNo ? stuId, lastName, major, credits,
    status, socSecNo
  • credits ? status, but not status?credits
  • A functional dependency is a generalization of
    the notion of a key.

24
"General Hardware" SALESPERSON and PRODUCT
Functional Dependencies
25
Beer Example
  • Drinkers(name, addr, beersLiked, manf, favBeer).
  • Reasonable FDs to assert
  • name ? addr
  • name ? favBeer
  • beersLiked ? manf

26
Related Attributes
name addr beersLiked manf favBeer Janeway
Voyager Bud A.B. WickedAle Janeway Voyager
WickedAle Petes WickedAle Spock Enterprise Bud
A.B. Bud
27
FDs With Multiple Attributes
  • No need for FDs with gt 1 attribute on right.
  • But sometimes convenient to combine FDs as a
    shorthand.
  • Example name ? addr and name ? favBeer become
    name ? addr, favBeer
  • gt 1 attribute on left may be essential.
  • Example bar, beer ? price

28
FDs From Laws of Physics
  • While most FDs come from E-R keyness and
    many-one relationships, some are really physical
    laws.
  • Example "no two courses can meet in the same
    room at the same time" tells us
  • hour, room ? course.

29
Trivial Functional Dependency
  • The FD X?Y is trivial if set Y is a subset of
    set X
  • Examples If A and B are attributes of R,
  • A?A
  • A,B ?A
  • A,B ?B
  • A,B ?A,B
  • are all trivial FDs

30
Full Functional Dependency
  • In relation R, set of attributes B is fully
    functionally dependent on set of attributes A of
    R, if B is functionally dependent on A but not
    functionally dependent on any proper subset of A
  • This means every attribute in A is needed to
    functionally determine B

31
Partial Functional Dependency
  • NewClass(courseNo, stuId, stuLastName, facId,
    schedule, room, grade)
  • FDs
  • courseNo,stuId ? grade
  • courseNo,stuId ? lastName partial FD
  • stuId ? lastName
  • courseNo,stuId ? facId partial FD
  • courseNo ? facId
  • courseNo,stuId ? schedule partial FD
  • courseNo ? schedule
  • courseNo,stuId ? room partial FD
  • courseNo ?room
  • plus trivial FDs that are partial

32
Functional Dependencies and Keys
  • Functional Dependency The value of one attribute
    (the determinant) determines the value of another
    attribute
  • Candidate Key
  • A unique identifier. One of the candidate keys
    will become the primary key
  • E.g. perhaps there is both credit card number and
    SS in a tablein this case both are candidate
    keys
  • Each non-key field is functionally dependent on
    every candidate key

33
Candidate Key
  • Which candidate key to pick depends on the
    application using the database.
  • Alternate key is a candidate key that was not
    chosen to be the primary key of the relation.

34
Second Normal Form
  • A relation is in second normal form (2NF) if it
    is in first normal form and all the non-key
    attributes are fully functionally dependent on
    the key.
  • No non-key attribute is FD on just part of the
    key
  • If key has only one attribute and R is 1NF, then
    R is automatically 2NF

35
Converting to 2NF
  • Identify each partial FD
  • Remove the attributes that depend on each of the
    determinants so identified
  • Place these determinants in separate relations
    along with their dependent attributes
  • In original relation keep the composite key and
    any attributes that are fully functionally
    dependent on all of it
  • Even if the composite key has no dependent
    attributes, keep that relation to connect the
    others

36
2NF Example
  • NEWCLASS(courseNo, stuId, stuLastName, facId,
    schedule, room, grade )
  • FDs grouped by determinant
  • courseNo ? courseNo, facId, schedule, room
  • stuId ? stuId, lastName
  • courseNo, stuId ? courseNo, stuId, facId,
    schedule, room, lastName, grade
  • Create tables grouped by determinants
  • COURSE(courseNo, facId, schedule, room)
  • STUD(stuId, lastName)
  • Keep relation with original composite key, with
    attributes FD on it, if any
  • NEWSTUD2( courseNo, stuId, grade)

37
General Hardware Company Second Normal Form
  • In SALESPERSON, Salesperson Number is the sole
    primary key attribute. Every nonkey attribute of
    the table is fully defined just by Salesperson
    Number.
  • Similar logic for PRODUCT and QUANTITY tables.

38
General Hardware Company Second Normal Form
39
Transitive Dependency
  • If A, B, and C are attributes of relation R, such
    that A ? B, and B ? C, then C is transitively
    dependent on A
  • Example
  • NewStudent (stuId, lastName, major, credits,
    status)
  • FD
  • credits?status
  • By transitivity
  • stuId?credits AND credits?status implies
    stuId?status
  • Transitive dependencies cause update, insertion,
    deletion anomalies.

40
Third Normal Form
  • To be 3NF, relation must be 2NF and have no
    transitive dependencies
  • No non-key attribute determines another non-key
    attribute. Here key includes candidate key

41
Making a relation 3NF
  • For example,
  • NEWSTUDENT(stuId, lastName, major, credits,
    status)
  • with FD credits?status
  • Remove the dependent attribute, status, from the
    relation
  • Create a new table with the dependent attribute
    and its determinant, credits
  • Keep the determinant, but not the dependent
    attribute, in the original table
  • NEWSTUD2 (stuId, lastName, major, credits)
  • STATS (credits, status)

42
Functional dependency diagram for INVOICE
Order_ID ? Order_Date, Customer_ID,
Customer_Name, Customer_Address
Customer_ID ? Customer_Name, Customer_Address
Product_ID ? Product_Description, Product_Finish,
Unit_Price
Order_ID, Product_ID ? Order_Quantity
Therefore, NOT in 2nd Normal Form
43
Removing partial dependencies
Getting it into Second Normal Form
Partial dependencies are removed, but there are
still transitive dependencies
44
Removing Transitive dependencies
Getting it into Third Normal Form
Transitive dependencies are removed
44
45
General Hardware Company Second Normal Form
46
General Hardware Company Third Normal Form
47
Third Normal Form
  • Important points about the third normal form
    structure are
  • It is completely free of data redundancy.
  • All foreign keys appear where needed to logically
    tie together related tables.

48
Violating 3NF?
  • Zip Codes determine City State
  • AND Company (therefore CustID) determines Zip
  • Hannay Reels CustomerMaster does not split out
    zip/city/state into another table
  • Why might this be?
  • I screwed up???
  • ease/speed of access
  • get full address with one data fetch (one table)
  • lack of repetition
  • (not many customers at same zip code in same
    city)
  • foreign customers (no zip)

49
Closure of a Set of Functional Dependencies
  • Given a set F of functional dependencies, there
    are certain other functional dependencies that
    are logically implied by F.
  • For example If A ? B and B ? C, then we can
    infer that A ? C
  • The set of all functional dependencies logically
    implied by F is the closure of F.
  • We denote the closure of F by F.
  • F is a superset of F.

50
Closure of a Set of Functional Dependencies
  • We can find F, the closure of F, by repeatedly
    applying Armstrongs Axioms
  • if ? ? ?, then ? ? ?
    (reflexivity)
  • if ? ? ?, then ? ? ? ? ?
    (augmentation)
  • if ? ? ?, and ? ? ?, then ? ? ? (transitivity)
  • These rules are
  • sound (generate only functional dependencies that
    actually hold), and
  • complete (generate all functional dependencies
    that hold).

51
Example
  • R (A, B, C, G, H, I)F A ? B A ? C CG
    ? H CG ? I B ? H
  • some members of F
  • A ? H
  • by transitivity from A ? B and B ? H
  • AG ? I
  • by augmenting A ? C with G, to get AG ? CG
    and then transitivity with CG ? I
  • CG ? HI
  • by augmenting CG ? I to infer CG ? CGI,
  • and augmenting of CG ? H to infer CGI ? HI,
  • and then transitivity

52
Procedure for Computing F
  • To compute the closure of a set of functional
    dependencies F
  • F Frepeat for each functional
    dependency f in F apply reflexivity and
    augmentation rules on f add the resulting
    functional dependencies to F for each pair of
    functional dependencies f1and f2 in F
    if f1 and f2 can be combined using
    transitivity then add the resulting functional
    dependency to F until F does not change any
    further

53
Closure of Functional Dependencies (Cont.)
  • Additional rules
  • If ? ? ? holds and ? ? ? holds, then ? ? ? ?
    holds (union)
  • If ? ? ? ? holds, then ? ? ? holds and ? ? ?
    holds (decomposition)
  • If ? ? ? holds and ? ? ? ? holds, then ? ? ? ?
    holds (pseudotransitivity)
  • The above rules can be inferred from Armstrongs
    axioms.

54
Canonical Cover
  • Sets of functional dependencies may have
    redundant dependencies that can be inferred from
    the others
  • For example A ? C is redundant in A ? B,
    B ? C, A? C
  • Parts of a functional dependency may be redundant
  • E.g. on RHS A ? B, B ? C, A ? CD can
    be simplified to A ?
    B, B ? C, A ? D
  • E.g. on LHS A ? B, B ? C, AC ? D can
    be simplified to A ?
    B, B ? C, A ? D
  • Intuitively, a canonical cover of F is a
    minimal set of functional dependencies
    equivalent to F, having no redundant dependencies
    or redundant parts of dependencies
  • Every set of FDs has an equivalent minimal set
  • There can be several equivalent minimal sets

55
Equivalent FDs
  • Two sets of FDs F and G are equivalent if
  • Every FD in F can be inferred from G, and
  • Every FD in G can be inferred from F
  • Hence, F and G are equivalent if F G
  • Definition (Covers)
  • F covers G if every FD in G can be inferred from
    F
  • (i.e., if G is subset-of F)
  • F and G are equivalent if F covers G and G covers
    F

56
Finding a Minimal Cover
  • Finding a minimal cover G for F
  • 1. Set G F.
  • 2. Replace each FD X?A1, A2, ..., Ak in G by
    the n functional dependencies X?A1, X?A2 , ,
    X?Ak.
  • 3. For each FD X?A in G
  • For each attribute B that is an element of X
  • if ((G -X?A) ? (X-B)?A) is equivalent to G,
  • then replace X?A with (X-B)?A in G.
  • 4. For each remaining FD X?A in G,
  • if (G-X?A) is equivalent to G, then remove X?A
    from G.

57
Finding a Minimal CoverExample 1
  • Example
  • A?B, ABCD?E, EF?GH, ACDF?EG
  • Make RHS a single attribute
  • A?B, ABCD?E, EF?G, EF?H, ACDF?E, ACDF?G
  • Minimize LHS ACD?E instead of ABCD?E
  • Eliminate redundant FDs
  • Can ACDF?G be removed?
  • Can ACDF?E be removed?
  • Final answer A?B, ACD?E, EF?G, EF?H

58
Canonical CoverExample 2
  • Compute the minimal cover of the following set of
    functional dependencies
  • ABC ? DE, BD ? DE, E ? CF, EG ? F
  • ABC ? D
  • ABC ? E
  • BD ? D // reflexive
  • BD ? E
  • E ? C
  • E ? F
  • EG ? F // augmentation
  • The minimal cover is
  • ABC ? D, BD ? E, E ? C, E ? F

59
Homework/Project
  • Homework due Next Class
  • 7.1, 7.3, 7.5, 7.6, 7.7, 7.8, 7.9, 7.10, 7.11,
    7.12
  • No Homework due in One Week (Test)
  • Project
  • Verify (or convert to) 1NF, 2NF, 3NF

60
Test in One Week
  • Chapters 6-8
  • Give/Interpret Domain Relational Calculus
    Expressions
  • Design an EER Diagram Convert to Relational
    Schema
  • (tuples, primary keys, foreign keys)
  • Convert a relation to 1NF using each of 3 methods
  • Test for and/or Convert to 2NF, 3NF, BCNF, 4NF
  • Find a 3NF lossless join decomposition that
    preserves dependencies
  • List all non-trivial Functional Dependencies for
    a relation
  • Give Closure Set for Functional Dependencies
  • List Candidate Keys Give Canonical Cover
  • Open Book
  • 4 Sheets of Notes

61
In-Class Exercise(You may work together. You are
free to go once you show me your work.)
  • List the Functional Dependences in
  • (EmpID, CourseTitle, Name, Phone, Salary,
    DateCompleted)
  • 8.6 Given the relation R and Functional
    Dependencies F, compute F
  • R (A, B, C, D, E)
  • F A ? BC, B ? D, CD ? E, E ? A
  • Find Candidate Keys (more than 1)
  • Find Canonical Cover
Write a Comment
User Comments (0)
About PowerShow.com