The principal problem that we encounter is redundancy, where a fact is repeated in more than one tup - PowerPoint PPT Presentation

About This Presentation
Title:

The principal problem that we encounter is redundancy, where a fact is repeated in more than one tup

Description:

Carrie Fisher. Fox. color. 124. 1977. Star Wars. starName. studioNam ... Carrie Fisher. starName. year. No true redundancy! The update anomaly disappeared. ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 16
Provided by: webhome9
Category:

less

Transcript and Presenter's Notes

Title: The principal problem that we encounter is redundancy, where a fact is repeated in more than one tup


1
Design of Relational Database Schemas
  • The principal problem that we encounter is
    redundancy, where a fact is repeated in more than
    one tuple.
  • Most common cause attempts to group into one
    relation both single valued and multi-valued
    properties of an object.
  • Now we will tackle the problem of designing good
    relational schemas.

2
Anomalies
  • Redundancy.
  • Information may be repeated unnecessarily in
    several tuples.
  • E.g. length and filmType.
  • Update anomalies.
  • We may change information in one tuple but leave
    it unchanged in other tuples.
  • E.g. we could change the length of Star Wars to
    125, in the first tuple, and forget to do the
    same in the second and third tuple.
  • Deletion anomalies.
  • If a set of values becomes empty, we may lose
    other information as a side effect.
  • E.g. if we delete Emilio Estevez we will lose all
    the information about Mighty Ducks.

3
Decomposing Relations - Example
Movie1 relation
  • No true redundancy!
  • The update anomaly disappeared. If we change the
    length of a movie, it is done only once.
  • The deletion anomaly disappeared. If we delete
    all the stars from Movie2 we still will have the
    other info for a movie.

Movie2 relation
4
Boyce-Codd Normal Form
  • The goal of decomposition is to replace a
    relation by several that do not exhibit
    anomalies.
  • There is a simple condition under which the
    anomalies can be guaranteed not to exist.
  • This condition is called Boyce-Codd Normal Form,
    or BCNF.
  • A relation is in BCNF if
  • Whenever there is a nontrivial dependency
  • A1A2An?B1B2Bm
  • for R, it must be the case that
  • A1 , A2 , , An is a superkey for R.

5
Boyce-Codd Normal Form - Example
Violating BCNF
  • Relation Movie in the previous figure is not in
    BCNF.
  • Consider the FD title year?length filmType
    studioName
  • Unfortunately, the left side of the above
    dependency is not a superkey.
  • In particular we know that the title and the year
    does not functionally determine starName.
  • On the other hand, Movie1 is in BCNF.
  • The only key is title, year and
  • title year ? length filmType studioName holds in
    the relation

6
Decomposition into BCNF
  • The decomposition strategy is
  • Find a non-trivial FD A1A2An?B1B2Bm that
    violates BCNF, i.e. A1A2An is not a superkey.
  • Decompose the relation schema into two
    overlapping relation schemas
  • One is all the attributes involved in the
    violating dependency and
  • the other is the left side and all the other
    attributes not involved in the dependency.
  • By repeatedly, choosing suitable decompositions,
    we can break any relation schema into a
    collection of smaller schemas in BCNF.
  • The data in the original relation is represented
    faithfully by the data in the relations that are
    the result of the decomposition.
  • i.e. we can reconstruct the original relation
    exactly from the decomposed relations.

7
Boyce-Codd Normal Form - Example
Consider relation schema Movies(title, year,
studioName, president, presAddr) and functional
dependencies title year ? studioName studioName
? president president ? presAddr Last two
violate BCNF. Why? Compute title, year,
studioName, president and see if you get
all the attributes of the relation. If not, you
got a BCNF violation, and need to break relation.
8
Boyce-Codd Normal Form Example
  • Lets decompose
  • starting with
  • studioName ? president
  • Lets add to the right-hand side any other
    attributes in the closure of studioName (optional
    rule of thumb).
  • XstudioName studioName?president
  • XstudioName, president president?presAddr
  • XstudioNamestudioName, president, presAddr

9
Boyce-Codd Normal Form Example
From the closure we get studioName?president
presAddr We decompose the relation schema into
the following two schemas Movies1(studioName,
president, presAddr) Movies2(title, year,
studioName) The second schema is in BCNF. What
about the first schema? The following dependency
violates BCNF. president?presAddr Why its bad
to leave Movies1 table as is? If many studios
share the same president than we would have
redundancy when repeating the presAddr in all
those studios.
10
Boyce-Codd Normal Form Example
We must decompose Movies1, using the
FD president?presAddr The resulting relation
schemas, both in BCNF, are Movies11(title, year,
studioName) Movies12(studioName, president) In
general, we must keep applying the decomposition
rule as many times as needed, until all our
relations are in BCNF. So, finally we got
Movies11, Movies12, and Movies2.
11
Finding FDs for the decomposed relations
  • When we decompose a relation, we need to check
    that the resulting schemas are in BCNF.
  • We cant tell a relation is in BCNF, unless we
    can determine the FDs that hold for that
    relation.

12
Finding FDs for the decomposed relations
  • Suppose S is one of the resulting relations in a
    decomposition of R.
  • For this
  • Consider each subset X of attributes of S.
  • Compute X using the FD on R.
  • At the end throw out the attributes of R, which
    arent in S.
  • Then, for each attribute B such that
  • B is an attribute of S,
  • B is in X
  • we have that the functional dependency X?B holds
    in S.

13
Example Consider R(A, B, C, D, E) decomposed
into S(A, B, C) and another relation. Let FDs of
R be A?D, B?E, DE?C First, consider
AA,D. Since D is not in the schema of S,
we get no dependency here. Similarly, BB,E
and CC, yielding no FDs for S. Now consider
pairs. A,BA, B, C, D, E. Thus, we deduce
AB?C for S. Neither of the other pairs give us
any FD for S. Of course the set of all three
attributes of S, A, B, C, cannot yield any
nontrivial dependencies for S. Thus, the only
dependency we need assert for S is AB?C.
14
A Few Tricks
  • Never need to compute the closure of the empty
    set or of the set of all attributes.
  • If we find X all attributes, dont bother
    computing the closure of any supersets of X.

15
Another Example
  • R(A,B,C)
  • with FDs A ?B and B ?C.
  • Project onto S(A,C).
  • A ABC yields A ?B, A ?C.
  • We do not need to compute AB or AC .
  • B BC yields B ?C.
  • C C yields nothing.
  • BC BC yields nothing.
  • Resulting FDs A ?B, A ?C, and B ?C.
  • Projection onto AC A ?C.
  • This is the only FD that involves a subset of
    A,C .
Write a Comment
User Comments (0)
About PowerShow.com