The principal problem that we encounter is redundancy, where a fact is repeated in more than one tup - PowerPoint PPT Presentation

About This Presentation

Title:

The principal problem that we encounter is redundancy, where a fact is repeated in more than one tup

Description:

Carrie Fisher. Fox. color. 124. 1977. Star Wars. starName. studioNam ... Carrie Fisher. starName. year. No true redundancy! The update anomaly disappeared. ... – PowerPoint PPT presentation

Number of Views:66

Avg rating:3.0/5.0

Slides: 16

Provided by: webhome9

Category:

more less

Transcript and Presenter's Notes

Title: The principal problem that we encounter is redundancy, where a fact is repeated in more than one tup

1
Design of Relational Database Schemas

The principal problem that we encounter is
redundancy, where a fact is repeated in more than
one tuple.
Most common cause attempts to group into one
relation both single valued and multi-valued
properties of an object.
Now we will tackle the problem of designing good
relational schemas.

2
Anomalies

Redundancy.
Information may be repeated unnecessarily in
several tuples.
E.g. length and filmType.
Update anomalies.
We may change information in one tuple but leave
it unchanged in other tuples.
E.g. we could change the length of Star Wars to
125, in the first tuple, and forget to do the
same in the second and third tuple.
Deletion anomalies.
If a set of values becomes empty, we may lose
other information as a side effect.
E.g. if we delete Emilio Estevez we will lose all
the information about Mighty Ducks.

3
Decomposing Relations - Example
Movie1 relation

No true redundancy!
The update anomaly disappeared. If we change the
length of a movie, it is done only once.
The deletion anomaly disappeared. If we delete
all the stars from Movie2 we still will have the
other info for a movie.

Movie2 relation
4
Boyce-Codd Normal Form

The goal of decomposition is to replace a
relation by several that do not exhibit
anomalies.
There is a simple condition under which the
anomalies can be guaranteed not to exist.
This condition is called Boyce-Codd Normal Form,
or BCNF.
A relation is in BCNF if
Whenever there is a nontrivial dependency
A1A2An?B1B2Bm
for R, it must be the case that
A1 , A2 , , An is a superkey for R.

5
Boyce-Codd Normal Form - Example
Violating BCNF

Relation Movie in the previous figure is not in
BCNF.
Consider the FD title year?length filmType
studioName
Unfortunately, the left side of the above
dependency is not a superkey.
In particular we know that the title and the year
does not functionally determine starName.
On the other hand, Movie1 is in BCNF.
The only key is title, year and
title year ? length filmType studioName holds in
the relation

6
Decomposition into BCNF

The decomposition strategy is
Find a non-trivial FD A1A2An?B1B2Bm that
violates BCNF, i.e. A1A2An is not a superkey.
Decompose the relation schema into two
overlapping relation schemas
One is all the attributes involved in the
violating dependency and
the other is the left side and all the other
attributes not involved in the dependency.
By repeatedly, choosing suitable decompositions,
we can break any relation schema into a
collection of smaller schemas in BCNF.
The data in the original relation is represented
faithfully by the data in the relations that are
the result of the decomposition.
i.e. we can reconstruct the original relation
exactly from the decomposed relations.

7
Boyce-Codd Normal Form - Example
Consider relation schema Movies(title, year,
studioName, president, presAddr) and functional
dependencies title year ? studioName studioName
? president president ? presAddr Last two
violate BCNF. Why? Compute title, year,
studioName, president and see if you get
all the attributes of the relation. If not, you
got a BCNF violation, and need to break relation.
8
Boyce-Codd Normal Form Example

Lets decompose
starting with
studioName ? president
Lets add to the right-hand side any other
attributes in the closure of studioName (optional
rule of thumb).
XstudioName studioName?president
XstudioName, president president?presAddr
XstudioNamestudioName, president, presAddr

9
Boyce-Codd Normal Form Example
From the closure we get studioName?president
presAddr We decompose the relation schema into
the following two schemas Movies1(studioName,
president, presAddr) Movies2(title, year,
studioName) The second schema is in BCNF. What
about the first schema? The following dependency
violates BCNF. president?presAddr Why its bad
to leave Movies1 table as is? If many studios
share the same president than we would have
redundancy when repeating the presAddr in all
those studios.
10
Boyce-Codd Normal Form Example
We must decompose Movies1, using the
FD president?presAddr The resulting relation
schemas, both in BCNF, are Movies11(title, year,
studioName) Movies12(studioName, president) In
general, we must keep applying the decomposition
rule as many times as needed, until all our
relations are in BCNF. So, finally we got
Movies11, Movies12, and Movies2.
11
Finding FDs for the decomposed relations

When we decompose a relation, we need to check
that the resulting schemas are in BCNF.
We cant tell a relation is in BCNF, unless we
can determine the FDs that hold for that
relation.

12
Finding FDs for the decomposed relations

Suppose S is one of the resulting relations in a
decomposition of R.
For this
Consider each subset X of attributes of S.
Compute X using the FD on R.
At the end throw out the attributes of R, which
arent in S.
Then, for each attribute B such that
B is an attribute of S,
B is in X
we have that the functional dependency X?B holds
in S.

13
Example Consider R(A, B, C, D, E) decomposed
into S(A, B, C) and another relation. Let FDs of
R be A?D, B?E, DE?C First, consider
AA,D. Since D is not in the schema of S,
we get no dependency here. Similarly, BB,E
and CC, yielding no FDs for S. Now consider
pairs. A,BA, B, C, D, E. Thus, we deduce
AB?C for S. Neither of the other pairs give us
any FD for S. Of course the set of all three
attributes of S, A, B, C, cannot yield any
nontrivial dependencies for S. Thus, the only
dependency we need assert for S is AB?C.
14
A Few Tricks

Never need to compute the closure of the empty
set or of the set of all attributes.
If we find X all attributes, dont bother
computing the closure of any supersets of X.

15
Another Example