Functional Dependencies FDs - PowerPoint PPT Presentation

About This Presentation

Title:

Functional Dependencies FDs

Description:

We now know the party. ... We claim that 'office, city' is clearly a key ... 'city' a party of a party of the key 'office, key', hence pollingplace- city is ... – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 38

Provided by: csNorth

Learn more at: https://users.cs.northwestern.edu

Category:

more less

Transcript and Presenter's Notes

Title: Functional Dependencies FDs

1
Functional Dependencies (FDs)
2

A function dependency on a relation means that
some attribute is a function of a group of other
attributes A4 f(A1, A2, A3)
Notice that functions are single-valued, so if
two tuples agree on A1, A2, A3, then they will
also agree on A4. Its this forced agreement that
is the functional dependency.

The function may also be multiple attributes
(A3, A5, A6) f(A1, A2, A3)
We write A1, A2, A3- A4, A5, A6

4
Keys of Relations

A1, A2, A3 - all other attributes
No subset of (A1, A2, A3) - all other attributes
A key must be minimal
We cant throw out an attribute from this key and
have it remains a key
This is the smallest key
A key is a minimal-sized set of attributes that
functionally determine all the other attributes
of a relation
In ER, keys need not be minimal. In the
relational data model, keys must be minimal

5
Superkey

A superkey is a set of attributes that contains a
key
ES relations key is the key attributes for the
ES
E-R-F many-many Rs key contains the key
attributes from both E and F
E-R-F many-one Rs key contains the key
attributes only from E
EF one-one Rs key contains the key from
either E or F (not unique)

6
Rules for FDs and Reasoning about them

A1, A2, A3 - B1, B2, B3 is shorthand for
Splitting Rule going from the multi-valued form
to the list of single-valued form
A1, A2, A3-B1
A1, A2, A3-B2
Combining Rule going from single-valued form to
multi-valued

7
Trivial dependencies

A FD A1, A2 An - B is said to be trivial if B
is one of the As. for example,
Title, year - title
In general A1, A2, A3 - B1, B2, B3
Trivial if the Bs are a subset of As
Nontrivial if at least one B is not an A
Completely nontrivial no B is an A
Any trivial dependency can be assumed

8
Computing the Closure of Attributes

Given a set of attributes AA1, A2, A3 and a
set of FDs S
We can think of A as a subset of the attributes
of a relation R, and the FDs S as being FDs of
that relation R
The closoure of A under S is the set of
attributes B, such that every relation that
satisfies the FDs in S also satisfies A1 A2 A3 -
B. (we want to compute the B!)

A set TA. Now apply all your FDs in S that match
the attributes you have in T.
Their right-hand-side will take you to new
attributes. Union those attributes you have back
into T. Repeat.
Keep going until you can introduce no new
attributes.
Then T is your enclosure of A under S

Given closure, we can now determine if a new FD
follows from existing FDs
FOLLOWS FROM TEST
Lets test A1, A2, A3 - B
Find A1, A2, A3
If it contains B then A1, A2, A3 - B follows
from the FDs in S

11
Closure and Keys

A1, A2, A3 contains all the attributes of R
iff A1 A2 A3 is a superset for R
Given a superkey, we can test its subsets to see
if they are also superkeys. If no subesets exits,
then the superkey must be a key
Proof of why closure works omitted

12
Transitivity

If A1, A2, A3-B1, B2 and B1, B2-C1, C2 then A1,
A2, A3-C1, C2
GIVEN versus DERIVED FDs
BASIS a set FDs from which all the FDs can be
derived.

13
PROJECTION

Suppose we have a relation R with some FDs F, and
we project R by eliminating certain attributes
from the schema and get S. What FDs hold in S?
Computing all FSs that
Follow from F, and
Involve only attributes of S
The calculation is exponential for a large number
of such FDs (many of them may be redundant since
they follow from other such FDs)

14
All Inference Rules

Reflexivity if B1, B2, B3 is a subset of A1,
A2, A3, then A1, A2, A3-B1, B2, B3
Augmentation if A1, A2, A3- B1, B2 B3 then A1,
A2, A3, C1, C2- B1, B2 B3 C1, C2 for all C1, C2
Transitivity A1, A2- B1, B2 and B1, B2- C1, C2
then A1, A2- C1, C2

15
Relational Data Model(schema design and normal
forms)

Recall that a schema is a template for a relation
and a relation is just a table, as we saw from
SQL days at the beginning of class
Except here we use the original relational model
in which a relation is a set and not a bag

16
Projection Rehash

Given a relation R and its FDs F (A-B A
determine B
We can use the FDs to find the keys and superkeys
We can talk about a minimal basis of FDs from
which we can derive all the others using our
inference rules
We can use closure to figure out all that the FDs
give us to know, given some initial set of knowns
But what does this means when it comes to the
data in the relation?

Lets say you have two tuples (a, b, c) and you
project out the third attributes.
Now we have (a, b), (a, b). But you no longer
have a set because you have duplicates.
So, projection means that you turn the input back
into a set.
Multiple tuples in the original relation can get
flattened into a single tuple in the projected
relation

18
What about FDs?

We would expect that fewer FDs (or the same
number) would hold in the projected relation than
in the original one. What are they?
They are the ones that follow from the original
Fds, but that involve only attributes of the new
relation.

19
Design

After we hav finished doing ER-relation, or we
have done relation to start with, we have a set
of relations, with each relation having a set of
associated FDs.
Now well talk about what it means to normalize
each of those relations.

20
Normalization

Take each relation and its FDs and convert it
into a set of relations, each with its own FDs
such that the relations make it easy to avoid
anomalies.
Avoiding anomalies boils down to avoiding
redundancy and duplication while still preserving
all the meaning in the originals.

21
Theres a range of normal forms for a relation
schema

5NF4NFBCNF3NF2NF1NF
The book will talk about BCNF first, then 3NF and
then 4NF
The handout talks about the other normal forms

Exmaple schema
Candidate(lastname, firstname, address, party,
partychair, partyaddress)
Last, first, address-party, chair
party-partychair
Party-partyaddress
We will pretend that (lastname, firstname,
address) is the key

23
Anomoly

Redundancy
Information repeated in more than one tuple
Updated anomalies
What is information is repeated in more than one
tuple, but when you update that tuple you forget
to update the others? (broken an FD)
Deletion anomalies
What if information is repeated in more than one
tuple and you delete all of tuples? (where is
the information now? We may loose other
information as a side effect)

24
Decomposing Relations

Take some relation and turn it into two relations
R(A1, A2An) can be decomposed into
S(B1, B2Bm) and T(C1, C2Ck)
If we union the attributes of the two relations,
we get the set of attributes f the original
relation
B1, B2Bm UC1, C2CkA1, A2A n
The tuples in S are projections onto B1, B2 This
means there could be fewer tuples in S than in R.
Similar for T

25
BCNF

A relation R is in BCNF iff whenever there is a
nontrivial FD A1, A2An-B for R, it is the case
that A1, A2An is a superkey for R
Recall that a superkey need not be minimal, so an
equivalent statement of BCNF is that the left
side of every nontrivial FD must contain a key

26
How do we convert a relation into BCNF?

We dont. We decompose it into relations that
seperately are in BCNF, then we project the
tuples into those relations
For example, the party-partychair is a violating
FD
How to decompose?

Augment the FD with other FDs RHS that are
determinded by the LHS or subset of it
We have party-partychair and party-partyaddress,
so well create party-partychair, partyaddress
Make two new schemas by PROJECTION
contain all the attributes of the FD
Parties(party, partychair, partyaddress)
Contain all the attributes in the originl
relation except those in the RHS of the FD
Candidates(last, first, address, party)
Use the projection scheme to find the FDs of each.

28
Getting Stuff Back (Joins)

Pick some tuple in Candidates. We now know the
party. Use party to go find the matching tuples
in Parties(just one) and append the Candidate
attribute to it. Now we have the original tuple
back.
Pick some tuple in Parties. Now we have a party.
Find all matching candidates in candidates,
append the party tuple to it. Now we have the
original set of tuples back.

This is sometimes called an equijoin, because
you are joining tuples in Parties to tuples in
Candidates which match(are equal) in the party.
If we decompose a relation according to the above
algorithm, we are guaranteed that we can always
recover the original relation through join.

30
3NF

Compare with BCNF, it is relaxed a bit
A relation R is in 3NF if whenever A1, A2An-B
is a nontrivial FD, either A1, A2An is a
superkey, or B is a member of some key

Example we want to convert this schema into
BCNF
Offices(office, pollingplace, city), with
pollingplace-city
office, city-pollingplace
We claim that office, city is clearly a key and
that pollingplace, office is also a key
So it looks like pollingplace-city is a
violation, since pollingplace is not a superkey,
we split into
(pollingplace, city) and (office, pollingplace)
We now break the original FD office,
city-pollingplace
Now lets look at the offending FD in the context
of 3NF.

We would say that its ok because city a party
of a party of the key office, key, hence
pollingplace-city is acceptable even though
pollingplace is not a superkey.
The key point here is that 3NF relax BCNFs
requirement to allow relation schema like above.
It can be proved that 3NF is in fact adequate for
its purpose, that is we can always decompose a
relation schema in a way that does not lose
information, into schemas that are in 3NF and
allow all FDs to be checked.

33
Other normal forms

1NF every component of every tuple must be
atomic. No structures, or complex data structure,
even arrays. (SQL forces this)
2NF 1NFFDs permitted, no LHS of an FD can be a
proper subset of a key. The key determines all
the nonkey attributes
3NF 2NF
BVNF 3NF
4NF BCNF multivalued dependencies
5NF 4NF 3NF all keys are single attributes
(join-projection normal form(JPNF))
DKNF may be impossible

34
Multivalued Dependencies

A multivalud dependency (MVD) is an assertion
that two attributes or sets of attributes are
independent of one other.
We say that MVD A1A2An-- B1B1Bm holds for a
relation R if for all tuples, which we can view
as (A1A2An C1C2..Ck B1B2Bm), when we pick
(A1A2An) value, the (B1B2..Bm) value is
independent of (C1C2Ck) value

35
Reasoning about multivalued dependencies

Trivial dependencies rule
If MVD A1A2An--B1B2Bm holds for a relation,
then does A1A2An--C1C2Ck, where the Cs are
the Bs plus one or more of the As, or the Bs
minus one or more of the As with Ds are those
Bs that are not among As

Transitive rule
If MVD A1A2An--B1B2Bm and B1B2Bm-C1C2Ck
hold for a relation, then so does A1A2An--
C1C2Ck, however any Cs that are also Bs must
be deleted from the right side.

Complementation rule
If MVD A1A2An--B1B2Bm holds for a relation,
then R also satisfies A1A2An--C1C2Ck, where
the Cs are all attributes of R not among the As
and Bs.

Write a Comment

User Comments (0)