Functional Dependencies FDs - PowerPoint PPT Presentation

About This Presentation
Title:

Functional Dependencies FDs

Description:

We now know the party. ... We claim that 'office, city' is clearly a key ... 'city' a party of a party of the key 'office, key', hence pollingplace- city is ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 38
Provided by: csNorth
Category:

less

Transcript and Presenter's Notes

Title: Functional Dependencies FDs


1
Functional Dependencies (FDs)
2
  • A function dependency on a relation means that
    some attribute is a function of a group of other
    attributes A4 f(A1, A2, A3)
  • Notice that functions are single-valued, so if
    two tuples agree on A1, A2, A3, then they will
    also agree on A4. Its this forced agreement that
    is the functional dependency.

3
  • The function may also be multiple attributes
  • (A3, A5, A6) f(A1, A2, A3)
  • We write A1, A2, A3- A4, A5, A6

4
Keys of Relations
  • A1, A2, A3 - all other attributes
  • No subset of (A1, A2, A3) - all other attributes
  • A key must be minimal
  • We cant throw out an attribute from this key and
    have it remains a key
  • This is the smallest key
  • A key is a minimal-sized set of attributes that
    functionally determine all the other attributes
    of a relation
  • In ER, keys need not be minimal. In the
    relational data model, keys must be minimal

5
Superkey
  • A superkey is a set of attributes that contains a
    key
  • ES relations key is the key attributes for the
    ES
  • E-R-F many-many Rs key contains the key
    attributes from both E and F
  • E-R-F many-one Rs key contains the key
    attributes only from E
  • EF one-one Rs key contains the key from
    either E or F (not unique)

6
Rules for FDs and Reasoning about them
  • A1, A2, A3 - B1, B2, B3 is shorthand for
  • Splitting Rule going from the multi-valued form
    to the list of single-valued form
  • A1, A2, A3-B1
  • A1, A2, A3-B2
  • Combining Rule going from single-valued form to
    multi-valued

7
Trivial dependencies
  • A FD A1, A2 An - B is said to be trivial if B
    is one of the As. for example,
  • Title, year - title
  • In general A1, A2, A3 - B1, B2, B3
  • Trivial if the Bs are a subset of As
  • Nontrivial if at least one B is not an A
  • Completely nontrivial no B is an A
  • Any trivial dependency can be assumed

8
Computing the Closure of Attributes
  • Given a set of attributes AA1, A2, A3 and a
    set of FDs S
  • We can think of A as a subset of the attributes
    of a relation R, and the FDs S as being FDs of
    that relation R
  • The closoure of A under S is the set of
    attributes B, such that every relation that
    satisfies the FDs in S also satisfies A1 A2 A3 -
    B. (we want to compute the B!)

9
  • A set TA. Now apply all your FDs in S that match
    the attributes you have in T.
  • Their right-hand-side will take you to new
    attributes. Union those attributes you have back
    into T. Repeat.
  • Keep going until you can introduce no new
    attributes.
  • Then T is your enclosure of A under S

10
  • Given closure, we can now determine if a new FD
    follows from existing FDs
  • FOLLOWS FROM TEST
  • Lets test A1, A2, A3 - B
  • Find A1, A2, A3
  • If it contains B then A1, A2, A3 - B follows
    from the FDs in S

11
Closure and Keys
  • A1, A2, A3 contains all the attributes of R
    iff A1 A2 A3 is a superset for R
  • Given a superkey, we can test its subsets to see
    if they are also superkeys. If no subesets exits,
    then the superkey must be a key
  • Proof of why closure works omitted

12
Transitivity
  • If A1, A2, A3-B1, B2 and B1, B2-C1, C2 then A1,
    A2, A3-C1, C2
  • GIVEN versus DERIVED FDs
  • BASIS a set FDs from which all the FDs can be
    derived.

13
PROJECTION
  • Suppose we have a relation R with some FDs F, and
    we project R by eliminating certain attributes
    from the schema and get S. What FDs hold in S?
  • Computing all FSs that
  • Follow from F, and
  • Involve only attributes of S
  • The calculation is exponential for a large number
    of such FDs (many of them may be redundant since
    they follow from other such FDs)

14
All Inference Rules
  • Reflexivity if B1, B2, B3 is a subset of A1,
    A2, A3, then A1, A2, A3-B1, B2, B3
  • Augmentation if A1, A2, A3- B1, B2 B3 then A1,
    A2, A3, C1, C2- B1, B2 B3 C1, C2 for all C1, C2
  • Transitivity A1, A2- B1, B2 and B1, B2- C1, C2
    then A1, A2- C1, C2

15
Relational Data Model(schema design and normal
forms)
  • Recall that a schema is a template for a relation
    and a relation is just a table, as we saw from
    SQL days at the beginning of class
  • Except here we use the original relational model
    in which a relation is a set and not a bag

16
Projection Rehash
  • Given a relation R and its FDs F (A-B A
    determine B
  • We can use the FDs to find the keys and superkeys
  • We can talk about a minimal basis of FDs from
    which we can derive all the others using our
    inference rules
  • We can use closure to figure out all that the FDs
    give us to know, given some initial set of knowns
  • But what does this means when it comes to the
    data in the relation?

17
  • Lets say you have two tuples (a, b, c) and you
    project out the third attributes.
  • Now we have (a, b), (a, b). But you no longer
    have a set because you have duplicates.
  • So, projection means that you turn the input back
    into a set.
  • Multiple tuples in the original relation can get
    flattened into a single tuple in the projected
    relation

18
What about FDs?
  • We would expect that fewer FDs (or the same
    number) would hold in the projected relation than
    in the original one. What are they?
  • They are the ones that follow from the original
    Fds, but that involve only attributes of the new
    relation.

19
Design
  • After we hav finished doing ER-relation, or we
    have done relation to start with, we have a set
    of relations, with each relation having a set of
    associated FDs.
  • Now well talk about what it means to normalize
    each of those relations.

20
Normalization
  • Take each relation and its FDs and convert it
    into a set of relations, each with its own FDs
    such that the relations make it easy to avoid
    anomalies.
  • Avoiding anomalies boils down to avoiding
    redundancy and duplication while still preserving
    all the meaning in the originals.

21
Theres a range of normal forms for a relation
schema
  • 5NF4NFBCNF3NF2NF1NF
  • The book will talk about BCNF first, then 3NF and
    then 4NF
  • The handout talks about the other normal forms

22
  • Exmaple schema
  • Candidate(lastname, firstname, address, party,
    partychair, partyaddress)
  • Last, first, address-party, chair
  • party-partychair
  • Party-partyaddress
  • We will pretend that (lastname, firstname,
    address) is the key

23
Anomoly
  • Redundancy
  • Information repeated in more than one tuple
  • Updated anomalies
  • What is information is repeated in more than one
    tuple, but when you update that tuple you forget
    to update the others? (broken an FD)
  • Deletion anomalies
  • What if information is repeated in more than one
    tuple and you delete all of tuples? (where is
    the information now? We may loose other
    information as a side effect)

24
Decomposing Relations
  • Take some relation and turn it into two relations
  • R(A1, A2An) can be decomposed into
  • S(B1, B2Bm) and T(C1, C2Ck)
  • If we union the attributes of the two relations,
    we get the set of attributes f the original
    relation
  • B1, B2Bm UC1, C2CkA1, A2A n
  • The tuples in S are projections onto B1, B2 This
    means there could be fewer tuples in S than in R.
    Similar for T

25
BCNF
  • A relation R is in BCNF iff whenever there is a
    nontrivial FD A1, A2An-B for R, it is the case
    that A1, A2An is a superkey for R
  • Recall that a superkey need not be minimal, so an
    equivalent statement of BCNF is that the left
    side of every nontrivial FD must contain a key

26
How do we convert a relation into BCNF?
  • We dont. We decompose it into relations that
    seperately are in BCNF, then we project the
    tuples into those relations
  • For example, the party-partychair is a violating
    FD
  • How to decompose?

27
  • Augment the FD with other FDs RHS that are
    determinded by the LHS or subset of it
  • We have party-partychair and party-partyaddress,
    so well create party-partychair, partyaddress
  • Make two new schemas by PROJECTION
  • contain all the attributes of the FD
  • Parties(party, partychair, partyaddress)
  • Contain all the attributes in the originl
    relation except those in the RHS of the FD
  • Candidates(last, first, address, party)
  • Use the projection scheme to find the FDs of each.

28
Getting Stuff Back (Joins)
  • Pick some tuple in Candidates. We now know the
    party. Use party to go find the matching tuples
    in Parties(just one) and append the Candidate
    attribute to it. Now we have the original tuple
    back.
  • Pick some tuple in Parties. Now we have a party.
    Find all matching candidates in candidates,
    append the party tuple to it. Now we have the
    original set of tuples back.

29
  • This is sometimes called an equijoin, because
    you are joining tuples in Parties to tuples in
    Candidates which match(are equal) in the party.
  • If we decompose a relation according to the above
    algorithm, we are guaranteed that we can always
    recover the original relation through join.

30
3NF
  • Compare with BCNF, it is relaxed a bit
  • A relation R is in 3NF if whenever A1, A2An-B
    is a nontrivial FD, either A1, A2An is a
    superkey, or B is a member of some key

31
  • Example we want to convert this schema into
    BCNF
  • Offices(office, pollingplace, city), with
  • pollingplace-city
  • office, city-pollingplace
  • We claim that office, city is clearly a key and
    that pollingplace, office is also a key
  • So it looks like pollingplace-city is a
    violation, since pollingplace is not a superkey,
    we split into
  • (pollingplace, city) and (office, pollingplace)
  • We now break the original FD office,
    city-pollingplace
  • Now lets look at the offending FD in the context
    of 3NF.

32
  • We would say that its ok because city a party
    of a party of the key office, key, hence
    pollingplace-city is acceptable even though
    pollingplace is not a superkey.
  • The key point here is that 3NF relax BCNFs
    requirement to allow relation schema like above.
  • It can be proved that 3NF is in fact adequate for
    its purpose, that is we can always decompose a
    relation schema in a way that does not lose
    information, into schemas that are in 3NF and
    allow all FDs to be checked.

33
Other normal forms
  • 1NF every component of every tuple must be
    atomic. No structures, or complex data structure,
    even arrays. (SQL forces this)
  • 2NF 1NFFDs permitted, no LHS of an FD can be a
    proper subset of a key. The key determines all
    the nonkey attributes
  • 3NF 2NF
  • BVNF 3NF
  • 4NF BCNF multivalued dependencies
  • 5NF 4NF 3NF all keys are single attributes
  • (join-projection normal form(JPNF))
  • DKNF may be impossible

34
Multivalued Dependencies
  • A multivalud dependency (MVD) is an assertion
    that two attributes or sets of attributes are
    independent of one other.
  • We say that MVD A1A2An-- B1B1Bm holds for a
    relation R if for all tuples, which we can view
    as (A1A2An C1C2..Ck B1B2Bm), when we pick
    (A1A2An) value, the (B1B2..Bm) value is
    independent of (C1C2Ck) value

35
Reasoning about multivalued dependencies
  • Trivial dependencies rule
  • If MVD A1A2An--B1B2Bm holds for a relation,
    then does A1A2An--C1C2Ck, where the Cs are
    the Bs plus one or more of the As, or the Bs
    minus one or more of the As with Ds are those
    Bs that are not among As

36
  • Transitive rule
  • If MVD A1A2An--B1B2Bm and B1B2Bm-C1C2Ck
    hold for a relation, then so does A1A2An--
    C1C2Ck, however any Cs that are also Bs must
    be deleted from the right side.

37
  • Complementation rule
  • If MVD A1A2An--B1B2Bm holds for a relation,
    then R also satisfies A1A2An--C1C2Ck, where
    the Cs are all attributes of R not among the As
    and Bs.
Write a Comment
User Comments (0)
About PowerShow.com