CS457 Functional Dependencies - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

CS457 Functional Dependencies

Description:

Design relations so there can be an equijoin with a PK and a FK or ... What is the most importance concept in relational schema design? Functional Dependencies ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 38
Provided by: susanv5
Category:

less

Transcript and Presenter's Notes

Title: CS457 Functional Dependencies


1
  • CS457 Functional Dependencies
  • and Normalization
  • Chapter 10 

2
 Relation Schema Goodness
  • Logical level - relations and views
  • Storage level - relations as files
  •       
  •  Placing one set of attributes in a table is
    better than placing them in other tables. Why?

3
Schema design
  • Design the schema so it is easy to explain the
    semantics
  • semantics the meaning associated with the
    attributes
  • want to minimize
  • storage space
  • redundant information

4
Informal Guidelines
  • Semantics should be clear
  • Reduce the redundant values
  • Reduce null values
  • Disallow spurious tuples

5
    Semantics
  • Do not combine attributes from gt 1
    entity/relationship type Fig 10.3  
  • Reduce the redundant values
  • Design schema so no anomalies occur
  • Update anomalies insert, delete, modify
  •  

6
Updates
  • Insertion
  • if insert new employee into EMP_DEPT and no
    department yet? Fig 10.3
  • If create a new department if no employee?
  • If add employee in department?

7
Updates
  • Deletion
  • if delete last employee of a department?
  • Modification
  • if change the values of a particular
    department?
  • Design schemas so no anomalies occur but what
    about performance?
  • Must always do join between employee and
    department
  • In general it is best if specify joins as views
    so anomaly free

8
Reduce null values
  • Avoid placing attributes in a base relation
    whose values may be null for a majority of tuples
  • If use null values can mean different things
  • "fat" tuples - if many attributes and lots of
    nulls wastes space
  • aggregate functions are a problem with nulls

9
Disallow spurious tuples
  • Spurious tuples represent wrong information that
    is not valid
  • Result of joins with equality conditions on
    attributes that are not PKs or FKs
  • Design relations so there can be an equijoin with
    a PK and a FK or no spurious tuples 
  • Lossless join guarantees no spurious tuples
  • Fig 10.5, 10.6 join on plocation

10
Functional Dependencies
  • What is the most importance concept in relational
    schema design?
  •     Functional Dependencies
  • Formal concepts and theory to define goodness of
    relational schemas
  • Functional dependency FD between 2 sets of
    attributes as
  • X -gt Y
  • Constraint on the possible tuples that can form a
    relation instance

11
 Functional Dependencies
  • X -gt Y means
  • X functionally determines Y
  • Y depends on X
  • Values of Y component depend on, determined
    by values of X component

12
Functional Dependencies
  • Given t1 and t2
  • if t1X t2 X then t1Y t2 Y (1)
  • In other words if the values of X are equal, then
    Y value are equal
  • Values of X component uniquely (functionally)
    determine values of Y component iff (1)

13
Example
  • for example city, address -gt zipcode
  • ssn -gt name
  • if X is a candidate key implies X -gt Y
  • if X -gt Y, does this imply Y -gt X?
  • dont know - FD is a property of semantics
  • dependency is a constraint
  • if satisfy FD, instances are legal relation
    instances (extension)

14
FDs - set F
  • describes a relation instance
  • constraints must hold at all times
  • property of relation schema not a particular
    extension
  • therefore, it cannot be automatically deduced, it
    must be defined explicitly by designer

15
Normalization
  • Normalization of data - method for analyzing
    schemas
  • Unsatisfactory schemas decomposed into smaller
    ones with desirable properties
  • Objectives of normalization
  • good relation schemas disallowing update
    anomalies

16
Formal framework
  • database normalized to any degree (1, 2, 3, 4, 5,
    etc.)
  • normalization is not done in isolation
  • need
  • lossless join
  • dependency preservation
  • additional normal forms meet other desirable
    criteria

17
Normal Forms
  • 1st, 2nd, 3rd, BCNF consider only FD and key
    constraints
  • constraints must not be hard to understand or
    detect
  • need not normalize to highest form (e.g. for
    performance reasons)

18
1NF - 1st normal form
  • part of the formal definition of a relation
  • disallow multivalued attributes, composite
    attributes and their combination
  • In 1NF single (atomic, indivisible) values

19
Example
  • There are 2 ways to look at
  • DEPARTMENT (dnumber, dlocations) Fig.
    10.8        
  • dlocations is a set of values
  • dnumber -gt dlocations, but dlocations is not in
    1NF
  • dlocations atomic values
  • dnumber does not functionally determine
    dlocations

20
How to resolve this?
  • Nested relation - multivalued composite
    attributes Fig. 10.9
  • research attempts to allow and formalize nested
    relations
  • Oracle 9i allows it  
  • Normalize it to 1NF

21
Normalize into 1NF?
  • How to normalize nested relations into 1NF?
  • Remove nested relation (or set-valued) attributes
    into new relation
  • propagate PK
  • combine PK and partial PK
  • recursively unnest - multilevel nesting
  • useful in converting hierarchical schemes into
    1NF

22
 Difficulties with 1NF
  • insert, delete, update
  •  Determine if describe entity identified by PK?
  • If not, called non-full FDs
  • we need full FDs for good inserts, deletes,
    updates

23
 Second Normal Form - 2NF
  • Uses the concepts of FDs, PKs and this
    definition
  • An FD is a Full functional dependency if
  • given Y -gt Z
  • Removal of any attribute from Y means the FD does
    not hold any more

24
2NF
  • Examples Fig. 10.10
  • ssn, pnumber -gt hours
  • is a full FD since neither
  • ssn -gt hours nor pnumber -gt hours holds
  • ssn, pnumber -gt ename is not a full FD 
  • (it is called a partial dependency)
  • since ssn -gt ename also holds

25
2NF
  • A relation schema R is in 2NF if
  • Relation is in 1NF
  • Every non-prime attribute A in R is fully
    functionally dependent on the primary key
  • Definition Prime attribute - attribute that is
    a member of the primary key K
  • R can be decomposed into 2NF relations via the
    process of 2NF normalization
  • Remove partial dependencies
  • create new relations where partials are full

26
2NF problems
  • even if no partial dependencies problems with
    insert, delete, modify
  • why?
  • Transitive dependencies
  • Given a set of attributes Z, where Z is not a
    subset of any key and
  • X is a key
  • Both X -gt Z and Z -gt Y
  • then we have a transitive dependency Fig. 10.10b
  •  

27
 3rd Normal Form (3NF)
  • No non-prime attribute is transitively dependent
    on a primary key and the table is in 2NF
  • intuitively, this means we need independent
    entity facts steps for normalization
  • disallow partial and transitive dependency on
    primary keys

28
Examples
  • Examples Fig 10.10
  • ssn -gt dmgrssn is a transitive FD since
  • ssn -gt dnumber and dnumber -gt dmgrssn
  • ssn -gt ename is non-transitive
  • since
  • there is no set of attributes X
  • where ssn -gt x and x -gt ename
  •  

29
3NF 
  • A relation schema R is in 3NF if
  • it is in 2NF
  • no non-prime attribute A in R is transitively
    dependent on the primary key
  • R can be decomposed into 3NF relations via the
    process of 3NF normalization

30
General Normal Form Definitions
  • The above definitions consider the primary key
    only (can be gt 1 column)
  • The following more general definitions take into
    account relations with multiple candidate keys

31
2NF  
  • A relation schema R is in 2NF if every non-prime
    attribute A in R is fully functionally dependent
    one every key (including candidate keys of R)
    Fig. 10.11
  • County_name and lot are candidate keys

32
3NF
  • Definition
  • a superkey of relation schema R - a set of
    attributes S of R that contains a key of R
  • A relation schema R is in 3NF if whenever
  • X -gt A  holds in R
  • then either
  • a) X is a superkey of R or
  • b) A is a prime attribute of R
  • a) means every non-prime attribute is fully
    functionally dependent on every key
  • b) means no transitive dependencies on any
    key Fig. 10.11

33
 Example of 3NF
  • Example

34
Normal forms
  • Each normal form is strictly stronger than the
    previous one
  • every 2NF relation is in 1NF
  • every 3NF relation is in 2NF

35
Goal
  • The goal is to have each relation in 3NF
  • Additional criteria may be needed to ensure the
    set of relations in a relational database are
    satisfactory
  • lossless joins
  • dependency preservation property

36
Additional normal forms
  • 4NF - based on multi-valued dependencies
  • No table may contain 2 or more 1N or NM
    relationships that are not directly related
  • ename - gt pname, ename -gt dep_name
  • 5NF - based on join dependencies as a way to
    decompose relations
  • If cant decompose with lossless join into 2
    tables, decompose into 3 tables
  • Join dependency Supplier, part, project
  • Must join all of these to get a valid tuple

37
Decomposition
  • Relational database schema design is synthesis
    and decomposition
  • synthesis - grouping attributes together
  • decomposition - avoiding transitive and partial
    dependencies
  • strict decomposition - start with a universal
    relation
  • OR
  • ER model mapped to a set of relations
  • Maps to 3NF

38
Good design
  • A "good" design is not simple individual
    relations in a higher normal form
  • also a set of relations with characteristics such
    as
  • attribute preservation - each attribute appears
    once (at least)
  • dependency preservation - each dependency is a
    constraint to enforce a join
  • union of dependencies holds - does not guarantee
    a lossless join
Write a Comment
User Comments (0)
About PowerShow.com