DBS201: Introduction to Normalization - PowerPoint PPT Presentation

About This Presentation
Title:

DBS201: Introduction to Normalization

Description:

DBS201: Introduction to Normalization – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 44
Provided by: BarbaraC172
Category:

less

Transcript and Presenter's Notes

Title: DBS201: Introduction to Normalization


1
DBS201 Introduction to Normalization

2
Agenda
  • Top Down vs Bottom Up
  • What is Normalization?
  • Why Normalization?
  • Normalization Steps

3
Top Down vs Bottom Up
  • Top Down
  • Usually provided just a narrative or very high
    level data requirements
  • Need to discover entities, attributes,
    relationships
  • Result is tables

4
Top Down vs Bottom Up
  • Bottom Up
  • Provided with views of data
  • Views can be screen shots or reports (printouts)
  • Views contain fields (data)
  • Need to groups fields together find fields that
    are in common
  • Result is tables

5
Agenda
  • Top Down vs Bottom Up
  • What is Normalization?
  • Why Normalization?
  • Normalization Steps

6
What is normalization?
  • Normalization
  • Process for evaluating and correcting table
    structures to minimize data redundancies
  • helps eliminate data anomalies
  • Can be used in conjunction with ER modeling to
    produce a good database design

7
What is normalization?
  • Works through a series of stages called normal
    forms
  • Normal form (1NF)
  • Second normal form (2NF)
  • Third normal form (3NF)

8
What is normalization?
  • 2NF is better than 1NF
  • 3NF is better than 2NF
  • For most business database design purposes, 3NF
    is highest we need to go in the normalization
    process
  • Highest level of normalization is not always most
    desirable

9
Agenda
  • Top Down vs Bottom Up
  • What is Normalization?
  • Why Normalization?
  • Normalization Steps

10
Why normalization?
  • Example company that manages building projects
  • Charges its clients by billing hours spent on
    each contract
  • Hourly billing rate is dependent on employees
    position
  • Periodically, a report is generated that contains
    information displayed as in Table 5.1

11
A Sample Report Layout
12
A Table in the Report Format
13
Why normalization?
  • Structure of data set in Figure 5.1 does not
    handle data very well
  • The table structure appears to work report is
    generated with ease
  • Unfortunately, the report may yield different
    results, depending on what data anomaly has
    occurred

14
Agenda
  • Top Down vs Bottom Up
  • What is Normalization?
  • Why Normalization?
  • Normalization Steps

15
Conversion to First Normal Form
  • Relational table must not contain repeating
    groups
  • Repeating group
  • Derives its name from the fact that a group of
    multiple (related) entries can exist for any
    single key attribute occurrence
  • Normalizing the table structure will reduce these
    data redundancies
  • Normalization is three-step procedure

16
Step 1 Eliminate the Repeating Groups
  • Present data in a tabular format, where each cell
    has a single value and there are no repeating
    groups
  • Eliminate repeating groups by eliminating nulls,
    making sure that each repeating group attribute
    contains an appropriate data value

17
First Normal Form
18
Step 2 Identify the Primary Key
  • Primary key must uniquely identify attribute
    values (a row)
  • Primary key is PROJ_NUM, EMP_NUM (because the
    combination of those two uniquely identifies each
    row of the table)

19
Step 3 Identify all Dependencies
  • Dependencies can be depicted with the help of a
    diagram
  • Dependency diagram
  • Depicts all dependencies found within a given
    table structure
  • Helpful in getting birds-eye view of all
    relationships among a tables attributes
  • Use makes it much less likely that an important
    dependency will be overlooked

20
Dependency Diagram
1NF
PROJ_NUM PROJ_NAME EMP_NUM EMP_NAME JOB_CLASS
CHG_HOUR HOURS
21
First Normal Form
  • Tabular format in which
  • All key attributes are defined
  • There are no repeating groups in the table
  • All attributes are dependent on primary key
  • All relational tables satisfy 1NF requirements
  • Some tables contain partial dependencies
  • Dependencies based on only part of the primary
    key
  • Still subject to data redundancies

EMPLOYEE_PROJECT(PROJ_NUM(PK), EMP_NUM(PK),
PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, HOURS)
22
Conversion to Second Normal Form
  • Relational database design can be improved by
    converting the database into second normal form
    (2NF)
  • Two steps

23
Step 1 Identify All Key Components
  • Determine which attributes are dependent on which
    other attributes
  • Using the dependency diagram, document the
    partial dependencies in other words take each
    part of the primary key and document which
    attributes are dependent on each part of the
    primary key

24
Dependency Diagram
2NF
1NF
PROJ_NUM PROJ_NAME EMP_NUM EMP_NAME JOB_CLASS
CHG_HOUR HOURS
25
Step 2 Identify the Dependent Attributes
  • Write each key component on separate line, and
    then write the original (composite) key on the
    last line
  • Each component will become the key in a new table

PROJECT (PROJ_NUM (PK), PROJ_NAME) EMPLOYEE
(EMP_NUM(PK), EMP_NAME, JOB_CLASS,
CHG_HOUR) EMPLOYEE_PROJECT(PROJ_NUM(PK),
EMP_NUM(PK), HOURS)
26
Second Normal Form
  • Table is in second normal form (2NF) if
  • It is in 1NF and
  • It includes no partial dependencies
  • No attribute is dependent on only a portion of
    the primary key

27
Conversion to Third Normal Form
  • Data anomalies created are easily eliminated by
    completing these steps

28
Step 1 Identify Each New Determinant
  • For every transitive dependency, write its
    determinant as a PK for a new table
  • Determinant
  • Any attribute whose value determines other values
    within a row
  • Using the dependency diagram, document the
    transitive dependencies in other words identify
    the attributes dependent on each determinant
    identified above and identify the dependency

29
Dependency Diagram
3 NF
2NF
1NF
PROJ_NUM PROJ_NAME EMP_NUM EMP_NAME JOB_CLASS
CHG_HOUR HOURS
JOB_CLASS is a determinant because it can
determine other values within the row. In this
case, its the CHG_HOUR
30
Step 2 Name the table
  • Name the table to reflect its contents and
    function

PROJECT (PROJ_NUM (PK), PROJ_NAME) EMPLOYEE
(EMP_NUM(PK), EMP_NAME) JOB (JOB_CLASS(PK),
CHG_HOUR) EMPLOYEE_PROJECT(PROJ_NUM(PK),
EMP_NUM(PK), HOURS)
31
Third Normal Form
  • A table is in third normal form (3NF) if
  • It is in 2NF and
  • It contains no transitive dependencies

32
Improving the Design
  • Table structures are cleaned up to eliminate the
    troublesome initial partial and transitive
    dependencies
  • Normalization cannot, by itself, be relied on to
    make good designs
  • It is valuable because its use helps eliminate
    data redundancies

33
Improving the Design (continued)
  • The following changes should be made
  • PK assignment
  • Naming conventions
  • Attribute atomicity
  • Adding attributes
  • Adding relationships (will define the FKs)
  • Refining PKs
  • Eliminate derived attributes

34
Business Rules
  • Business Rules drive the relationships
  • Look at the data in the table and
    understand/interpret what the relationship is
    between the data

35
Business Rules
  • A job class can have more than 1 employee in it
    results in a 1M relationships between JOB and
    EMPLOYEE
  • Add the FKs into the appropriate tables

36
Step 2 Name the table
  • Name the table to reflect its contents and
    function

PROJECT (PROJ_NUM (PK), PROJ_NAME) EMPLOYEE
(EMP_NUM(PK), EMP_NAME, JOB_CLASS(FK)) JOB
(JOB_CLASS(PK), CHG_HOUR) EMPLOYEE_PROJECT(PROJ_N
UM(PK), EMP_NUM(PK), HOURS)
New foreign key
?Business rule not necessary between EMPLOYEE and
PROJECT. Why?
37
First Normal Form
38
Normalization and Database Design
  • Normalization should be part of design process
  • Make sure that proposed entities meet required
    normal form before table structures are created
  • ER diagram
  • Provides the big picture, or macro view, of an
    organizations data requirements and operations
  • Created through an iterative process
  • Identifying relevant entities, their attributes
    and their relationship
  • Use results to identify additional entities and
    attributes

39
ERD
40
Normalization and Database Design (continued)
  • Normalization procedures
  • Focus on the characteristics of specific entities
  • A micro view of the entities within the ER
    diagram
  • Difficult to separate normalization process from
    ER modeling process
  • Two techniques should be used concurrently

41
Summary
  • Normalization is a table design technique aimed
    at minimizing data redundancies
  • First three normal forms (1NF, 2NF, and 3NF) are
    most commonly encountered
  • Normalization is an important partbut only a
    partof the design process
  • Continue the iterative ER process until all
    entities and their attributes are defined and all
    equivalent tables are in 3NF

42
Normalization Exercise
STU_NUM STU_LNAME STU_MAJOR DEPT_CODE DEPT_NAME DEPT_PHONE
211343 Stephanos Accounting ACCT Accounting 4356
200128 Smth Accounting ACCT Accounting 4356
199876 Jones Marketing MKTG Marketing 4378
199876 Ortiz Marketing MKTG Marketing 4378
223456 McKulski Statistics MATH Mathematics 3420
Normalize the above table. 1NF eliminate
repeating groups, identify a PK for the table
structure 2NF find partial dependencies 3NF
- find transitive dependencies Write final table
structures, including all relationships. Make
all attributes atomic.
43
Normalization Exercise
Vehicle Num Model Year Acme TireNumber Tire Manfctr Size KM This Vehicle
11 Ford 1997 1327 Goodyear 15R7 43000
      1328 Goodyear 15R7 43000
      1329 Goodyear 15R7 25000
      1330 Goodyear 15R7 24000
15 Chevrolet 1999 2013 BF Goodrich 15R7 18000
      2014 BF Goodrich 15R7 18000
      2013 BF Goodrich 15R7 29000
      2014 BF Goodrich 15R7 29000
Normalize the above table. 1NF eliminate
repeating groups, identify a PK for the table
structure 2NF find partial dependencies 3NF
- find transitive dependencies Write final table
structures, including all relationships. Make
all attributes atomic.
Write a Comment
User Comments (0)
About PowerShow.com