Week 10 October 31 - PowerPoint PPT Presentation

About This Presentation
Title:

Week 10 October 31

Description:

CDP-325 (Sony CD Changer) attributes of CD. attributes of receivers. attributes of cassette decks ... 10001 123456 John Smith CDPC725 Sony Disc Jockey CD 1 399.95 ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 43
Provided by: Kain6
Learn more at: https://www.csus.edu
Category:
Tags: october | sony | week

less

Transcript and Presenter's Notes

Title: Week 10 October 31


1
Week 10October 31
  • Extended ERD
  • Data Normalization

2
Problems with ER modeling
  • Fan traps - Pathway between two entities is
    ambiguous
  • Chasm traps - Pathway does not exist between
    certain entity occurrences
  • Inheritance - An entity receives its attributes
    from a class of attributes

Extended Entity Relationship (ERR) modeling
3
Connection Trap Fan Trap
What products belong to which product
categories? Which products have restricted use
aboard a plane?
Merchandise Lines Merchandise_line Description
Product Categories Product_category Merchandise_l
ine
Classify
Have
Products Product_code Description Merchandise_lin
e
4
Connection Trap Fan Trap
What products belong to which product
categories? Which products have restricted use
aboard a plane?
Merchandise Lines Merchandise_line Description
Product Categories Product_category Merchandise_l
ine
Classify
Have
Products Product_code Description Merchandise_lin
e
To satisfy these queries, we need to form a
relationship
5
Connection Trap Chasm Trap
What products belong to the same merchandise
line? Which products require a UL listing?
Merchandise Lines Merchandise_line Description UL
_listing
Product Categories Product_category Merchandise_l
ine
Classify
Have
Products Product_code Description Product_categor
y
To satisfy these queries, we need to form a
relationship
Known What merchandise lines are composed of
what products
6
EER modelingSuperclass and Subclass Entity Types
  • Superclass - Higher order of classification or
    categorization
  • Subclass - A member of a superclass that provides
    specification

Electronic Merchandise
Music and Videos
Superclasses
Audio
Visual
CD
Receiver
Cassette
attributes of CD
attributes of cassette decks
attributes of receivers
7
EER modelingSuperclass and Subclass Entity Types
  • Specialization - top-down
  • Maximizing differences between members by
    identifying distinguishing characteristics
  • Generalization - bottom-up

General
Electronic Merchandise
Audio
Visual
CD
Receiver
Cassette
Specific
attributes of CD
attributes of cassette decks
attributes of receivers
8
EER modelingSuperclass and Subclass Entity Types
  • Specialization - top-down
  • Generalization - bottom-up
  • Minimizing differences between entities by
    identifying common features

General
Electronic Merchandise
Audio
Visual
CD
Receiver
Cassette
Specific
attributes of CD
attributes of cassette decks
attributes of receivers
9
EER modelingAttribute Inheritance
CDP-325 (Sony CD Changer)
Attributes common to all audio merchandise are
inherited
General
Electronic Merchandise
Audio
Visual
CD
Receiver
Cassette
Specific
attributes of CD
attributes of cassette decks
attributes of receivers
10
EER Diagram
Product_code
Prod_descip
1
M
Products
Manufacturers
Sells
1
Disjoint constraint
Produces
Superclass/subclass
d
1
CD
Receiver
Cassette
Db range
Flutter
Watts
11
Constraints
  • Disjoint (d, o)
  • Entity can be a member of only one of the
    subclasses of specialization
  • Under non-disjoint, an entity can be a member of
    more than one subclass of specialization
  • Participation (partial or total)
  • Total - every entity in the superclass must be a
    member of a subclass in specialization
  • Partial - An entity need not belong to any of the
    subclasses of specialization

12
Data Normalization
  • The process of decomposing complex data
    structures into simple relations according to a
    set of dependency rules. McFadden and Hoffer

13
Data Normalization
  • The purpose of normalization is to produce a
    stable set of relations that is a faithful model
    of the operations of the enterprise.
  • Achieve a design that is highly flexible
  • Reduce redundancy
  • Ensure that the design is free of certain update,
    insertion and deletion anomalies Catherine
    Richardo, 1990

14
Normalization
1NF
2NF
Progressively putting the relation into a higher
normal form
3NF
BCNF
4NF
15
10001
Order No.
Stereos To Go Invoice
6 15 99
Date / /
Go, Hogs
0000-000-0000-0
Account No.
John Smith
Customer
2036-26 Street
Address
1/05
Sacramento CA 95819
City
State
Zip Code
6 18 99
Date Shipped / /
Item
Product
Product Description/Manufacturer
Qty
Price
Number
Code
1
SAGX730 Pioneer Remote A/V Receiver
1 1 1
56995 35995 39995
2
AT10 Cervwin Vega Loudspeakers
CDPC725 Sony Disc-Jockey CD Changer
3
4
5
132985 10000 10306 153291
Subtotal Shipping Handling Sales Tax Total
16
File-Based System
Invoice Program
Invoices
Customer Orders
Customer Account Program
Account
Customer Accounts
Report
File
Customer Mailings Program
Customer Mailing List
Mailing List
File
17
Data Redundancy
  • Customer Order File
  • PO number
  • Customer account number
  • Customer name, address, city, state, zip code
  • Order date
  • Product code, product description, price, unit
  • Customer Account File
  • Account Number
  • Customer name, mailing address, city, state, zip
    code
  • Customer Mailing List File
  • Customer name, mailing address, city, state, zip
    code

18
Unnormalized Relation
(Invoice_number, Invoice_date, Date_delivered,
Cust_account Cust_name Cust_addr Cust_city
Cust_state Zip_code, Item1 Item1_descrip
Item1_qty Item1_price, Item2 Item2_descrip
Item2_qty Item2_price, . . . , Item7
Item7_descrip Item7_qty Item7_price)
How would a program process the data to recreate
the invoice?
19
First Normal Form (1NF)
  • A relation is in first normal form if and only if
    every attribute is single-valued for each tuple.
  • Remove all repeating groups
  • Create a flat file

20
Unnormalized to 1NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account Cust_name Cust_addr Cust_city
Cust_state Zip_code, Item1, Item1_descrip,
Item1_qty, Item1_price, Item2, Item2_descrip,
Item2_qty, Item2_price, . . . , Item7,
Item7_descrip, Item7_qty, Item7_price)
Repeating groups
A flat file places all the data of a transaction
into a single record.
This is reminiscent of a COBOL or BASIC program
processing a single transaction with one read
statement.
21
Unnormalized to 1NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code, Item, Item_descrip,
Item_qty, Item_price)
Nominated group of attributes to serve as the
key (form a unique combination)
  • Eliminate the repeating groups.
  • Each row retains data for one item.
  • If a person bought 5 items, we would have five
    tuples

22
1NF
Flat File
Invoice number
Account number
Customer name
Item Quantity
Item Price
Description
Item
Account number
23
Second Normal Form (2NF)
  • A relation is in second normal form if and only
    if it is in first normal form and the nonkey
    attributes are fully functionally dependent on
    the key.
  • Full functional dependency
  • B is functionally dependent on A if each value of
    A is associated with exactly one value of B

Attribute B
Attribute A
Determinant
24
2NF
  • Second normal form applies to relations with
    composite keys (i.e., a primary key composed of
    two or more attributes)
  • A relation with a single attribute primary key is
    automatically in at least 2NF

25
From 1NF to 2NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code, Item, Item_descrip,
Item_qty, Item_price)
What attribute(s) can be used to uniquely
identify a tuple?
If the primary key consisted of invoice_number
and item (i.e., composite key), we would need to
remove the partial dependencies.
26
From 1NF to 2NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code, Item, Item_descrip,
Item_qty, Item_price)
Using Invoice number and Item as the key...
Some of the attributes are dependent upon
invoice_number for their values and others on
item. In either case, they are not functionally
dependent on the entire key.
27
From 1NF to 2NF
Which attributes are functionally dependent on
which keys?
Invoice_date, Date_delivered, Cust_account,
Cust_name, Cust_addr, Cust_city, Cust_state,
Zip_code, tem_descrip, Item_qty, Item_price
?
Invoice_number Vs. Item
28
From 1NF to 2NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code)
(Item, Item_descrip, Item_qty, Item_price)
Is this unique by itself? What happens if the
item is purchased more than once?
29
From 1NF to 2NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code)
Partial dependency
(Invoice_number, Item, Item_descrip, Item_qty,
Item_price)
Composite key (forms a unique combination)
30
From 1NF to 2NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code)
(Invoice_number, Item, Item_qty, Item_price)
(Item, Item_descrip)
31
From 1NF to 2NF
In contrast...
(Invoice_number, Invoice_date, Date_delivered, Cus
t_account Cust_name Cust_addr Cust_city
Cust_state Zip_code, Item Item_descrip
Item_qty Item_price)
If the primary key consisted of invoice_number
and Invoice_date (i.e., composite key), we would
NOT have partial dependencies. Thus, the
relation would be in 2NF.
32
Third Normal Form (3NF)
  • A relation is in third normal form if it is in
    second normal form and no nonkey attribute is
    transitively dependent on the key.
  • Remove transitive dependencies
  • Each nonkey attribute must depend upon the key,
    the whole key, and nothing but key. Kent,
    1978

33
From 2NF to 3NF
(Invoice_number, Invoice_date, Date_delivered,
Cust_account, Cust_name, Cust_addr, Cust_city,
Cust_state, Zip_code)
(Invoice_number, Item, Item_qty, Item_price)
(Item, Item_descrip)
Which attributes are dependent on others? Is
there a problem?
34
Transitive Dependencies and Anomalies
  • Insertion anomalies
  • To add a new row, all customer (name, address,
    city, state, zip code, phone) and products
    (description) must be consistent with previous
    entries
  • Deletion anomalies
  • By deleting a row, a customer or product may
    cease to exist
  • Modification anomalies
  • To modify a customers or products data in one
    row, all modifications must be carried out to all
    others

35
Insertion and Modification AnomaliesFor example
Insert a new Panasonic product
Product_code
Manufacturer_name
DVD-A110 Panasonic PV-4210 Panasonic PV-4250 Panas
onic
CT-32S35 PAN
Inconsistency
DVD-A110 Panasonic PV-4210 PanaSonic PV-4250 Pana
Sonic CT-32S35 PAN
Change all Panasonic products manufacturer name
to Panasonic USA
36
Deletion AnomalyFor Example
4377182 John Smith ??? Sacramento CA 95831 4398711
Arnold S ??? Davis CA 95691 4578461 Gray
Davis ??? Sacramento CA 95831 4873179 Lisa
Carr ??? Reno NV 89557
By deleting customer Arnold S, we would also be
deleting Davis, California.
37
Invoice_number Invoice_date Date_delivered Cust_ac
count Cust_name Cust_addr Cust_city Cust_state Zip
_code Item Item_descrip Invoice_numberItem Item_q
ty Item_price
Transitive Dependencies
  • A condition where A, B, C are attributes of a
    relation such that if A ? B and B ? C, then C
    is transitively dependent on A via B (provided
    that A is not functionally dependent on B or C).

38
Why Should City and State Be Separated from
Customer Relation?
  • City and state are dependent on zip code for
    their values and not the customers identifier
    (i.e., key). Zip_code ? City, State
  • Otherwise, Cust_account ? Cust_addr,
    Zip_code ? City, StateIn which case, you have
    transitive dependency.

39
3NF
Invoice Relation (Invoice_number, Invoice_date,
Date_delivered, Cust_account)
Customer Relation (Cust_account, Cust_name,
Cust_addr, Zip_code)
Zip_code Relation (Zip_code, City, State)
Invoice_items Relation (Invoice_number, Item,
Item_qty, Item_price)
Items Relation (Item, Item_descrip)
40
3NF
Invoice Relation (Invoice_number, Invoice_date,
Date_delivered, Cust_account)
Customer Relation (Cust_account, Cust_name,
Cust_addr, Zip_code)
Zip_code Relation (Zip_code, City, State)
Invoice_items Relation (Invoice_number, Item,
Item_qty, Item_price)
Items Relation (Item, Item_descrip)
Manufacturers Relation (Manuf_code, Manuf_name)
Since the Items relation contains the
manufacturers name in the description, a
separate Manufacturers relation can be created
41
First to Third Normal Form(1NF - 3NF)
  • 1NF A relation is in first normal form if and
    only if every attribute is single-valued for each
    tuple (remove the repeating or multi-value
    attributes and create a flat file)
  • 2NF A relation is in second normal form if and
    only if it is in first normal form and the nonkey
    attributes are fully functionally dependent on
    the key (remove partial dependencies)
  • 3NF A relation is in third normal form if it is
    in second normal form and no nonkey attribute is
    transitively dependent on the key (remove
    transitive dependencies)

42
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com