Title: Logical Data Modeling:
1Chapter 10
- Logical Data Modeling
- Normalization
- Physical Database Design Overview and
Denormalization
2Agenda
- Data Modeling in the SDLC and Outcomes
- What is Logical Data Modeling and Normalization
- Functional Dependency
- Normal Forms
- Translating ERDs to Normalized Relations
- View Integration
- Overview of Physical Database Design
- Denormalization
3Learning Objectives
- Understand key terminology
- Know where logical data modeling fits into
systems development - Be able to represent relationships through
relations - Be able to translate ERDs into normalized
relations - Be able to merge different sets of normalized
relations
4Where We Are in the SDLC
5Data Modeling within the SDLC
6Outcomes of Logical Database Design
- Normalized relations (see Examples)
- Note Account for every data element on or in
- system or process input (e.g., forms)
- system or process output (e.g., reports, query
screens) - data stores or E-R diagram
- Each data element must be kept in the systems
database or derived from data in the database
7Logical Database Design
- Based upon the conceptual data model
- Four key steps
- Develop a logical data model for each known user
interface for the application using normalization
principles. - Combine normalized data requirements from all
user interfaces into one consolidated logical
database model (view integration). - Translate the conceptual E-R data model for the
application into normalized data requirements. - Compare the consolidated logical database design
with the translated E-R model and produce one
final logical database model for the application.
8Physical Database Design
- Based upon results of logical database design
- Key decisions
- Choosing storage format for each attribute from
the logical database model - Grouping attributes from the logical database
model into physical records - Arranging related records in secondary memory
(hard disks and magnetic tapes) so that records
can be stored, retrieved and updated rapidly - Selecting media and structures for storing data
to make access more efficient
9Query Screen Example
10Report Example
11Integrated Set of Relations (Fig. 10-3c)
Notice that this attribute was in Query view
(Fig. 10-3a) but not Report view (Fig. 10-3b)
(derived from two previous examples)
12Conceptual Data Model and Transformed Relations
(Fig. 10-3d)
Customer Order Processing Application
13Final Set of Normalized Relations (Fig. 10-3e)
Notice that these attributes were in ER diagram
(Fig. 10-3d) but not in integrated view from
Query and Report (Fig. 10-3c)
(derived from two previous examples)
Notice that this attribute was in integrated view
from Query and Report (Fig. 10-3c) but not in ER
diagram (Fig. 10-3d)
14What Is Logical Data Modeling
- Translating conceptual data models into a format
consistent with the architecture used by the data
management software to be used with the
application - Normalization
- analysis of functional dependencies between data
items to result in a structure of data that is
simple, stable, and fundamental
15Well-Structured Relations
- Avoid Anomalies (Errors, Inconsistencies,
Problems) - Insertion
- Deletion
- Modification
16Anomalies Example of Non-Simple Relation
Primary Key
- Although Emp_ID and Course make each row unique,
the table is not simple - Data in each instance not just related to primary
key - It should be possible to add a new employee
without supplying course data, but not here
(thus, an insertion anomaly) - It should be possible to delete a course without
losing employee data, but not here (thus, a
deletion anomaly) - Notice redundant data errors could occur on
modifying data, e.g., salary (thus, a
modification anomaly) - Table contains data on more than one entity use
separate, simpler tables
17Example of Simpler Relations No Redundancy
Primary Key
Primary Key
- Each instance contains only data related to the
primary key - Anomalies less likely
- Insertion of employee does not require course
data - Deleting course would not affect data in employee
table - Less redundancy makes modification errors less
likely
18Data Normalization
- Primarily a tool to validate and improve a
logical design so that it satisfies certain
constraints that avoid unnecessary duplication of
data. - The process of decomposing relations with
anomalies to produce smaller, well-structured
relations.
19Functional Dependency
- For a relation (table), attribute A depends on
attribute B if for every valid row the value of B
determines the value of A B A - E.g.
- Student ID Student name
- Order No Product No Quantity ordered
20Normal Forms
- First normal form
- No multi-valued attributes.
- Every attribute value is atomic.
- Second normal form
- 1NF and every non-key attribute is fully
functionally dependent on the primary key. - Every non-key attribute must be defined by the
entire key, not by only part of the key. - No partial functional dependencies.
- Third normal form
- 2NF and no transitive dependencies (functional
dependency between non-key attributes.)
21Relation with transitive dependency
(a) SALES relation with simple data
22Removing a transitive dependency
(a) Decomposing the SALES relation
23Transforming E-R Diagrams into Relations
- 1. Map Regular Entities to Relations.
- Composite attributes Use only their simple,
component attributes. - Multi-valued Attribute - Becomes a separate
relation with a foreign key taken from the
superior entity.
24Mapping a composite attribute
(a) CUSTOMER entity type with composite attribute
25(b) CUSTOMER relation with address detail
26Transforming E-R Diagrams Into Relations
- 2. Map Weak Entities
- Becomes a separate relation with a foreign key
taken from the superior entity.
27Example of mapping a weak entity
(a) Weak entity DEPENDENT
28(b) Relations resulting from weak entity
29Transforming E-R Diagrams Into Relations
- 3. Map Binary Relationships
- One-to-Many - Primary key on the one side becomes
a foreign key on the many side - Many-to-Many - Create a new relation with the
primary keys of the two entities as its primary
key - One-to-One - Primary key on the mandatory side
becomes a foreign key on the optional side
30Example of mapping a 1M relationship
(a) Relationship between customers and orders
31(b) Mapping the relationship
32Example of mapping an MN relationship
(a) Requests relationship (MN)
33(b) Three resulting relations
34Mapping a binary 11 relationship
(a) Binary 11 relationship
35(b) Resulting relations
36Transforming E-R Diagrams Into Relations
- 4. Map Associative Entities
- Identifier Not Assigned
- Default primary key for the association relation
is composed of the primary keys of the two
entities - Identifier Assigned
- It is natural and familiar to end-users.
- Default identifier may not be unique.
37Mapping an associative entity with an identifier
(a) Associative entity (SHIPMENT)
38(b) Three relations
39Transforming E-R Diagrams Into Relations
- 5. Map Unary Relationships
- One-to-Many - Recursive foreign key in the same
relation - Many-to-Many - Bill-of-materials Two relations
- One for the entity type.
- One for an associative relation in which the
primary key has two attributes, both taken from
the primary key of the entity
40Transforming E-R Diagrams Into Relations
- 6. Map Ternary (and n-ary) Relationships
- One relation for each entity and one for the
associative entity
41Mapping a ternary relationship
(a) Ternary relationship with associative entity
42(b) Mapping the ternary relationship
43View Integration / Merging Relations
- Building relations from independent views or
data models - Issues that arise
- Synonyms
- Homonyms
- Transitive dependencies
- ISA relationships (supertypes/subtypes)
44Physical File and Database Design
- The following information is required
- Normalized relations, including volume estimates
- Definitions of each attribute
- Descriptions of where and when data are used,
entered, retrieved, deleted, and updated
(including frequencies) - Expectations or requirements for response time
and data integrity - Descriptions of the technologies used for
implementing the files and database - Must be collected during Analysis
45Designing Fields
- Field
- Smallest unit of named application data
recognized by system software - Attributes from relations will be represented as
fields - Data Type
- A coding scheme recognized by system software for
representing organizational data - Choosing data types
- Four objectives
- Minimize storage space
- Represent all possible values of the field
- Improve data integrity of the field
- Support all data manipulations desired on the
field - Calculated fields
- A field that can be derived from other database
fields
46Methods of Controlling Data Integrity
- Default Value
- A value a field will assume unless an explicit
value is entered for that field - Range Control
- Limits range of values that can be entered into
field - Referential Integrity
- An integrity constraint specifying that the value
(or existence) of an attribute in one relation
depends on the value (or existence) of the same
attribute in another relation - Null Value
- A special field value, distinct from 0, blank, or
any other value, that indicates that the value
for the field is missing or otherwise unknown
47Denormalization
- Implementing relations in non-normalized form
- Why?
- Optimize processing efficiency at cost of
managing anomalies - combine into one table data needed together
- separate a table which has several uses
- Partition over several servers
48Indexed File Organization
- A file organization in which rows are stored
either sequentially or nonsequentially and an
index is created that allows software to locate
individual rows - Index A table used to determine the location of
rows in a file that satisfy some condition
49Guidelines for Choosing Indexes
- Specify a unique index for the primary key of
each table. - Specify an index for foreign keys.
- Specify an index for nonkey fields that are
referenced in qualification, sorting and grouping
commands for the purpose of retrieving data.
50Summary Four Key Steps in Logical Database
Modeling
- Develop a logical data model for each known user
interface (form and report) using normalization
principles - Combine normalized data requirements from all
interfaces (view integration) - Translate E-R data model (developed without
explicit concern for interfaces) into normalized
data requirements - Compare results of 2 and 3 and produce, through
view integration, one final model