Introduction to Database Design - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Database Design

Description:

... hair color Accomplish normalization by analyzing the interdependencies ... for relational database systems. ... data retrieval Best for use in ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 60
Provided by: ITSu48
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Database Design


1
Introduction to Database Design
July 2006 Ken Nunes knunes _at_ sdsc.edu

2
Database Design Agenda
  • Introductions
  • General Design Considerations
  • Entity-Relationship Model
  • Normalization
  • Overview of SQL
  • Star Schemas
  • Additional Information
  • QA

3
General Design Considerations
  • Users
  • Application Requirements
  • Legacy Systems/Data

4
Users
  • Who are they?
  • Administrative
  • Scientific
  • Technical
  • Impact
  • Access Controls
  • Interfaces
  • Service levels

5
Application Requirements
  • What kind of database?
  • OnLine Analytical Processing (OLAP)
  • OnLine Transactional Processing (OLTP)
  • Budget
  • Platform / Vendor
  • Workflow?
  • order of operations
  • error handling
  • reporting

6
Legacy Systems/Data
  • What systems are currently in place?
  • Where does the data come from?
  • How is it generated?
  • What format is it in?
  • What is the data used for?
  • Which parts of the system must remain static?

7
Entity - Relationship Model
  • A logical design method which emphasizes
    simplicity and readability.
  • Basic objects of the model are
  • Entities
  • Relationships
  • Attributes

8
Entities
  • Data objects detailed by the information in the
    database.
  • Denoted by rectangles in the model.

Employee
Department
9
Attributes
  • Characteristics of entities or relationships.
  • Denoted by ellipses in the model.

Employee
Department
Name
SSN
Name
Budget
10
Relationships
  • Represent associations between entities.
  • Denoted by diamonds in the model.

Employee
Department
works in
Name
SSN
Name
Budget
Start date
11
Relationship Connectivity
  • Constraints on the mapping of the associated
    entities in the relationship.
  • Denoted by variables between the related
    entities.
  • Generally, values for connectivity are expressed
    as one or many

Employee
Department
1
N
work
Name
SSN
Name
Budget
Start date
12
Connectivity
one-to-one
Department
Manager
1
1
has
one-to-many
Department
Project
N
1
has
many-to-many
Employee
Project
N
M
works on
13
ER example
  • Retailer wants to create an online webstore.
  • The retailer requires information on
  • Customers
  • Items
  • Orders

14
Webstore Entities Attributes
  • Customers - name, credit card, address
  • Items - name, price, inventory
  • Orders - item, quantity, cost, date, status

cost
date
status
price
Name
credit card
Orders
Items
Customers
name
address
item
quantity
inventory
15
Webstore Relationships
  • Identify the relationships.
  • The orders are recorded each time a customer
    purchases items, so the customer and order
    entities are related.
  • Each customer may make several purchases so the
    relationship is one-to-many

1
N
Customer
Order
purchase
16
Webstore Relationships
  • Identify the relationships.
  • The order consists of the items a customer
    purchases but each item can be found in multiple
    orders.
  • Since a customer can purchase multiple items and
    make multiple orders the relationship is many to
    many.

N
M
Order
Item
consists
17
Webstore ER Diagram
credit card
name
address
Customers
1
purchase
date
status
N
Orders
Items
consists
M
N
item
quantity
cost
name
price
inventory
18
Logical Design to Physical Design
  • Creating relational SQL schemas from
    entity-relationship models.
  • Transform each entity into a table with the key
    and its attributes.
  • Transform each relationship as either a
    relationship table (many-to-many) or a foreign
    key (one-to-many and many-to-many).

19
Entity tables
Transform each entity into a table with a key and
its attributes.
Employee
create table employee (emp_no number, name
varchar2(256), ssn number, primary key
(emp_no))
Name
SSN
20
Foreign Keys
  • Transform each one-to-one or one-to-many
    relationship as a foreign key.
  • Foreign key is a reference in the child (many)
    table to the primary key of the parent (one)
    table.

create table department (dept_no number, name
varchar2(50), primary key (dept_no))
Department
1
has
create table employee (emp_no number, dept_no
number, name varchar2(256), ssn
number, primary key (emp_no), foreign key
(dept_no) references department)
N
Employee
21
Foreign Key
Department
Accounting has 1 employee Brian Burnett Human
Resources has 2 employees Nora Edwards Ben
Smith IT has 3 employees Ajay Patel John
OLeary Julia Lenin
Employee
22
Many-to-Many tables
  • Transform each many-to-many relationship as a
    table.
  • The relationship table will contain the foreign
    keys to the related entities as well as any
    relationship attributes.

Project
create table project_employee_details (proj_no
number, emp_no number, start_date
date, primary key (proj_no, emp_no), foreign
key (proj_no) references project foreign key
(emp_no) references employee)
N
Start date
has
M
Employee
23
Many-to-Many tables
Project
Project_employee_details
Employee
Employee Audit has 1 employee Brian
Burnett Budget has 2 employees Julia
Lenin Nora Edwards Intranet has 3
employees Julia Lenin John OLeary Ajay Patel
24
Normalization
  • A logical design method which minimizes data
    redundancy and reduces design flaws.
  • Consists of applying various normal forms to
    the database design.
  • The normal forms break down large tables into
    smaller subsets.

25
First Normal Form (1NF)
  • Each attribute must be atomic
  • No repeating columns within a row.
  • No multi-valued columns.
  • 1NF simplifies attributes
  • Queries become easier.

26
1NF
Employee (unnormalized)
Employee (1NF)
27
Second Normal Form (2NF)
  • Each attribute must be functionally dependent on
    the primary key.
  • Functional dependence - the property of one or
    more attributes that uniquely determines the
    value of other attributes.
  • Any non-dependent attributes are moved into a
    smaller (subset) table.
  • 2NF improves data integrity.
  • Prevents update, insert, and delete anomalies.

28
Functional Dependence
Employee (1NF)
Name, dept_no, and dept_name are functionally
dependent on emp_no. (emp_no -gt name, dept_no,
dept_name) Skills is not functionally dependent
on emp_no since it is not unique to each emp_no.
29
2NF
Employee (1NF)
Employee (2NF)
Skills (2NF)
30
Data Integrity
Employee (1NF)
  • Insert Anomaly - adding null values. eg,
    inserting a new department does not require the
    primary key of emp_no to be added.
  • Update Anomaly - multiple updates for a single
    name change, causes performance degradation. eg,
    changing IT dept_name to IS
  • Delete Anomaly - deleting wanted information.
    eg, deleting the IT department removes employee
    Barbara Jones from the database

31
Third Normal Form (3NF)
  • Remove transitive dependencies.
  • Transitive dependence - two separate entities
    exist within one table.
  • Any transitive dependencies are moved into a
    smaller (subset) table.
  • 3NF further improves data integrity.
  • Prevents update, insert, and delete anomalies.

32
Transitive Dependence
Employee (2NF)
Dept_no and dept_name are functionally dependent
on emp_no however, department can be considered a
separate entity.
33
3NF
Employee (2NF)
Employee (3NF)
Department (3NF)
34
Other Normal Forms
  • Boyce-Codd Normal Form (BCNF)
  • Strengthens 3NF by requiring the keys in the
    functional dependencies to be superkeys (a column
    or columns that uniquely identify a row)
  • Fourth Normal Form (4NF)
  • Eliminate trivial multivalued dependencies.
  • Fifth Normal Form (5NF)
  • Eliminate dependencies not determined by keys.

35
Normalizing our webstore (1NF)
orders
items
order_id cust_id item_id quantity cost date status
405 45 34 2 100 2/306 shipped
405 45 35 1 50 2/306 shipped
405 45 56 3 75 2/306 shipped
408 78 56 2 50 3/5/06 refunded
410 102 72 2 150 3/10/06 shipped
410 102 81 1 175 3/10/06 shipped
item_id name price inventory
34 sweater red 50 21
35 sweater blue 50 10
56 t-shirt 25 76
72 jeans 75 5
81 jacket 175 9
customers
cust_id name address credit_card_num credit_card_type
45 Mike Speedy 123 A St. 45154 visa
45 Mike Speedy 123 A St. 32499 mastercard
45 Mike Speedy 123 A St. 12834 discover
78 Frank Newmon 2 Main St. 45698 visa
102 Joe Powers 343 Blue Blvd. 94065 mastercard
102 Joe Powers 343 Blue Blvd. 10532 discover
36
Normalizing our webstore (2NF 3NF)
customers
credit_cards
cust_id name address
45 Mike Speedy 123 A St.
78 Frank Newmon 2 Main St.
102 Joe Powers 343 Blue Blvd.
cust_id num type
45 45154 visa
45 32499 mastercard
45 12834 discover
78 45698 visa
102 94065 mastercard
102 10532 discover
37
Normalizing our webstore (2NF 3NF)
items
item_id name price inventory
34 sweater red 50 21
35 sweater blue 50 10
56 t-shirt 25 76
72 jeans 75 5
81 jacket 175 9
order details
orders
order_id item_id quantity cost
405 34 2 100
405 35 1 50
405 56 3 75
408 56 2 50
410 72 2 150
410 81 1 175
order_id cust_id date status
405 45 2/306 shipped
408 78 3/5/06 refunded
410 102 3/10/06 shipped
38
Revisit webstore ER diagram
address
Customers
Credit card
have
1
name
N
1
card number
card type
purchase
N
status
Orders
date
name
price
inventory
1
consists
quantity
N
M
N
Items
Order details
consists
cost
39
Structured Query Language
  • SQL is the standard language for data definition
    and data manipulation for relational database
    systems.
  • Nonprocedural
  • Universal

40
Data Definition Language
  • The aspect of SQL that defines and manipulates
    objects in a database.
  • create tables
  • alter tables
  • drop tables
  • create views

41
Create Table
name
address
create table customer (cust_id number, name
varchar(50) not null, address varchar(256) not
null, primary key (cust_id)) create table
credit_card (cust_id number not null,
credit_card_type char(5) not null,
credit_card_num number not null, foreign key
(cust_id) references customer)
Customer
1
have
N
Credit card
card number
card type
42
Modifying Tables
alter table customer modify name
varchar(256) alter table customer add
credit_limit number drop table customer
43
Data Manipulation Language
  • The aspect of SQL used to manipulate the data in
    a database.
  • queries
  • updates
  • inserts
  • deletes

44
Data Manipulation Language
  • The aspect of SQL used to manipulate the data in
    a database.
  • queries
  • updates
  • inserts
  • deletes

45
Select command
  • Used to query data from database tables.
  • Format
  • Select ltcolumnsgt From lttablegt
  • Where ltconditiongt

46
Query example
customers
cust_id name address
45 Mike Speedy 123 A St.
78 Frank Newmon 2 Main St.
102 Joe Powers 343 Blue Blvd.
Select name from customers result Mike
Speedy Frank Newmon Joe Powers
47
Query example
customers
cust_id name address
45 Mike Speedy 123 A St.
78 Frank Newmon 2 Main St.
102 Joe Powers 343 Blue Blvd.
select name from customers where address 123 A
St. result Mike Speedy
48
Query example
customers
credit_cards
cust_id name address
45 Mike Speedy 123 A St.
78 Frank Newmon 2 Main St.
102 Joe Powers 343 Blue Blvd.
cust_id num type
45 45154 visa
45 32499 mastercard
45 12834 discover
78 45698 visa
102 94065 mastercard
102 10532 discover
select from customers where customers.cust_id
credit_cards.cust_id and type
visa returns
Cust_id Name Address Cust_id Num type
45 Mike Speedy 123 A St. 45 45154 visa
78 Frank Newmon 2 Main St. 78 45698 visa
49
Changing Data
There are 3 commands that change data in a
table. Insert insert into lttablegt (ltcolumnsgt)
values (ltvaluesgt) insert into customer
(cust_id, name) values (3, Fred
Flintstone) Update update lttablegt set
ltcolumngt ltvaluegt where ltconditiongt update
customer set name Mark Speedy where cust_id
45 Delete delete from lttablegt where
ltconditiongt delete from customer where cust_id
45
50
Star Schemas
  • Designed for data retrieval
  • Best for use in decision support tasks such as
    Data Warehouses and Data Marts.
  • Denormalized - allows for faster querying due to
    less joins.
  • Slow performance for insert, delete, and update
    transactions.
  • Comprised of two types tables facts and
    dimensions.

51
Fact Table
  • The main table in a star schema is the Fact
    table.
  • Contains groupings of measures of an event to be
    analyzed.
  • Measure - numeric data

Invoice Facts
units sold unit amount total sale price
52
Dimension Table
  • Dimension tables are groupings of descriptors and
    measures of the fact.
  • descriptor - non-numeric data

Customer Dimension
Time Dimension
cust_dim_key name address phone
time_dim_key invoice date due date delivered date
Location Dimension
Product Dimension
loc_dim_key store number store address store
phone
prod_dim_key product price cost
53
Star Schema
The fact table forms a one to many relationship
with each dimension table.
Customer Dimension
Time Dimension
1
1
cust_dim_key name address phone
time_dim_key invoice date due date delivered date
Invoice Facts
N
N
cust_dim_key loc_dim_key time_dim_key prod_dim_ke
y units sold unit amount total sale price
Product Dimension
Location Dimension
N
prod_dim_key product price cost
N
loc_dim_key store number store address store
phone
1
1
54
Analyzing the webstore
  • The manager needs to analyze the orders obtained
    from the webstore.
  • From this we will use the order table to create
    our fact table.

Order Facts
date items customers
55
Webstore Dimension
We have 2 dimensions for the schema customers
and items.
Item Dimension
Customer Dimension
item_dim_key name price inventory
cust_dim_key name address credit_card_type
56
Webstore Star Schema
Order Facts
date items customers
N
N
1
1
Item Dimension
Customer Dimension
item_dim_key name price inventory
cust_dim_key name address credit_card_type
57
Books and Reference
  • Database Design for Mere Mortals,
  • Michael J. Hernandez
  • Information Modeling and Relational Databases,
  • Terry Halpin
  • Database Modeling and Design,
  • Toby J. Teorey

58
Continuing Education
UCSD Extension Data Management Courses DBA
Certificate Program Database Application
Developer Certificate Program
59
Data Central
  • The Data Services Group provides Data Allocations
    for the research community.
  • http//datacentral.sdsc.edu/
  • Tools and expertise for making data collections
    available to the broader scientific community.
  • Provide disk, tape, and database storage
    resources.
Write a Comment
User Comments (0)
About PowerShow.com