Database Design, a Practical Guide - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Database Design, a Practical Guide

Description:

Database Design: A Practical Guide. Ask questions as we go ... Database Design: A Practical Guide. If you thought this talk was going to be about indexing ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 75
Provided by: gusbjr
Category:

less

Transcript and Presenter's Notes

Title: Database Design, a Practical Guide


1
Database Design, a Practical Guide
  • Click to add subtitle

Gus Björklund (gus_at_progress.com)
Wizard, Progress Software Corporation
2
Ask questions as we goif I am not being
clear.Warning there is a mistake in these
slides.
3
Rules are made to be broken
To every rule, there is an exception!
4
If you thought this talk was going to be about
indexing
It isnt. Nor is it about performance.
5
Topics
  • Theory
  • What is Database Design
  • Basic Elements
  • Representing the Model as Tables
  • Practice
  • An Example
  • Some Other Topics

6
First, a little theory
7
What do we mean by database design?
  • A process for defining a model of a subset of the
    real1 world, then representing it as data in
    tables in a relational databaseAt least, thats
    the definition we will use for the purposes of
    this talk.

1 Well, for small values of real, anyway.
8
Basic Elements
What do we put in our model?
  • Just 3 Things
  • Entities
  • Attributes
  • Relationships

The entity-relationship model was described by
Peter Chen in 1976. See http//bit.csc.lsu.edu/c
hen/chen.html
9
Basic Elements Entities
  • Can be thought of as nouns
  • People
  • author, composer, performer, seller, buyer
  • Places
  • home, IP address, URL, destination, factory,
    store
  • Things
  • song, recording, instrument, car, invoice

Is telephone number a place or a thing?
10
Basic Elements Attributes
Entities have attributes
  • Can be thought of as adjectives (but only
    loosely)
  • Length
  • Color
  • Horsepower
  • Part number
  • Song Title
  • Publication Date
  • Size
  • Fabric
  • Owner

Is telephone number a attribute or an entity?
11
Basic Elements Relationships
Entities are connected by relationships
  • Can be thought of as verbs
  • has a
  • owns
  • contains
  • supervises
  • performs
  • called
  • sold
  • purchased
  • proved

Is telephone number a relationship?
12
Relationships have attributes too
In May, 1995, Andrew Wiles published a proof of
Fermats Last Theorem
13
Relationships have attributes too
In May, 1995, Andrew Wiles published a proof of
Fermats Last Theorem
attribute
entity
relationship
entity
14
What goes in an entity
  • Identifying attributes
  • Must be able to uniquely identify the entity
  • Can have more than one way to id
  • Id can be composite
  • Descriptive attributes
  • the values you need to keep track of
  • generally should be simple, not complex

15
What to include in your model
  • The things your application has to keep track of
  • Telephones, wires, switches
  • The actions your application or its users perform
  • Make calls, send telephone bills, collect
    payments
  • Some attributes of the things and actions
  • Originating number, date and time of call,
    duration, called number
  • Keep it simple
  • Be accurate
  • Keep it up to date

16
What to include in your model
  • Consider the goals of the system
  • Everything you include should be there for a
    reason you can state
  • in no more than two sentences
  • Everything should have a clear name
  • if you cant name it, it doesnt belong
  • Talk to the stakeholders !!!

17
What to leave out of your model
  • The real world has properties that dont matter
    (to your application)
  • The real world has relationships that dont
    matter
  • Things happen in the real world that dont matter
  • Keep it simple
  • If you cant say why you need it, leave it out

18
Logical vs Physical Data Models
  • Logical entities often require multiple tables to
    represent them
  • Tables can be thought of as logical or physical
  • It depends on your point of view
  • There is also the physical storage database
    layout
  • storage areas
  • data extents
  • disks
  • etc.
  • We arent going to talk about the physical
    database layout
  • We will talk about tables

19
Mapping Your Model to a Database
Simply put,
  • Entities become tables
  • Identifiers become indexes
  • Attributes become columns
  • Data types pick appropriate
  • Relationships become tables or foreign keys

20
In theory, there is no difference betweentheory
and practice, but in practice there is.Jan van
de Snepscheut
21
Now for some practice.
22
An example
  • Music store
  • Buys compact disc recordings from distributors
  • Has inventory
  • Allows customers to search for what they want
  • Maybe in an in-store kiosk or on the web
  • Sells compact discs to customers

23
What should we do first?
24
Activities
  • We buy discs from a distributor
  • Orders are sent to a distributor
  • Orders are delivered to the store
  • Orders may be cancelled
  • We sell discs to customers in sales transactions
  • Customers buy discs in sales transactions
  • Customers search for what they want to buy

Which of these must be remembered by the system?
25
What do we need to keep track of
  • Discs we have
  • Discs we sold
  • Discs we know about and can get
  • Discs we have ordered
  • Information needed to do our income tax
  • what we paid for stock
  • when we bought it
  • what we sold it for
  • when we sold it

26
Disc entities
  • UPC Code 8697-07416-2
  • Manufacturer Sony BMG
  • Cost to us 2.00
  • Price charged 17.95
  • Tax charged 0.80
  • Date purchased March 19, 2007
  • Date sold June 9, 2007

27
Disc table might look like this
28
Whats wrong?
  • Is upc a unique identifier?
  • Might have bought from a distributor
  • Have no information about what is on the disc
  • How do customers search?
  • Dont know when disc was made
  • Could be more than one tax jurisdiction
  • provincial tax, city tax
  • Dont know if disc is on order
  • Dont know who bought it
  • Duplicated data
  • Etc., etc.

29
Disc entities take 2
  • UPC Code 8697-07416-2
  • Manufacturer Sony BMG
  • Distributor Bobs Wholesale CDs
  • Cost to us 2.00
  • Price charged 17.95
  • Tax charged 0.80
  • Date ordered March 19, 2007
  • Date received March 20, 2007
  • Date sold June 9, 2007
  • Disc Title The Essential Joshua Bell
  • Artist Joshua Bell
  • Track 1 Danse Russe
  • Track 2 Violin Concerto in E Minor
  • Track 3 Nocturne in C-sharp Minor
  • etc.

30
Example Now Whats wrong?
  • This is getting messy
  • Activities combined with discs attributes
  • Have duplicated information
  • How many tracks can there be?
  • What if there is more than one artist?
  • Dont have all the information a customer might
    want to use to search

31
Discs revisited
  • Discs have titles
  • Discs have pictures on the cover
  • Discs contain tracks
  • Discs are made by manufacturers
  • Discs are purchased from distributors
  • Discs are ordered from distributors
  • Discs are delivered to the store
  • Discs are sold to customers

32
Discs contain tracks
  • Tracks contain songs
  • Tracks occur in order
  • Tracks have a duration
  • Songs are performed in performances
  • Songs have performers (usually)
  • Songs have composers
  • Songs have names (titles)
  • Songs have a key (but not always)
  • Performances are done by performers
  • Performers can be groups (bands, orchestras,
    etc.)
  • Performances are performed in a location or venue

33
We seem to need these entities
  • Discs
  • Manufacturers
  • Distributors
  • Orders
  • Customers
  • Inventory
  • Tracks
  • Songs
  • Performers
  • Groups ?

34
Songs have names (titles).Are names properties
of songs?Or are they entities related to
songs?Or are they something else?
35
Song data (track 1)
36
Song data (track 2)
37
Performance data
38
Performance data take 2
39
Performer data
40
Performance to Performer Relationship
41
Performance data take 3
42
Track to Performance Relationship
43
Relationships (so far)
track
performance
one to one
performer
performance
performance
disc
performer
track
performance
track
one to many
many to many
track
44
What happened to Songs?
45
Relationships (take 2)
song
performance
track
song
one to many
performance
one to one
performance
disc
performer
track
performance
track
performance
one to many
performer
track
performance
many to many
46
Relationships (take 3)
47
What aboutbusiness entities?Where are they?
48
Business entities
Here is one kind of business entity
49
Business entities
Here is a different kind of business entity
50
Business entities
Here is still another kind of business entity
51
Should you use arrays?
52
Indexes
  • Enforce uniqueness
  • Make searches faster
  • Enable fast retrieval of entities by their
    identities
  • Enable finding entities with certain attributes

53
What indexes do we needfor the music store
database?
54
Tables
0) Discs1) Tracks2) Songs3) Performers4)
Performances5) Tracks of discs6) Performances
of songs7) Performers of performances
55
What indexes do we need
0) Indexes for identifying attributes1) A unique
row identifier2) Indexes for the queries you
will do
56
What should we do next ?
57
Other Topics
  • Normalization
  • Unique keys
  • Word indexes
  • Naming
  • Customisation

58
Normalization
  • Oversimplified, it means
  • Dont duplicate data
  • Attributes should be simple
  • have only one value
  • be necessary
  • not derived data
  • dont repeat
  • Complicated attributes are often entities in
    their own right
  • For example, addresses might be

59
Unique keys
  • EVERY table must have a unique key
  • EVERY row needs a unique identifier
  • that never changes even if moved to another
    database (i.e. if you replicate)
  • Often, users dont need to see it
  • Use a UUID or sequence or maybe datetime
  • Unique key is the ONLY way to identify rows
    unambiguously
  • ROWIDs are temporary and can change
  • Use the same method throughout
  • Youll be glad you did

60
Word indexes
  • Can be used to hold multiple status or attribute
    values
  • Conflicts with normalisation
  • Flexible
  • Easy to add new ones
  • Queries are fast
  • Example
  • Category classical, violin, orchestral, concerto

61
Naming
Good names are crucial to understanding
  • What is in the column GL01262 ?

62
Naming
Good names are crucial to understanding
  • Table and column names should have clear meanings
    everyone can understand
  • GL01262 vs dateEntered
  • Names with dashes cause inconvenience with SQL
  • order-date
  • Booleans should be named for truth value
  • backOrdered
  • No double negations
  • notOutOfStock

63
Making tables customizable
We will look at 4 ways
  • Spare columns
  • Separate table with spare columns
  • Separate table with name/value pairs
  • Name/value pairs in word-indexed column

64
Table and columns
65
Spare columns in table
66
Spare columns in table
What data types should you use? How many spare
columns? Wasted columns when not used How do you
know what each spare got used for? How do you
know how many unused spares you have?
67
Separate table for spare columns
68
Separate table for spare columns
69
Separate table with name/value pairs
70
Name/value pairs in word-indexed column
71
Modeling Tools
  • PCase
  • Enterprise Architect
  • Power Designer
  • ConceptDraw
  • Erwin
  • Rational

Pencil and paper !
Blackboard !
72
Summary
  • Understand the requirements
  • Leave out what is not needed
  • Review the design with stakeholders
  • Evolve the design as changes come up
  • Test to make sure it works
  • Can it do everything that is needed?
  • Does it perform adequately?
  • Expect changes to come

73
Homework
  • Papers
  • Wiles, A. "Modular elliptic curves and Fermat's
    Last Theorem, Annals of Mathematics 141 (3)
    443-551
  • Chen, P. The Entity-Relationship Model --
    Toward a Unified View of Data, ACM TODS Vol 1,
    No 1, 1976
  • Wikipedia articles to start from
  • entity-relationship model
  • data model
  • Books
  • Teorey, Lightstone, Nadeau Database Modeling
    and Design, Morgan Kaufmann.

74
Questions
?
Write a Comment
User Comments (0)
About PowerShow.com