Database Design, Implementation, and Programming - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Database Design, Implementation, and Programming

Description:

INSERT INTO STUDENT VALUES (3, 'Johnson', 'STAT', 'USA', 'Evert', 'Sam', 'Philosophy' ... INSERT INTO ADVISOR VALUES (3, 'Evert', 'Sam', 'Philosophy' ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 54
Provided by: davis7
Category:

less

Transcript and Presenter's Notes

Title: Database Design, Implementation, and Programming


1
RiceCAP workshop, June 4 9, 2006 Noble
Foundation Ardmore, Oklahoma
Basic data relations for markers, traits, and
genotypes
Clare Nelson Dept. of Plant Pathology, Kansas
State University
2
Plan
  • Databases what and why?
  • What's a relational database?
  • Designing a database for Student data
  • Identifying the main data elements
  • Identifying their relations
  • Creating, populating, and querying a MySQL
    database
  • Lab Do it yourself (with guidance!)

3
What you'll know after lecture and lab
  • What are data relations?
  • What is a relational database (RDB)?
  • What's a schema?
  • and how do we design one?
  • How do we get data into and out of a RDB?
  • What's SQL and how do we use it?
  • And a free bonus tip ? ? ? ?

4
How to use your SQL knowledge in the real world
5
What is a database?
  • (Abstract) A structure for storage and retrieval
    of data
  • (Concrete) An instance of such a structure,
    containing data
  • Is an Excel spreadsheet a database?
  • Is a WWW page a database?
  • What about a telephone directory?

6
Is a telephone directory a database?
  • Yes, in a limited way.
  • It is designed for just one query
  • Find the address and telephone number of person
    X.
  • What if we want
  • the names of all residents in the 1700 block of
    Main St.
  • the address with telephone number 222-2222

7
For our purposes, what is a DB?
  • A software structure for data that explicitly
    describes all the relations between the data
    types.
  • We'll call this a relational database or RDB.
  • What's a data type? What's a relation?
  • We'll see by example.

8
What software manages a RDB?
  • Microsoft Access, Oracle, Sybase, MySQL,
    PostgreSQL, DB2, Informix...
  • are database management systems (DBMSs).
  • They are not databases
  • A DBMS is software used to create, populate,
    destroy, query, and administer DBs.

9
Creating a simple database
  • What data do I have (in Excel, say)?

---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 0 Guo GEN China
Jannink Jean-Luc Agronomy
1 Mosquera PLPTH Colombia Valent
Barbara Plant Path 2 Smith
STAT USA Nelson Lloyd
Statistics 3 Johnson STAT
USA Evert Sam
Philosophy 3 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
10
Let's build a MySQL database
  • The MySQL server is running on my PC
  • At the command line, type
  • mysql -u ltusernamegt -p
  • enter password
  • I'm now running the MySql DBMS.
  • show databases
  • create database ltXXXgt
  • use ltXXXgt
  • show tables

11
Table
  • The unit of data organization in a RDB
  • Describes a data type or entity
  • Each column describes an attribute
  • Sample table (a very bad one!)

---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 0 Guo GEN China
Jannink Jean-Luc Agronomy
1 Mosquera PLPTH Colombia Valent
Barbara Plant Path 2 Smith
STAT USA Nelson Lloyd
Statistics 3 Johnson STAT
USA Evert Sam
Philosophy 3 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
12
Creating a table
  • CREATE TABLE STUDENT
  • ( ID INT(4) NOT NULL,
  • Name VARCHAR(10),
  • Major VARCHAR(5),
  • Nationality VARCHAR(10),
  • Advisor_lname VARCHAR(10),
  • Advisor_fname VARCHAR(10),
  • Advisor_dept VARCHAR(10)
  • )

13
Viewing the table structure
  • describe STUDENT

---------------------------------------------
--------------- Field Type
Null Key Default Extra
--------------------------------------------
---------------- ID int(4)
NO Name
varchar(10) YES
Major varchar(5) YES
Nationality
varchar(10) YES
Advisor_lname varchar(10) YES
Advisor_fname
varchar(10) YES
Advisor_dept varchar(10) YES
--------------------------------
----------------------------
14
Populating the STUDENT table
  • INSERT INTO STUDENT VALUES (0, 'Guo', 'GEN',
    'China', 'Nelson', 'Clare', 'Plant Path')
  • INSERT INTO STUDENT VALUES (0, 'Guo', 'GEN',
    'China', 'Jannink', 'Jean-Luc', 'Agronomy')
  • INSERT INTO STUDENT VALUES (1, 'Mosquera',
    'PLPTH', 'Colombia', 'Valent', 'Barbara', 'Plant
    Path')
  • INSERT INTO STUDENT VALUES (2, 'Smith', 'STAT',
    'USA', 'Nelson', 'Lloyd', 'Statistics')
  • INSERT INTO STUDENT VALUES (3, 'Johnson', 'STAT',
    'USA', 'Evert', 'Sam', 'Philosophy')
  • INSERT INTO STUDENT VALUES (3, 'Johnson', 'STAT',
    'USA', 'Nelson', 'Lloyd', 'Statistics')

15
Querying the database
  • SQL (Structured Query Language)
  • is how we
  • insert
  • query
  • delete
  • alter
  • sort
  • calculate
  • and do all other data operations

16
Use SELECT for querying
  • select from STUDENT

---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 0 Guo GEN China
Jannink Jean-Luc Agronomy
1 Mosquera PLPTH Colombia Valent
Barbara Plant Path 2 Smith
STAT USA Nelson Lloyd
Statistics 3 Johnson STAT
USA Evert Sam
Philosophy 3 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
17
A query with conditions
---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 0 Guo GEN China
Jannink Jean-Luc Agronomy
1 Mosquera PLPTH Colombia Valent
Barbara Plant Path 2 Smith
STAT USA Nelson Lloyd
Statistics 3 Johnson STAT
USA Evert Sam
Philosophy 3 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
Show names and majors of students whose
nationality is Chinese or whose major is STAT.
  • SELECT
  • FROM
  • WHERE
  • OR

Name, Major STUDENT Nationality 'China' Major
'STAT'
18
Is this a "good" database?
---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 0 Guo GEN China
Jannink Jean-Luc Agronomy
1 Mosquera PLPTH Colombia Valent
Barbara Plant Path 2 Smith
STAT USA Nelson Lloyd
Statistics 3 Johnson STAT
USA Evert Sam
Philosophy 3 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
  • Efficiently maintained and queried?
  • data easily added, dropped, changed?
  • If you correct or update repeated data, how can
    you be sure you've found every instance of the
    bad data?
  • With multiple records for a student, how can you
    be sure you've found the information you want?

19
To reduce redundancy pack multiple data into
fields?
---------------------------------------------
----------------------------------------- ID
Name Major Nationality Advisor_lname
Advisor_fname Advisor_dept
-------------------------------------------
-------------------------------------------
0 Guo GEN China Nelson,
Jannink Clare, Jean-Luc Plant Path, Agronomy
3 Smith STAT USA Nelson
Lloyd Statistics
etc. etc. ---------------------------------
---------------------------------------------
  • Why is this a bad idea?
  • It's hard to get at the attributes of the packed
    data
  • Adding/removing data requires updating old data

20
Why not just add attributes?
  • Must know number of advisors in advance of
    designing the database
  • Can't easily find all advisors of student X
  • Advisor data what's it doing in the same table
    with student, anyway?

21
Uncoupling unrelated data
---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 1 Guo GEN China
Jannink Jean-Luc Agronomy
2 Mosquera PLPTH Colombia Valent
Barbara Plant Path 3 Smith
STAT USA Nelson Lloyd
Statistics 4 Johnson STAT
USA Evert Sam
Philosophy 5 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
  • It's reasonable to separate advisor from student!
  • if we want to change or add an attribute to
    advisor, why should we have to update student
    records?
  • How can we reduce unwanted dependencies?

22
Relational database idea
  • Developed by E. F. Codd in 1970
  • Records in tables are linked by relations between
    entities
  • The structure of an RDB is called the data model
    or schema

23
Thinking about relations
  • Can a student have gt 1 advisor?
  • an advisor have gt 1 student?
  • Redundancy comes from data types that don't have
    1-to-1 relations!
  • Can we use relations to develop rules for
    designing a schema?
  • Let's list the possible relations....

24
What is a data relation?
  • A rule of association. There are only three
  • One to one (11)
  • One to many ( many to one) (1M, M1)
  • Many to many (MM)
  • Of what type are the following relations?
  • Mother, child
  • Student, faculty advisor
  • House, street address
  • Food prices, weather forecast
  • genetic marker, genome map

1M MM 11 ?? MM
25
Designing a schema
  • Decide on the main data entities and their
    attributes
  • Determine their relations
  • Create a table for each entity
  • Add columns for attributes
  • Add columns that link tables according to the
    data relations
  • Add tables to reduce redundancy

26
What are the main entities in the Student data?
---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 0 Guo GEN China
Jannink Jean-Luc Agronomy
1 Mosquera PLPTH Colombia Valent
Barbara Plant Path 2 Smith
STAT USA Nelson Lloyd
Statistics 3 Johnson STAT
USA Evert Sam
Philosophy 3 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
  • A "main entity" one with attributes of its own)
  • Student
  • with ID, name, major, nationality, advisors
  • Advisor
  • with first name, last name, department

27
First table-creation decisions
---------------------------------------------
--------------------------------- ID Name
Major Nationality Advisor_lname
Advisor_fname Advisor_dept ----------------
-----------------------------------------------
--------------- 0 Guo GEN China
Nelson Clare Plant
Path 0 Guo GEN China
Jannink Jean-Luc Agronomy
1 Mosquera PLPTH Colombia Valent
Barbara Plant Path 2 Smith
STAT USA Nelson Lloyd
Statistics 3 Johnson STAT
USA Evert Sam
Philosophy 3 Johnson STAT USA
Nelson Lloyd Statistics
-------------------------------------------
-----------------------------------
  • Suppose we start by creating STUDENT and ADVISOR
    tables.

28
Create a new STUDENT table
  • drop table STUDENT
  • CREATE TABLE STUDENT
  • ( ID INT(4) NOT NULL,
  • Name VARCHAR(10),
  • Major VARCHAR(5),
  • Nationality VARCHAR(10),
  • )
  • Advisor_lname VARCHAR(10),
  • Advisor_fname VARCHAR(10),
  • Advisor_dept VARCHAR(10)
  • (leave these out for now .... why?)

29
Repopulate STUDENT table
  • INSERT INTO STUDENT VALUES (0, 'Guo', 'GEN',
    'China')
  • INSERT INTO STUDENT VALUES (1, 'Mosquera',
    'PLPTH', 'Colombia')
  • INSERT INTO STUDENT VALUES (2, 'Smith', 'STAT',
    'USA')
  • INSERT INTO STUDENT VALUES (3, 'Johnson', 'STAT',
    'USA')

select from STUDENT ---------------------
--------- ID Name Major Country
------------------------------ 0 Guo
GEN China 1 Mosquera PLPTH
Colombia 2 Smith STAT USA 3
Johnson STAT USA ---------------
---------------
30
Table ADVISOR
  • CREATE TABLE ADVISOR
  • ( ID INT(4) NOT NULL,
  • LName VARCHAR(10),
  • FName VARCHAR(5),
  • Department VARCHAR(20)
  • )
  • Why do we always put an ID field in these tables?

31
Populating the ADVISOR table
  • INSERT INTO ADVISOR VALUES (0, 'Nelson', 'James',
    'Plant Pathology')
  • INSERT INTO ADVISOR VALUES (1, 'Valent',
    'Barbara', 'Plant Pathology')
  • INSERT INTO ADVISOR VALUES (2, 'Nelson', 'Lloyd',
    'Statistics')
  • INSERT INTO ADVISOR VALUES (3, 'Evert', 'Sam',
    'Philosophy')
  • INSERT INTO ADVISOR VALUES (4, 'Jannink',
    'Jean-Luc', 'Agronomy')
  • select from ADVISOR

------------------------------------- ID
LName FName Department
------------------------------------- 0
Nelson James Plant Pathology 1
Valent Barba Plant Pathology 2 Nelson
Lloyd Statistics 3 Evert Sam
Philosophy 4 Jannink Jean-
Agronomy ----------------------------
---------
32
Making trouble with SQL
  • Also try to add duplicate data
  • INSERT INTO ADVISOR VALUES (0, 'Nelson', 'James',
    'Plant Pathology')
  • INSERT INTO ADVISOR VALUES (1, 'Valent',
    'Barbara', 'Plant Pathology')
  • What happened?
  • MySQL didn't prevent us from ruining the
    nonredundant design...
  • We'll see how to do this later.

33
How do we link students with advisors?
  • We would like to store the advisors of each
    student and the students of each advisor...
  • But with a MM relation, we can't store these data
    in either table....
  • (Why not?)
  • What do we do?

34
First, an easier problem 1M and M1 relations
  • Suppose we decide to place Nationality in its own
    table.
  • (Why might we want to do this?
  • CREATE TABLE NATIONALITY
  • ( ID INT(4) NOT NULL,
  • Country VARCHAR(25)
  • )
  • INSERT INTO NATIONALITY VALUES (0, 'China')
  • INSERT INTO NATIONALITY VALUES (1, 'Colombia')
  • INSERT INTO NATIONALITY VALUES (2, 'USA')

------------ ID Country
-------------- 0 China 1
Colombia 2 USA --------------
35
Rebuild the STUDENT table...
  • drop table STUDENT
  • ... and replace Nationality with a numerical key!
  • CREATE TABLE STUDENT
  • ( ID INT(4) NOT NULL,
  • Name VARCHAR(10),
  • Major VARCHAR(5),
  • Nat_ID INT(4)
  • )

36
Repopulate the STUDENT table
  • INSERT INTO STUDENT VALUES (0, 'Guo', 'GEN', 0)
  • INSERT INTO STUDENT VALUES (1, 'Mosquera',
    'PLPTH', 1)
  • INSERT INTO STUDENT VALUES (2, 'Smith', 'STAT',
    2)
  • INSERT INTO STUDENT VALUES (3, 'Johnson', 'STAT',
    2)
  • select from STUDENT

----------------------------- ID Name
Major Nat_ID --------------------------
--- 0 Guo GEN 0 1
Mosquera PLPTH 1 2 Smith
STAT 2 3 Johnson STAT 2
-----------------------------
37
Now how do we find the nationalities of students?
NATIONALITY table ------------ ID
Country -------------- 0 China
1 Colombia 2 USA
--------------
STUDENT table -----------------------------
ID Name Major Nat_ID
----------------------------- 0 Guo
GEN 0 1 Mosquera PLPTH 1
2 Smith STAT 2 3
Johnson STAT 2 -------------------
----------
  • SELECT
  • FROM
  • WHERE

Name, Country STUDENT, NATIONALITY (is this
enough?) STUDENT.Nat_ID NATIONALITY.ID
-------------------- Name Country
-------------------- Guo China
Mosquera Colombia Smith USA
Johnson USA --------------------
This is called a join operation.
38
The join operation and keys
  • Because storing 1M data types in one table breaks
    our One Data Instance gt One Record rule, we
  • place the two data types into two tables
  • specify their relations with primary and foreign
    keys
  • use a join query to extract attributes from both
    tables at once

39
Primary and foreign keys
  • Primary key unique identifier (generally a
    number) for each record in a table.
  • Foreign key an identifier in one table that
    matches the primary key of another table
  • It serves to join the two tables.
  • It's the address of a record, rather than a copy
    of the data in the record.
  • This decouples the attributes of two different
    data types.

40
Primary foreign keys example
A primary key
A foreign key
STUDENT table -----------------------------
ID Name Major Nat_ID
----------------------------- 0 Guo
GEN 0 1 Mosquera PLPTH 1
2 Smith STAT 2 3
Johnson STAT 2 -------------------
----------
NATIONALITY table ------------ ID
Country -------------- 0 China
1 Colombia 2 USA
--------------
41
Many-to-many relations
  • We handled the 1M relation
  • 1 nationality ltgt many students.
  • How do we handle the MM relation
  • many students ltgt many advisors?

42
Table-design rules
  • 11 relations data types may be placed in the
    same table
  • 1M or M1 create two tables, with primary and
    foreign keys
  • MM create three tables, with two primary keys
    and a composite foreign key.

43
Handling the student advisor relation
  • CREATE TABLE STUDENT
  • ( ID INT(4) NOT NULL,
  • Name VARCHAR(10),
  • Major VARCHAR(5),
  • Nat_ID INT(4),
  • Primary key (ID),
  • Foreign key (Nat_ID) references NATIONALITY(ID)
  • )
  • CREATE TABLE ADVISOR
  • ( ID INT(4) NOT NULL,
  • LName VARCHAR(10),
  • FName VARCHAR(5),
  • Department VARCHAR(20),
  • Primary key (ID)
  • )

CREATE TABLE STUDENT_ADVISOR ( Student_ID INT(4)
NOT NULL, Advisor_ID INT(4) NOT NULL, Is_major
CHAR(1), primary key (Student_ID,
Advisor_ID), foreign key (Student_ID) references
STUDENT(ID), foreign key (Advisor_ID) references
ADVISOR(ID) )
Notice that we are telling MySQL about the
keys. What will MySQL do with this?
44
Populating the keyed tables
  • INSERT INTO STUDENT VALUES (0, 'Guo', 'GEN', 0)
  • INSERT INTO STUDENT VALUES (1, 'Mosquera',
    'PLPTH', 1)
  • INSERT INTO STUDENT VALUES (2, 'Smith', 'STAT',
    2)
  • INSERT INTO STUDENT VALUES (3, 'Johnson', 'STAT',
    2)
  • INSERT INTO ADVISOR VALUES (0, 'Nelson', 'James',
    'Plant Pathology')
  • INSERT INTO ADVISOR VALUES (1, 'Valent',
    'Barbara', 'Plant Pathology')
  • INSERT INTO ADVISOR VALUES (2, 'Nelson', 'Lloyd',
    'Statistics')
  • INSERT INTO ADVISOR VALUES (3, 'Evert', 'Sam',
    'Philosophy')
  • INSERT INTO ADVISOR VALUES (4, 'Jannink',
    'Jean-Luc', 'Agronomy')
  • Try repeating the last command. What happens?

45
Populating the join table
INSERT INTO STUDENT_ADVISOR VALUES (0, 0,
1) INSERT INTO STUDENT_ADVISOR VALUES (0, 4,
0) INSERT INTO STUDENT_ADVISOR VALUES (1, 1,
1) INSERT INTO STUDENT_ADVISOR VALUES (2, 2,
1) INSERT INTO STUDENT_ADVISOR VALUES (3, 3,
0) INSERT INTO STUDENT_ADVISOR VALUES (3, 2, 1)
select from STUDENT_ADVISOR -----------------
----------------- Student_ID Advisor_ID
Is_major ----------------------------------
0 0 1 0
4 0 1 1
1 2 2 1
3 3 0 3
2 1
----------------------------------
46
Tables set up for MM relation
describe STUDENT ---------------------------
----- Field Type Null Key
-------------------------------- ID
int(4) NO PRI Name
varchar(10) YES Major varchar(5)
YES Nat_ID int(4) YES
MUL --------------------------------
  • describe STUDENT_ADVISOR
  • --------------------------------
  • Field Type Null Key
  • --------------------------------
  • Student_ID int(4) NO PRI
  • Advisor_ID int(4) NO PRI
  • Is_major char(1) YES
  • --------------------------------

describe ADVISOR ----------------------------
-------- Field Type Null
Key ------------------------------------
ID int(4) NO PRI LName
varchar(10) YES FName
varchar(5) YES Department
varchar(20) YES ---------------------
---------------
How would you extract the names of all advisors
of student named Guo?
47
Who are all Guo's advisers?
  • SELECT
  • FROM
  • WHERE
  • AND
  • AND

Name, LName, FName STUDENT, ADVISOR,
STUDENT_ADVISOR STUDENT.Name 'Guo' STUDENT.ID
STUDENT_ADVISOR.Student_ID ADVISOR.ID
STUDENT_ADVISOR.Advisor_ID
48
A joining query
How to select all the students who have James
Nelson as an advisor?
  • SELECT
  • FROM
  • WHERE
  • AND
  • AND
  • AND

Name, LName STUDENT, ADVISOR, STUDENT_ADVISOR LNam
e "Nelson" FName "Lloyd " ADVISOR_ID
ADVISOR.ID STUDENT_ID STUDENT.ID
49
Summary
  • RDBs incorporate 11, 1M/M1, and MM relations
    among data types
  • Data are stored economically
  • with each table describing one data "entity" and
    its attributes
  • and key fields encoding the relations between
    entities
  • Data are extracted with SQL SELECT statements
  • and data from different tables are extracted with
    joins

50
Categories of SQL statement
  • Data definition (DDL)
  • CREATE
  • ALTER
  • DROP
  • RENAME
  • TRUNCATE
  • Data manipulation (DML)
  • INSERT
  • UPDATE
  • DELETE
  • Data control (DCL)
  • GRANT
  • REVOKE
  • Data retrieval (DQL)
  • SELECT
  • Transaction control (TC)
  • COMMIT
  • ROLLBACK

51
Elements of SQL statements
  • A verb, such as SELECT (Do what?)
  • A predicate object that specifies field names in
    tables ( means all fields) (To what?)
  • A prepositional clause describing the table on
    which the verb acts (FROM tablename) (Where?)
  • A conditional or adverbial clause (WHERE Advisor
    "Nelson") (How?)
  • Various useful helping verbs such as ORDER BY

52
How MySQL stores numbers
  • INTEGER A whole number
  • VARCHAR(10) Up to 10 characters
  • CHAR(10) Fixed number of characters
  • DATE A date
  • DATETIME Date and time
  • FLOAT Floating-point number
  • TEXT allows up to 65535 characters
  • DECIMAL(10, 2) Up to 10 digits before the point,
    2 after
  • Note Oracle uses different number types. ALL
    DBMSs are not interchangeable!

53
Customer order systementity relationship
diagram
Write a Comment
User Comments (0)
About PowerShow.com