376a. Database Design - PowerPoint PPT Presentation

About This Presentation
Title:

376a. Database Design

Description:

Absolutely essential for learning the material. ... Queries (list the names of all students enrolled in Databases) ... Behind the scenes ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 60
Provided by: csVa
Category:
Tags: 376a | database | design

less

Transcript and Presenter's Notes

Title: 376a. Database Design


1
376a. Database Design
  • Dept. of Computer Science
  • Vassar College
  • http//www.cs.vassar.edu/cs376

2
Introduction
  • What are databases?
  • What are the different types of databases.
  • Data modeling
  • Access
  • Security

3
Who am I
  • Bill Yoshimi
  • Office 106 Old Laundry Building
  • Phone 437-5986
  • Email yoshimi_at_cs.vassar.edu
  • Office Hours 400pm-500pm MW

4
Required Texts
  • Fundamentals of Database Systems,
    Elmasri/Navathe, Benjamin Cummings.

5
How course will be graded
  • 40 - Homework/programming assignments
  • 20 - Midterm
  • 30 - Final Exam

6
Programming assignments
  • One homework/programming assignment per week
    (consisting of 3 or 4 exercises).
  • Absolutely essential for learning the material.
  • Try to hand assignments in on time, future
    assignments depend on knowledge learned from
    previous assignments.

7
Late policy
  • All assignments are due at the start of class on
    the date specified.
  • Assignments arriving after the start of class but
    before the start of the next class will be
    accepted with a 10 penalty. Late assignments
    will not be accepted after this point.

8
Tests
  • Closed book and closed notes.
  • Each test is cumulative.
  • Tests will be announced later in semester.

9
Original authoring of materials
  • Talking to classmates about general ideas is OK.
  • Each student must do his or her programming
    assignment entirely by himself or herself.
  • Do not discuss or share programs with other
    students.
  • Vassar regulations require the professor to
    report suspected violations of academic integrity
    to the Dean of Studies.
  • Read the Originality and Attribution pamphlet.

10
  • Now, were finished with the administrative
    tasks, on with the course

11
Define database
  • A collection of related data, logically connected
    and organized.
  • A database management System (DBMS) enables users
    to create and use a database

12
Simplified Database System
Application Programs/Queries
Software to process queries/programs
Software to access stored data
13
Example
  • STUDENT file stores data on each student.
  • COURSE file stores data on eahc course
  • SECTION file stores data on each section of a
    course
  • GRADE_REPORT file stores grades that students
    receive in the various sections they complete
  • PREREQUISITE file stores prerequisites for each
    course.

14
STUDENT file
  • Each student has
  • Name
  • StudentID
  • Class
  • Major
  • Define the types of each of these

15
Look at relationships between these files
PREREQUISITE
COURSE
STUDENT
SECTION
GRADE_REPORT
16
Basic operations
  • Queries (list the names of all students enrolled
    in Databases)
  • Updates (change Smiths section from 1 to 2)
  • These are informal queries (must be fully
    specified before real query can be made against
    database)

17
Old way to access data
  • Flat file system.
  • Easiest to implement quickly.
  • Each system has its own copy of the data.
  • Every programmer must keep track of their own
    interfaces, must maintain consistency themselves.

18
Fundamental characteristic of a database
  • Should be self describing.
  • System catalog contains info about all data,
    fields, constraints and relationships between
    data. (Also called meta-data)
  • Database program should be application agnostic.
    (as long as relationships can be described by
    meta-data.)
  • Information is looked up according to meta-data
    definitions. In ordinary files, this information
    is encoded in accessor routines.

19
Databases provide
  • Program-data independence. Change in physical
    file structure should not change access routines.
  • Data abstraction all details about how data is
    stored is hidden (access is via data model.)
  • Multiple views subsets of data and virutal
    data.

20
Who accesses the database?
  • Database administrators (DBA) authorizing
    access, coordinating and monitoring use.
  • Database designers identify data and how it
    will be stored in DB, understand requirements,
    devise views.
  • End users
  • Casual end users use query language to access
    database.
  • Naïve or parametric end users use canned
    queries (little variation.)
  • Sophisticated end users thoroughly familiar
    with intricacy of database, make sophisticated
    queries.
  • System Analyst determine needs of Naïve and
    parameteric users
  • Application programs implement SA reqs.

21
Behind the scenes
  • DBMS designers and implementers maintain
    modules inside the database (query processors,
    data access, security, etc.)
  • Tool developers Facilitate design and improve
    performance. (performance monitoring tools,
    natural language interfaces, etc.)
  • Operators and maintenance personnel

22
Why use a DBMS?
  • Control redundancy
  • Shared data (concurrent access)
  • Restrict unauthorized access
  • Provide multiple interfaces to data
  • Represent complex relationships between data.
  • Enforce integrity constraints
  • Backup and recovery

23
Lesser advantages
  • Ability to enforce standards across organization.
  • Flexibility
  • Reduced development time for new applications.
  • Work with up to date data
  • Economies of scale

24
When not to use a DBMS
  • Database and application are simple, well
    defined, not expected to change often.
  • System has stringent real-time constraints that
    DBMS my not meet.
  • Multiple access to data is not required.
  • DBMS are high overhead systems (hardware,
    planning, training and maintenance).

25
DBMS Concepts and Architecture
26
Data model
  • Describes structure of the database (types,
    relationships and constraints.)
  • High level (conceptual) vs. low level (physical)
    vs. implementation data models
  • High level (object-based models)
  • Entity an object
  • Attribute some quality or quantity associated
    with an object
  • Relationship how objects are related
  • Implementation level
  • Relational, network, hierarchical.

27
Database Schema
  • Description of the database. Specified during
    design time and not expected to change.
  • Schema Diagram data model specific convention
    for displaying schemas.
  • Schema Construct single object description in a
    schema.

28
Schema Diagram example
STUDENT
Name StudentNumber Class Major
COURSE
CourseName Number CreditHour Dept
SECTION
SectionID CourseNumber Semester Year
29
Understand
  • Data changes frequently, data scheme, if well
    designed, should not change.
  • The data in the database at any time is called
    the database instance, occurance or state. Any
    insert, delete or change converts DB from one
    instance to another.
  • DBMS is responsible for verifying that each
    instance of the DB adheres to schema.
  • Scheme also called intension, instance
    extension of the schema.

30
Three Schema Architecture
  • Supports program data independence, multi-user
    views, and catalog to store DB schema.
  • Internal level has internal scheme physical
    structure of DB and access paths.
  • Conceptual level has conceptual schema
    describes entities, datatype, relationships, and
    constraints.
  • External or view level has multiple external
    schemas or user views. views for a particular
    audience.

31
Schemas are only descriptions of data
  • Data only exists at the physical level.
  • Mappings are used to convert from schema to
    schema (walk through an example).

32
How schema preserve data independence.
  • Logical data independence changes to conceptual
    scheme do not affect external schemas.
  • Physical data independence changes to internal
    schema (physical layer) do not affect conceptual
    schema.
  • In reality, difficult to implement (also multiple
    level mappings add overhead to queries.)

33
Interfaces to Databases
  • Languages
  • Data Definition Language (DDL) used to define
    internal and conceptual schemas
  • Storage Definition Language (SDL) mapping
    between conceptual and internal schema.
  • View Definition Language (VDL) defines user
    views and mapping to conceptual
  • Data Manipulation Language (DML) methods for
    retrieving, inserting, deleting and modifying
    data.
  • High-level (non procedural) set-at-a-time
  • Low-level (procedural) must be embedded.
    Record-at-a-time.

34
How languages are used
  • Data Manipulation calls are made either directly
    (query language) or embedded in a host language
    (data sublanguage)
  • Naïve and parametric users have user-friendly
    interfaces

35
User Friendly DBMS Interfaces
  • Menus, graphical interfaces, forms, natural
    language, command languages for parametric users,
    DBA interfaces
  • Are these really user friendly?

36
Classification of DBMS
  • Relational, network, hierarchical, other.
  • Single vs. multi-user
  • Number of sites (centralized or distributed).
  • Homo vs. heterogeneous (federated)
  • Cost
  • Types of access path
  • General vs. special purpose

37
Main criteria the data model
  • Relational data organized in tables, high level
    query language, limited form of user views.
    Conceptual and internal views are not
    distinguishable.
  • Network set of records and implements limited
    1N relationships. Must be embedded in host
    programming language.
  • Hierarchical include parent-child relationships.

38
Database design process
  • Collect requirements and analyze results.
  • Create conceptual schema (conceptual database
    design) (datatypes, relationships, and
    constraints.)
  • Implement database (data model mapping)
  • Design physical database (specify internal
    storage structures and file organizations)

39
Example a company database
  • Company is organized in departments. Each
    department has a name, a number and an employee
    who manages the department. Track when the
    employee started managing the department.
    Department may have several locations.
  • Departments have projects, each has a name, a
    number and a location.
  • Employees have SSN, address, salary, sex and
    birthdate. Employee is assigned to one
    department but may work on multiple projects.
    Track of hours worked per week and on which
    projects. Track employees direct supervisor.

40
Entity
  • A thing in the real world. Can be real or
    conceptual.
  • Entity has particular properties attributes
  • E.g.

E1
Name John Addr 1 J St, Apt 2, NY, NY
10002 Age 55 Phone 222 222 2222
41
Attributes
  • Atomic vs. composite.
  • Atomic or simple are not decomposable.
  • Composite attributes can form a hierarchy (street
    address number, street, apartment)
  • Composite attributes useful when parts are
    referenced as well as whole.

42
Attributes cont.
  • Single valued vs. multivalued.
  • Derivable attributes (age from BD, of employees
    by sum)
  • NULL- not applicable or unknown

43
Entity Types
  • Similar entities (having similar attributes but
    with different values)
  • Described using Entity Type Schema (intension)
    common structure shared by entities.
  • E.g. EMPLOYEE has Name, Age, Salary
  • Sets of instances valid at a given point in time
    is called an extension of the entity type.
  • E.g. (John Smith, 45, 80k) (Fred Brown, 32, 20K)
  • Entity type schema should not change extensions
    change often.

44
Key Attribute
  • Uniqueness constraint on attributes.
  • An entity type usually an attribute whose values
    are distinct for each individual entry.
  • Key attribute is used to uniquely identify an
    entity.
  • E.g. PERSON has SocialSecurityNumber, COMPANY
    has Name.
  • No two instances can have the same Key Attribute.
  • Some entities can have more than one Key
    Attribute.

45
Value Sets of Attributes
  • A Attribute
  • E Entity Type
  • V Value Set
  • AE -gt P(V)
  • P(V) is the Power Set of V the set of all subsets
    of V.

46
Value Sets of Attributes cont.
  • A(e) value of attribute A for entity e
  • A(e) is a singleton for SV attributes (has only
    one element)
  • A(e) may be empty set, single element or multiple
    element for multivalued attribute.
  • A(e) for composite attribute is cartesian product
    (all pairs mapping) of P(V1)xP(V2)xP(V3)
  • Use () and comma separted list to denote
    composite attribute.
  • Use and comma separated list to denote
    multi-valued attribute

47
Value Set of Entity cont.
  • E.g. if a person can have more than one residence
    and each residence can have more than one phone
    then the attribute AddressPhone
  • AddressPhone(Phone(AreaCode,PhoneNumber),
    Address(StreetAddress(Number, Street,
    ApartmentNumber),City,State,Zip))

48
Relationships
  • Relationship type R among N entities E1EN is a
    set of associations among these types.
  • R is a set of ri where each ri is an n tuple of
    (e1, e2, en) and each ej in ri is a member of
    entity type Ej 1ltjltN

49
Example
EMPLOYEE
DEPARTMENT
WORKS_FOR
e1 e2 e3 e4 e5 e6 e7
r1 r2 r3 r4 r5 r6 r7
d1 d2 d3 d4 d5
50
Degree of a Relationship Model
  • Degree is number of participating Entity Types.
  • In previous example, degree is 2 or binary. (most
    used form)
  • Ternary relationship type has three Entity Types.
    (holds more information than 3 binary
    relationships).
  • Connection trap can occur when 3 binary relations
    used instead of ternary relation.

51
Relations as Attributes
52
Example
53

54
Entity-Relationship Model
55
(No Transcript)
56
Example
Student Name StudentID Class Major
Smith 17 1 COSC
Brown 8 2 COSC
57
Quoting
  • Quoting by using (quote ) or protects the
    symbol that follows like (quote helloworld) or
    helloworld

58
Summary
  • State what has been learned
  • Define ways to apply training
  • Request feedback of training session

59
Where to Get More Information
  • Other training sessions
  • List books, articles, electronic sources
  • Consulting services, other sources
Write a Comment
User Comments (0)
About PowerShow.com