The Relational Data Model (Based on Chapter 5) - PowerPoint PPT Presentation

About This Presentation
Title:

The Relational Data Model (Based on Chapter 5)

Description:

If a relation has several candidate keys, one is chosen arbitrarily to be the primary key. ... 2. Insert Alicia', J', Zelaya', 999887777, 1960-04-05', 6357 ... – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 68
Provided by: UMR
Learn more at: http://web.mst.edu
Category:

less

Transcript and Presenter's Notes

Title: The Relational Data Model (Based on Chapter 5)


1
The Relational Data Model(Based on Chapter 5)
2
1. Relational Model Concepts
  • BASIS OF THE MODEL
  •  
  •  
  • The relational Model of Data is based on the
    concept of a Relation.
  •  
  • A Relation is a mathematical concept based on
    the ideas of sets.
  •  
  • The strength of the relational approach to
    data management comes from the formal foundation
    provided by the theory of relations.
  •  
  •  

3
INFORMAL DEFINITIONS
  • RELATION A table of values
  •  
  • A relation may be thought of as a set of rows.
  •  
  • A relation may alternately be though of as a
    set of columns.
  •  
  • Each row of the relation may be given an
    identifier.
  •  
  • Each column typically is called by its column
    name or column header or attribute name.

4
FORMAL DEFINITIONS
  • A Relation may be defined in multiple ways.
  •  
  • The Schema of a Relation R (A1, A2, .....An)
  • Relation R is defined over attributes A1, A2,
    .....An
  •  
  • For Example -
  • CUSTOMER (Cust-id, Cust-name, Address, Phone)
  •  
  • Here, CUSTOMER is a relation defined over the
    four attributes Cust-id, Cust-name, Address,
    Phone, each of which has a domain or a set of
    valid values.

5
  • For example, the domain of Cust-id is 6 digit
    numbers.
  • A tuple is an ordered set of values
  •  
  • Each value is derived from an appropriate
    domain.
  •  
  • Each row in the CUSTOMER table may be called
    as a tuple in the table and would consist of four
    values.
  •  
  • lt632895, "John Smith", "101 Main St. Atlanta, GA
    30332", "(404) 894-2000"gt is a triple belonging
    to the CUSTOMER relation.
  •  
  • A relation may be regarded as a set of tuples
    (rows).
  • Columns in a table are also called as
    attributes of the relation.

6
FORMAL DEFINITIONS (contd..)
  • The relation is formed over the cartesian
    product of the sets each set has values from a
    domain that domain is used in a specific role
    which is conveyed by the attribute name.
  • For example, attribute Cust-name is defined
    over the domain of strings of 25 characters. The
    role these strings play in the CUSTOMER relation
    is that of the name of customers.
  • Formally, Given R(A1, A2, .........., An)
  • r(R) subset-of dom (A1) X dom (A2) X ....X
    dom(An)
  •  
  • R schema of the relation
  • r of R a specific "value" or population of R.
  •  

7
  • R is also called the intension of a relation
  • r is also called the extension of a relation
  • Let S1 0,1
  • Let S2 a,b,c
  •   Let R be a subset-of S1 X S2
  •  
  • for example r(R) lt0.agt , lt0,bgt , lt1,cgt

8
DEFINITION SUMMARY
  • Informal Terms Formal Terms
  • Table Relation
  • Column Attribute/Domain
  • Row Tuple
  • Values in a column Domain
  • Table Definition Schema of Relation
  • Populated Table Extension

9
Figure 7.1 The attributes and tuples of a
relation STUDENT.
10
2 Characteristics of Relations
  • Ordering of tuples in a relation r(R) The tuples
    are not considered to be ordered, even though
    they appear to be in the tabular form.
  •  
  • Ordering of attributes in a relation schema R
    (and of values within each tuple) We will
    consider the attributes in R(A1, A2, ..., An) and
    the values in tltv1, v2, ..., vngt to be ordered .
  •  
  •  

11
  • Values in a tuple All values are considered
    atomic (indivisible). A special null value is
    used to represent values that are unknown or
    inapplicable to certain tuples.
  • Notation
  • - We refer to component values of a tuple t by
  • tAi vi (the value of attribute Ai for tuple
    t).
  • - Similarly, tAu, Av, ..., Aw refers to the
    subtuple of t containing the values of attributes
    Au, Av, ..., Aw, respectively.

12
Figure 7.2 The relation STUDENT from Figure 7.1,
with a different order of tuples
13
3 Relational Integrity Constraints
  • Constraints are conditions that must hold on all
    valid relation instances. There are three main
    types of constraints
  • Key constraints
  • Entity integrity constraints,
  • Referential integrity constraints

14
3.1 Key Constraints
  • Superkey of R A set of attributes SK of R such
    that no two tuples in any valid relation instance
    r(R) will have the same value for SK. That is,
    for any distinct tuples t1 and t2 in r(R), t1SK
    ltgt t2SK.
  • Key of R A "minimal" superkey that is, a
    superkey K such that removal of any attribute
    from K results in a set of attributes that is not
    a superkey.

15
  • Example The CAR relation schema
  • CAR(State, Reg, SerialNo, Make, Model, Year)
  • has two keys Key1 State, Reg, Key2
    SerialNo,
  • which are also superkeys. SerialNo, Make is a
    superkey but not a key.
  • If a relation has several candidate keys, one is
    chosen arbitrarily to be the primary key.
  • The primary key attributes are underlined.

16
Figure 7.4 The CAR relation with two candidate
keys LicenseNumber and EngineSerialNumber.
17
Figure 7.5 Schema diagram for the COMPANY
relational database schema the primary keys are
underlined.
18
Figure 7.5 Schema diagram for the COMPANY
relational database schema the primary keys are
underlined.
19
Figure 7.6 (continued)
20
3.2 Entity Integrity
  • Relational Database Schema A set S of relation
    schemas that belong to the same database. S is
    the name of the database.
  • S R1, R2, ..., Rn
  •  
  • Entity Integrity The primary key attributes PK
    of each relation schema R in S cannot have null
    values in any tuple of r(R). This is because
    primary key values are used to identify the
    individual tuples.
  • tPK ltgt null for any tuple t in r(R)
  •  
  • Note Other attributes of R may be similarly
    constrained to disallow null values, even though
    they are not members of the primary key.

21
3.3 Referential Integrity
  • A constraint involving two relations (the
    previous constraints involve a single relation).
  •  
  • Used to specify a relationship among tuples in
    two relations the referencing relation and the
    referenced relation.
  •  
  • Tuples in the referencing relation R1 have
    attributes FK (called foreign key attributes)
    that reference the primary key attributes PK of
    the referenced relation
  • R2. A tuple t1 in R1 is said to reference a tuple
    t2 in R2 if t1FK t2PK.
  •  
  • A referential integrity constraint can be
    displayed in a relational database schema as a
    directed arc from R1.FK to R2.

22
Figure 7.7 Referential integrity constraints
displayed on the COMPANY relational database
schema diagram.
23
Figure 7.6 One possible relational database
state corresponding to the company schema.
24
4 Update Operations on Relations
  • - INSERT a tuple.
  • - DELETE a tuple.
  • - MODIFY a tuple.
  •  
  • - Integrity constraints should not be
    violated by the update operations.
  •  
  • - Several update operations may have to be
    grouped together.
  •  
  • - Updates may propagate to cause other
    updates automatically. This may be necessary to
    maintain integrity constraints.
  •  
  • -

25
  • In case of integrity violation, several actions
    can be taken
  • - cancel the operation that causes the violation
    (REJECT optiom)
  • - perform the operation but inform the user of
    the violation
  • - trigger additional updates so the violation is
    corrected (CASCADE option, SET NULL option)
  • - execute a user-specified error-correction
    routine

26
5 The Relational Algebra
  • - Operations to manipulate relations.
  • - Used to specify retrieval requests
    (queries).
  • - Query result is in the form of a relation.
  •  
  • Relational Operations
  • 5.1 SELECT s and PROJECT P
    operations.
  • 5.2 Set operations These include
    UNION U, INTERSECTION ,
  • DIFFERENCE -, CARTESIAN
    PRODUCT X.
  • 5.3 JOIN operations X.
  • 5.4 Other relational operations DIVISION,
    OUTER JOIN, AGGREGATE FUNCTIONS.

27
5.1 SELECT s and PROJECT P
  • SELECT operation (denoted bys )
  •  
  • - Selects the tuples (rows) from a relation
    R that satisfy a certain selection condition c
  • - Form of the operation s c(R)
  •  
  • - The condition c is an arbitrary Boolean
    expression on the attributes of R
  •  
  • - Resulting relation has the same attributes
    as R
  •  
  • - Resulting relation includes each tuple in
    r(R) whose attribute values satisfy the condition
    c
  •  

28
  • Examples
  • s DNO4 (EMPLOYEE)
  • s SALARYgt30000 (EMPLOYEE)
  • s(DNO4 AND SALARYgt25000) OR DNO5(EMPLOYEE)

29
  • PROJECT operation (denoted byP )
  •  
  • - Keeps only certain attributes (columns)
    from a relation R specified in an attribute list
    L
  •  
  • - Form of operation P L(R)
  •  
  • - Resulting relation has only those
    attributes of R specified in L
  •  
  • Example P FNAME,LNAME,SALARY(EMPLOYEE)
  •  
  • - The PROJECT operation eliminates duplicate
    tuples in the resulting relation so that it
    remains a mathematical set (no duplicate
    elements)

30
  • Example P SEX,SALARY(EMPLOYEE)
  • If several male employees have salary 30000, only
    a single tuple ltM, 30000gt is
  • kept in the resulting relation.
  • Duplicate tuples are eliminated by the P
    operation.

31
  • Sequences of operations
  •  
  • - Several operations can be combined to form
    a relational algebra expression (query)
  • Example Retrieve the names and salaries of
    employees who work in department 4
  • P FNAME,LNAME,SALARY (s DNO4(EMPLOYEE) )
  •  
  • - Alternatively, we specify explicit
    intermediate relations for each step
  • DEPT4_EMPS lt-s DNO4(EMPLOYEE)
  • R lt-P FNAME,LNAME,SALARY(DEPT4_EMPS)
  •  
  • -

32
  • Attributes can optionally be renamed in the
    resulting left-hand-side relation (this may be
    required for some operations that will be
    presented later)
  • DEPT4_EMPS lt-s DNO4(EMPLOYEE)
  • R(FIRSTNAME,LASTNAME,SALARY) lt-
  • P FNAME,LNAME,SALARY(DEPT4_EMPS)

33
Figure 7.8 Results of SELECT and PROJECT
operations.(a) ?(DNO4 AND SALARYgt25000) OR
(DNO5 AND SALARYgt30000)(EMPLOYEE).(b) ?LNAME,
FNAME, SALARY(EMPLOYEE). (c) ?SEX,
SALARY(EMPLOYEE).
34
Class Number CS 304
Class Name - DBMS
Instructor Sanjay Madria
Lesson Title Relational Algebra 3rd July
35
5.2 Set Operations
  • - Binary operations from mathematical set
    theory
  • UNION R1 U R2,
  • INTERSECTION R1 R2,
  • SET DIFFERENCE R1 - R2,
  • CARTESIAN PRODUCT R1 X R2.
  •  
  • - For U, , -, the operand relations R1(A1,
    A2, ..., An) and R2(B1, B2, ..., Bn) must have
    the same number of attributes, and the domains of
    corresponding attributes must be compatible that
    is, dom(Ai)dom(Bi) for i1, 2, ..., n. This
    condition is called union compatibility.
  •  
  • The resulting relation for U, , or -
    has the same attribute names as the first
    operand relation R1 (by convention).

36
Figure 7.10 Query result after the UNION
operationRESULT ? RESULT1 ? RESULT2
37
Figure 7.11 Illustrating the set operations
union, intersection and difference.(a) Two union
compatible relations.(b) STUDENT ? INSTRUCTOR.
(c) STUDENT ? INSTRUCTOR(d) STUDENT - INSTRUCTOR
(e) INSTRUCTOR - STUDENT
38
CARTESIAN PRODUCT
  • R(A1, A2, ..., Am, B1, B2, ..., Bn) lt-
  • R1(A1, A2, ..., Am) X R2 (B1, B2, ...,
    Bn)
  •  
  • - A tuple t exists in R for each combination
    of tuples t1 from R1 and t2 from R2 such that
  • tA1, A2, ..., Amt1 and tB1, B2, ...,
    Bnt2
  •  
  • - If R1 has n1 tuples and R2 has n2 tuples,
    then R will have n1n2 tuples.
  •  
  • - CARTESIAN PRODUCT can combine related
    tuples from two relations if followed by the
    appropriate SELECT operation .
  •  

39
  • Example Combine each DEPARTMENT tuple with the
    EMPLOYEE tuple of the manager.
  • DEP_EMP lt-DEPARTMENT X EMPLOYEE
  • DEPT_MANAGER lt-s MGRSSNSSN(DEP_EMP)

40
5.3 JOIN Operations
  • THETA JOIN Similar to a CARTESIAN PRODUCT
    followed by a SELECT.
  • The condition c is called a join condition.
  • R(A1, A2, ..., Am, B1, B2, ..., Bn) lt-R1(A1, A2,
    ..., Am) X c R2 (B1, B2, ..., Bn)
  • Here c can be lt, gt, , lt, gt
  •  
  • EQUIJOIN The join condition c includes one or
    more equality comparisons involving attributes
    from R1 and R2. That is, c is of the form
  • (AiBj) AND ... AND (AhBk) 1lti,hltm, 1ltj,kltn
  •  
  • In the above EQUIJOIN operation
  • Ai, ..., Ah are called the join attributes of R1
  • Bj, ..., Bk are called the join attributes of R2
  •  

41
  • Example of using EQUIJOIN
  • Retrieve each DEPARTMENT's name and its manager's
    name
  • T lt-DEPARTMENT X MGRSSNSSN EMPLOYEE
  • RESULT lt-P DNAME,FNAME,LNAME(T)

42
  • NATURAL JOIN ()
  • In an EQUIJOIN R lt- R1 X c R2, the join
    attribute of R2 appear redundantly in the result
    relation R. In a NATURAL JOIN, the redundant join
    attributes of R2 are eliminated from R. The
    equality condition is implied and need not be
    specified.
  • R lt- R1 (join attributes of R1),(join attributes
    of R2) R2
  • Example Retrieve each EMPLOYEE's name and the
    name of the DEPARTMENT he/she
  • works for
  • Tlt- EMPLOYEE (DNO),(DNUMBER) DEPARTMENT
  • RESULT lt-P FNAME,LNAME,DNAME(T)
  •  

43
  • If the join attributes have the same names in
    both relations, they need not be specified and
    we can write R lt- R1 R2.
  • Example Retrieve each EMPLOYEE's name and the
    name of his/her SUPERVISOR
  • SUPERVISOR(SUPERSSN,SFN,SLN)lt-P
    SSN,FNAME,LNAME(EMPLOYEE)
  • Tlt-EMPLOYEE SUPERVISOR
  • RESULT lt-P FNAME,LNAME,SFN,SLN(T)

44
Figure 7.13 Illustrating the JOIN operation.
45
Figure 7.14 An illustration of the NATURAL JOIN
operation. (a) PROJ_DEPT ? PROJECT DEPT.(b)
DEPT_LOCS ? DEPARTMENT DEPT_LOCATIONS.
46
  • Note In the original definition of NATURAL
    JOIN, the join attributes were required to have
    the same names in both relations.
  •  
  • There can be a more than one set of join
    attributes with a different meaning between the
    same two relations. For example
  •  
  • JOIN ATTRIBUTES RELATIONSHIP
  • EMPLOYEE.SSN EMPLOYEE manages
  • DEPARTMENT.MGRSSN the DEPARTMENT
  • EMPLOYEE.DNO EMPLOYEE works for
  • DEPARTMENT.DNUMBER the DEPARTMENT
  •  

47
  • A relation can have a set of join attributes to
    join it with itself
  •  
  • JOIN ATTRIBUTES RELATIONSHIP
  • EMPLOYEE(1).SUPERSSN EMPLOYEE(2) supervises
  • EMPLOYEE(2).SSN EMPLOYEE(1)
  •  
  • - One can think of this as joining two
    distinct copies of the relation, although only
    one relation actually exists
  •  
  • - In this case, renaming can be useful
  •  

48
Figure 7.15 Illustrating the division
operation.(a) Dividing SSN_PNOS by SMITH_PNOS.
(b) T ? R ? S.
49
  • Complete Set of Relational Algebra Operations
  •  
  • - All the operations discussed so far can
    be described as a sequence of only the
    operations SELECT, PROJECT, UNION, SET
    DIFFERENCE, and CARTESIAN PRODUCT.
  •  
  • - Hence, the set s ,P , U, - , X is
    called a complete set of relational algebra
    operations. Any query language equivalent to
    these operations is called relationally complete.
  •  
  • - For database applications, additional
    operations are needed that were not part of the
    original relational algebra. These include
  • 1. Aggregate functions and grouping.
  • 2. OUTER JOIN.

50
5.4 Additional Relational Operations
  • AGGREGATE FUNCTIONS
  • - Functions such as SUM, COUNT, AVERAGE,
    MIN, MAX are often applied to sets of values or
    sets of tuples in database applications
  • ltgrouping attributesgt Fltfunction listgt (R)
  •  
  • - The grouping attributes are optional
  •  
  • Example 1 Retrieve the average salary of all
    employees (no grouping needed)
  • R(AVGSAL) lt- F AVERAGE SALARY (EMPLOYEE)
  •  

51
  • Example 2 For each department, retrieve the
    department number, the number of employees, and
    the average salary (in the department)
  • R(DNO,NUMEMPS,AVGSAL) lt-
  • DNO F COUNT SSN, AVERAGE SALARY (EMPLOYEE)
  •  
  • DNO is called the grouping attribute in the
    above example

52
Figure 7.16 An illustration of the AGGREGATE
FUNCTION operation.(a) R(DNO, NO_OF_EMPLOYEES,
AVERAGE_SAL) ? DNO ? COUNT SSN, AVERAGE
SALARY(EMPLOYEE). (b) DNO ? COUNT SSN, AVERAGE
SALARY(EMPLOYEE).(C) ? COUNT SSN, AVERAGE
SALARY(EMPLOYEE).
53
  • OUTER JOIN
  •  
  • - In a regular EQUIJOIN or NATURAL JOIN
    operation, tuples in R1 or R2 that do not have
    matching tuples in the other relation do not
    appear in the result
  •  
  • - Some queries require all tuples in R1 (or
    R2 or both) to appear in the result
  •  
  • - When no matching tuples are found, nulls
    are placed for the missing attributes
  •  
  • -

54
  • LEFT OUTER JOIN R1 X R2 lets every tuple in
    R1 appear in the result
  •  
  • - RIGHT OUTER JOIN R1 X R2 lets every
    tuple in R2 appear in the result
  •  
  • - FULL OUTER JOIN R1 X R2 lets every
    tuple in R1 or R2 appear in the result

55
Figure 7.18 The LEFT OUTER JOIN operation.
56
RENAME Operator
  • ?S(B1, B2,Bn) (R) Renaming relation R as S
    and renaming attributes of R as Bis.
  •  
  • ?S(R) Renaming R as S
  •  
  • ?(B1, B2,Bn) (R) Renaming attributes of R as
    Bis.
  •  

57
  • Some Queries
  •  
  • Q. Retrieve the SSNs of all the employees who
    either work in dept. 5 or supervise an employee
    who works in dept 5.
  •  
  • Dept-Emps ??DNO 5 (Employee)
  • Result1 ? ?SSN (Dept-Emps)
  • Result 2(SSN) ? ?SuperSSN (Dept-Emps)
  •  
  • Result Result 1 ? Result 2

58
  • Q. Find for each female employee, a list of names
    of her dependents.
  •  
  • Female-Emp ??SEX F (Employee)
  • Empname ? ?FNAME, LNAME, SSN (Female-Emp)
  • Emp-dep ? Empname ? Dependent
  • Actual-dep ??SSN ESSN (Emp-dep)
  • Result ? ?FNAME, LNAME, Dependent-name
    (Actual-dep)
  •  
  • Q. Find the name of the manager of each
    department.
  •  
  • Dept-mgr ? Department ?MGRSSN SSN (Employee)
  • Result ? ?DNAME, LNAME, FNAME (Dept-mgr)

59
  • List names of managers who have atleast one
    dependent
  • Find names of employees who have no dependents
  • List names of all employees with two or more
    dependents

60
Figure 7.12 An illustration of the CARTESIAN
PRODUCT operation.
61
Figure 7.12 (continued)
62
The Insert Operation
  • 1. Insert ltCecilia, F, Kolonsky, null,
    1960-04-05, 6357 Windy Lane, Katy, TX, F,
    28000, null, 4gt into EMPLOYEE
  • This insertion violates the entity integrity
    constraint (null for the primary key SSN), so it
    is rejected.
  • 2. Insert ltAlicia, J, Zelaya, 999887777,
    1960-04-05, 6357 Windy Lane, Katy, TX, F,
    28000, 987654321, 4gt into EMPLOYEE
  • This insertion violates the key constraint
    because another tuple with the same SSN Value
    already exists in the EMPLOYEE relation, and so
    it is rejected.

63
The Insert Operation (contd.)
  • 3. Insert ltCecilia, F, Kolonsky,
    67768989, 1960-04-05, 6357 Windswept, katy,
    TX, F, 28000, 987654321, 7gt into EMPLOYEE
  • This insertion violates the referential
    integrity constraint specified on DNO because no
    DEPARTMENT tuple exists with DNUMBER 7
  • 4. Insert ltCecilia, F, Kolonsky,
    67768989, 1960-04-05, 6357 Windy Lane, Katy,
    TX, F, 28000, null, 4gt into EMPLOYEE.
  • This insertion satisfies all constraints, so it
    is acceptable.

64
The Delete Operation
  • 1. Delete the WORKS_ON tuple with ESSN
    999887777 and PNO 10.
  • This deletion is acceptable
  • 2. Delete the EMPLOYEE tuple with SSN
    999887777
  • This deletion is not acceptable, because tuples
    in WORKS_ON refer to this tuple. Hence, if the
    tuple is deleted, referential integrity
    violations will result.

65
The Delete Operation (contd.)
  • 3. Delete the EMPLOYEE tuple with SSN
    333445555
  • This deletion will result in even worse
    referential integrity violations, because the
    tuple involved is referenced by tuples from the
    EMPLOYEE, DEPARTMENT, WORKS_ON and DEPENDENT
    relations.
  • Options
  • Reject deletion
  • Propagate deletion by deleting other tuples
  • Modify the referencing attributes that cause the
    violations
  • Or combination of three above

66
The Update Operation
  • 1. Update the SALARY of the EMPLOYEE tuple with
    SSN 999887777 to 28000
  • Acceptable
  • 2. Update the DNO of the EMPLOYEE tuple with SSN
    999887777 to 1
  • Acceptable
  • 3. Update the DNO of the EMPLOYEE tuple with SSN
    999887777 to 7
  • Unacceptable, because it violates referential
    integrity.

67
The Update Operation (contd.)
  • 4. Update the SSN of the EMPLOYEE tuple with SSN
    999887777 to 987654321
  • Unacceptable, because it violates primary key
    and referential integrity constraints.
  • Updating an attribute which is neither primary
    key or foreign key usually causes no problems,
    new value should be of correct data type and
    domain.
Write a Comment
User Comments (0)
About PowerShow.com