Title: The Relational Data Model (Based on Chapter 5)
1The Relational Data Model(Based on Chapter 5)
21. Relational Model Concepts
- BASIS OF THE MODEL
-
-
- The relational Model of Data is based on the
concept of a Relation. -
- A Relation is a mathematical concept based on
the ideas of sets. -
- The strength of the relational approach to
data management comes from the formal foundation
provided by the theory of relations. -
-
3INFORMAL DEFINITIONS
- RELATION A table of values
-
- A relation may be thought of as a set of rows.
-
- A relation may alternately be though of as a
set of columns. -
- Each row of the relation may be given an
identifier. -
- Each column typically is called by its column
name or column header or attribute name.
4FORMAL DEFINITIONS
- A Relation may be defined in multiple ways.
-
- The Schema of a Relation R (A1, A2, .....An)
- Relation R is defined over attributes A1, A2,
.....An -
- For Example -
- CUSTOMER (Cust-id, Cust-name, Address, Phone)
-
- Here, CUSTOMER is a relation defined over the
four attributes Cust-id, Cust-name, Address,
Phone, each of which has a domain or a set of
valid values.
5- For example, the domain of Cust-id is 6 digit
numbers. - A tuple is an ordered set of values
-
- Each value is derived from an appropriate
domain. -
- Each row in the CUSTOMER table may be called
as a tuple in the table and would consist of four
values. -
- lt632895, "John Smith", "101 Main St. Atlanta, GA
30332", "(404) 894-2000"gt is a triple belonging
to the CUSTOMER relation. -
- A relation may be regarded as a set of tuples
(rows). - Columns in a table are also called as
attributes of the relation.
6FORMAL DEFINITIONS (contd..)
- The relation is formed over the cartesian
product of the sets each set has values from a
domain that domain is used in a specific role
which is conveyed by the attribute name. - For example, attribute Cust-name is defined
over the domain of strings of 25 characters. The
role these strings play in the CUSTOMER relation
is that of the name of customers. - Formally, Given R(A1, A2, .........., An)
- r(R) subset-of dom (A1) X dom (A2) X ....X
dom(An) -
- R schema of the relation
- r of R a specific "value" or population of R.
-
7- R is also called the intension of a relation
- r is also called the extension of a relation
-
- Let S1 0,1
- Let S2 a,b,c
- Let R be a subset-of S1 X S2
-
- for example r(R) lt0.agt , lt0,bgt , lt1,cgt
8DEFINITION SUMMARY
- Informal Terms Formal Terms
- Table Relation
- Column Attribute/Domain
- Row Tuple
- Values in a column Domain
- Table Definition Schema of Relation
- Populated Table Extension
9Figure 7.1 The attributes and tuples of a
relation STUDENT.
102 Characteristics of Relations
- Ordering of tuples in a relation r(R) The tuples
are not considered to be ordered, even though
they appear to be in the tabular form. -
- Ordering of attributes in a relation schema R
(and of values within each tuple) We will
consider the attributes in R(A1, A2, ..., An) and
the values in tltv1, v2, ..., vngt to be ordered . -
-
-
11- Values in a tuple All values are considered
atomic (indivisible). A special null value is
used to represent values that are unknown or
inapplicable to certain tuples. - Notation
- - We refer to component values of a tuple t by
- tAi vi (the value of attribute Ai for tuple
t). - - Similarly, tAu, Av, ..., Aw refers to the
subtuple of t containing the values of attributes
Au, Av, ..., Aw, respectively.
12Figure 7.2 The relation STUDENT from Figure 7.1,
with a different order of tuples
133 Relational Integrity Constraints
- Constraints are conditions that must hold on all
valid relation instances. There are three main
types of constraints - Key constraints
- Entity integrity constraints,
- Referential integrity constraints
143.1 Key Constraints
- Superkey of R A set of attributes SK of R such
that no two tuples in any valid relation instance
r(R) will have the same value for SK. That is,
for any distinct tuples t1 and t2 in r(R), t1SK
ltgt t2SK. - Key of R A "minimal" superkey that is, a
superkey K such that removal of any attribute
from K results in a set of attributes that is not
a superkey.
15- Example The CAR relation schema
- CAR(State, Reg, SerialNo, Make, Model, Year)
- has two keys Key1 State, Reg, Key2
SerialNo, - which are also superkeys. SerialNo, Make is a
superkey but not a key. - If a relation has several candidate keys, one is
chosen arbitrarily to be the primary key. - The primary key attributes are underlined.
16Figure 7.4 The CAR relation with two candidate
keys LicenseNumber and EngineSerialNumber.
17Figure 7.5 Schema diagram for the COMPANY
relational database schema the primary keys are
underlined.
18Figure 7.5 Schema diagram for the COMPANY
relational database schema the primary keys are
underlined.
19Figure 7.6 (continued)
203.2 Entity Integrity
- Relational Database Schema A set S of relation
schemas that belong to the same database. S is
the name of the database. - S R1, R2, ..., Rn
-
- Entity Integrity The primary key attributes PK
of each relation schema R in S cannot have null
values in any tuple of r(R). This is because
primary key values are used to identify the
individual tuples. - tPK ltgt null for any tuple t in r(R)
-
- Note Other attributes of R may be similarly
constrained to disallow null values, even though
they are not members of the primary key.
213.3 Referential Integrity
- A constraint involving two relations (the
previous constraints involve a single relation). -
- Used to specify a relationship among tuples in
two relations the referencing relation and the
referenced relation. -
- Tuples in the referencing relation R1 have
attributes FK (called foreign key attributes)
that reference the primary key attributes PK of
the referenced relation - R2. A tuple t1 in R1 is said to reference a tuple
t2 in R2 if t1FK t2PK. -
- A referential integrity constraint can be
displayed in a relational database schema as a
directed arc from R1.FK to R2.
22Figure 7.7 Referential integrity constraints
displayed on the COMPANY relational database
schema diagram.
23Figure 7.6 One possible relational database
state corresponding to the company schema.
244 Update Operations on Relations
- - INSERT a tuple.
- - DELETE a tuple.
- - MODIFY a tuple.
-
- - Integrity constraints should not be
violated by the update operations. -
- - Several update operations may have to be
grouped together. -
- - Updates may propagate to cause other
updates automatically. This may be necessary to
maintain integrity constraints. -
- -
25- In case of integrity violation, several actions
can be taken - - cancel the operation that causes the violation
(REJECT optiom) - - perform the operation but inform the user of
the violation - - trigger additional updates so the violation is
corrected (CASCADE option, SET NULL option) - - execute a user-specified error-correction
routine
265 The Relational Algebra
- - Operations to manipulate relations.
- - Used to specify retrieval requests
(queries). - - Query result is in the form of a relation.
-
- Relational Operations
- 5.1 SELECT s and PROJECT P
operations. - 5.2 Set operations These include
UNION U, INTERSECTION , - DIFFERENCE -, CARTESIAN
PRODUCT X. - 5.3 JOIN operations X.
- 5.4 Other relational operations DIVISION,
OUTER JOIN, AGGREGATE FUNCTIONS.
275.1 SELECT s and PROJECT P
- SELECT operation (denoted bys )
-
- - Selects the tuples (rows) from a relation
R that satisfy a certain selection condition c - - Form of the operation s c(R)
-
- - The condition c is an arbitrary Boolean
expression on the attributes of R -
- - Resulting relation has the same attributes
as R -
- - Resulting relation includes each tuple in
r(R) whose attribute values satisfy the condition
c -
-
28- Examples
- s DNO4 (EMPLOYEE)
- s SALARYgt30000 (EMPLOYEE)
- s(DNO4 AND SALARYgt25000) OR DNO5(EMPLOYEE)
29- PROJECT operation (denoted byP )
-
- - Keeps only certain attributes (columns)
from a relation R specified in an attribute list
L -
- - Form of operation P L(R)
-
- - Resulting relation has only those
attributes of R specified in L -
- Example P FNAME,LNAME,SALARY(EMPLOYEE)
-
- - The PROJECT operation eliminates duplicate
tuples in the resulting relation so that it
remains a mathematical set (no duplicate
elements) -
30- Example P SEX,SALARY(EMPLOYEE)
- If several male employees have salary 30000, only
a single tuple ltM, 30000gt is - kept in the resulting relation.
- Duplicate tuples are eliminated by the P
operation.
31- Sequences of operations
-
- - Several operations can be combined to form
a relational algebra expression (query) - Example Retrieve the names and salaries of
employees who work in department 4 - P FNAME,LNAME,SALARY (s DNO4(EMPLOYEE) )
-
- - Alternatively, we specify explicit
intermediate relations for each step - DEPT4_EMPS lt-s DNO4(EMPLOYEE)
- R lt-P FNAME,LNAME,SALARY(DEPT4_EMPS)
-
- -
32- Attributes can optionally be renamed in the
resulting left-hand-side relation (this may be
required for some operations that will be
presented later) - DEPT4_EMPS lt-s DNO4(EMPLOYEE)
- R(FIRSTNAME,LASTNAME,SALARY) lt-
- P FNAME,LNAME,SALARY(DEPT4_EMPS)
33Figure 7.8 Results of SELECT and PROJECT
operations.(a) ?(DNO4 AND SALARYgt25000) OR
(DNO5 AND SALARYgt30000)(EMPLOYEE).(b) ?LNAME,
FNAME, SALARY(EMPLOYEE). (c) ?SEX,
SALARY(EMPLOYEE).
34Class Number CS 304
Class Name - DBMS
Instructor Sanjay Madria
Lesson Title Relational Algebra 3rd July
355.2 Set Operations
- - Binary operations from mathematical set
theory - UNION R1 U R2,
- INTERSECTION R1 R2,
- SET DIFFERENCE R1 - R2,
- CARTESIAN PRODUCT R1 X R2.
-
- - For U, , -, the operand relations R1(A1,
A2, ..., An) and R2(B1, B2, ..., Bn) must have
the same number of attributes, and the domains of
corresponding attributes must be compatible that
is, dom(Ai)dom(Bi) for i1, 2, ..., n. This
condition is called union compatibility. -
- The resulting relation for U, , or -
has the same attribute names as the first
operand relation R1 (by convention).
36Figure 7.10 Query result after the UNION
operationRESULT ? RESULT1 ? RESULT2
37Figure 7.11 Illustrating the set operations
union, intersection and difference.(a) Two union
compatible relations.(b) STUDENT ? INSTRUCTOR.
(c) STUDENT ? INSTRUCTOR(d) STUDENT - INSTRUCTOR
(e) INSTRUCTOR - STUDENT
38CARTESIAN PRODUCT
- R(A1, A2, ..., Am, B1, B2, ..., Bn) lt-
- R1(A1, A2, ..., Am) X R2 (B1, B2, ...,
Bn) -
- - A tuple t exists in R for each combination
of tuples t1 from R1 and t2 from R2 such that - tA1, A2, ..., Amt1 and tB1, B2, ...,
Bnt2 -
- - If R1 has n1 tuples and R2 has n2 tuples,
then R will have n1n2 tuples. -
- - CARTESIAN PRODUCT can combine related
tuples from two relations if followed by the
appropriate SELECT operation . -
39- Example Combine each DEPARTMENT tuple with the
EMPLOYEE tuple of the manager. - DEP_EMP lt-DEPARTMENT X EMPLOYEE
- DEPT_MANAGER lt-s MGRSSNSSN(DEP_EMP)
405.3 JOIN Operations
- THETA JOIN Similar to a CARTESIAN PRODUCT
followed by a SELECT. - The condition c is called a join condition.
- R(A1, A2, ..., Am, B1, B2, ..., Bn) lt-R1(A1, A2,
..., Am) X c R2 (B1, B2, ..., Bn) - Here c can be lt, gt, , lt, gt
-
- EQUIJOIN The join condition c includes one or
more equality comparisons involving attributes
from R1 and R2. That is, c is of the form - (AiBj) AND ... AND (AhBk) 1lti,hltm, 1ltj,kltn
-
- In the above EQUIJOIN operation
- Ai, ..., Ah are called the join attributes of R1
- Bj, ..., Bk are called the join attributes of R2
-
41- Example of using EQUIJOIN
- Retrieve each DEPARTMENT's name and its manager's
name - T lt-DEPARTMENT X MGRSSNSSN EMPLOYEE
- RESULT lt-P DNAME,FNAME,LNAME(T)
42- NATURAL JOIN ()
- In an EQUIJOIN R lt- R1 X c R2, the join
attribute of R2 appear redundantly in the result
relation R. In a NATURAL JOIN, the redundant join
attributes of R2 are eliminated from R. The
equality condition is implied and need not be
specified. - R lt- R1 (join attributes of R1),(join attributes
of R2) R2 - Example Retrieve each EMPLOYEE's name and the
name of the DEPARTMENT he/she - works for
- Tlt- EMPLOYEE (DNO),(DNUMBER) DEPARTMENT
- RESULT lt-P FNAME,LNAME,DNAME(T)
-
43- If the join attributes have the same names in
both relations, they need not be specified and
we can write R lt- R1 R2. - Example Retrieve each EMPLOYEE's name and the
name of his/her SUPERVISOR - SUPERVISOR(SUPERSSN,SFN,SLN)lt-P
SSN,FNAME,LNAME(EMPLOYEE) - Tlt-EMPLOYEE SUPERVISOR
- RESULT lt-P FNAME,LNAME,SFN,SLN(T)
44Figure 7.13 Illustrating the JOIN operation.
45Figure 7.14 An illustration of the NATURAL JOIN
operation. (a) PROJ_DEPT ? PROJECT DEPT.(b)
DEPT_LOCS ? DEPARTMENT DEPT_LOCATIONS.
46- Note In the original definition of NATURAL
JOIN, the join attributes were required to have
the same names in both relations. -
- There can be a more than one set of join
attributes with a different meaning between the
same two relations. For example -
- JOIN ATTRIBUTES RELATIONSHIP
- EMPLOYEE.SSN EMPLOYEE manages
- DEPARTMENT.MGRSSN the DEPARTMENT
- EMPLOYEE.DNO EMPLOYEE works for
- DEPARTMENT.DNUMBER the DEPARTMENT
-
47- A relation can have a set of join attributes to
join it with itself -
- JOIN ATTRIBUTES RELATIONSHIP
- EMPLOYEE(1).SUPERSSN EMPLOYEE(2) supervises
- EMPLOYEE(2).SSN EMPLOYEE(1)
-
- - One can think of this as joining two
distinct copies of the relation, although only
one relation actually exists -
- - In this case, renaming can be useful
-
48Figure 7.15 Illustrating the division
operation.(a) Dividing SSN_PNOS by SMITH_PNOS.
(b) T ? R ? S.
49- Complete Set of Relational Algebra Operations
-
- - All the operations discussed so far can
be described as a sequence of only the
operations SELECT, PROJECT, UNION, SET
DIFFERENCE, and CARTESIAN PRODUCT. -
- - Hence, the set s ,P , U, - , X is
called a complete set of relational algebra
operations. Any query language equivalent to
these operations is called relationally complete. -
- - For database applications, additional
operations are needed that were not part of the
original relational algebra. These include - 1. Aggregate functions and grouping.
- 2. OUTER JOIN.
505.4 Additional Relational Operations
- AGGREGATE FUNCTIONS
- - Functions such as SUM, COUNT, AVERAGE,
MIN, MAX are often applied to sets of values or
sets of tuples in database applications - ltgrouping attributesgt Fltfunction listgt (R)
-
- - The grouping attributes are optional
-
- Example 1 Retrieve the average salary of all
employees (no grouping needed) - R(AVGSAL) lt- F AVERAGE SALARY (EMPLOYEE)
-
51- Example 2 For each department, retrieve the
department number, the number of employees, and
the average salary (in the department) - R(DNO,NUMEMPS,AVGSAL) lt-
- DNO F COUNT SSN, AVERAGE SALARY (EMPLOYEE)
-
- DNO is called the grouping attribute in the
above example
52Figure 7.16 An illustration of the AGGREGATE
FUNCTION operation.(a) R(DNO, NO_OF_EMPLOYEES,
AVERAGE_SAL) ? DNO ? COUNT SSN, AVERAGE
SALARY(EMPLOYEE). (b) DNO ? COUNT SSN, AVERAGE
SALARY(EMPLOYEE).(C) ? COUNT SSN, AVERAGE
SALARY(EMPLOYEE).
53- OUTER JOIN
-
- - In a regular EQUIJOIN or NATURAL JOIN
operation, tuples in R1 or R2 that do not have
matching tuples in the other relation do not
appear in the result -
- - Some queries require all tuples in R1 (or
R2 or both) to appear in the result -
- - When no matching tuples are found, nulls
are placed for the missing attributes -
- -
54- LEFT OUTER JOIN R1 X R2 lets every tuple in
R1 appear in the result -
- - RIGHT OUTER JOIN R1 X R2 lets every
tuple in R2 appear in the result -
- - FULL OUTER JOIN R1 X R2 lets every
tuple in R1 or R2 appear in the result
55Figure 7.18 The LEFT OUTER JOIN operation.
56RENAME Operator
- ?S(B1, B2,Bn) (R) Renaming relation R as S
and renaming attributes of R as Bis. -
- ?S(R) Renaming R as S
-
- ?(B1, B2,Bn) (R) Renaming attributes of R as
Bis. -
57- Some Queries
-
- Q. Retrieve the SSNs of all the employees who
either work in dept. 5 or supervise an employee
who works in dept 5. -
- Dept-Emps ??DNO 5 (Employee)
- Result1 ? ?SSN (Dept-Emps)
- Result 2(SSN) ? ?SuperSSN (Dept-Emps)
-
- Result Result 1 ? Result 2
58- Q. Find for each female employee, a list of names
of her dependents. -
- Female-Emp ??SEX F (Employee)
- Empname ? ?FNAME, LNAME, SSN (Female-Emp)
- Emp-dep ? Empname ? Dependent
- Actual-dep ??SSN ESSN (Emp-dep)
- Result ? ?FNAME, LNAME, Dependent-name
(Actual-dep) -
- Q. Find the name of the manager of each
department. -
- Dept-mgr ? Department ?MGRSSN SSN (Employee)
- Result ? ?DNAME, LNAME, FNAME (Dept-mgr)
59- List names of managers who have atleast one
dependent - Find names of employees who have no dependents
- List names of all employees with two or more
dependents
60Figure 7.12 An illustration of the CARTESIAN
PRODUCT operation.
61Figure 7.12 (continued)
62The Insert Operation
- 1. Insert ltCecilia, F, Kolonsky, null,
1960-04-05, 6357 Windy Lane, Katy, TX, F,
28000, null, 4gt into EMPLOYEE - This insertion violates the entity integrity
constraint (null for the primary key SSN), so it
is rejected. - 2. Insert ltAlicia, J, Zelaya, 999887777,
1960-04-05, 6357 Windy Lane, Katy, TX, F,
28000, 987654321, 4gt into EMPLOYEE - This insertion violates the key constraint
because another tuple with the same SSN Value
already exists in the EMPLOYEE relation, and so
it is rejected.
63The Insert Operation (contd.)
- 3. Insert ltCecilia, F, Kolonsky,
67768989, 1960-04-05, 6357 Windswept, katy,
TX, F, 28000, 987654321, 7gt into EMPLOYEE - This insertion violates the referential
integrity constraint specified on DNO because no
DEPARTMENT tuple exists with DNUMBER 7 - 4. Insert ltCecilia, F, Kolonsky,
67768989, 1960-04-05, 6357 Windy Lane, Katy,
TX, F, 28000, null, 4gt into EMPLOYEE. - This insertion satisfies all constraints, so it
is acceptable.
64The Delete Operation
- 1. Delete the WORKS_ON tuple with ESSN
999887777 and PNO 10. - This deletion is acceptable
- 2. Delete the EMPLOYEE tuple with SSN
999887777 - This deletion is not acceptable, because tuples
in WORKS_ON refer to this tuple. Hence, if the
tuple is deleted, referential integrity
violations will result.
65The Delete Operation (contd.)
- 3. Delete the EMPLOYEE tuple with SSN
333445555 - This deletion will result in even worse
referential integrity violations, because the
tuple involved is referenced by tuples from the
EMPLOYEE, DEPARTMENT, WORKS_ON and DEPENDENT
relations. - Options
- Reject deletion
- Propagate deletion by deleting other tuples
- Modify the referencing attributes that cause the
violations - Or combination of three above
66The Update Operation
- 1. Update the SALARY of the EMPLOYEE tuple with
SSN 999887777 to 28000 - Acceptable
- 2. Update the DNO of the EMPLOYEE tuple with SSN
999887777 to 1 - Acceptable
- 3. Update the DNO of the EMPLOYEE tuple with SSN
999887777 to 7 - Unacceptable, because it violates referential
integrity.
67The Update Operation (contd.)
- 4. Update the SSN of the EMPLOYEE tuple with SSN
999887777 to 987654321 - Unacceptable, because it violates primary key
and referential integrity constraints. - Updating an attribute which is neither primary
key or foreign key usually causes no problems,
new value should be of correct data type and
domain.