The Relational Model and Relational Algebra PowerPoint PPT Presentation

presentation player overlay
1 / 43
About This Presentation
Transcript and Presenter's Notes

Title: The Relational Model and Relational Algebra


1
The Relational Model and Relational Algebra
  • Srinath Srinivasa,
  • Indian Institute of Information Technology,26/C
    Electronic City,Hosur Road, Bangalore 560100
  • sri_at_iiitb.ac.in

2
Background
  • Introduced by Ted Codd of IBM Research in 1970.
  • Concept of mathematical relation as the
    underlying basis.
  • The standard database model for most
    transactional database today.

3
Relational Model Concepts
  • A relational database is a collection of
    relation.
  • A relation resembles a Table or a Flatfile of
    records.
  • Each row in such a table represents a collection
    of such related data values ( an instance of
    the relation depicted by the table)

Table 1 A example relation called STUDENT
  • Each row or a relation instance is called a
    tuple and
  • each column is called attribute of the relation

4
Some Definitions
  • A data value is said to be atomic if it cannot be
    subdivided into smaller data values. For example,
    the age of a person in years is an atomic value.
  • A Domain is a set of atomic values. Examples are
    the set of all Integers, the set of all valid
    student roll number, the set of all Indian cities
    with population above 6 million etc.
  • A Relation Schema denoted by R(A1,A2,A3,An) is
    made up of relation name R and a list of
    attributes A1,A2,A3,An . Each Ai is the name of
    role played some domain D in the schema R. The
    domain Ai is denoted by dom(Ai).

5
Some Definitions
  • The Degree of the relation is the number of
    attributes of its relation schema.
  • A relation r of a relation schema
    R(A1,A2,A3,An), is a set of n-tuples of the
    form t1,t2,.,tn where each ti ? dom(Ai).
  • The Relation can be defined as
  • r(R) ? (dom(A1) x dom(A2) x dom(An)),
    where x is the cartesian product over sets.

6
Characteristics of Relation
  • Ordering of Tuples Mathematically tuples in
    relation dont have ordering (although in
    reality they do have order when stored in files
    or disk)
  • Ordering of attributes Attributes within a tuple
    have to maintain order if they are stored as
    tuples. However a relation need not necessarily
    be maintained as a set of ordered tuples as long
    as a mapping between attribute and value is
    maintained in each relation instance.
  • Values of Tuples Each value that makes up a
    tuple is atomic and cannot be further subdivided.
    This is also called the first normal form
    assumption.

7
Relational Constraints
  • Domain constraints domain constraints specify
    that value of each attribute A must be atomic and
    a member of dom(A).
  • Key constraints Given a relational schema R
    (Ai), i 1n , a subset of attributes K is
    called a superkey if the values of attributes
    in K is distinct for every relation instance.
    This is, for any two tuples ti , tj , (ti ? tj)
    ? (ti K ? tjK).
  • A key k of a relation is a subset of the
    superkey, k ? K,such that removal of any
    attribute in k removes the superkey property for
    k. They are also called Minimal superkeys.

8
Relational Constraints
  • In STUDENT table (Roll no, Name) is a superkey,
    while (Roll no) is a minimal key.
  • A relation schema may have more than one key.
    Each key is called a candidate key.
  • Usually one of the candidate key is chosen to
    uniquely identify tuples in a relation. such a
    key is called primary key for a relation.
  • In STUDENT relation, Roll no is a good primary
    key.

9
Relational Constraints
  • Entity Integrity Constraint The primary key of a
    tuple may never be null.
  • Referential Integrity Constraint Informally, a
    tuple that refers to another tuple from another
    relation should refer to an existing tuple.
  • In a given relation R1 a set of attributes
    FK is said to be a foreign key of R1 referencing
    another relation R2, if the following rules hold
  • The attribute in FK have the same domain as the
    primary key of R2.
  • For every tuple in R1, the attributes in its
    Foreign Key FK either reference a tuple in R2 or
    is null.

10
Relational Constraints
Employee
Department
Foreign Keys are diagrammatically depicted as
arrows. Note that foreign key can be from a
relation to itself.
11
Relational Constraints
  • Semantic Integrity Constraint Constraints on the
    values of attributes.
  • Example The salary of the employee cannot
    be more than of his/her supervisor.
  • The minimum age of an employee is 18 and
    maximum is 65.
  • Semantic Integrity constraints are application
    specific and not part of the relational model,
    strictly speaking. Such constraints are specified
    and enforced using general purpose constraint
    specification languages.

12
Basic Relational Algebra Operations
  • Operations of relational model can be categorized
    into retrievals and updates.
  • Updates can be either insert of new tuple ,
    update( or modify) an existing tuple or delete an
    existing tuple.
  • Retrieval is handled mostly by select (denoted by
    ?) and project (denoted by ?) operations.

13
The Select Operation
  • The SELECT operation is used to select (retrieve)
    a subset of the tuples from a relation that
    matches the selection condition .
  • ?salary gt 3000 (Employee)
  • selects all tuples of relation Employee which
    matches the criteria
  • Employee.Salary gt 3000

14
The Select Operation
  • The general syntax of SELECT is as follows
  • ? ltConditiongt (Relation)
  • where condition is of the form
  • Condition? (Expression) logicaloperator condition
    Expression
  • Expression ? attributename operator constant
    attributename operator attributename
  • Logicaloperator ? AND OR
  • Operator ? lt gt ? ? ?

15
Properties of SELECT operation
  • The SELECT operation is unary. It operates on
    only one relation.
  • Selection conditions are applied to each tuple
    individually in the relation. Hence the condition
    cannot be span more than one tuple.
  • The degree of the relation resulting from ?c(R)
    is the same as that of R.
  • ?c(R) ? R
  • The SELECT operation is commutative
    ?c1 (?c2(R)) ? ?c2 (?c1(R))
  • Hence, SELECT operation can be cascaded in any
    fashion.

16
The Project Operation
  • The project operation selects specific
    columns from the specified relation.
  • ?Name, Salary ( Employee)
  • returns a relation comprising of only the
    Employee.Name and Employee.Salary attributes. The
    operation has said to have projected Employee
    relation onto the required relation.

17
The Project Operation
  • The general form of PROJECT is as follows
  • ? ltattributeslistgt (Relation)
  • where attributelist is a comma-separated
    list of attributes belonging to the specified
    relation.
  • The result of the PROJECT is in the same order as
    the specified list of attributes.

18
Properties of Project operation
  • The PROJECT operation removes duplicate in its
    results.
  • The number of tuples returned by PROJECT is less
    than or equal to number of tuples in the
    specified relation.
  • When attrubute list of PROJECT includes the
    superkey then the number of tuples returned is
    equal to the number of tuples in the specified
    relation.
  • PROJECT is not commutative.
  • ?l1?l2(R) ?l1 (R) if l1 is a substring of l2.
    Else the operation is an incorrect expression.

19
Composition and Assignment
  • Since the input and output of the relational
    operator is a single relation, they can be
    composed in any fashion with the output of one
    operation being the input of the other.
  • ? Name, Salary (?Salary gt 3000 (Employee ) )
  • returns Name and Salary field of all records of
    Employee, where salary is greater than 3000.
  • Results of a query can also be assigned to a name
    to form a relation by that name.
  • SalaryStatement ? ? Name, Salary (? Salary gt
    3000( Employee ))
  • create a new relation SalaryStatement out of the
    specified query.

20
Cartesian Join ( x)
  • PROJECT and SELECT operators expect a single
    relation.
  • When PROJECT and SELECT need to be run on a
    multiple relation, a single relation can be
    generated using the cartesian join ( x )
    operator.

TABLE Student
TABLE Lab
Consider the Relational Query ? Student.Lab
Lab.Name (Student x Lab)
21
Cartesian Join (x)
Columns matching the query are shown in PINK.
22
Query Result
23
Properties of Cartesian Join ( x )
  • The Cartesian Join represents a Canonical Join of
    two or more relations.
  • If relation R having R tuples and relation S
    having S tuples are joined using x, then R
    x S R ? S
  • Cartesian join is too inefficient for most
    queries involving multiple relations.

24
Relational Algebra (contd)
  • Join operators and additional relational operators

25
Relational Algebra Summary
  • Based on the concept of mathematical relation
  • Building block a relation comprising of
    attributes within domains
  • Tuples Schema Relation
  • Properties Ordering, Duplicates
  • Constraints Key constraints, entity constraints,
    referential integrity constraints
  • Basic retrieval operators Select, Project
  • Composability of relational operators and
    Cartesian join

26
Theta Join Operator (??? )
  • Theta join is used to combine related tuples
    from two or more relations ( specified by the
    condition theta ), to form a single tuple.
  • Consider the earlier relation Student and Lab.
  • The relation
  • Student ?? Student.Lab Lab.Name Lab
  • returns the same result as
  • ? Student.Lab Lab.Name (Student x Lab)
  • However, the number of tuples generated by Join
    is 4 rather than 12 as in the Cartesian join.
  • Note that Student.lab is a foreign key into Lab
    (and Lab.Name is a primary key of Lab) and
    referential Integrity has to be maintained to
    perform this join.

27
Theta Join Operator
  • The general form of theta join is as follows
  • R ?? join condition S
  • where R and S are relation and join condition is
    a logical expression over attributes of R and S .
  • Tuples whose join attributes are null do not
    appear in the final result.

28
Theta Join Operator
  • Theta join where the only comparison operator
    used is the equals () sign, are called equijoins
    .
  • A natural join denoted by is an equijoin,where
    the attribute names at both ends of the equality
    sign are the same. In this case, the attribute is
    not repeated in the final result.
  • Consider the Student and the Lab example. Let
    the Student relation schema be modified as
    Student(RollNo,Name,Labname) and Lab relation
    schema be modified as Lab(LabName,Faculty,Dept).
  • The natural join Student Lab has a scheme
  • (Roll No, Name, LabName, Faculty, Dept)

29
Renaming
  • TA(id,LabName) ? ?Roll No,Lab(Student)
  • The above relation projects RollNo and Lab
    attributes from Student and renames them as id
    and LabName respectively in the output relation.
    It also names the output relation as TA.
  • The above can also be achieved by rename (?)
    operator as
  • ? TA(id,LabName) (? Roll No,Lab( Student ))

30
Renaming
  • The general form of the rename operator is as
    follows
  • ? S(B1, B2, .Bn) (R)
  • Or
  • ? S (R)
  • Or
  • ? (B1, B2, .Bn) ( R)
  • The first form renames both relation name and
    attribute names, the second form renames only the
    relation name and the third form renames only
    attribute names.

31
Set Theoretic Operations
  • Set theoretic operation like UNION ( U ),
    INTERSECTION ( ) and SET DIFFERENCE can be
    applied to compatible relations.
  • Compatibility Two relations R ( A1,A2An) and
    S ( B1,B2,.Bn) are compatible if they have the
    same number of attributes and for all
  • The Union operator R U S returns the set of all
    tuples that are present in either R or S or both.
    Duplicate tuples will be eliminated.
  • The Intersection operator R S returns the
    set of all tuples are present in both R and S.
  • The Set Difference operator R S returns the
    set of all tuples in R but not in S.
  • Note that
    , but

32
The Division Operator
  • The Division operator ( ) is used to denote
    the conditions where a given relation R is to be
    split based on whether its association with every
    tuple in an other relation S.
  • Let the set of attributes of R be Z and set of
    attributes of S be X. Let
  • Y Z X, that is the set of all
    attributes of R that are not in S.
  • The result of the operation is as
    follows. For every tuple
    ,

R
S
33
Other Relational Operators
  • Outer Join A normal join operation ?? requires
    referential integrity.
  • Every attribute in R that refers S should refer
    to an existing tuple in S. If either the
    referencing attribute of S is null, the tuple is
    not returned.
  • Outer join is used when all tuples are required
    even when referential integrity is not met. Left
    outer join includes tuples even when the
    referencing attribute is null and Right outer
    join includes tuples even when the referenced
    attribute is null or referenced tuple does not
    exist.
  • An outer join which is both left and right is
    called a Full outer join.
  • Outer Union Outer Union computes the union of
    partially compatible relations. Two relations
    R(X) and S(Z) are said to be partially compatible
    if there exists W ? X and Y ? Z such that
    for every Ai ? W and Bi ? Y, dom(Ai)
    dom(Bi).
  • Outer Union of R and S creates a new relation
    that includes all attributes from
    both relation. Null values fill up all undefined
    attributes.

34
The Complete Set of the Operators
  • The following set of relational operators
    , , U , -- , x is called the complete
    set of the relational operators because all
    other operators can be mathematically expressed
    as a sequence of the above operators.
  • For Example
  • Similarly

35
Relational Operators on Bags
  • A bag or a multi-set is a set that may have
    multiple occurrences of a given element
  • Bags are necessary when aggregation (sum,
    average,) has to be performed over query results
  • Tolerating bags while answering queries reduces
    unnecessary overheads

36
Set-theoretic operations on bags
Consider two bags R where tuple t occurs n times
and S, where
tuple t occurs m times. The union of bags or
disjoint union of sets denoted by R S is a
bag that contains R S tuples or mn
occurrences of tuple t. In the intersection of
bags R ? S, tuple t occurs min(n,m) times. In
the set difference of bags R S, tuple t
occurs max(0, n-m) times.
37
Other operators on Bags
  • The select operator on bags is applied to each
    tuple in the bag independently and duplicates are
    not removed from the result
  • The project operator on bags is applied to each
    tuple independently and duplicates are not
    removed from the result. Faster than project
    operator on sets.
  • Cartesian product on bags If tuple t occurs m
    times in relation R and tuple s occurs n times in
    relation S, then tuple ts occurs mn times in RxS.

38
Algebraic expressions on bags
The following algebraic expressions are valid
when R and S are sets, but invalid when they are
bags (R ? S) T (R T) ? (S T)
1-1-1 principle R ? (S ? T) (R ? S) ?
(R ? T) 2-1-2 principle ?C OR D (R) ?C
(R) ? ?D (R)
39
Relational Operators Example
Consider the following schema Employee
(FNAME,MINIT,LNAME,PAN,DOB,ADDR
GENDER,SALARY,SUPERVISOR,DNO) Department
(DNAME, DNUMBER, MGRPAN,
MGRSTARTDATE) Dept_Locations
(DNUMBER, DLOCATION) Project (PNAME, PNUMBER,
PLOCATION, DNUM) Works_on (EPAN, PNO, HOURS)
40
Relational Operators Example
Query 1 Retrieve the name and address of all
employees who work for Research department
Research_dept ? ?DNAMEResearch
(Department) Research_emps ?
(Research_dept ??DNUMBERDNO EMPLOYEE) Result ?
?FNAME,LNAME,ADDRESS (Research_emps)
41
Relational Operators Example
Query 2 Find the names of employees who work on
all projects controlled by department no. 5
Dpt5_proj ? ?PNUMBER (?DNUM5
(Project)) Emp_Prj(PAN,PNUMBER) ? ?EPAN,PNO
(Works_on) Result_Pans ? Emp_Prj ? Dpt5_proj
Result ? ?FNAME,LNAME (Result_Pans Employee)
42
Relational Operators Example
Query 3 List the names of all employees who are
not working on any projects All_Pans ? ?PAN
(Employee) Prj_Pans ? ?EPAN (Works_on) NoPrj_Pan
s ? All_Pans Prj_Pans Result ? ?FNAME,LNAME
(NoPrj_Pans Employee)
43
Summary
  • Definition of relation schema, relation, domain,
    attributes.
  • Difference between the mathematical concept of
    relation and their implementation as tables or
    files.
  • Constraints in relational model Domain
    Constraints, Key constraints (Super key,
    Candidate key, Primary key definition), Entity
    Integrity constraint, Referential integrity
    constraint .
  • Basic relational algebra operation ,
  • Relational Algebra operators
    outer join, outer union.
  • The complete set of relational operators.
Write a Comment
User Comments (0)
About PowerShow.com