Relational Data Model - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Relational Data Model

Description:

( see also Attachment 'A' for Codd's 12 Rules for a Relational DBMS) The relational Model of Data is based on the concept of a Relation. ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 24
Provided by: jayrcfer
Category:
Tags: aret | data | model | relational

less

Transcript and Presenter's Notes

Title: Relational Data Model


1
Relational Data Model
  • ICS 22 Database Systems

2
1. Relational Model Concepts
3
BASIS OF THE MODEL
  • The model was first proposed by Dr. E.F. Codd of
    IBM in 1970 in the following paper - "A
    Relational Model for Large Shared Data Banks,"
    Communications of the ACM, June 1970. (see also
    Attachment A for Codds 12 Rules for a
    Relational DBMS)
  • The relational Model of Data is based on the
    concept of a Relation. The model represents the
    database as a collection of relations or tables
    (Note Relations have nothing to do with
    relationships in the ER Data Model)
  • A Relation is a mathematical concept based on the
    ideas of sets.
  • Consists of three components data definition,
    data integrity and data manipulation.
  • The strength of the relational approach to data
    management comes from the formal foundation
    provided by the theory of relations.

4
INFORMAL DEFINITIONS
  • RELATION A table of values
  • A relation may be thought of as a set of rows.
  • A relation may alternately be though of as a set
    of columns.
  • Each row of the relation may be given an
    identifier.
  •  Each column typically is called by its column
    name or column header or attribute name.
  •  Degree of a relation is the number of
    attributes.
  •  Domain is a set of atomic values that are
    possible values for an attribute.

5
FORMAL DEFINITIONS
  • A Relation may be defined in multiple ways.
  • The Schema of a Relation
  • R (A1, A2, .....An)
  • Relation R is defined over attributes A1, A2,
    .....An
  • For Example -
  • CUSTOMER (Cust-id, Cust-name, Address, Phone)
  • Here, CUSTOMER is a relation defined over the
    four attributes Cust-id, Cust-name, Address,
    Phone, each of which has a domain or a set of
    valid values. For example, the domain of Cust-id
    is 6 digit numbers.

6
FORMAL DEFINITIONS
  • A tuple is an ordered set of values
  • Each value is derived from an appropriate
    domain.
  • Each row in the CUSTOMER table may be called
    as a tuple in the table and would consist of four
    values.
  • lt632895, "John Smith", "101 Main St. Atlanta, GA
    30332", "(404) 894-2000"gt is a tuple belonging to
    the CUSTOMER relation.
  • A relation may be regarded as a set of tuples
    (rows).
  • Columns in a table are also called as
    attributes of the relation.
  • The relation is formed over the cartesian
    product of the sets each set has values from a
    domain that domain is used in a specific role
    which is conveyed by the attribute name.
  •  
  • For example, attribute Cust-name is defined
    over the domain of strings of 25 characters. The
    role these strings play in the CUSTOMER relation
    is that of the name of customers.

7
FORMAL DEFINITIONS
  • Formally,
  • Given R(A1, A2, .........., An)
  • r(R) subset-of dom (A1) X dom (A2) X ....X
    dom(An)
  •  
  • R schema of the relation
  • r of R a specific "value" or population of R.
  •  
  • R is also called the intension of a relation
  • r is also called the extension of a relation
  •  
  • Let S1 0,1
  • Let S2 a,b,c
  •  
  • Let R subset-of S1 X S2
  •  
  • for example r(R) lt0.agt , lt0,bgt , lt1,cgt

8
DEFINITION SUMMARY
  • Informal Terms Formal Terms
  • Table Relation
  • Column Attribute/Domain
  • Row Tuple
  • Values in a column Domain
  • Table Definition Schema of Relation
  • Populated Table Extension

9
Notes
  • Whereas languages like SQL use the informal terms
    of TABLE (e.g. CREATE TABLE), COLUMN (e.g.
    SYSCOLUMN variable), the relational database
    textbooks present the model and operations on it
    using the formal terms.

10
2 Characteristics of Relations
  • Ordering of tuples in a relation r(R) The tuples
    are not considered to be ordered, even though
    they appear to be in the tabular form.
  • Ordering of attributes in a relation schema R
    (and of values within each tuple) We will
    consider the attributes in R(A1, A2, ..., An) and
    the values in tltv1, v2, ..., vngt to be ordered .
  • (However, a more general alternative definition
    of relation does not require this ordering).
    Sequence of columns (left to right) is
    insignificant.
  • Values in a tuple All values are considered
    atomic (indivisible) no repeating groups or
    links to other relations are allowed. A special
    null value is used to represent values that are
    unknown or inapplicable to certain tuples.
  • Uniqueness of tuples Each tuple or row in a
    relation is unique.
  • Uniqueness of attribute names. Each
    attribute/column has a unique name within a
    relation.
  • Domain Constraint on attributes. Entries in each
    column are of the same domain.

11
Notation
  • We refer to component values of a tuple t by
    tAi vi (the value of attribute Ai for tuple
    t).
  • Similarly, tAu, Av, ..., Aw refers to the
    subtuple of t containing the values of attributes
    Au, Av, ..., Aw, respectively.

12
3 Relational Integrity Constraints
  • Constraints are conditions that must hold on all
    valid relation instances. There are three main
    types of constraints
  • Key constraints, entity integrity constraints,
    and referential integrity constraints

13
3.1 Key Constraints
  • Superkey of R A set of attributes SK of R such
    that no two tuples in any valid relation instance
    r(R) will have the same value for SK. That is,
    for any distinct tuples t1 and t2 in r(R), t1SK
    ltgt t2SK.
  • Key of R A "minimal" superkey that is, a
    superkey K such that removal of any attribute
    from K results in a set of attributes that is not
    a superkey.
  • Example The CAR relation schema
  • CAR(State, Reg, SerialNo, Make, Model, Year)
  • has two keys Key1 State, Reg, Key2
    SerialNo, which are also superkeys. SerialNo,
    Make is a superkey but not a key.
  • If a relation has several candidate keys, one is
    chosen arbitrarily to be the primary key. The
    primary key attributes are underlined.

14
3.2 Entity Integrity
  • Relational Database Schema A set S of relation
    schemas that belong to the same database. S is
    the name of the database.
  • S R1, R2, ..., Rn 
  • Entity Integrity The primary key attributes PK
    of each relation schema R in S cannot have null
    values in any tuple of r(R). This is because
    primary key values are used to identify the
    individual tuples.
  • tPK ltgt null for any tuple t in r(R)
  • Note Other attributes of R may be similarly
    constrained to disallow null values, even though
    they are not members of the primary key.

15
3.3 Referential Integrity
  • A constraint involving two relations (the
    previous constraints involve a single relation).
  •  Used to specify a relationship among tuples in
    two relations the referencing relation and the
    referenced relation.
  • Tuples in the referencing relation R1 have
    attributes FK (called foreign key attributes)
    that reference the primary key attributes PK of
    the referenced relation R2. A tuple t1 in R1 is
    said to reference a tuple t2 in R2 if t1FK
    t2PK.
  • A referential integrity constraint can be
    displayed in a relational database schema as a
    directed arc from R1.FK to R2.

16
4 Update Operations on Relations
  • INSERT a tuple.
  • DELETE a tuple.
  • MODIFY a tuple.
  • Integrity constraints should not be violated by
    the update operations.
  • - Several update operations may have to be
    grouped together.
  • - Updates may propagate to cause other updates
    automatically. This may be necessary to maintain
    integrity constraints.
  • - In case of integrity violation, several actions
    can be taken
  • - cancel the operation that causes the violation
    (REJECT option)
  • - perform the operation but inform the user of
    the violation
  • - trigger additional updates so the violation is
    corrected (CASCADE option, SET NULL option)
  • - execute a user-specified error-correction
    routine

17
5 The Relational Algebra
  • - Operations to manipulate relations.
  • - Used to specify retrieval requests (queries).
  • - Query result is in the form of a relation.
  • Relational Operations
  • 5.1 SELECT s and PROJECT P operations.
  • 5.2 Set operations These include UNION U,
    INTERSECTION , DIFFERENCE -, CARTESIAN
    PRODUCT X.
  • 5.3 JOIN operations X.
  • 5.4 Other relational operations DIVISION, OUTER
    JOIN, AGGREGATE FUNCTIONS.

18
5.1 SELECT s and PROJECT P
  • SELECT operation (denoted bys )
  •  
  • - Selects the tuples (rows) from a relation R
    that satisfy a certain selection condition c
  • - Form of the operation s c(R)
  •  
  • - The condition c is an arbitrary Boolean
    expression on the attributes of R
  •  
  • - Resulting relation has the same attributes as
    R
  •  
  • - Resulting relation includes each tuple in r(R)
    whose attribute values satisfy the condition c
  •  
  • Examples
  • s DNO4(EMPLOYEE)
  • s SALARYgt30000(EMPLOYEE)
  • s(DNO4 AND SALARYgt25000) OR DNO5(EMPLOYEE)
  • PROJECT operation (denoted byP )
  •  
  • - Keeps only certain attributes (columns) from a
    relation R specified in an attribute list L

19
5.2 Set Operations
  • - Binary operations from mathematical set theory
  • UNION R1 U R2,
  • INTERSECTION R1 R2,
  • SET DIFFERENCE R1 - R2,
  • CARTESIAN PRODUCT R1 X R2.
  •  
  • - For U, , -, the operand relations R1(A1, A2,
    ..., An) and R2(B1, B2, ..., Bn) must have the
    same number of attributes, and the domains of
    corresponding attributes must be compatible that
    is, dom(Ai)dom(Bi) for i1, 2, ..., n. This
    condition is called union compatibility.
  •  
  • - The resulting relation for U, , or - has
    the same attribute names as the first operand
    relation R1 (by convention).
  •  
  • CARTESIAN PRODUCT
  • R(A1, A2, ..., Am, B1, B2, ..., Bn) lt-
  • R1(A1, A2, ..., Am) X R2 (B1, B2, ...,
    Bn)
  •  
  • - A tuple t exists in R for each combination of
    tuples t1 from R1 and t2 from R2 such that
  • tA1, A2, ..., Amt1 and tB1, B2, ..., Bnt2
  •  
  • - If R1 has n1 tuples and R2 has n2 tuples, then
    R will have n1n2 tuples.
  •  

20
5.3 JOIN Operations
  • THETA JOIN Similar to a CARTESIAN PRODUCT
    followed by a SELECT. The condition c is called a
    join condition.
  • R(A1, A2, ..., Am, B1, B2, ..., Bn) lt-
  • R1(A1, A2, ..., Am) X c R2 (B1, B2, ...,
    Bn)
  •  
  • EQUIJOIN The join condition c includes one or
    more equality comparisons involving attributes
    from R1 and R2. That is, c is of the form
  • (AiBj) AND ... AND (AhBk) 1lti,hltm, 1ltj,kltn
  •  
  • In the above EQUIJOIN operation
  • Ai, ..., Ah are called the join attributes of R1
  • Bj, ..., Bk are called the join attributes of R2
  • Example of using EQUIJOIN
  • Retrieve each DEPARTMENT's name and its manager's
    name
  • T lt-DEPARTMENT XMGRSSNSSN EMPLOYEE
  • RESULT lt-P DNAME,FNAME,LNAME(T)
  •  
  • NATURAL JOIN ()
  • In an EQUIJOIN R lt- R1 X c R2, the join
    attribute of R2 appear redundantly in the result
    relation R. In a NATURAL JOIN, the redundant join
    attributes of R2 are eliminated from R. The
    equality condition is implied and need not be
    specified.
  • R lt- R1 (join attributes of R1),(join attributes
    of R2) R2

21
Complete Set of Relational Algebra Operations
  • - All the operations discussed so far can be
    described as a sequence of only the operations
    SELECT, PROJECT, UNION, SET DIFFERENCE, and
    CARTESIAN PRODUCT.
  •  
  • - Hence, the set s ,P , U, - , X is called a
    complete set of relational algebra operations.
    Any query language equivalent to these
    operations is called relationally complete.
  •  
  • - For database applications, additional
    operations are needed that were not part of the
    original relational algebra. These include
  • 1. Aggregate functions and grouping.
  • 2. OUTER JOIN and OUTER UNION.

22
5.4 Additional Relational Operations
  • AGGREGATE FUNCTIONS
  • - Functions such as SUM, COUNT, AVERAGE, MIN, MAX
    are often applied to sets of values or sets of
    tuples in database applications
  • ltgrouping attributesgt f ltfunction listgt (R)
  •  
  • - The grouping attributes are optional
  •  
  • Example 1 Retrieve the average salary of all
    employees (no grouping needed)
  • R(AVGSAL)lt- f AVERAGE SALARY (EMPLOYEE)
  •  
  • Example 2 For each department, retrieve the
    department number, the number of employees, and
    the average salary (in the department)
  • R(DNO,NUMEMPS,AVGSAL)lt-
  • DNO f COUNT SSN, AVERAGE SALARY (EMPLOYEE)
  •  
  • DNO is called the grouping attribute in the
    above example
  • OUTER JOIN
  •  
  • - In a regular EQUIJOIN or NATURAL JOIN
    operation, tuples in R1 or R2 that do not have
    matching tuples in the other relation do not
    appear in the result
  •  

23
Codds 12 rules for an RDBMS
  • Information Rule All data is represented in the
    form of relations / tables (with rows columns).
  • Guaranteed Access Rule The DBMS can refer to
    every single value in the database (by using a
    combination of the table name the primary key).
  • Systematic Treatment of Null Values The concept
    of missing / non-available values should be
    distinct from blank fields / fields with 0s in
    it.
  • Active On-Line Catalog based on the relational
    model i.e. the database schema / definitions
    can be represented in the form of tables (and
    retrieved as such).
  • Comprehensive Data Sub-Language Rule The system
    must support at least one relational language
    (based on either Relational Algebra / Relational
    Calculus) and should allow the definition
    manipulation of database objects using this.
  • View Updating Rule Any views that are
    theoretically updateable, should be allowed to be
    so.
  • High Level Update The user should be allowed to
    Delete / Update a set of tuples (rather than row
    by row).
  • Physical Data Independence.
  • Logical/ Conceptual Data Independence.
  • Integrity Independence The DBMS should allow the
    definition of constraints (rules / triggers /
    etc.) independent of application programs.
  • Distribution Independence If the database were
    to be taken and split into parts residing in
    different places, this should be invisible to the
    users.
  • Non-Subversion Rule Even if the system allowed
    access to the data through a low level language
    (e.g. access a tuple at a time through 'C') it
    should go through the DBMS and hence ensure that
    the constraints are not violated.
  •  
Write a Comment
User Comments (0)
About PowerShow.com