An Introduction to DBMS Technology - PowerPoint PPT Presentation

About This Presentation
Title:

An Introduction to DBMS Technology

Description:

An Introduction to DBMS Technology Yelena Yesha Olga Streltchenko – PowerPoint PPT presentation

Number of Views:293
Avg rating:3.0/5.0
Slides: 49
Provided by: umb78
Category:

less

Transcript and Presenter's Notes

Title: An Introduction to DBMS Technology


1
An Introduction to DBMS Technology
Yelena Yesha Olga Streltchenko
2
Presentation Overview
  • Database functionality
  • Relational Data Model
  • SQL
  • Web-Database Connectivity

3
Transaction
  • A transaction is an exchange of
  • Information
  • Goods
  • Services
  • Currency/currencies
  • Transaction properties
  • Atomicity a transaction must be all or nothing.
  • Consistency a transaction takes the system from
    one consistent state to another.
  • Isolation.
  • Durability.

4
Databases and Information Economy
  • Shift from computation to information
  • corporate computing
  • personal computing and the Internet
  • scientific computing.
  • Growing importance of transaction-orientation and
    information retrieval.
  • The database field concentrates on the efficient
    management of large amounts of persistent,
    reliable shared data.

5
Database versus File System
  • Database management systems (DBMS) is a software
    that provides transaction support by implementing
  • Data independence
  • Data access efficiency
  • Concurrency control
  • Data integrity
  • Reliability
  • Security
  • Data distribution and heterogeneity.

6
Data Independence and Access Efficiency
  • DBMS allows to avoid rewriting all access
    routines every time the data format changes or
    data is added/modified/deleted.
  • insulate applications from data storage details.
  • Logical independence protection from changes in
    logical structure of data.
  • Physical independence protection from changes
    in physical structure of data.
  • DBMS maintains data structures and implements
    algorithms allowing to avoid linear search
  • indexing search in O(log n)
  • fast access even on complex data queries.

7
Concurrency Control and Data Integrity
  • Interleaving actions of different applications
    boosts performance.
  • DBMS insures semantically correct access to the
    same data by concurrent applications
  • two programs accessing the same data at the same
    time can result in an inconsistent update
  • implement sharing in a controlled manner.
  • Data semantics may require certain constraints to
    be satisfied.
  • DBMS guarantees that application programs comply
    with the constraints when adding/modifying the
    data.

8
Reliability and Security
  • DBMS provides techniques for recovery from
    software and hardware failures
  • guarantee survival of the data across
    catastrophes.
  • DBMS prevents unauthorized users from
    accessing/modifying data or denying service to
    other users.

9
Data Distribution and Heterogeneity
  • Centralization is the enemy of scalability
  • a vast number of modern applications are
    distributed.
  • Data sharing in a distributed environment is a
    challenge.
  • Heterogeneity applies to networks, hardware,
    operating systems, programming languages, data
    formats, etc.
  • Distributed applications must mask the
    differences.
  • Need distributed data management.

10
Categories of Data Models
  • High-level or conceptual
  • entities, attributes, relationships.
  • Representational or implementation or logical
  • relational, network hierarchical,
    object-oriented, object-relational.
  • Physical or low-level
  • data storage.

11
Levels of Abstraction in a Database
  • Schema versus Instance
  • schema description of the data that
  • captures data types, relationships, constraints
    on the data
  • meta-data (data about data), knowledge, e.g.,
    Employees(EmpName, EmpNo, Dept, Sal)
  • is independent of any application program
  • changes infrequently
  • instance set of records/tuples/rows for that
    schema, the actual data in the database at a
    given time
  • time-varying
  • e.g., ltJane, 201, Shoe, 1Mgt,ltSusan, 302, Toy, 1Mgt

12
3-schema Architecture
  • Physical level description of a database
  • how things are stored on disk
  • files, record structures,
  • indices,
  • data structures for disk blocks,
  • methodology for dealing with too long records,
    etc.
  • Conceptual level description of a database
  • The description of application data (its schema)
    using one of the traditional data models.

13
3-Schema Architecture (cont'd)
  • View-level description of a database
  • What users of a particular application see
  • their own customized schema, e.g., for payroll,
    for the ticket agent, for a simulation program.
  • Multiple levels
  • helps with data independence
  • helps with maintenance.
  • Many views, single logical and physical schema.
  • Levels of abstraction give data independence.

14
The Entity-Relational Model
  • Entity a distinguishable object.
  • Entity set a set of entities all of the same
    type.
  • Attribute a single property of an entity
  • simple vs composite
  • single-valued vs multi-valued
  • stored vs derived
  • null values.
  • Domain set of values permitted for that
    attribute.

15
The E-R Model (2)
  • Relationship an association between two or more
    entities.
  • Relationship set a set of relationships all of
    the same type
  • There is no correct schema for a batch of data.
    Which schema is best depends on the application.
  • Many basic data modelling choices depend on an
    understanding of the application.

16
Data Model
  • Data model notation for describing data, plus a
    set of operations used to manipulate that data.
  • a set of primitives for defining the structure of
    a DB
  • a set of operations for specifying the retrievals
    and updates on a DB
  • relational, hierarchical, network,
    object-oriented.

17
The Relational Model (Codd 1970)
  • The relational data model is the most important
    data model currently existing.
  • Value-oriented, i.e., allows operations on
    relations whose results are relations, thus
    enables to combine operations.
  • As opposed to object-oriented models, in which
  • Operations cannot be applied to the result of
    other operations
  • The result of an operation may be a new data
    type, and operations may not be available for
    this type.

18
Definitions Domain Relation
  • A domain is a set of atomic values.
  • A relation is a finite subset of the cartesian
    product of a finite list of domains
  • relation is a set of tuples
  • order of tuples is irrelevant and
  • no relation has 2 identical tuples
  • each tuple value is atomic
  • no composite attributes
  • no multi-valued attributes.

19
Relational Model (contd)
  • Everything is represented by relations
  • Formally Given sets D1, D2, ....Dn (not
    necessarily distinct), a relation R ? D1 X D2 X
    ...X Dn
  • Di 's are the domains and n is the arity
    (degree) of R
  • elements of R are called tuples
  • number of tuples in R is the cardinality of R
  • relational data model helps to view a relation as
    a table
  • Observe the following properties

20
Relational Model (contd)
  • Everything is represented by relations
  • Given sets D1, D2, ....Dn (not necessarily
    distinct), a relation R ? D1 X D2 X ...X Dn
  • Di 's are the domains and n is the arity (degree)
    of R
  • elements of R are called tuples
  • number of tuples in R is the cardinality of R.
  • Relational data model helps to view a relation as
    a table
  • each row represents a tuple (record)
  • each column represents an attribute (field).
  • Properties
  • no two rows are identical
  • the ordering of tuples is unimportant
  • the ordering of columns is important.

21
Keys
  • Let R be a relation schema and K ? R.
  • K is a superkey of R if it can uniquely identify
    any tuple in any r(R). There are no tuples t and
    t' such that tK t'K.
  • K is a candidate key if K is a minimal superkey.
    There is no K' ? K such that K' is also a
    superkey of r(R)
  • A primary key is one of the candidate keys,
    remaining candidate keys are alternate keys
  • Every relation has a key.
  • A key is a property of a relation schema, not a
    relation.

22
Integrity Constraints
  • Relational database schema is a set of relation
    schemas and a set of integrity constraints
  • Integrity constraint condition that must be true
    for any instance of a database.
  • Integrity constraints are expected to hold on
    every database instance of the schema
  • Integrity constraints
  • Structural
  • key constraint uniqueness of keys
  • entity integrity constraint no primary key value
    can be null
  • referential integrity constraint reference from
    relation R to relation S must refer to an
    existing tuple of S
  • Semantic.

23
Foreign Keys
  • A set of attributes (FK) of R is a foreign key if
  • the attributes in FK have the same domain as the
    primary key (PK) attributes of another relation
    S, and
  • for each instance of R, the values of FK occur as
    a value of PK for some instance in S, or is null.
  • In the relational model, the only way an entity
    can reference another entity is through the value
    of the primary key of the second entity.
  • Foreign keys don't have to be unique or non-null,
    but if one component is null, then all components
    must be null.

24
E-R to Relations (I.e., Defining Relations)
  • Done using DDL (Data Definition Language)
  • Name whole database schema
  • Declare domains for attributes
  • Define relations
  • name
  • attribute names and domains
  • primary and other keys
  • foreign keys

25
Translating from E-R
  • Represent entity set E by a relation whose
    attributes are all the E-R attributes of E. Then
    each tuple represents one entity of E.
  • To represent relation R between entity sets E1,
    , Ek, create relation R with key attributes of
    E1, , key attributes of Ek, as attributes
    (rename duplicates). Each tuple of the relation
    represents one combination of entities that are
    related to one another.
  • You might have some redundant relations, which
    you can delete.

26
Schema Normalization
  • Formal theory of database design
  • based on grouping attributes in a particular way
    using attribute dependencies to achieve good
    schemas
  • 1NF, 2NF, 3NF, BCNF, 4NF,
  • Goal
  • dont store redundant information
  • can represent everything (otherwise, the schema
    is useless!)

27
Query and Update Languages
  • DDL data definition language
  • used by DBA
  • to define schemas, create views, create indices
  • DML data manipulation language
  • used by sophisticated casual user
  • to query data or
  • update data

28
Relational Query Languages
  • Query languages allow manipulation and retrieval
    of data from a database.
  • Relational model supports simple, powerful query
    languages
  • strong formal foundation based on logic
  • allows for optimization.
  • Two mathematical languages form the basis for
    relational languages (e.g., SQL) and for
    implementation
  • Relational Algebra More operational, useful for
    representing execution plans
  • Relational Calculus Lets users describe what
    they want, rather than how to compute it
    (non-operational, declarative).
  • Basic operations
  • selection, projection, cross-product,
    set-difference, union, intersection, join,
    division

29
SQL
  • SQL (Structured Query Language) is the query
    language for the System R developed at IBM San
    Jose Astraham, Gray, Lindsay, Selinger ..
  • SQL is now the query language for IBM's DB2 and
    the de-facto standard on most commercial RDBMS.
  • SQL is a comprehensive language providing
    statements for data definition, query and update.
    Hence it is both DDL and DML.

30
SQL (contd)
  • SQL allows to create views, it can be embedded in
    a general-purpose programming language (C or
    PASCAL).
  • SQL has one basic statement for retrieving data
    from the database
  • the SELECT statement
    SELECT ltattribute listgt

    FROM lttable listgt
    WHERE ltconditiongt
  • Standards
  • SQL or SQL1 (ANSI 1986)
  • SQL2 or SQL-92 (ANSI 1992)
  • SQL3 underway extends SQL with OO and other
    concepts.

31
SQL Data Types
  • Numeric
  • Integers of various ranges INTEGER (or INT),
    SMALLINT
  • Real numbers of various precision FLOAT, REAL,
    DOUBLE PRECISION
  • Formatted numbers DECIMAL(i,j) or DEC(i,j) or
    NUMERIC(i,j).
  • Character Strings
  • Fixed length n CHAR(n) or CHARACTER(n)
  • Variable length of maximum n VARCHAR(n) or CHAR
    VARYING(n) (default n 1).
  • Bit strings
  • Fixed length n BIT(n)
  • Varying length of maximum n VARBIT(n) or BIT
    VARYING(n).

32
SQL Data Types (contd)
  • Date Time SQL2
  • DATE (10 positions) YYYY-MM-DD
  • TIME (8 positions) HHMMSS
  • TIMESTAMPdate, time with 6 fractions of seconds
    and optional time zone TIME(i) defines i decimal
    fractions of seconds
  • (81i positions) HHMMSSddd...d
  • TIME WITH TIME ZONE includes the displacement
    from standard universal time zone 1300 to
    -1259 (6 additional positions)
    HHMMSS/-HHMM
  • INTERVAL Year/Month or Day/TIME

33
Data Definition Language
  • DDL is used to define the (schema of) database
  • to create a database schema
  • to create a domain
  • to create, drop. alter a table
  • to create, remove an index defunct in SQL2
  • to create or drop a view
  • to define integrity constraints
  • to define access privileges to users (Oracle
    CONNECT, RESOURCE, DBA)
  • to GRANT or REVOKE privileges ON/TO object/user
  • SQL2 supports multiple schemas
  • CREATE SCHEMA name AUTHORIZATION user
  • CREATE SCHEMA EMPLOYEE AUTHORIZATION yesha

34
SQL Schema
  • EMP(Name,SSN,DNO,BirthPlace)
  • DEPT(DName,DNO,MGRSSN)
  • PROJECT(PName,PNO,PLocation,DNum)
  • WORKSON(ESSN,PNO,Hours)
  • CREATE SCHEMA 'COMPANY'
  • CREATE TABLE EMP
  • (
  • EName name_dom NOT NULL,
  • SSN CHAR(9) NOT NULL,
  • DNO INTEGER NOT NULL,
  • BirthPlace city_dom,
  • PRIMARY KEY(SSN),
  • FOREIGN KEY (DNO) REFERENCES DEPT (DNO)
  • )

35
Drop
  • DROP command can be used to remove
  • a schema
  • DROP SCHEMA Company CASCADE
  • DROP SCHEMA Company RESTRICT
  • CASCADE option removes everything tuples,
    tables, domains, ...
  • RESTRICT option removes the schema if it has no
    elements in it
  • a table
  • DROP TABLE EMP CASCADE
  • DROP SCHEMA EMP RESTRICT
  • CASCADE option removes the table and all
    references to it
  • RESTRICT option removes the table if it is not
    referenced

36
Alter
  • The ALTER allows to
  • alter the domain of an attribute
  • ALTER TABLE Student
  • ALTER GPA NUMBER(4,2)
  • set or drop default value of an attribute
  • ALTER TABLE Student
  • ALTER GPA DROP DEFAULT
  • ALTER TABLE Student
  • ALTER GPA SET DEFAULT 0.00
  • add a new attribute to a relation
  • ALTER TABLE Student
  • ALTER Admission DATE
  • drop an attribute (not in SQL1)
  • ALTER TABLE Student
  • DROP GPA CASCADE/RESTRICT

37
Data Manipulation Language
  • SELECT
  • tuple queries
  • aggregate queries
  • INSERT
  • DELETE
  • UPDATE
  • CREATE VIEW

38
Data Query SELECT
  • Used for retrieval.
  • Used to specify subqueries for retrieval and for
    the other operations.
  • Not the sigma or select operator of the
    relational algebra.
  • The general form of a SELECT statement
  • SELECT ltattribute listgt
  • FROM lttable listgt
  • WHERE ltconditiongt
  • GROUP BY ltattribute listgt
  • HAVING ltconditiongt
  • ORDER BY ltattribute,ASC/DESC pairgt

39
Relational Operators in SQL
  • Projection SELECT A,B FROM R
  • Selection SELECT FROM R WHERE F
  • The WHERE condition can be a Boolean combination
    of conditions.
  • Product of two tables, A X B
  • SELECT R.?, S.?
  • FROM R, S

40
Data Update
  • Examples
  • Create a new instance of an entity explicitly
    with all details
  • INSERT table VALUES (v1, v2, ..., vn) inserts a
    tuple (v1, v2, ..., vn) into table
  • Unspecified columns get their default value if
    available,NULL if allowed (and no default is
    defined), (fails otherwise)
  • INSERT table(c1, c2, cm) VALUES (v1, v2, ..., vm)
    inserts a tuple with m columns set to (v1, v2,
    ..., vm).

41
Data Update (contd)
  • Examples
  • UPDATE table SET c1 v1, ..., cm vm WHERE
    expr
  • Update all instances matching some local
    criterion UPDATE table SET ? WHERE ?
  • Remove instances matching some local criterion
    DELETE table WHERE ?
  • Remove instances matching a non-local criterion
  • DELETE table FROM table, table2 WHERE AND ?

42
SQL Views
  • An SQL view is a table derived from other tables
  • base (physical) tablesultimately depend on these
    defining tables
  • other views
  • Views are
  • usually virtual
  • sometimes materialized

43
View Specification
  • Views are specified with the paradigm
  • CREATE VIEW view
  • AS SELECT
  • The SELECT part is the defining query
  • In a view definition, we can
  • define column names of view
  • use aggregation (GROUP BY)

44
View Resolution
  • Views are
  • computed by a sort of macro expansion when
    neededview resolution
  • query modification compute fresh
  • view materialization store
  • always up to date modifications to the defining
    tables are automatically reflected in the view

45
Updating Views
  • A view tuple may have as its source a combination
    of base tuples, because of
  • joins
  • aggregations
  • Therefore, updating a view
  • is nontrivial to determine
  • may potentially lead to more than one update on
    the defining relations
  • is often not allowed

46
Updating Views (2)
  • Restricted kinds of views may be updated. The
    subselect must have
  • only one defining table
  • columns including a candidate key of the table
  • only column names, not expressions
  • no joins, e.g., no correlative subqueries
  • no DISTINCT keyword
  • no aggregates

47
CHECK OPTION
  • Specified for updatable views
  • Checks whether an INSERT or UPDATE to a view
    would immediately cause the tuple to disappear
    from the view
  • the new tuple should satisfy the WHERE clause of
    the view definition

48
SQL Constraints
  • SQL allows declarative constraints such as
  • CREATE ASSERTION assertion-name
  • CHECK (enhanced subselect query)
  • The DBMS checks if any ASSERTION is ever violated
Write a Comment
User Comments (0)
About PowerShow.com