CIS 5163 APPLIED DATABASE MANAGEMENT Section 1 Lecture 1 Database Principles within The Enterprise F - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

CIS 5163 APPLIED DATABASE MANAGEMENT Section 1 Lecture 1 Database Principles within The Enterprise F

Description:

Federal Express, Wal-Mart, Knight-Ridder (Dialog) ... People. Roles in the Database Environment. Data Administrator (DA) Database Administrator (DBA) ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 59
Provided by: drr108
Category:

less

Transcript and Presenter's Notes

Title: CIS 5163 APPLIED DATABASE MANAGEMENT Section 1 Lecture 1 Database Principles within The Enterprise F


1
CIS 5163 APPLIED DATABASE MANAGEMENT Section 1
Lecture 1Database Principles withinThe
Enterprise Framework
  • Dr. Raja K. Iyer
  • iyer_at_tarleton.edu
  • (254) 968-9341

2
Course Presentation Sections Outlines
  • DATABASE PRINCIPLES WITHIN THE ENTERPRISE
    FRAMEWORK
  • Business Process Engineering for Database
  • Transactions Processing versus Analytical
    Processing
  • Workflow Models Versus Data Models
  • Data Dictionaries and Applications CRUD
    Linkages
  • DATABASE ANALYSIS AND DESIGN PRINCIPLES
  • Logical versus Physical Schema
  • DBMS and Relational DBMS
  • Object-oriented DBMS
  • Choosing the Appropriate DBMS Environment for
    enterprises
  • 4th Generation Programming / Web-based DB Designs
  • DATABASE DESIGN AND IMPLEMENTATION CONSIDERATIONS
  • Database Implementation in Legacy / Mainframe
    Environments
  • Database Implementation in Client-Server
    Environments
  • Database Implementation in Networked / Web
    Environments
  • Database Implementation Open Source Future of
    DBMS?
  • Database Integrity, Security, and Recovery
    Administration
  • DATABASE PROJECT
  • Semester project based on a Case Study (to be
    selected)

3
Section 1Database Principles withinThe
Enterprise Framework
  • Business Process Engineering for Database
  • Transactions Processing versus Analytical
    Processing
  • Workflow Models Versus Data Models
  • Data Dictionaries and Applications CRUD Linkages

4
Business Process Re-Engineering for Database
Management Systems
  • Introduction to Workflow and BPR
  • Drivers for BPM 11 Money-Relevant Reasons to
    Start
  • Always View Database in terms of Workflow!
  • (For additional information, visit the Workflow
    Management Coalition Website at www.wfmc.com)

5
Multiple Layers of Workflow Automation
6
Workflow Automation Model from WfMC
Online Auction Process Analysis, Modeling
Definition Tools
Business Process Analysis, Modeling Definition
Tools
Build Time
Auction Process Definitions
Process Definition
Run Time
Auction Workflow Enhancement Service
Workflow Enhancement Service
Auction Applications IT Tools
Applications IT Tools
Source Online Auction Workflow Model Mapped on
to Workflow Reference Model
Source Workflow Reference Model from Workflow
Management (WfM) Coalition)
7
An Example of Workflow AutomationSales Workflow
Process Automation
Customer Quotes Management (CQM)
Customer Opportunities Management (COM)
Customer Contacts Management (CCM)
Sales Order Management (SOM)
Front Office
Back Office
Sales Commission Management (SCM)
Sales Shipping Management (SSM)
Accounts Receivables Management (ARM)
Sales Invoice Management (SIM)
8
An Example of Workflow AutomationHospital
Workflow Process Automation
  • Role of Database in Hospital Workflow
  • Process Automation
  • Transactions
  • Records Accessibility
  • Security

Reference Yahoo! News Hospitals Move Toward
Paperless Age
9
Workflow Automation and BPR The Role of the
Database and DBMS
  • Database should support workflow through process
    automation with transactions against the database
  • DBMS technologies, such as normalization
    ensures integrity and consistency of data used in
    business process-oriented transactions
  • DBMS guarantees data independence from business
    functions and business processes and workflow
    transactions
  • Security and recoverability

10
What is a Database?
  • Collection of related data items that are being
    stored for record-keeping analysis
  • Could be stored on cards in Rolodex, file
    cabinet, computer,
  • Computerized databases are managed by a Database
    Management System (DBMS)
  • Efficient, safe storage of large amounts of data
  • High-level language for specifying operations
    user wishes to perform on data
  • Shields users from knowing details about how data
    is stored
  • Users View of Data versus how data is stored
  • Conceptual Schema versus Physical Schema

11
Importance of DBMS
  • Amount of electronically available data is
    exploding
  • Cost of storage is continuously dropping
  • Moores law every 18 months, speed of processor
    and/or capacity of disk doubles or price goes
    down by half
  • Value of data as an organizational asset is
    widely accepted
  • High demand in industry for powerful, flexible
    data management systems to store data efficiently
    and get the most out of their large, complex data
    sets
  • e.g., data warehousing, data mining
  • Largest databases
  • Federal Express, Wal-Mart, Knight-Ridder
    (Dialog),
  • Tables with 1 billion or more rows
  • Approaching 10s of TB of data
  • Think of the consequences of storing this much
    data ?

12
Brief History of Data Management
  • Early DBMSs (late 1960s) evolved from file-based
    processing systems
  • Need for supporting concurrent access to the data
    by many users, recovery, back-up,
  • Roots in airline reservation systems (SABRE),
    banking systems, corporate record-storage systems
  • Visualize the data much as it was stored
  • Tree-based (hierarchical model)
  • Graph-based (network model)
  • Cumbersome to use, require programming to access
    data

DEPTS
Course
EMPS
MGR
ITEMS
Student
Faculty
Pre-Req
NAME
SS
NAME
ID
13
Tracing the Roots of Modern DBMS
  • Early 1970s Ted Codd invented new data model
    (relational data model) and the concept of data
    abstraction
  • Soon thereafter, team of IBMers invented SQL
    (Structured Query Language)
  • Became de-facto standard for query languages
    based on the relational data model
  • Commercial DBMS based on relational model are now
    widely accepted in industry
  • e.g., Microsoft Access, Oracle 9i, Sybase
    Adaptive Server,
  • gt10 billion dollar industry!

14
Examples of Database Applications
  • Purchases from the supermarket
  • Purchases using your credit card
  • Booking a holiday at the travel agents
  • Using the local library
  • Taking out insurance
  • Using the Internet
  • Studying at university

15
File-Based Systems
  • Collection of application programs that perform
    services for the end users (e.g. reports).
  • Each program defines and manages its own data.

16
File-Based Processing
17
Limitations of File-Based Approach
  • Separation and isolation of data
  • Each program maintains its own set of data.
  • Users of one program may be unaware of
    potentially useful data held by other programs.
  • Duplication of data
  • Same data is held by different programs.
  • Wasted space and potentially different values
    and/or different formats for the same item.

18
Limitations of File-Based Approach
  • Data dependence
  • File structure is defined in the program code.
  • Incompatible file formats
  • Programs are written in different languages, and
    so cannot easily access each others files.
  • Fixed Queries/Proliferation of application
    programs
  • Programs are written to satisfy particular
    functions.
  • Any new requirement needs a new program.

19
Database Approach
  • Database Approach essential because
  • Definition of data was embedded in application
    programs, rather than being stored separately and
    independently.
  • No control over access and manipulation of data
    beyond that imposed by application programs.
  • Result
  • Database and Database Management System (DBMS).

20
Database
  • Shared collection of logically related data (and
    a description of this data), designed to meet the
    information needs of an organization.
  • System catalog (metadata) provides description of
    data to enable programdata independence.
  • Logically related data comprises entities,
    attributes, and relationships of an
    organizations information.

21
Database Management System (DBMS)
  • A software system that enables users to define,
    create, and maintain the database and that
    provides controlled access to this database.

22
Database Approach
  • Data definition language (DDL).
  • Permits specification of data types, structures
    and any data constraints.
  • All specifications are stored in the database.
  • Data manipulation language (DML).
  • General enquiry facility (query language) of the
    data.
  • Controlled access to database may include
  • A security system.
  • An integrity system.
  • A concurrency control system.
  • A recovery control system.
  • A user-accessible catalogue.
  • A view mechanism.
  • Provides users with only the data they want or
    need to use.

23
Concept of Views in Database
  • Allows each user to have his or her own view of
    the database.
  • A view is essentially some subset of the
    database.
  • Benefits of Views include
  • Reduce complexity
  • Provide a level of security
  • Provide a mechanism to customize the appearance
    of the database
  • Present a consistent, unchanging picture of the
    structure of the database, even if the underlying
    database is changed.

24
Components of DBMS Environment
  • Hardware
  • Can range from a PC to a network of computers.
  • Software
  • DBMS, operating system, network software (if
    necessary) and also the application programs.
  • Data
  • Used by the organization and a description of
    this data called the schema.
  • Procedures
  • Instructions and rules that should be applied to
    the design and use of the database and DBMS.
  • People

25
Roles in the Database Environment
  • Data Administrator (DA)
  • Database Administrator (DBA)
  • Database Designers (Logical and Physical)
  • Application Programmers
  • End Users (naive and sophisticated)

26
History of Database Systems
  • First-generation
  • Hierarchical and Network
  • Second generation
  • Relational
  • Third generation
  • Object Relational
  • Object-Oriented

27
Advantages of DBMSs
  • Control of data redundancy
  • Data consistency
  • More information from the same amount of data
  • Sharing of data
  • Improved data integrity
  • Improved security
  • Enforcement of standards
  • Economy of scale

28
Advantages of DBMSs
  • Balanced conflicting requirements
  • Improved data accessibility and responsiveness
  • Increased productivity
  • Improved maintenance through data independence
  • Increased concurrency
  • Improved backup and recovery services

29
Disadvantages of DBMSs
  • Complexity
  • Size
  • Cost of DBMS
  • Additional hardware costs
  • Cost of conversion
  • Performance
  • Higher impact of a failure

30
Transactions Processing Vs. Analytical Processing
31
Transactions Processing Systems (TPS)
  • What are Transactions and How do they differ from
    data/information?
  • Informally, a transaction is a change to a
    database in real-time
  • Program that maintains correspondence between the
    state of the database and the real-world
    application it is modeling
  • e.g., deposit transaction at an ATM, hotel
    check-in,
  • Transactions are executed by Transaction
    Processing System (TPS) consisting of
  • Set of application programs containing
    transactions
  • Transaction middleware (TP monitor) which
    monitors transactions and controls access to
    DBMS(s)
  • One or more DBMSs for maintaining persistence
  • Understanding and building (simple) TPS is focus
    of this course

32
Transactions vs. Programs
  • Recall, transaction is a program that interacts
    with database
  • Goal Maintain correspondence between database
    and real-world application that is modeled by
    database
  • Example deposit transaction at a bank
  • Customer gives teller cash and deposit slip
  • Transaction updates customers account
    information in database to reflect deposit
  • Transaction must satisfy certain requirements
    which go beyond what is expected of a
    non-transactional program

33
Transactional Requirements
  • Maintain database consistency
  • Database restrictions stored as integrity
    constraints
  • Burden of the user/programmer to assure that
    transaction preserves all such constraints
  • Guarantee that transaction is executed as a whole
    or not at all (atomicity guarantee)
  • e.g., either deposit whole amount or no money at
    all
  • Guarantee that no information is lost
    (durability)
  • For multiple transactions running concurrently,
    guarantee that transactions do not interfere with
    each other (isolation guarantee)
  • The effect of multiple, concurrent transactions
    on database should be the same as that of a
    serial execution of the transactions why?
  • Atomicity, durability, and isolation are
    guaranteed by transaction subsystem
  • More on transactions later in semester

34
Structure of a Transaction Processing System
35
Challenges of Transaction Processing Systems
  • Applications that involve long transactions with
    multiple database accesses Use information in
    database to guide decision making
  • Applications that involve complex transactions
    that involve multiple distributed databases

FLAT TRANSACTION MODEL Begin transaction(
) S Commit( )
  • Structure for TPS
  • Savepoints Rollbacks
  • Begin_transaction()
  • S1
  • Sp1 create_savepoint()
  • .
  • If (condition)
  • rollback sp1
  • Commit( )
  • Multiple or Federated DBs

Note S indicates a set of SQL statements.
36
Decision Support Systems
  • Second application domain (besides transaction
    processing) in which databases play major role is
    decision-support
  • Use information in database to guide decision
    making
  • e.g., analyze the purchasing behavior of
    supermarket customers to make decisions about
    when and how to re-stock inventory
  • a.k.a. online analytical processing (OLAP) ?
    online transaction processing (OLTP)
  • Decision-support queries are more complex than
    transaction-oriented queries
  • OLTP In a particular store Reduce the number
    of blue shirts in the inventory database by 10.
  • OLAP Compute the percent change in sales of
    blue shirts for all of our stores in the
    northeastern US between 1990 and 2000.

37
Decision-Support Contd
  • Databases for OLAP are different from OLTP
    systems
  • Usually larger (TBs vs. GBs)
  • Separated from OLTP systems to prevent
    interference (since OLTP systems have stringent
    real-time constraints)
  • Different organization of data to support complex
    queries and facilitate maintenance
  • Different users, hence different interfaces
  • A.k.a. data warehouses
  • Data mining as an additional application
  • We will not focus on data warehousing and OLAP

38
Characteristics of Modern Database Systems
  • Support for concurrent access to data
  • Safeguard data against accidentally loss
  • Maintain integrity of database in light of
    changes
  • Support for distributed data
  • Control access to data
  • More recently
  • Support for non-standard data
  • Support for heterogeneous data
  • Support for decision-support and analysis

39
Additional Requirements
  • Increase usage and new applications for databases
    have resulted in additional requirements (since
    early days)
  • High availability
  • High reliability
  • High throughput
  • Low response time
  • Extensible

40
Some Recent Trends
  • DBMS are getting smaller and smaller
  • DBMS that can store GB of data can run on PC
  • Databases are getting bigger and bigger
  • Multiple TBs (terabyte 1012 bytes) not uncommon
  • Databases also able to store images, video, audio
  • Database stored on secondary storage devices
  • Use of Tertiary Storage in TPS
  • Larger capacity disks but much slower response
    time (10-20 msec vs. several sec.)
  • Tape, CD, etc. usually involves robotic
    conveyance
  • DBMS Supporting Parallel Computing
  • Speed-up query processing through parallelism
    (e.g., read data from many disks)
  • However, need special algorithms to partition
    data correctly

41
Types of DBMS
  • General-purpose DBMS
  • Multimedia DBMS
  • Geographic information systems (GIS)
  • Data warehouse DBMS
  • Real-time DBMS
  • Active DBMS

42
Stakeholders/Participants in the Transaction
Processing System Arena
  • System Analyst
  • Database Designer
  • Application Programmer
  • Project Manager
  • Database Administrator
  • System Administrator
  • End Users
  • Naïve end users
  • Sophisticated end users

43
When NOT to Use a DBMS
  • Initial investment too high
  • Too much overhead
  • Application is simple, well-defined, not expected
    to change
  • Stringent real-time requirements (use specialized
    real-time DBMS)
  • Multi-user access to data is not required
  • Alternative collection of files managed by
    access programs

44
Basic Database Terminology
  • Database (DB)
  • Collection of related data that exists over a
    long period of time
  • Database Management System (DBMS)
  • Collection of programs that
  • allows users to create a new database and specify
    its structure
  • gives users the ability to query and modify the
    data efficiently
  • keeps the data secure from accidents or
    unauthorized use
  • controls the access to the data for many users at
    once
  • Database System (DBS)
  • The database and DBMS software together make up
    what is known as the Database System

45
Data Abstraction
  • Set of seemingly incompatible goals for DBMS
  • (1) Retrieve large amounts of data efficiently
    lead to the design of complex data structures
  • (2) Provide simple interactions with data for
    non-trained users
  • For various reasons, early data-intensive
    applications worked directly with physical data
    representation ? What are the drawbacks?
  • Modern DBMS hide the physical complexity from
    users through several levels of abstraction while
    providing simple and efficient access methods
  • 3-tier architecture

46
Schema
  • Description of how data is stored or represented
  • Physical schema describes the structure of
    database (i.e., data files and indexes) as it is
    stored on disk (e.g., sequence of byte addresses
    and offsets)
  • Conceptual schema describes data in terms of
    higher-level concepts (e.g., a Student is a
    record containing ID, Name, and Address fields,
    which are of type INTEGER, STRING and STRING
    respectively)

47
Three-Tier Architecture
48
Data Independence
  • Ability to modify a schema definition at one
    level without affecting a schema definition in
    the next higher level
  • Physical data independence
  • Ability to modify physical schema without causing
    the conceptual (or external) schema or
    application programs to be re-written
  • Logical data independence
  • Ability to modify the conceptual schema without
    having to change the external schemas or
    application programs
  • Why is data independence so important?

49
Data Model
  • A Data Model (DM) is a collection of conceptual
    tools and languages for describing data,
    relationships, semantics, and consistency
    constraints (schema)
  • Distinguish between high-level (conceptual) data
    models and low-level (physical) data models
  • Conceptual Data Models (external and conceptual
    level)
  • Object-Based Logical Models Entity-Relationship
    Model, Object-Oriented Model
  • Record-Based Logical Models Relational Model,
    Network Model (CODASYL), Hierarchical Model (IMS)
  • Physical Data Models (internal)

50
Schema vs. Data
  • Recall, database schema describes the structure
    of the database
  • Changed infrequently
  • a.k.a. database intension
  • a.k.a. metadata ( data about data)
  • Database State refers to the data in the database
    at any given moment (snapshot)
  • Changes frequently
  • a.k.a. database extension
  • BTW, DBMS assures that all database states are
    valid states
  • Database Instance refers to a specific data item
    in the database

51
DBMS Languages
  • Data Definition Language (DDL)
  • Used to define the conceptual and internal
    schemas
  • Includes constraint definition language (CDL) for
    describing conditions that database instances
    must satisfy
  • Includes storage definition language (SDL) to
    influence layout of physical schema (some DBMSs)
  • Data Manipulation Language (DML)
  • Used to describe operations on the instances of a
    database
  • Procedural DML (how) vs. declarative DML (what)
  • e.g., Relational Algebra e.g., SQL
  • Note, SQL includes a DML and a DDL in one!
  • Host Language
  • General-purpose programming language which lets
    users embed DML commands (data sublanguage) into
    their code

52
Architecture of a DBMS
Schema Modifications
Modifications
Queries
Database System
Query Processor
Transaction Subsystem
DBMS Software
Storage Manager
Data
Data Definition (Metadata)
53
Component Overview
  • Data storage (incl. metadata)
  • e.g., names of relations, attributes, data types,
    etc.
  • Often, DBMS maintains an index
  • Helps us find data items quickly given part of
    their value how?
  • Storage manager
  • Handles requests from levels above, retrieves
    data from store and returns it in format
    requested by queries
  • Query processor
  • Processes not only queries but also requests for
    modifications, etc.
  • Figures out best way to retrieve data
  • Transaction subsystem
  • Handles concurrent transactions against database
    (see slides 9 through 12 for more details on
    transactions)
  • Three types of input at top

54
How Are Indexes Implemented?
  • Hash tables used extensively by early systems
  • Today B-tree, B for balanced tree
  • Generalization of a balanced binary search tree
  • Optimized for organizing data on disk, each node
    of B-Tree occupies 1 full disk block
  • Each node in B-Tree has large number of children
  • Hundreds of pointers to children in one block
  • Search of B-Tree rarely requires traversal of
    more than three levels!!
  • Cost of disk op is proportional to number of disk
    blocks accessed
  • 3 block accesses much more efficient than search
    of binary tree with many different levels
  • Lesson most appropriate data structure for data
    stored on disk is different from the data
    structures used for algorithms that run in memory

55
Next Lectures
  • Start with database design process, followed by
    study of the relational data model and its query
    languages
  • Goal of course project is to develop an
    application that supports transactions and uses
    database for persistence (transaction processing
    system)
  • Brief review of methodology used for project
    development next

56
CIS 5163 Course Project
  • Goal of course project is to develop an
    application that supports transactions and uses
    database for persistence (transaction processing
    system)
  • Implementation of transaction processing system
    is significant undertaking
  • Numerous software engineering practices to guide
    developer
  • Steps for designing and implementing course
    project
  • For the selected application domain, state
    objectives clearly
  • Expand objectives into formal requirements
    document
  • Design database schema
  • Implement and test database
  • Develop application and integrate with database
  • Test integrated system, fine-tune database if
    necessary
  • Produce documentation and deliver project
  • Demonstrate finished system to course staff

57
CIS 5163 Course ProjectRequirements Document (RD)
  • Describes in some detail what the system is going
    to do, but NOT how it will do it
  • Specifically, RD describes
  • Functional requirements for the application
  • Set of interactions with users
  • Information/data that must be contained in the
    system
  • Constraints the at must be satisfied by the
    underlying database
  • Final set of deliverables

58
ADDITIONAL READINGS OF INTEREST
  • Business Event Methodology
  • Introduction to Workflow
  • Workflow - An Introduction
  • Modern Business Strategies and Process Support
  • Process Knowledge
  • Tech for Virtual Enterprise
  • Workflow and Process Management
  • Yahoo! News - Hospitals Move Toward Paperless Age
Write a Comment
User Comments (0)
About PowerShow.com