Distributed Database Management Systems - PowerPoint PPT Presentation

Loading...

PPT – Distributed Database Management Systems PowerPoint presentation | free to download - id: 19f2aa-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Distributed Database Management Systems

Description:

What a distributed database management system (DDBMS) is and what its ... a centralized DBMS as well as stitch together the data from many sources transparently ... – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 44
Provided by: roger266
Learn more at: http://web.sau.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Distributed Database Management Systems


1
Chapter 10
  • Distributed Database Management Systems
  • Database Systems Design, Implementation, and
    Management, Fifth Edition, Rob and Coronel

2
In this chapter, you will learn
  • What a distributed database management system
    (DDBMS) is and what its components are
  • How database implementation is affected by
    different levels of data and process distribution
  • How transactions are managed in a distributed
    database environment
  • How database design is affected by the
    distributed database environment

3
Evolution of DDBMS
  • Decentralized database management systems (DDBMS)
  • Interconnected computer systems
  • Data/processing functions reside on multiple
    sites
  • 1970s Centralized DBMS
  • Structured information requirements
  • Slow progression from needs to implementation
  • 1980s Social and Technical Changes
  • Relational model ideal but a resource hog
  • Business and Competitive pressures
  • Ad hoc capability required
  • Decentralized management structure common
  • 1990s New forces
  • Internet and the World Wide Web used for data
    access and distribution
  • Data analysis through data mining and data
    warehousing

4
Centralized Databases
  • Performance degradation
  • High Costs of mainframe maintenance
  • Reliability
  • All your eggs in one basket

5
DDBMS Advantages
  • Data located near site with greatest demand
  • Faster data access
  • Faster data processing
  • Growth facilitation
  • Improved communications
  • Reduced operating costs
  • User-friendly interface
  • Less danger of single-point failure
  • Processor independence

6
DDBMS Disadvantages
  • Complexity of management and control
  • Applications must recognize data location
  • Synchronization issues
  • Redundancy/anomalies
  • Security
  • Lack of standards
  • No application level standards
  • Increased storage requirements
  • Greater difficulty in managing data environment
  • Increased training costs

7
Distributed Processing
  • Shares databases logical processing among
    physically, networked independent sites file
    server

Figure 10.1
8
Distributed Database
  • Stores logically related database over
    physically independent sites

Figure 10.2
9
Distributed Database vs. Distributed Processing
  • Distributed processing
  • Does not require distributed database
  • May be based on a single database on single
    computer
  • Copies or parts of database processing functions
    must be distributed to all data storage sites
  • Distributed database
  • Database fragments
  • Requires distributed processing
  • Both
  • Require a network to connect components

10
Functions of DDBMS
  • Application/end user interface
  • Validation to analyze data requests
  • Transformation to determine request components
  • Query optimization to find the best access
    strategy
  • Mapping to determine the data location
  • I/O interface to read or write data
  • Formatting to prepare the data for presentation
  • Security to provide data privacy
  • Backup and recovery
  • DB Administration
  • Concurrency Control
  • Transaction Management

11
Centralized Database
Figure 10.3
12
Fully Distributed Database Management System
Must perform all the functions of a centralized
DBMS as well as stitch together the data from
many sources transparently
Figure 10.4
13
DDBMS Components
  • Computer workstations
  • Network hardware and software components
  • Communications media
  • Transaction processor (TP)
  • Receives and processes application requests for
    data
  • Forwards requests to the Data Processor(s)
  • Also called application manager (AP) or
    transaction manager (TM)
  • Data processor (DP)
  • Stores and retrieves data located at a site
  • Also called data manager (DM)

14
Distributed Database Components
Figure 10.5
15
DDBMS Protocols
  • Interface with network to transport data and
    commands between DPs and TPs
  • Synchronize data received from DPs and route to
    appropriate TPs
  • Ensure common database functions
  • Security
  • Concurrency control
  • Backup and recovery

16
Levels of Data and Process Distribution
  • Database systems can be classified based on
    process distribution and data distribution

Table 10.1
17
Single-Site Processing, Single-Site Data (SPSD)
  • All processing on single CPU or host computer
  • All data are stored on host computer disk
  • DBMS located on the host computer
  • DBMS accessed by dumb terminals
  • Typical of mainframe and minicomputer DBMSs
  • Typical of 1st generation of single-user
    microcomputer database

18
Single-Site Processing, Single-Site Data (cont.)
Figure 10.6
19
Multiple-Site Processing, Single-Site Data (MPSD)
  • TP acts as a redirector
  • Requires network file server
  • Applications accessed through LAN
  • Variation known as client/server architecture

Figure 10.7
20
Multiple-Site Processing, Multiple-Site Data
(MPMD)
  • Fully distributed DDBMS with support for multiple
    DPs and TPs at multiple sites
  • Homogeneous I
  • Integrate one type of centralized DBMS over the
    network
  • Heterogeneous
  • Same DBMS
  • Fully Heterogeneous
  • Integrate different types of centralized DBMSs
    over a network
  • Different OSs
  • Different hardware platforms

21
Heterogeneous Distributed Database Scenario
Figure 10.8
22
Fully Hetergeneous
  • Not there yet
  • Those that claim to be
  • Read only
  • Number of table restrictions/transaction
  • Number of database restrictions
  • Restriction on the database models
  • Hierarchical
  • Network
  • Relational

23
Distributed DB Transparency
  • Allows end users to feel like only database user
  • Hides complexities of distributed database
  • Transparency features
  • Distribution
  • Transaction
  • Failure
  • Performance
  • Heterogeneity

24
Distribution Transparency
  • Allows management of a physically dispersed
    database as though it were centralized
  • Three Levels
  • Fragmentation transparency
  • Location transparency
  • Local mapping transparency

Table 10.2
25
Distribution Transparency
  • Example Employee data (EMPLOYEE) are distributed
    over three locations New York, Atlanta, and
    Miami. Depending on the level of distribution
    transparency support, three different cases of
    queries are possible

Distributed DBMS
Employee Table
E1
E2
E3
Fragment
Location
New York
Atlanta
Miami
26
Distribution Transparency
  • When a DBMS support fragmentation transparency
    the user views a single logical database
  • SELECT FROM EMPLOYEE WHERE SALARY gt 50000

27
Distribution Transparency
  • When the DBMS supports location transparency the
    user needs to know the fragment names but need
    not know the actual location of the fragments
  • SELECT FROM E1 WHERE SALARY gt
    50000 UNION
    SELECT FROM E2
    WHERE SALARY gt
    50000 UNION
    SELECT FROM
    E3 WHERE SALARY gt
    50000

28
Distribution Transparency
  • When the DBMS supports local mapping
    transparency the user needs to know the fragment
    names as well as the actual location of the
    fragments
  • SELECT FROM E1 NODE NY WHERE SALARY
    gt 50000 UNION SELECT FROM E2 NODE ATL
    WHERE SALARY gt 50000
    UNION SELECT
    FROM E3 NODE MIA WHERE SALARY gt 50000

29
Distribution Transparency
  • Distribution transparency is supported by a
    distributed data dictionary which captures the
    distributed global schema.
  • A local transaction processor uses this global
    schema to translate user requests into subqueries
    (remote requests) that will be processed by
    different data processors.

30
Transaction Transparency
  • Ensures transactions maintain integrity and
    consistency
  • Completed only if all involved database sites
    complete their part of the transaction
  • Management mechanisms
  • Remote request
  • Remote transaction
  • Distributed transaction
  • Distributed request

31
Remote Request
Figure 10.10
Single request for data from a single DP
32
Remote Transaction
Figure 10.11
Multiple requests for data from a single DP
33
Distributed Transaction
Figure 10.12
Transaction can reference multiple DPs each
request processes by single DP
34
Distributed Requests
Each request can be processed my multiple DPs
Figure 10.13
35
Distributed Requests (cont.)
Figure 10.14
36
Distributed Concurrency Control
  • Multisite, multiple-process operations more
    likely to create data inconsistencies and
    deadlocked transactions
  • Problems
  • Transaction committed by local DP
  • One DP could not commit transactions result
  • Yields inconsistent database

37
Two-Phase Commit Protocol
  • DO-UNDO-REDO protocol
  • Write-ahead protocol
  • Two kinds of nodes
  • Coordinator
  • Subordinates
  • Phases
  • Preparation
  • Coordinator sends message to all subordinates
  • Confirms all are ready to commit or abort
  • Final Commit
  • Ensures all subordinates have committed or aborted

38
Performance Transparency and Query Optimization
  • Objective Minimize total cost associated with
    execution of request
  • Main costs
  • Access time
  • Communication
  • CPU time
  • Basis for query optimization algorithms
  • Optimum execution order
  • Sites accessed to minimize communication costs
  • Dynamic or static optimization
  • Statistically based vs. rule-based query
    optimization algorithms

39
Distributed Database Design
  • Partition database into fragments
  • Horizontal
  • Vertical
  • Mixed
  • Fragments to replicate
  • Storage of data copies at multiple sites
  • Fully, partially, unreplicated databases
  • Data allocation
  • Where to locate data
  • Centralized, partitioned, replicated

40
Client/Server Advantages Over DDBMS
  • Client/server less expensive
  • Client/server solutions allow use of
    microcomputers GUI
  • More people with PC skills than mainframe skills
  • PC is well established in workplace
  • Numerous data analysis and query tools exist
  • Considerable cost advantages to off-loading
    application development

41
Client/Server Disadvantages
  • Creates more complex environment with different
    platforms
  • Increased number of users and sites creates
    security problems
  • Training issues become more complex and expensive

42
Dates 12 Commandments for Distributed Databases
  • 1. Local Site Independence
  • 2. Central Site Independence
  • 3. Failure Independence
  • 4. Location Transparency
  • 5. Fragmentation Transparency
  • 6. Replication Transparency

43
Dates 12 Commandments for Distributed Databases
  • 7. Distributed Query Processing
  • 8. Distributed Transaction Processing
  • 9. Hardware Independence
  • 10. Operating System Independence
  • 11. Network Independence
  • 12. Database Independence
About PowerShow.com