L02: DDBMS Architecture - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

L02: DDBMS Architecture

Description:

Architecture defines a system's structure with. Componets. Functions of ... full fledged local ... Full-fledged DBMS at each site. A layer runs on top of ... – PowerPoint PPT presentation

Number of Views:311
Avg rating:3.0/5.0
Slides: 55
Provided by: Lu47
Category:

less

Transcript and Presenter's Notes

Title: L02: DDBMS Architecture


1
L02 DDBMS Architecture
  • DDBMS Architecture
  • Architectural Models of DDBMS
  • More on Transparency

2
DDBMS Architecture
  • Architecture defines a systems structure with
  • Componets
  • Functions of components, and
  • Their interactions
  • Purpose of reference architecture
  • A framework for discussion
  • Standardiztion

3
Approaches of Reference Models
  • Based on components
  • modularity - unclear functionality
  • Based on functions (e.g. ISO/OSI model)
  • clear interfaces - unclear complexity
  • Based on data (e.g. ANSI/SPARC model)
  • data-centric - unclear functionality
  • Need an integration of the three approaches

4
ANSI/SPARC Architecture
Users
External Schema
External View
External View
External View
Conceptual Schema
Conceptual View
Internal Schema
Internal View
5
Conceptual Schema Definition (1)
  • RELATION PAY
  • KEY TITLE
  • ATTIRBUTES
  • TITLE CHAR (10)
  • SAL NUMERIC(6)
  • RELATION EMP
  • KEY ENO
  • ATTIBUTES
  • ENO CHAR(9)
  • ENAME CHAR(15)
  • TITLE CHAR(10)

6
Conceptual Schema Definition (2)
  • RELATION ASG
  • KEY ENO, PNO
  • ATTIRBUTES
  • ENO CHAR(9)
  • PNO CHAR(7)
  • RESP CHAR (10)
  • DUR NUMERIC(3)
  • RELATION PROJ
  • KEY PNO
  • ATTIBUTES
  • PNO CHAR(7)
  • PNAME CHAR(20)
  • BUDGET NUMERIC(7)
  • LOC CHAR(15)

7
Internal Schema Definition
  • RELATION EMP
  • KEY ENO
  • ATTIBUTES
  • ENO CHAR(9)
  • ENAME CHAR(15)
  • TITLE CHAR(10)
  • INTERNA_REL E
  • INDEX ON E
  • FIELD
  • E BYTE(9)
  • ENAME BYTE(15)
  • TIT BYTE(10)

8
External View Definition
  • Create a BUDGET view from PROJ
  • CREATE VIEW BUDGET (PNAME,BUD)
  • AS SELECT PNAME, BUDGET
  • FROM PROJ
  • Create a PAYROLL view from EMP and PAY
  • CREATE VIEW PAYROLL ( EMP_NO, EMP_NAME, SAL)
  • AS SELECT EMP.ENO, EMP.ENAME, PAY.SAL
  • FROM EMP, PAY
  • WHERE EMP.TITLE PAY.TITLE

9
The Classical DDBMS Architecture
Global Schema
Site Independent Schemas
Fragmentation Schema
Allocation Schema
Local Mapping Schema
Local Mapping Schema
Other sites
DBMS 1
DBMS 2
LOCAL DB 2
Site 1
Site 2
10
DDBMS Schemas
  • Global Schema a set of global relations as if
    database were not distributed at all
  • Fragmentation Schema global relation is split
    into non-overlapping (logical) fragments. 1n
    mapping from relation R to fragments Ri.
  • Allocation Schema 11 or 1n (redundant) mapping
    from fragments to sites. All fragments
    corresponding to the same relation R at a site j
    constitute the physical image Rj. A copy of a
    fragment is denoted by Rji.
  • Local Mapping Schema a mapping from physical
    images to physical objects, which are manipulated
    by local DBMSs.

11
Global Relations, Fragments and Physical Images
12
Motivation for this Architecture
  • Separating the concept of data fragmentation from
    the concept of data allocation
  • fragmentation transparency
  • location transparency
  • Explicit control of redundancy
  • Independence from local databases allows local
    mapping transparency

13
Some Aspects of the Classical Architecture
  • Distributed database technology is an add-on
    technology, most users already have populated
    centralized DBMSs. Whereas top down design
    assumes implementation of new DDBMS from scratch.
  • In many application environments, such as
    semi-structured databases, continuous multimedia
    data, the notion of fragment is difficult to
    define.
  • Current relational DBMS products provide for some
    form of location transparency (such as, by using
    nicknames).

14
L02 DDBMS Architecture
  • DDBMS Architecture
  • Architectural Models of DDBMS
  • More on Transparency

15
Bottom Up Architecture - Present Future
  • Possible ways in which multiple databases may be
    put together for sharing by multiple DBMSs, which
    are characterized according to
  • Autonomy (A) degree to which individual DBMSs
    can operate independently.
  • 0 - Tightly coupled - integrated (A0),
  • 1 - Semiautonomous - federated (A1),
  • 2 - Total Isolation - multidatabase systems(A2)
  • Distribution (D)
  • 0 - no distribution - single site (D0),
  • 1 - client-server - distribution of DBMS
    functionality (D1),
  • 2 - full distribution P2P distributed
    architecture(D2)
  • Heterogeneity (H)
  • 0 - homogeneous (H0)
  • 1 - heterogeneous (H1)

16
Architectural Alternatives
Distribution
(A1, D2, H0)
(A0, D2, H0)
(A2, D2, H0)
(A2, D2, H1)
(A0, D2, H1)
(A2, D1, H0)
(A2, D1, H1)
(A0, D1, H1)
(A2, D0, H0)
(A0, D0, H0)
Autonomy
(A0, D0, H1)
(A2, D0, H1)
(A1, D0, H1)
Heterogeneity
17
Autonomy
  • Autonomy refers to the distribution of control,
    not of data. It indicates the degree to which
    individual DBMSs can operate independently
  • Exchanging information among sites?
  • Independently executing transactions?
  • Modifying the component systems?
  • Requirements of an autonomous system
  • The local operations of the individual DBMSs are
    not affected by their participation in the DDBS.
  • The individual DBMSs query processing and
    optimization should not be affected by the
    execution of global queries that access multiple
    databases
  • System consistency or operation should not be
    compromised when individual DBMSs join or leave
    the distributed database confederation.

18
Dimensions of Autonomy
  • Design Autonomy
  • Freedom for individual DBMSs to use data models
    and transaction management techniques they prefer
  • Communication Autonomy
  • Freedom for individual DBMSs to decide what
    information (data control) is to be exported
  • Execution Autonomy
  • Freedom for individual DBMSs to execute
    transactions submitted in any way that it wants
    to

19
Alternatives of Autonomy
  • A0 Tight integration
  • A single image of the entire database to any user
  • A1 Semiautonomous
  • Can operate independently
  • Have made some parts of their local data
    shareable
  • Need to be modified to enable exchanging
    information
  • A2 Total isolation
  • Stand alone DBMS at each site, no global control
  • Know nothing about other databases

20
Distribution
  • Deals with data, not of control
  • Physical distribution of data over sites
  • Two classes of distribution alternatives
  • Client/Server (different from network C/S)
  • Clients Apps, Server Data management
  • Peer-to-Peer (or full)
  • Each machine has full DBMS functionality
  • Main focus of the textbook

21
Heterogeneity
  • Various forms of heterogeneity
  • Hardware, networking protocol, data manager
  • Focus of this course
  • Data models, query languages, xact mgmt

22
DDBMSs Implementation Alternatives
P2P homogeneous DBMS
P2P homogeneous federated system
P2P homogeneous multidatabase system
P2P Heterogeneous Federated DBMS
D
P2P heterogeneous DBMS
P2P heterogeneous Multidatabase system
Logically integrated and homogeneous multiple
DBMSs
A
Heterogeneous Integrated DBMS
Multidatabase System
H
Single site homogeneous federated DBMS
Heterogeneous multidatabase system
Single Site heterogeneous federated
23
Taxonomy of Distributed Databases
  • Composite DBMSs -tight integration
  • single image of entire database is available to
    any user
  • can be single or multiple sites
  • can be homogeneous or heterogeneous
  • Federated DBMSs - semiautonomous
  • DBMSs that can operate independently, but have
    decided to make some parts of their local data
    shareable
  • can be single or multiple sites.
  • they need to be modified to enable them to
    exchange information
  • Multidatabase Systems - total isolation
  • individual systems are stand alone DBMSs, which
    know neither the existence of other databases or
    how to communicate with them
  • no global control over the execution of
    individual DBMSs.
  • can be single or multiple sites
  • homogeneous or heterogeneous

24
Client Server Architectures
  • Client/Server (Ax, D1, Hy)
  • Multiple clients, single server
  • Similar data management as in centralized db
  • Multiple clients, multiple servers
  • Clients connect to servers as needed, or
  • Clients connect to their home server only

25
Components of Client/Server System
26
DDBS Reference Architecture
ES1
ES2
ESn
External Schema
GCS
Global Conceptual Schema
LCS1
LCS2
LCSn
Local Conceptual Schema
LIS1
LIS2
LISn
Local Internal Schema
It is logically integrated. Provides for the
levels of transparency
27
Components of a DDBS Site
DATA PROCESSOR
USER PROCESSOR
Local Internal Schema
Local Conceptual Schema
log
External Schema
Global Schema
Runtime Support
Local Query Processor
Local Recovery Manager
28
DDBMS Components USER PROCESSOR
  • User interface handler interprets user commands
    and formats the result data as it is sent to the
    user
  • Semantic data controller checks the integrity
    constraints and authorization requirements
  • Global query optimizer and decomposer determines
    execution strategy, translates global queries to
    local queries, and generates strategy for
    distributed join operations
  • Global execution monitor (distributed transaction
    manager) coordinates the distributed execution of
    the user request

29
DDBMS Components DATA PROCESSOR
  • Local query processor selects the access path and
    is involved in local query optimization and join
    operations
  • Local recovery manager maintains local database
    consistency
  • Run-time support processor physically accesses
    the database. It is the interface to the OS and
    contains database buffer manager

30
MDBS Architecture
  • Heterogeneous MDBS-Unilingual
  • Requires users to utilize different data models
    and languages when both a local and global
    database need to be accessed
  • Applications accessing multiple databases should
    use external views defined on GCS
  • A single application can have a LES on LCS and
    GES on GCS
  • Heterogeneous MDBS-Multilingual
  • Users access global database by means of external
    schema defined using the language of user's local
    DBMS
  • Queries against global databases are made using
    language of local DBMS
  • Deal with translation of queries at runtime

31
MDBS Architecture (with Global Schema)
Global External Schema
Local External Schema
(A2, Dx, Hy) GCS generated bottom-up
  • The GCS is generated by integrating LES's or
    LCS's
  • The users of a local DBMS can maintain their
    autonomy
  • Design of GCS is bottom-up

32
MDBS with GCS
  • Uni-lingual Heterogeneous MDBS
  • Accessing multiple databases through GES on GCS
  • LES on LCS and GES on GCS co-exist
  • Multilingual Heterogeneous MDBS
  • GES defined using the language of local DBMS
  • Queries over GES in the language of local DBMS
  • Easier for users to query multiple databases
  • The system deals with translation of queries at
    runtime

33
MDBS without Global Conceptual Schema
Multidatabase Layer
Local Database System Layer
  • Local system layer consists of several DBMSs
    which present to multidatabase layer part of
    their databases
  • The shared database is either local conceptual
    schema or external schema (Not shown in the
    figure above)
  • External views on one or more LCSs.
  • Access to multiple databases through application
    programs

34
MDBS without Global Schema
  • Federated Database Systems
  • Do not use global conceptual schema
  • Each local DBMS defines export schema
  • Global database is a union of export schemas
  • Each application accesses global database through
    import schema (external view)
  • MDBSs components architecture
  • Existence of full fledged local DBMSs
  • MDBS is a layer on top of individual DBMSs that
    support access to different databases
  • The complexity of the layer depends on existence
    of GCS and heterogeneity

35
MDBS Components Model
  • Full-fledged DBMS at each site
  • A layer runs on top of individual DBMSs
  • The complexity of the layer depends on
  • Existence of GCS and heterogeneity

36
Components of an MDBMS
User
User
User
User Requests
System Responses
Multi-DBMS Layer
Transaction Manager
Transaction Manager
User Interface
User Interface
Query Processor
Query Processor
Scheduler
Scheduler
Recovery Manager
Recovery Manager
Query Optimizer
Query Optimizer
Runtime Sup. Processor
Runtime Sup. Processor
Database
Database
37
Global Directory
  • Directory is itself a database that contains
    meat-data about the actual data stored in the
    database. It includes the support for
    fragmentation transparency for the classical
    DDBMS architecture.
  • Directory can be local or distributed.
  • Directory can be replicated and/or partitioned.
  • Directory issues are very important for large
    multi-database applications, such as digital
    libraries.

38
L02 DDBMS Architecture
  • DDBMS Architecture based on data
  • Architectural Models
  • More on Transparency

39
More on Transparency
  • X transparency means the existence of X is not
    known to users.
  • Closely related to architecture issues.
  • The level of transparency is inevitably a
    compromise between ease of use and the difficulty
    and overhead cost of providing high levels of
    transparency
  • DDBMS provides location transparency and
    fragmentation transparency, OS provides network
    transparency, and DBMS provides data independence

40
On Distribution Transparency
  • Higher levels of distribution transparency
    require appropriate DDBMS support, but makes
    end-application developers work easy.
  • Less distribution transparency the more the
    end-application developer needs to know about
    fragmentation and allocation schemes, and how to
    maintain database consistency.
  • There are tough problems in query optimization
    and transaction management that need to be
    tackled (in terms of system support and
    implementation) before fragmentation transparency
    can be supported.

41
Levels of Distribution Transparency
  • Fragmentation Transparency
  • Just like using global relations.
  • Location Transparency
  • Need to know fragmentation schema but no need
    not know where fragments are located
  • Applications access fragments (no need to
    specify sites where fragments are located).
  • Local Mapping Transparency
  • Need to know both fragmentation and allocation
    schema no need to know what the underlying local
    DBMSs are.
  • Applications access fragments explicitly
    specifying where the fragments are located.
  • No Transparency
  • Need to know local DBMS query languages, and
    write applications using functionality provided
    by the Local DBMS

42
Distribution Transparency
  • We analyze the different levels of distribution
    transparency which can be provided by DDBMS for
    applications.
  • A Simple Application
  • Supplier(SNum, Name, City)
  • Horizontally fragmented into
  • Supplier 1 ? City HK'' Supplier at Site1
  • Supplier2 ? City ! HK Supplier at Site2,
    Site3
  • Application
  • Read the supplier number from the terminal and
    return the name of the supplier with that number

43
Types of Data Fragmentation
  • Vertical Fragmentation
  • Projection on relation (subset of attributes)
  • Reconstruction by join
  • Updates require no tuple migration
  • Horizontal Fragmentation
  • Selection on relation (subset of tuples)
  • Reconstruction by union
  • Updates may requires tuple migration
  • Mixed Fragmentation
  • A fragment is a Select-Project query on relation.

44
Horizontal Fragmentation
  • Partitioning the tuples of a global relation into
    subsets
  • Example
  • Supplier(SNum, Name, City)
  • Horizontal Fragmentation can be
  • Supplier 1 ? City ! HK Supplier
  • Supplier2 ? City ! HK Supplier
  • Reconstruction is possible
  • Supplier Supplier1 ? Supplier2
  • The set of predicates defining all the fragments
    must be complete, and mutually exclusive

45
Derived Horizontal Fragmentation
  • The horizontal fragmentation is derived from the
    horizontal fragmentation of another relation
  • Example
  • Supply (SNum, PNum, DeptNum, Quan)
  • SNum is a supplier number
  • Supply1 Supply SNumSNum Supplier1
  • Supply2 Supply SNumSNum Supplier2
  • The predicates defining derived horizontal
    fragments are
  • Supply.SNum Supplier.SNum and Supplier. City
    HK''
  • Supply.SNum Supplier.SNum and Supplier. City !
    HK''

46
Vertical Fragmentation
  • The vertical fragmentation of a global relation
    is the subdivision of its attributes into groups
    fragments are obtained by projecting the global
    relation over each group
  • Example
  • EMP (ENum,Name,Sal,Tax,MNum,DNum)
  • A vertical fragmentation can be
  • EMP1 ? ENum, Name, MNum, DNum EMP
  • EMP2 ? ENum, Sal, Tax EMP
  • Reconstruction

47
Rules for Data Fragmentation
  • Completeness
  • All the data of the global relation must be
    mapped into the fragments
  • Reconstruction
  • It must always be possible to reconstruct each
    global relation from its fragments
  • Disjointedness
  • it is convenient that fragments be disjoint, so
    that the replication of data can be controlled
    explicitly at the allocation level

48
Level 4 of Distribution Transparency
  • No Transparency

read(terminal, SNum) SupIMS(Snum,Name,Found
) at S1 If not FOUND then SupCODASYL(Snum,Nam
e,Found) at S3 write(terminal, Name).
S1
S3
49
Level 3 of Distribution Transparency
  • Local Mapping Transparency

The applications have to specify both the
fragment names and the sites where they are
located. The mapping of database operations
specified in applications to those in DBMSs at
sites is transparent
50
Level 2 of Distribution Transparency
  • Location Transparency

The application is independent from changes in
allocation schema, but not from changes to
fragmentation schema
51
Level 1 of Distribution Transparency
  • Fragmentation transparency
  • The DDBMS interprets the database operation by
    accessing the databases at different sites in a
    way which is completely determined by the system

52
Distribution Transparency for Updates
EMP1 ?ENum,Name,Sal,Tax?DNum?10 (EMP) EMP2
?ENum,MNum,DNum?DNum?10 (EMP) EMP3
?ENum,Name,DNum?Dnumgt10 (EMP) EMP4
?ENum,MNum,Sal,Tax?Dnumgt10 (EMP)
  • Difficult
  • broadcasting updates to all copies
  • migration of tuples because of change of
    fragment defining attributes

EMP1
EMP2
Update Dnum15 for Employee with Enum100
EMP4
EMP3
53
An Update Application
With Level 2 Location Transparency only
UPDATE Emp SET DNum 15 WHERE ENum 100
Select Name, Tax, Sal into Name, Sal, Tax
From EMP 1 Where ENum 100 Select MNum into
MNum From Emp 2 Where ENum 100 Insert into
EMP 3 (ENum, Name, DNum) (100, Name, 15)
Insert into EMP 4 (ENum, Sal, Tax, MNum) (100,
Sal, Tax, MNum) Delete EMP 1 where ENum
100 Delete EMP 2 where ENum 100
With Level 1 Fragmentation Transparency
54
Summary
  • Approaches of Reference Architectures
  • Data-based Architecture of DBMS
  • External, Conceptual, and Internal Schema
  • DDBMS Models
  • Autonomy, Distribution, and Heterogeneity
  • Architectural Alternatives of DDBMS
  • Client/Server, Peer-to-Peer, and Multi-database
Write a Comment
User Comments (0)
About PowerShow.com