Title: Distributed Database Systems
1Distributed Database Systems
2A Distributed Database on a Geographically
Dispersed Network
3A Distributed Database on a Local Network
4A Multi-Processor System
5Types of Accesses to a Distributed Database
6Distributed Access Plan
- At site 1
- Send sites 2 and 3 the supplier number SN
- 2) At sites 2 and 3
- Execute in parallel, upon receipt of the
supplier number, the following program - Find all PARTS records having
- SUP SN
- Send result to site 1
- 3) At Site 1
- Merge results from sites 2 and 3
- Output the result.
7(No Transcript)
8Components of a Commercial DDBMS
9Data Distribution
- Problem
- Choose a unit of the logical database to use for
assignment to data modules. - Possibilities
- Relations Distribution issues will influence
logical database design. - Columns Distribution issues will influence
logical database design. - Rows Too many Directories become too
large. - Data Items -Too many Directories become too
large.
10Data Distribution
11Data Distribution
12Data Distribution
Datamodules
DM1
DM2
DM3
F1
F2
F3
F1
F2
Personnel
Inventory
Assignment of Fragments to Datamodules
13Data Distribution
- Advantages of fragments as units of
distribution. - Very flexible in size and definition.
- Distribution choices are largely independent of
logical design.
14System Considerations
- Reliable Network
- Pipelining
- Logical Data Items
- Database Operations Read
- Write
- Transactions Read Set
- Write Set
- Atomic All or Nothing Effect
15System Considerations (contd)
- Each site in the DDBMS has one or both of the
following software modules - Transaction Manager (TM)
- Data Manager (DM)
- TMs
- Read, Parse, and Optimize user queries
- Handle all interface with the user
- DMs
- Maintain physical database
- Perform actual reads and writes
16System Considerations (contd)
17Transaction Execution
- Transaction TMs Action.
- Begin Set up temporary workspace.
- Read (X) Select a DM which stores X,
- Send a message to this DM requesting X,
- Place X in workspace.
- Read (X) No Action necessary
- X is already in workspace.
-
- Write (X) Change the value of X.
- Read (X) No action necessary.
- End Send a pre-commit to each DM that stores a
copy of X, - Await acknowledgements,
- Send commit message
18Optimal File Allocation In A Distributed Database
System
- Given a number of computers that process common
information files, how can we - allocate the files optimally so that the
allocation yields minimum overall operating costs
(storage and communication)? - meet access time requirements for each file?
- not exceed the storage capacity of each computer?
- Note A File may be viewed as a segment.
19System Parameters
- n Computers
- m Files
- Size of each file
- Usage distribution for each file at each computer
- Frequency of modification of each file at each
computer during usage - Access time requirement for each file at each
computer - Storage capacity of each computer.
- Cost of storage per unit file length per
computer. - Cost of transmission per unit file length per
second per pair of computers.
20Model
- COSTS
- Total Cost Storage Costs Transmission Costs
- TC CS CT
- Transmission Costs Costs for Retrievals Cost
for Updates - CT CTR CTU
- CONSTRAINTS
- Each file must be stored in at least one
computer. - The storage capacity of each computer must not be
exceeded. - The probability of exceeding the required access
time for each file must be less than a specified
bound.
21Mathematical Representation Model
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27Transmission Paths Between Each Pair of Computers
28(No Transcript)
29Reliability Constraint
- Assuming processors and channels each have
identical reliability, - ap availability of the processor
- ac availability of the channel
- rj of redundant copies of the jth file
- Aj Availability of the jth file
- Aj ap 1 - (1 - acap)rj
- For example ap 0.98, ac 0.99, then
- Aj 0.951 for rj 1
- Aj 0.979 for rj 2
30(No Transcript)
31File Directory for Distributed Databases
32User Transaction
DDBMS
Transaction Manager
Directory Manager
To Other Nodes
Database Manager
Directory Fragment
Database
Overview of the Directory Manager
33Content of Directory
- Global description
- Fragmentation description
- Allocation description
- Mappings to local names
- Access method description
- Statistics on the database
- Consistency information
34Content of a Directory System
Security (File, User, C) CRead/Write Read
Only Write Only
Operation Compression ratio (Logical Operation
Query Data Value) Query Access
Optimizer Statistical Data Gathering Protocols
Logical (Dynamic) File Status (R, W) Number of
Backlog Jobs Site Availability Resource
Requirement Processing Cost Communication
Cost Translation Cost
Physical (Static) Location (Site, Copy , Disk,
Page) Creator Creation Date Version of the
File Size Code Format Date of Last Update
35The Functional Objectives ofIntegrated
Dictionary/Directory
- To support the control of data resources
- Maintaining data independence, security, and
integrity - To support applications development
- Offering standardized data definitions and usage
characteristics - Established program entities, DDL
- To provide independence of directory data
elements - Different hardware and software environments
- Changes in these environments
36Possible Data Types In IDD
- Data names, definitions, formats and sizes.
- Integrity constraints, authorization tables, and
usage statistics for transaction management. - Schemas and sub-schemas.
- Description of standardized transactions and
reports. - Characteristics of hardware, such as processors,
lines, and terminals. - Description of users.
- The IDD must support the maintenance of
relationships between various entities such as - Associations between
- Authorization tables and data,
- Users and transactions
- Reports
- The IDD supplies version control
37(No Transcript)
38Maximum Length 400 Characters
Relationship Created 820708
Contains
Payroll Record
Length 9 Characters
Figure 2
39Schema Model Level
Schema Level
Dictionary Level
Typical Entities, Relationships, and Attributes
Typical Entity-Types, Relationship-Types,and
Attribute-Types
Typical Meta-Entity-Types
Social-Security-Number Agency-Name
Element
Employee Record Payroll Record
Entity-Type
Record
Form 1040 FIPS Guideline
Document
Payroll-Record-Contains-Employee-Name
Relationship-Type
Record-Contains-Element
Length
9 Characters
Attribute-Type
Creator
ADP Division
Table 1
40Classes of Directory
- Centralized Directory
- Single Master Directory
- Extended Centralized Directory
- Multiple Master Directory
- Local Directory
- Distributed Directory
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46Causes For Directory Update
- Changing the description or structure of the user
database. - Moving user database entities from one node to
another. - Changing the description of a user or node.
- Changing a user view.
- Changing a network nodes status.
47Specific Drawbacks with Globally Replicated
Directories
- Additional remote activity to maintain directory
coherence. - Difficulty of posting directory changes to a down
site. - Difficulty of integrating a new site.
- Storage of directory entries where they are not
referenced. - Blurred responsibility for maintaining the
directory.
48Performance Measure
- Operating Cost/Unit Time Communication Cost
- (QueryUpdate)
- Storage Cost Code Translation
Cost (QueryUpdate) - Response Time
49Operating Cost for the Centralized Directory
System
50(No Transcript)
51Cost Trade-offs of Directory Systems
- Assume
- Communication cost much greater than storage cost
- No Translation cost
- All computers have same directory update rate
- Then the cost trade-off point is at directory
update rate. - P(C,EC) 2/(N 1)b
- P(C,D) 2/(N 1)
- P(L,D) 1
52(No Transcript)
53Type Centralized Extended Centralized Mult
iple Master Distributed Master Localized
Description Single Master directory
Advantages
Disadvantages
Simplicity Ease of update
Transmission costs and delays
Reduces transmission costs and delays
Coordinating updates of local directories Knowledg
e of appended directories
Variation of the centralized case in which the
directory information is permanently appended in
the local node once it is obtained from the
master directory
Reduces transmission costs and delays Fall-soft
Characteristics
Storage requirements Coordinating update of
redundant copies
Variation of the centralized case in which
redundant copies of the master directory exist
Fast Response
Storage costs Transmission costs for updates to
the directory
Master at every node
Simple update procedure
Transmission costs for non-local queries
Local directory at each node without replication
Directory Design Alternatives
54Distributed Ingres Dictionary/Directory Contain
Four Types of Data
- Relation name and location
- Information for parsing queries
- (domain names, formats, etc.)
- Performance information
- (number of tuples, storage structures, etc.)
- Consistency information
- (protection, integrity constraints, etc. Does
not include control data for concurrency control
and synchronization)
55SDD-1 Dictionary/Directory
- The directory itself is defined and maintained
like any other user data. It can be logically
fragmented, distributed, and replicated across
the distributed DBMSs. - A directory locator (a small highly static file
of directory fragment locations) is kept at every
site and is used by the TMs and DMs to plan and
control transactions and to help ensure DB
integrity and consistency across concurrent
accesses of data elements. - The transaction modules are capable of caching
remotely accessed directory data for subsequent
usage. This facility is provided on the
presumption that DB operations will exhibit the
locality-of-reference characteristic.
56PatientDB1
name SSN age
PatientDB2
name SSN patID
PatReportDB2
patID report
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 17 Pictorial diagram
showing usefulness of keys.
57personDB1
name sex age ssn
Vperson PersonClass
name sex age ssn job
Character_to_String
Character_to_String
personDB2
name gender ssn job
LargePositiveInteger_to_String
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity.
People
V person
Virtual Collection
Figure 15 Pictorial diagram showing
correspondence between virtual and real
attributes.
58Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 18 Pictorial diagram for
aggregation.
59Vname nameClass
first middle last
personDB1 name
getfirst
getmiddle
getlast
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 19 Pictorial diagram of
computed attribute.
60financeDB1
name stockAmount
1
VretireeretireClass
name income
financeDB2
name pension
2
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 20 Pictorial diagram of
computed attribute.
61carInsuranceDB1
carOwner amount
VinsuranceinsuranceClass
name insuranceAmounts
houseInsuranceDB2
houseOnwer amount
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 21 Pictorial diagram
showing grouping.
62patientDB1
name docID
(key)
patientDB2
(pointer)
name physician
relationship
patientDB1
Vdoctors doctorClass
name docID
name docID salary
patientDB1
name salary
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 22 Pictorial diagram
showing relationship.
63VtreatedBy treatedByClass
patientDB1
(key)
name docID amountOwed
patient doctor amountOwed
(key)
Vpatient PatientClass
Vdoctor DoctorClass
. . .
. . .
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 23 Pictorial diagram
showing a named relationship.
64VpersonPatient personClass
name
patientDB1
name SSN payment
Vpatient patientClass
patID amount
VpersonDoctor personClass
name
doctorDB2
name docID salary
Vdoctor DoctorClass
docID salary
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity.
patient
doctor
person
Vpatient
Vdoctor
VpersonPatient
VpersonDoctor
Virtual collections
Figure 24 Pictorial diagram showing relationship.
65ConceptSemType
conceptID semTypeID
Vconcept
(key)
conceptID semType termSet
Vterm
termID stringSet
Concept
conceptID termID stringType stringID stringVal
Vstring
stringName stringID stringType
Note that a shaded box represents a real
collection and an unshaded box represents a
virtual entity. Figure 30 Derivation of Virtual
Entity Vconcept.
66(No Transcript)