Multidatabase manipulations Part 2 - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Multidatabase manipulations Part 2

Description:

Contains by definition every SQL-x. Allows for non ... city burgh. FD. FD. banks. 34. Semantic Variables in MSQL. Can be applied to MSQL DD statements ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 83
Provided by: lit49
Category:

less

Transcript and Presenter's Notes

Title: Multidatabase manipulations Part 2


1
Multidatabase manipulations Part 2
  • Witold Litwin

http//ceria.dauphine.fr/witold.html
2
Multidatabase manipulations(Kandinsky Ligne
avec Accompagnement, 1937 )
3
Multidatabase manipulations
4
MSQL(Litwin, Abdellatif, Nicolas, Zeroual,
1989L. Suardi, M. Rusinkiewicz, 1992)
  • An extension to SQL
  • Contains by definition every SQL-x
  • Allows for non-procedural multidatabase base
    manipulations
  • ? MSQL queries impossible to formulate in SQL
  • An MSQL query may replace several SQL queries
  • Developed in 1986-89
  • INRIA, projet B A BA,
  • initialement sous projet du projet-pilote SIRIUS
    (J. Le Bihan, puis W. Litwin)
  • Dr. Thesis of MM. Abdellatif, Nicolas, Zeroual
  • Compiler implemented at Houston University
  • Team od Prof. M. Rusinkiewicz, 1990-1993

5
MSQL(Litwin, Abdellatif, Nicolas, Zeroual,
1989L. Suardi, M. Rusinkiewicz, 1992)
  • Research vehicle for functions for the MBD
    environment
  • to address relations in different databases
  • to manipulate semantically heterogeneous data
  • to create MDB views
  • to transfer data (and schemas) between DBs
  • to define MDB dependencies
  • Present to limited extent in most of commercial
    DBMSs DAMSs

6
MSQL(Basic new properties)
  • SQL Query
  • Uses 1st order predicate calculus
  • Is compiled for optimization into the relational
    algebra
  • Result is a table
  • MSQL Query
  • May use higher-order predicate calculus
  • Is compiled for optimization into the
    multirelational algebra
  • Result is a multitable
  • A set of relations (tables)
  • May be constituted from one or no tables

7
MSQL(More on functions specific to MDB env.)
  • Addressing of tables in different DBs
  • Implicitly or by qualification by (multi)database
    names
  • Introduced around 1985 by relational multibase
    system prototype MRDSM
  • B A BA project at INRIA
  • Unknown at that time of any relational language
  • See the overview of relational DBMSs existing
    in 1987 (M. Brodie)

8
MSQL(More on functions specific to MDB env.)
  • Manipulation of semantically heterogeneous
  • data
  • Multiple Queries
  • With multiples identifiers
  • With semantic variables
  • Ranging over data names
  • Scale and Precision
  • Units of measure
  • Implicit joins
  • Capabilities still unknown of SQL
  • Capabilities known at present to some dialects
  • Limited with respect to MSQL

9
MSQL example
View
View
SIL Internal Logical Schema
10
Conceptual Schemas (the multischema)
  • DB bnp
  • br (br, brname, street, street, city,
    zipcode, tel)account (acc, cl, balance,
    br)client (cl, clname, cltel, cltype, street,
    street, city, zipcode)spe-acc (acc, br, cl,
    balance, curr)
  • DB sg
  • branch (bra, braname, street, s, town, zip,
    t, class)acc (acc, bra, c, balance)client
    (c, cname, ct, ctype, street, s, town, zip)
  • DB cic
  • br (br, brname, street, street, city,
    zipcode, tel)account (ac, br, cl, balance,
    open_date)client(cl, clname, cltel, cltype,
    street, street, city, zipcode)

11
Semantic Heterogeneity In Banks
  • Same names can designate different data
  • Different names can designate same data
  • same client, same town..
  • The value of a primary key is valid only in one
    DB
  • how to identify same client in diff. banks ?

12
MSQL Commands
  • CREATE TABLE CREATE DATABASE
  • CREATE MULTIDATABASE CREATE VIEW
  • ALTER TABLE ALTER VIEW
  • ALTER MULTIDATABASE
  • DROP TABLE DROP DATABASE
  • DROP MULTIDATABASE DROP VIEW

13
MSQL CREATE DATABASE
gt MSQL CREATE DATABASE boulogne CREATE DB
.com.org.user.boulogne CREATE MULTIDATABASE
Banks (bnp cic sg ) USE Banks CREATE
DATABASE boulogne FROM bnp
Query scope
14
MSQL CREATE MULTIDATABASE
  • MSQL
  • CREATE MDB EC-Banks (f-banks-i-banks, s-banks,
    g-banks, e-banks )
  • CREATE MULTIDATABASE can create
  • flat MDBs (only contain DBs)
  • nested MDBs (DBs or MDBs)
  • can be potentially any network of DBs or MDBs
  • like through the links on the WEB
  • what about cycles ?

15
MSQLCREATE TABLE
Import
  • use banks
  • CREATE TABLE boulogne.loan FROM bnp.loan
  • CREATE TABLE fake_checks (Chq INT,
    Montant_Euro CURRENCY EURO .... )
  • One has created four (empty) tables bnp.
    fake_checks , cic. fake_checks ...
    boulogne.fake_checks
  • CREATE TABLE boulogne.client (c, cn, ct) FROM
    bnp.client (cl, clname, cltel)
  • PRIMARY KEY (c)
  • (cn, ct) OUTER REFERENCES (clname, cltel)

Unit of mesure
16
MSQLCREATE TABLE with References
  • USE AuPrintemps / MDB AuPrintemps
  • CREATE TABLE MusicDep.Inventory
  • .
  • FOREIGN KEY (Item) REFERENCES
    Central.Stock(I)
  • No unauthorized Item in the inventory of the
    Music Department
  • Other options
  • PRIMARY KEY () REFERENCES T()
  • T1(A) LEFTRIGHT REFERENCES T2(B)
  • Generates implicit equijoin, or left or right
    implicit outerjoins when a query selects
    attributes A and B.

17
MSQLALTER MULTIDATABASE
  • use banks
  • alter banks include vernesremove cic
  • Alter MDB can create
  • flat MDBs (only contain DBs)
  • nested MDBs

18
MSQL Elementary queries
Prefixing with DB names was unknown to SQL -
and is in DB2 SQL since last year only
19
MSQL Default DB
Tables of the default database are not prefixed
20
MSQL Elementary queries without prefixed names
Table names are unique within the query scope
21
Updates
  • USE (bnp b) sg
  • UPDATE account
  • SET account.balance account.balance 500
  • WHERE account.balance gt acc.balance
  • AND b.client.clname sg.client.cname AND
    b.client.street sg.client.street
  • What does it mean ?

22
Multiples Queries
23
Multiple Queries
24
Results (a multitable)
25
Multiple Updates
  • Begin
  • Use BanksUpdate clset street 'Charles de
    Gaulle"where street 'Etoile'
  • If SQLCODE ltgt 0 then Rollback
  • Commit
  • Use Banks vital cicUpdate clset street
    'Charles de Gaulle"where street 'Etoile'
  • MSQL transaction semantics is more general than
    ACID
  • may include COMP (compensation) statement, list
    of accept. states....

26
Semantic Variables in MSQL
  • use bnp sglet x be town cityselect from
    bwhere x 'Paris' and street 'r. de Rivoli'

27
Semantic Variables in MSQL
  • use bnp sglet x be town cityselect from
    bwhere x 'Paris' and street 'r. de Rivoli'
  • use bnp
  • select
  • from br
  • where town 'Paris' and street 'r. de Rivoli'
  • use sg
  • select
  • from branch
  • where city 'Paris' and street 'r. de Rivoli'

28
Semantic Variables in MSQL
  • use bnp sglet x be town cityselect from
    bwhere x 'Paris' and street 'r. de Rivoli'
  • Alternatively

use bnp sglet x be to cityselect from
bwhere x 'Paris' and street 'r. de
Rivoli'
29
Semantic Variables in MSQL
  • use banks
  • let X be banks.
  • select a, balance, cnamefrom X.a a, X.c c
  • where a. a c. c
  • The query illustrates the multitable pair-wise
    join
  • Semantic variable a over relation name account is
    not necessary, but simplifies the typing of the
    query

30
Semantic Variables in MSQL
  • use banks
  • let x be town city
  • let y be sg bnpselect Z. from y.b Z, cic.b
    Vwhere V.x 'Paris' and
  • V.street Z.street and Z.x 'Paris'
  • What does it means ?
  • Is the  natural  decomposition into SQL
    queries optimal ?
  • Otherwise is there any better ?

31
Multirelational Algebra
  • No, the natural decomposition is not optimal
  • the selection in cic is repeated three times
    uselessly
  • It should be done once first, then one should
    proceed with the join
  • One needs an algebra for multiple queries
  • Grant, Litwin, Selis, Roussopoulos. An Algebra
    and Calculus for Relational Multidatabases. The
    VLDB Journal, Vol. 2, No. 2, April 1993, 153-171.

32
Multirelational Algebra
  • Multirelational operators
  • Select From M where (boolean condition)
  • Project M (A, B)
  • Pair-wise Theta join On (M1.A ? M2.B AND )
  • Theta Join On (M1.A ? M2.B AND )
  • These operators are typically commutative and
    associative as their relational counterparts
  • Select can be moved through a join down the
    execution tree
  • Project (C (Project M (A, B, C))) Project M (C)
  • Etc

33
Semantic Variables in MSQL
  • Semantic variables can be compound and with
    values selected by queries from some dictionaries
  • use bankslet (x, y) be select X.attr Y.attr
    from FD X, FD Y where X.mean tel and Y.mean
    city
  • select from clientwhere x '123' and y
    'Paris'

FD
mean attr tel t tel tel city city city town city
burgh
34
Semantic Variables in MSQL
  • Can be applied to MSQL DD statements
  • use banks
  • create database cic2
  • let x be a b c
  • create table cic2.x from cic.x
  • Copies cic schema except for one table

35
Name homogenizationThe labels
  • USE Banks LET t BE tel tSELECT name
    branch_name, t tel, s streetFROM br
    brWHERE street Champs Elysées
  • The result multitable
  • ( bnp.br.branch_name, bnp.br.tel,
    bnp.br.street ), ( sg.br.branch_name,
    sg.br.tel, sg.br.street )( cic.br.
    branch_name, cic.br.tel, cic.br.street )

36
Multidatabase Views
bnp
my_bank
sg
A partial view of DBs bnp and sg in DB
my_bank
The views in my_bank can be considered Import
Schemes
37
Multidatabase Union Views
  • Use Banks
  • Create View bnp.all-banks as
  • Use banks
  • let x be town city
  • let y be banks.
  • Select y.br ( y, br, brname branch, street,
    street, x city, zip zip, t tel)
  • Union
  • Union unions all the tables of the selected
    multitable
  • It scales to all the tables named br of Banks,
    if new banks enters the MDB Banks in the future
  • Current DBMS, e.g., SQL Server, require to alter
    the union view definition in such a case

38
Key words and Aggregate Functions in MSQL
  • Key words and Aggregate Functions of SQL
  • par definition
  • DISTINCT, GROUP BY, ORDER BY
  • COUNT, AVG, SUM
  • operate at each table of a multitable
  • Their extensions to multitables
  • MDISTINCT, MCOUNT, MGROUP BY, MORDER BYMAVG,
    MSUM...
  • operate at whole multitable
  • important for warehousing

39
Example
  • USE BanksSELECT COUNT ()FROM br brWHERE
    street 'champs elysées'

40
Example
  • USE BanksSELECT COUNT ()FROM br brWHERE
    street 'champs elysées'
  • bnp.br2
  • cic.br2
  • sg.br2

41
Example
  • USE BanksSELECT MCOUNT ()FROM br brWHERE
    street 'champs elysées'

42
Example
  • USE BanksSELECT MCOUNT ()FROM br brWHERE
    street 'champs elysées'
  • br6
  • Exercises in warehousing
  • -Average balance per client in each bank
  • Average balance per client in BANKS
  • Sum of client assets per bank
  • Sum of client assets in BANKS

43
Aggregate Functions IMPLEMENTATION ISSUES
  • All-in-one (traditional computation)
  • Possibly in parallel
  • The calculus can take long time.
  • Successive approximations
  • Some kind of sampling
  • result1, from any 1st DB to come
  • (result1 result2) / 2
  • sampling within each database
  • several ACM-Sigmod VLDB papers dealt with query
    evaluation using sampling
  • Precomputing
  • Incremental evaluation using interdatabase
    dependencies
  • Common to warehousing

44
Aggregate Functions MERGE ON
  • form a single tuple from all the tuples of the
    same objet in the multitable
  • Uses outer jointures
  • Find millionaires in Banks and form the tuple
    for each millionaire
  • USE Banks LET x.y BE clname.cltel
    cname.ctLET z BE Banks.SELECT FROM z.a
    WHERE z.a.c z.client.cAND z.a.balance gt
    1 000 000MERGE ON x y

45
Aggregate Functions MERGE ON
nulls
nulls
nulls
USE Banks LET x.y BE clname.cltel
cname.ctLET z BE Banks.SELECT FROM z.a
WHERE z.a.c z.client.cAND z.a.balance gt
1 000 000MERGE ON x y
46
Aggregate Functions NAME
  • Transform a name (table, attribute..) into
    attribute value
  • USE Banks LET x.y BE br.city branch.townSELECT
    name branch_name, NAME (.x) bankFROM xWHERE
    y 'Nice' UNION
  • Note Union unions all the tables of the
    selected multitable
  • the result is the table branch_name bank

Jaures CIC
DeGaulle BNP
47
Aggregate Functions CHOOSE
  • Chooses at most n tuples among the selected
    ones
  • the 1st found as does the function TOP
    (default) in any or some order, specified by
    ORDER BY (default)
  • strictly random (RND)
  • these that were not chosen by the previous
    execution of the query in the same transaction
    (NEW)
  • preferably in the DBs listed, and in the
    listed order
  • at most j per DB
  • selecting at most m tuples sharing the values
    of the attr. in the list A, supposed global
    key of some objet.
  • CHOOSE (n, (m, ltAgt), ltBgt j, ltBgt, RND
    NEW
  • ltAgt ltlist of attr.gt ltBgt ltlist of
    DBsgt

48
Aggregate Functions CHOOSE
  • Choose a millionaire randomly
  • USE Banks SELECT c.FROM c c, a aWHERE
    c.c a.c AND a.ba gt 1.000.000CHOOSE (1)
    RND
  • Function very important in MBD environment
  • information overload

49
Aggregate Functions TIMEOUT
  • Fix time limit of a query
  • the system should possibly deliver all the
    relevant tuples
  • however, any query arriving to timeout is
    considered executed successfully
  • TIMEOUT (t unit) ltunitgt ms s m
    h d s - seconds (default)
  • USE BanksSELECT FROM brWHERE street
    'champs elysées' TIMEOUT (10)

50
Aggregate Functions POST
  • Make a query continuous
  • One manipulates each tuple found during the
    life time of the query
  • Even those created after the query start
  • TIMEOUT may be used to limit the life time
  • USE Immo LaCentrale Orpi SELECT FROM
    logemWHERE prix lt 1,000,000 AND Ville 'Paris'
    POST

51
Aggregate Functions ESTIMATE
  • Compute the cost of a query before the
    execution and can start the execution after an
    authorizationESTIMATE (type, price, time,
    count, size, report) WITH EXEC_PROMPT
  • type of estimate
  • exact (can be long to compute)
  • approximate
  • price of the query (in , FF...).
  • completion time
  • number of tuples
  • size of the resultant, in bytes
  • report on the estimate itself
  • precision...

52
Privileges in MSQL
  • USE bnp sg cic
  • GRANT SELECT ON client TO Nicolas Abdellatif
  • client is a multitable
  • client (bnp.client, sg.client, cic.client)
  • GRANT ALL ON etoile.account TO Nicolas Abdellatif
    FROM bnp.account
  • GRANT ALL ON etoile.account TO Nicolas FROM
    Zeroual ON bnp.account

53
Interdatabase Queries
  • Transfer data between DBs
  • Source and target are multitables

INSERT...
54
Interdatabase Queries
  • INSERT
  • insert selected tuples
  • except these with the key already in the target
  • STORE
  • insert selected tuples
  • replacing these with the key already in the
    target
  • REPLACE
  • insert selected tuples and delete the rest of
    the target
  • UPDATE
  • update the tuples selected in the target with
    the values in the source tables
  • COPY
  • copies tuples and the source schema

55
MSQL
  • There are more interesting capabilities
  • e.g. Multidatabase Dependencies
  • referential integrity (outer) join links
  • multidatabase triggers
  • local autonomy
  • dynamic attributes for retrieval and updates
  • The language design will never be finished
  • MSQL 1, 2, 3...
  • MSQL A multidatabase Language. Information
    Science Journal Special Issue on Database
    Systems, 48, 2, (July 1989).
  • Execution of Extended Multidatabase SQL. Intl.
    IEEE Conf. on Data Eng. Vienna, 1992

56
OSQL
  • For OO or RO common model, consider in addition
  • MDB inheritance
  • MDB type/subtype integration
  • derived types
  • OID heterogeneity UUIDs
  • Type / function value semantic heterogeneity
  • dynamic type hierarchies
  • higher-order OO languages
  • Relations with inherited attributes

57
Elements of MSQL in commercial DBMSs
  • Main DBMSs evolved to MDBSs
  • yet primitive but it's better than
    nothingSybase, Oracle, Informix, MsAccess, SQL
    Server,....
  • There are also MDBSs which are only access
    systems to DBMSs
  • EDA-SQL, DEC DB Integrator, DBJoiner (IBM),
    Ingres, UniSQL/M, Uniface, QE, OAdaptor (HP),
    Telebase...
  • "Data Warehouses"

58
MSQL in commercial DBMSs(Department Store Data
Warehouse, using MsAccess, SQL Server...)
Au Printemps
Bd. Haussman
Music Dep
Parly 2
Books Dep
Music Dep
Payroll
Home Appl. Dep
Home Appl. Dep
Food Dep
Jeans Dep
Jeans Dep
Car
Food Dep
Orders
Central Warehouse
59
MBD Manipulations in MsAccess
  • One can perform limited MBD operations between
  • MsAccess DBs
  • An DB of MsAccess and
  • any other DB under a DBMS ODBC compatible
  • Paradox, Btrieve, Dbase
  • Any OLE compatible program
  • Excel...

60
MBD Manipulations in MsAccess
Paradox Gateway
MsAccess
Sybase ODBC driver
B1
B2
Attach
Insert INTO
Export
Distr. Connect.
ODBC
Import
ODBC
B3
Paradox
Excel
Oracle
Sybase
61
MsAccess MSQL
  • Open B ltgt USE B
  • ATTACH table
  • Open B1 attach B2.T' as T ?create view B1.T as
    select from B2.T'
  • DROP VIEW corresponds to Delete in MsAccess menu
  • Clause IN ltexternalDBgt
  • Open B1Select a, b, c From D IN B2 ?? select
    a, b c from B2.D

62
Examples MsAccess
  • Source DB MsAccess SELECT Customer IDFROM
    Customers IN MYDATA.MDBWHERE Customer ID Like
    "A"
  • Source DB Paradox SELECT CustomerIDFROM
    CustomersIN "C\PARADOX\DATA\SALES" "Paradox
    4.x"WHERE CustomerID Like "A"
  • Every data transfer from/to DB non-MsAccess or
    OLE compatible software has data repr.
    conversions
  • Semantic Heterogeneity oblige

63
Elementary Queries in MS-Access
  • Open a DB and query other DBs
  • one has to define aliases in FROM
  • DB open here is called s-p1.mdb
  • but this name has no importance here
  • Joins of tables in other databases
  • SELECT TOP 10 C.Contact Name, C.City
  • FROM c\access\nwind2.mdb.Customers AS C,
    c\access\ordentr2.mdb.customers AS O
  • WHERE (o.Id C.customer Id)

64
Result
Contact Name City Pat Parkes London Gladys
Lindsay Seattle Elizabeth Lincoln Tsawassen Olivia
LaMont San Francisco Terry Hargreaves London Eliz
abeth Brown London Sylvia Dunn London Ann
Devon London Ronald Merrick London Bill
Lee Pocatello
65
Elementary Queries in MS-Access
  • Join of a local and external table
  • SELECT TOP 10 S.SName, C.Contact Name, C.City
  • FROM S, nwind2.mdb.Customers AS C
  • WHERE ((S.City C.City))
  • Order by contact name

66
Result
SName Contact Name City Clark Ann
Devon London Clark Archibald Langford London Clar
k Cornelia Giles London Clark David
Bird London Clark Elizabeth Brown London Clark G
.K.Chattergee London Clark Gerald
Pipps London Clark Hari Kumar London Clark Jane
Austen London Clark Jeffrey Jefferies London
67
MsAccess MSQL
  • Clause INTO ltexternalDBgt dans Select INTO ou
    INSERT INTO
  • Open B1Select a, b, c INTO T IN B2 From D ??
    Use B1 copy into B2.T select a, b c from D
  • D can be a view or a subquery
  • One cannot combine clauses IN et INTO
  • INSERT de MsAccess has (sub)semantics of
    INSERT in MSQL

68
MsAccess MSQL
  • IMPORT EXPORT
  • menu commands
  • equivalent to MSQL query
  • Use B1 copy into T1 from B2.T2

69
MsAccess MSQLComparison
  • Formulation of MBD elementary queries and views
  • first one has to define ATTACH's
  • then one formulates SQL monodatabase query
  • then, perhaps one needs to delete the ATTACH's
  • Much more procedurality than under MSQL
  • in Banks, one would need in practice that each
    DB attaches all the tables of any other DB
  • Good luck DBA !
  • Multiple queries and other capabilities of MSQL
  • yet unknown of MsAccess

70
SQL Server, Sybase, Interbase
  • MBD Architecture similar to that of MsAccess,
    but more powerful
  • gateways to Oracle, IMS, DB2
  • ODBC
  • Transac-SQL support the following MSQL
    functions and is the MBD dialect least
    procedural in the industry
  • elementary queries
  • to Sybase DBs at the same siteUSE B select
    from T where B1.T1.a T.a
  • Only one DB per USE
  • some restrictions at the level of interdatabase
    queries
  • multidatabase CREATE VIEW, and MDB triggers

71
Oracle, RDB, Informix
  • Have an operation similar to ATTACH
    calledCreate link
  • Create public database link bnp connect to
    bnp_unix
  • Create public database link cic connect to
    cic_vms
  • SELECT br.brname, b.braname, br.street
  • FROM br _at_bnp, br_at_ cic b
  • WHERE br.street b.street

72
Oracle, RDB, Informix
  • MBD queries are possible only once the links are
    defined
  • Hence these DBMSs are procedural than Sybase
    for MBD operations
  • Starting from V7, Oracle supports however MBDs
    queries without links
  • postfixing par the DB name
  • as in the last ex.

73
EDA-SQL, DB Integrator, DBJoiner, Ingres al
  • SQL MDBSs for access to DBMSs
  • in theory, without their own DBs
  • but there is always one for the MDB catalogs
  • auxiliary DB
  • One has to create links and logical DBs
  • almost virtual DBs
  • only DB Integrator supports elem. MDB queries
  • called multischema queries
  • No other MSQL functions

ODBC
logical DB
logical DB
Gateway
BD Ingres
BD lMS
BD RDB
74
UniSQL O-Adaptor
  • Similar to previous ones except that for the
    logical DB
  • UniSQL uses RO model
  • O-Adaptor an OO model
  • No MDB queries (other than link creations)

ODBC
logical DB
logical DB
Gateway
BD Ingres
BD lMS
BD RDB
75
Telebase (USA)
  • MDBMS for access to inf. retr. DBs
  • 1000 DBs at many sites
  • with different local languages
  • STAIRS, INSPEC, DIALOG...
  • Extended Common Command Set Language (CCS)
  • No joins only Boolean clauses
  • Supports the MSQL functions
  • multidatabases names
  • Called Categories
  • multiple queries
  • Called Scans

CCS
Drivers
DBs DIALOG
DBs INSPEC
DBs STAIRS
76
Messidor 1st Heterogeneous Multidatabase
Information Retrieval Access System
Démonstration par C. Moulinoux (STERIA), INRIA,
1987
77
Meta-search engines
  • Metacrawler, BigHub.com, AskJeewes.com, Copernic
  • Query simultaneously several search engines
  • Altavista, Yahoo, Excite, Hotbot
  • Boolean Manipulation Langages
  • Multiples Queries
  • Apply the mdb aggregate functions Mdistinct Name,
    Mdistinct, Choose, Timeout

78
Data Warehouses
  • Popular new concept for MDBSs
  • data warehouse ltgt an MDB or federation in an
    enterprise
  • With elaborated management of interdatabase
    dependencies
  • new ideas
  • elaborated DS implementation elaborated decision
    support functions
  • incremental propagation
  • an MDB view or a DB redundant with respect to
    existent ones is created

ODBC
Data warehouse
Data mart
Gateway
DB Ingres
DB lMS
DB RDB
79
An Instructive Call for Papers
International Journal of Cooperative Information
Systems Special
Issue on Design and Management
of Data Warehouses Guest editors
Manfred A. Jeusfeld and Martin Staudt Data
Warehousing embraces technology and industrial
practice to systematically collect data from the
enterprise and to use that data in a highly
aggregated form for managing the enterprise thru
decisions. Little attention is currently paid to
design and manage a data warehouse (DW) in such a
way that it accomplishes its purpose, i.e. to
support the management of the enterprise.
Existing solutions are focusing on technical
aspects like efficient source data extraction.
Their parameters are however incomprehensible to
the stakeholders who decide on the introduction
of a data warehouse. Data warehouses are
important in managing large enterprises and
in communicating highly aggregated information
between the various departments. Interoperable
tools and integrated methods to manage
data warehouses in order to fulfill the
enterprise goals are desperately needed. Such
tools should cover all aspects of data
warehousing - selection of data sources - data
cleaning - conceptual/logical/physical data
warehouse design - enterprise modeling - data
warehouse quality monitoring - data warehouse
refreshment methods - architecture design - data
mart customization, etc.
80
Exemple Architecture de DB2 Data Warehouse
81
Conclusion
  • MDB Manipulations - among most important R D
    directions
  • Other key-words
  • Interoperability
  • Integration
  • Distributed Heterogeneous DBs
  • Data Warehouses
  • MSQL is a research vehicle advanced for
    relational MBDs
  • The root for further resarch proposals
  • MSQL with Integrity Constraints, IDL, SchemaSQL
  • For the latter, see especially ACM-TODS journal,
    Dec. 2001
  • Basic MSQL capabilities are in commercials DBMSs,
    Information Retrieval Ssytems, MDB Access
    Systems, Data Warehouses, XML standard proposals
  • Others will follow
  • But there is still a lot to do
  • in the industry and in research

82
Exercises and Research Problems
  • All these in the text mdb queries especially.
  • Express various elementary mdb queries using
    Amos, MsAccess (SQL and QBE), SQL Server, DB2,
    Oracle, Interbase
  • Invent your own instructive queries to BANKS
  • Under MsAccess, design an MDB Form for an
    elementary query and for a multiple query.
    Explain how you did i in a short report.
  • Consider 3 attributes B1.T1.a, B2.T2.b, B3.T3.c.
    The attribute types are INT and unit of
    measures are KG, G, mG. Consider that ''
    operator has the usual mathematical meaning, with
    the usual rounding up of values with a different
    precision. Prove or disprove that the usual
    associativity of equijoins (a JOIN b) JOIN c
    a JOIN (b JOIN c)does not hold anymore. Comment
    on the consequences for the current relational
    query optimizers.
  • If you had to evaluate a JOIN b using manual
    unit conversion, would you rather convert a to b
    or vice versa ?
  • Propose and justify a reasonable algorithm for
    the multiple join evaluation.
  • A unit conversion algorithm A may be a long
    calculus. Would you rather
  • apply A to every value V of a the manipulated
    table
  • project or order the table first, then apply A
    once for every different value

83
Exercises and Research Problems
  • Propose an execution tree expressed in mdb
    algebra for the query of slide 30
  • Add unit conversion to your favorite query
    optimizer (Ph. D Thesis)
  • Try to express the example queries to Banks using
    SchemaSQL language. For each query, present also
    the result. Can it be a multirelation ?
  • Try to express the example queries to Banks using
    IDL language. For each query, present also the
    result. Can it be a multirelation ?
  • Consider that bnp.balance is in US and
    cic.balance is in FF. Consider that the exchange
    rate is in some table ExRa in DB called Currency.
    Is it possible to find accounts with the same
    balance in both DBs using a single MSQL query ?
  • Consider that to perform a multidatabase join A
    JON B one has to bring both tables into a
    database. What are your options for an
    elementary MDB query processing, if there are
    selections, joins, and projections ?
  • Idem, if you consider distributed join processing
    ?
  • Consider a multitable R (R1, R2, R3), a table T
    and the query Q
  • select from R where R.a T.a and T.b '123'
  • What are your option for Q's optimization ?
  • Propose an implementation for your favorite
    MSQL aggregate function (Ph. D. Thesis or a part
    of it at least)

84
  • END

85
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com