On Views and XML

About This Presentation
Title:

On Views and XML

Description:

buying-price=100$, quantity-in-stock=20000, supplier=Sears, authorized-discount=30 ... external private bp currency=dollar 100 /bp qis 20000 /qis , s Sears /s ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 59
Provided by: ber91
Learn more at: https://www.cs.uic.edu

less

Transcript and Presenter's Notes

Title: On Views and XML


1
On Views and XML
  • Serge Abiteboul
  • INRIA
  • PODS 1999

2
Organization
  • Introduction
  • XML View Query
  • Change Control
  • Objects
  • Structured Semistructured Data
  • Active Features
  • Incomplete Information
  • more...

Many Facets!
3
Warning
  • This is not a survey on database views
  • This is not a tutorial on XML
  • This is about the use of XMLecommerce as excuses
    to survey some works on views cast in a
    fashionable context O2views, views of OEM,
    ActiveViews, Lorel/Ozone...
  • (and also motivate future works)

4
Executive Summary Database folks should be
interested in XML Views and more and more are
Footnote this is a great way to recycle your old
results on views, incomplete information,
deductive databases, universal instance
assumption, dependency theory, etc.
5
Introduction XML in short
  • Document mark-up language descendant of SGML
  • Standard for data exchange on the Web
  • We are interested here in data exchange and not
    in document editing and retrieval

6
EXAMPLE EDI Electronic Data Interchange
  • Standard for business data exchange
  • 2 standards
  • ANSI X12 in US -- all B2G by end 1999
  • EDIFACT in world -- UN committee
  • translate ? EDI ? transmit

7
  • lt!DOCTYPE Book-Order PUBLIC "-//Editor//DTD Book
    Order Message//EN"gt
  • ltBook-Order Supplier"4012345000094"
    Send-to"http//www.bic.org/order.in"gt
  • lttitlegtEditor Lite-EDI Book Orderinglt/titlegt
    ltOrder-Nogt967634lt/Order-Nogt
  • ltMessage-Dategt19961002lt/Message-Dategt
    ltBuyer-EANgt5412345000176lt/Buyer-EANgt
  • ltOrder-Line Reference-No"0528837"gt
  • ltISBNgt0316907235lt/ISBNgt
  • ltAuthor-TitlegtLabaln, Brian/Chromelt/Author-Titlegt
  • ltQuantitygt2lt/Quantitygt
  • lt/Order-Linegt
  • ltOrder-Line Reference-No"0528838"gt
  • ltISBNgt0856674427lt/ISBNgt
  • ltAuthor-TitlegtParry, Linda (ed)/William
    Morrislt/Author-Titlegt
  • ltQuantitygt1lt/Quantitygt
  • lt/Order-Linegtltinput type"checkbox"
    name"partial" value"allowed"/gt
  • lttextgtTick here if a delayed/partial supply of
    order is acceptablelt/textgt
  • ltinput type"checkbox" name"confirmation"
    value"requested"/gt
  • lttextgtTick here if Confirmation of Acceptance of
    Order is to be returned by e-maillt/textgt
  • ltinput type"checkbox" name"DeliveryNote"
    value"required"/gt

data in XML/EDI
8
I personally prefer
9
XML
  • Some noise and confusion
  • Is the syntax important? No
  • What is XML?
  • the means to exchange tree/graph data on the Web
  • an object-oriented API for it
  • more

10
A (simplified) model for XML
  • XML-tree - list(node)
  • node - string element ref node
  • element - label list(att string) list(node)
  • label - string
  • att - string
  • an attribute occurs at most once

11
XML in short
  • ltpersongt
  • ltnamegtSerge Abiteboullt/namegtPODS invited speaker
  • lta xmllinksimple hrefgif/serge.gifgt old
    picturelt/agt
  • ltaddressgt ltcitygtLe Chesnaylt/citygtltzipgt92310lt/zipgt
    lt/addressgt
  • lta xmllinksimple hrefwww-rocq.inria.fr/abi
    tebougtWeblt/agt
  • lt/persongt
  • DTD grammar DCD some typing
  • DOM object API RDF meta data
  • XPOINTER/XLINK ...

12
XML Views

Query Publishsubscribe Crawlerfilter
engine Security manager Request broker Business
intelligence Output/report/delivery
Data Warehouse
Web browsers
OLAP
Web browsers
View server
Image video
Web browsers
reports
Information repository
13
What databases can bring to XML is query
optimization and query rewriting
View Query
14
View Query
  • like for relational model
  • use of query optimization techniques
  • use of query rewriting techniques
  • processing queries using views
  • main issue virtual vs. materialized

15
B2C Comparative Shopping
  • http//www.addall.com
  • 24 bookstores searched in about 10 seconds
  • between 42 and 78
  • thats why people will use them!

16
What DB can bring to XML is the control of
changes
View Change Control
17
Some of the most studied problems for relational
views
  • update propagation
  • incremental updates
  • view update problem

18
D2V Incremental Updates
  • a customer has loaded portions of the catalog
  • some prices change
  • no need to reload the entire catalog
  • many such examples on the Web
  • ? updates

19
V2D View Update
  • Sometimes considered less of an issue the Web is
    read only!
  • Many Web applications involve updates
  • We may be able to annotate the products of the
    catalog
  • some of the data is in read mode
  • some data is not visible (this is only a view!)
  • some data may be updated

20
Example Change Detection
  • A customer (self) is in a department
    (self.department) and may want to see only the
    current promotions of products in this department
    (MyPromotions)
  • let MyPromotions be
  • select I.
  • from I in Catalog.promotions.item
  • where I.department self.department

21
Query Subscription Changes from Chawathes
thesis
  • Changes in label graphs as in DOEM

Catalog
name
Gismos78
item
promotion
department
electronic
price
234
department
self
278
22
Query Subscription Changes
  • Change value of atomic vertex value
  • Creation of new vertex
  • Addition/removal of an edge
  • Change of the label on an edge add/remove
  • Move a vertex add/remove
  • annotations on edges and vertexes

23
Query Subscription Queries
  • select P.code, P.description
  • from P in Catalog.product
  • where P.price ltchangedgtQ vertex annotation
  • where P.ltaddedgtdescription edge annotation
  • where P.price data in annotation
  • ltchanged ltoldQ, date TgtgtQ
  • and Q - Q gt 100 and T gt 99/04/03

24
Query Subscription Examples
  • On the first of each month, send me the list of
    all products in my interest list such that their
    price increased by more than 10
  • Each time there are ten new employees, send me
    their names and departments
  • Notify me if the price of this house decreases
  • similarity on event when condition do action

25
XML World of Objects
The underlying model for XML is object-based
and XML views should be based on OO(DB)
technology
26
Views World of objects
  • API for XML Domain Object Model
  • Views XML as object-oriented
  • Allows designing C or Java applications
  • E.g.
  • use subclass Promotion of XMLNode
  • Catalog.promotions is only a set of virtual
    elements
  • the list of promotions is generated on demand
    based on the nature of customers

27
Views in OODB O2Views
  • Virtual values
  • like for relational views
  • entirely virtual XML document, e.g., view of
    relational data
  • virtual attributes
  • e.g., product code, name, price,
  • alternatives the set of products that
  • are similar and are on promotion

28
Views in OODB O2Views
  • Virtual class a set of database objects that are
    grouped together and as such acquire a new
    interface
  • catalog1/DTD1,,catalog17/DTD17
  • products are represented differently in each
    catalog
  • unique DTD that allows to view all products
  • each product can be viewed with that DTD

29
Views in OODB O2Views
  • Imaginary class groups objects that are all
    virtual, e.g., join of two relations
  • For more see Souzas thesis

30
XML data/views semistructured structured
data
XML should also allow the exchange of
structured data as in relational/ODMG models
31
Semistructured Structured Data
  • If we know about the structure of data, not using
    it may damage performance
  • The use of structure facilitates the programming
    of applications, e.g., in Java
  • Structure may be useful to explain data to users
  • For more see Lahiris thesis and Ozone OQL
    Lorel

32
Web catalog - continued
  • Product-basic all products
  • categoryelectronic, subcategorysound,
  • nameGismo223, codeF2GHYYRF,
  • selling-price1200FF
  • Product-specific for Gismos only
  • voltagelist(110,220), Gismo-normGHTF333
  • External resources
  • descriptionhttp//m.ec.fr/cat/Gismo
  • reviewshttp//reviews.com/Gismo
  • Private data
  • buying-price100, quantity-in-stock20000,
    supplierSears, authorized-discount30

Regular data
Semistructured data
External data
Other regular data
33
This data in XML
  • ltproductgt
  • ltbasicgt
  • ltcatgt electronic ltsubcat gtsound lt/subcatgtltcatgt
  • ltngtGismo223 lt/ngtltcgtF2GHYYRFlt/cgt
  • ltsp currencyFrench-francgt1200lt/spgt lt/basicgt
  • ltspecificgt
  • ltvgt110lt/vgtltvgt220lt/vgt
  • ltGismo-normgtGHTF333lt/Gismo-normgt lt/specificgt
  • ltexternalgt lt/externalgt
  • ltprivategt
  • ltbp currencydollargt100lt/bpgt ltqisgt20000lt/qisgt,
    ltsgtSearslt/sgt ltadgt30lt/adgtlt/privategtlt\productgt

34
What is such data exactly?
  • A mix of structured and semistructured data with
    pointers between two worlds
  • Purely XML. Then
  • use a relation as a materialized view
  • Product(name, code, category, subcategory, price,
    rest)
  • Index on name and subcategory
  • select P.name, P.price from P in Product
  • where P.subcategory sound

35
Digression storage of XML
  • as blobs
  • generic mapping ignore the structure
  • specific mapping
  • relational
  • object
  • hybrid

36
As blobs
  • ltproductgt ltbasicgt ltcatgt electronic ltsubcat
    gtsound lt/subcatgtltcatgt ltngtGismo22lt/ngtltcgtF2GHYYRFlt/c
    gt ltsp currencyFrench-francgt1200lt/spgt lt/basicgt
    ltspecificgt ltvgt110lt/vgtltvgt220lt/vgt
    ltGismo-normgtGHTF333lt/Gismo-normgt lt/specificgt
    ltexternalgt lt/externalgt ltprivategt ltbp
    currencydollargt100lt/bpgt ltqisgt20000lt/qisgt,
    ltsgtSearslt/sgt ltadgt30lt/adgtlt/privategtlt\productgt
  • full-text index

37
Generic mapping
  • root product o1 o3 electronic
  • o1 basic o2 o4 sound
  • o2 cat o3 o5 Gismo223
  • o2 subcat o4 o6 F2GHYYRF
  • o2 n o5 o7 1200...
  • o2 c o6
  • o2 sp o7...
  • o7 currency French-franc
  • o12 currency dollar...

element graph
atomic objects
attributes
38
Specific
  • Class Product
  • type tuple( catstring subcatset(string)
  • n string, cstring price Price
    specific OEM
  • external list(tuple(labelstringvalURL))
  • private pr tuple(
  • bpPrice qis integer
  • supplier Company ) )
  • type Price tuple(sumint, currencyCurrency)

39
What is better? Hybrid?
  • Need for comparative studies
  • My feeling/common sense?
  • Use structure for very structured portions of
    data
  • Use semistructured for less so or portions with
    very evolving structures
  • Use blobs for components accessed mostly via
    full-text indexing, e.g., paragraphs in a document

40
Views Active Features
41
Active Views
  • System developed at INRIA
  • Long term goals
  • Declarative specification of data intensive
    applications with cooperation between partners
  • Ease of use and fast deployment
  • (Automatic) verification

42
Architecture
JAVA
AVApi
DOM
O2
Java application
O2 Notification
Java RMI
XML repository
ACTIVEVIEWS MANAGER
Web Browser
Java Client
43
Motivations
  • Database Applications
  • passive behavior
  • closed systems
  • persistence, concurrency, access control
  • New needs
  • interactions between clients e.g., notification
  • change control
  • reactive behavior
  • E.g e-Commerce, cooperative work

44
Illustration of Interactions Notification
  • In the vendor view
  • when Customer.entersDept(dept)
  • if dept self.dept
  • then notifyme

45
Notification
AVServer
entersDept book
AVClient customer
notify
notify
AVServer
AVClient vendor in book dept
46
Illustration of Interaction Change Control
  • In the customer view
  • let monitored MyPromotions be
  • s elect I.name, I.price
  • from I in Catalog.promotions.item
  • where I.department self.department
  • read, write, append, monitored, refresh,
    deferred
  • simpler case monitoring of the catalog

47
Change control
3 Modification
AVServer
4 Write
AVClient
1 Read
6 Notification
2 Read
7.Read
5 Notification
AVServer
AVClient
48
Choices
  • All XML
  • XML repository
  • XML query language
  • XML views
  • Declarative specification
  • almost no code to write
  • compilation to an executable application
  • active rules

49
Important Aspects
  • workflow
  • e.g., customization to search for a biblio ref,
    look first in my own files, otherwise look in
    dblp otherwise look
  • activities (search, buy, accounting, chat)
  • active rules
  • logical traces
  • notifications

50
View Incomplete Information
Use something like Imielinski-Lipski tables
51
Example portal
Q1 Q2 comp comp price v1 v1 109 v2 v2 X v3 v3 99
v4 v4 89 v5 v5 Y
  • Q1 gismo vendors
  • V ? P sell(V,gismo,P)
  • Q1 v1, v2, v3, v4, v5
  • Q2 price for each vendor
  • V, P sell(V,gismo,P)
  • Q3 cheap gismo vendors
  • V ? P (sell(V,gismo,P) and Plt80)

Q3 comp price cond v2 X Xlt80 v5 Y Ylt80
52
Example more portal
  • Load all electronic products
  • expiration e.g. to recover storage space
  • for all products loaded before May 1st, discard
    images and text of annotations
  • give me the gismos that have been annotated by
    Jeff Ullman and the annotations

53
View workspace, distribution, cache...
Just to say, there is much more to it...
54
Conclusion
55
Some Challenges Semistructured Data Processing
  • XML storage under non generic form
  • XML query language optimization
  • XML bulk loading
  • data conversion, integration
  • incomplete information

56
Some Challenges Change Control and View
Interaction
  • update detection
  • incremental propagation
  • temporal XML versions, DOEM...
  • rule and trigger management
  • management of large number of user active views
    (personalized)

57
Some Challenges Workflow
  • workflow management task sequencing
  • declarative specification of applications
  • program Verification

58
Conclusion
Database folks should be interested in XML Views
and more and more are...
Write a Comment
User Comments (0)