Views and Storing XML in Relational Databases - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Views and Storing XML in Relational Databases

Description:

Generally, these can all be thought of as different views over the data. A view is a named query ... If tuples change in the view, that should reflect in the ... – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 25
Provided by: zack4
Category:

less

Transcript and Presenter's Notes

Title: Views and Storing XML in Relational Databases


1
Views and Storing XML in Relational Databases
  • Susan B. Davidson
  • University of Pennsylvania
  • CIS330 Database Management Systems
  • November 11, 2008

Most slide content courtesy of Zack Ives.
2
What are views?
  • We frequently want to reference data in a way
    that differs from the way its stored
  • XML data ? HTML, text, etc.
  • Relational data ? XML data
  • Relational data ? Different relational
    representation
  • XML data ? Different XML representation
  • Generally, these can all be thought of as
    different views over the data
  • A view is a named query
  • It is the DB analog of methods with arguments in
    PL
  • We will start with views in the same model
    (relational, XML), and then address moving XML
    data to a relational format.

3
Views in SQL
  • A view is first created (CREATE VIEW).
  • We then use the name of the view to invoke the
    query (treating it as if it were the relation it
    returns)

Using the view SELECT FROM V, R WHERE V.B5
AND V.CR.C
Creating a view CREATE VIEW V(A,B,C) AS SELECT
A,B,C FROM R WHERE R.A 123
This expands in the query processor to SELECT
FROM (SELECT A,B,C FROM R WHERE
R.A 123) AS V, R WHERE V.B 5 AND V.C
R.C
4
Views in XQuery
  • In XQuery, views are defined as functions.

Creating a view declare function V() as
element(content) for r in
doc(R)/root/tree, a in r/a, b in r/b,
c in r/c where a 123 return
a, b, c
Using a view for v in V()/content, r in
doc(R)/root/treewhere v/b r/breturn v
5
Why are views useful?
  • In addition to describing transformations from
    one schema to another, views have several common
    uses
  • Providing security/access control
  • We can assign users permissions on different
    views
  • Can select rows or project out columns so we only
    reveal what we want!
  • Can be used as relations in other queries
  • Allows the user to query things that make more
    sense

6
Virtual vs. Materialized Views
  • A virtual view is a named query that is actually
    re-computed every time as shown on the previous
    slide
  • CREATE VIEW V(A,B,C) AS
  • SELECT A,B,C FROM R WHERE R.A 123
  • A materialized view is one that is computed once
    and its results are stored as a table
  • Think of this as a cached answer
  • These are incredibly useful!
  • Techniques exist for using materialized views to
    answer other queries

SELECT FROM V, RWHERE V.B 5 AND V.C R.C
7
Example
  • Suppose V is defined as select from R where
    R.A 123. If R is initially
  • Then the result of evaluating select from V
    is
  • Suppose we delete (123, 8, World) from R. Then
    the result of select from V should be

A B C 123 5
Atlas 456 10 Guide 123 8
World
A B C 123 5
Atlas 123 8 World
A B C 123 5
Atlas
8
Views Should Stay Fresh
  • Views behave, from the perspective of a query
    language, exactly like base relations
  • But theres an association that should be
    maintained
  • If tuples change in the base relation, they
    should change in the view (whether its
    materialized or not)
  • If tuples change in the view, that should reflect
    in the base relation(s)

9
View Maintenance and the View Update Problem
  • There exist algorithms to incrementally recompute
    a materialized view when the base relations
    change
  • We can try to propagate view changes to the base
    relations
  • This is not hard for the previous example.
  • However, there are lots of views that arent
    easily updatable
  • We can ensure views are updatable by enforcing
    certain constraints (e.g., no aggregation),but
    this limits the kinds of views we can have

R
S
R?S
10
Publishing XML Views of Relational Data
  • It can be done with SQL/XML, an extension of the
    SQL standard (see
    http//sqlxml.org)
  • (Dont confuse with old and lame SQLXML for SQL
    Server.)
  • select xmlelement(name Customer,
  • xmlelement(name CustID, c.CustId),
  • xmlelement(name CustName, c.CustName),
  • xmlelement(name City, c.City) )
  • from customers c
  • where c.Status preferred
  • This is a very valuable tool for B2B
    (business-to-business) data exchange. Available
    in Oracle, DB2, SQL Server.

11
Embedding XML in a Relational Database
  • Straightforward solution add attributes of type
    XML. Promoted by the same SQL/XML standard.
  • create table clients(
  • id int primary key not null,
  • name varchar(50),
  • status varchar(10),
  • contactinfo xml )
  • Available in Oracle, DB2, SQL Server. Syntax may
    vary. Above syntax is from DB2.

12
Querying Relationally Embedded XMLwith SQL/XML
  • SQL/XML standard specifies xmlexists and xmlquery
    for embedding XPath and XQuery into SQL. DB2
    syntax.
  • select name from clients
  • where xmlexists('c/Client/Addresszip"95116"'
  • passing clients.contactinfo as
    "c")
  • select name,
  • xmlquery('for e in c/Client/email return
    e'
  • passing contactinfo as "c")
  • from clients
  • where status 'Gold

13
SQL Extensions for XQuery (DB2)
  • for y in
  • db2-fnxmlcolumn('CLIENTS.CONTACTINFO')/Client/A
    ddress
  • return y
  • for y in db2-fnsqlquery('select contactinfo
  • from clients
  • where status''Gold''
  • )/Client
  • where y/Address/city"San Jose"
  • return (
  • if (y/email) then y/email
  • else y/Address )

14
More Uniform Ways for Storing XML in
an RDBMS Mapping Relational ? XML
  • We know the following
  • XML data is tree-like
  • XML is SEMI-structured
  • Theres some structured stuff, especially if it
    follows a DTD
  • There is some unstructured stuff, eg. text
  • Issues relate to describing XML structure,
    particularly parent/child in a relational
    encoding
  • Relations are flat
  • Tuples can be connected via foreign-key/primary-
    key links

15
The Simplest Way to Encode a Tree
  • Suppose we had
    XYZ
    14
  • If we have no IDs, we CREATE values
  • BinaryLikeEdge(key, label, type, value, parent)

What are shortcomings here?
16
Improved Edge Approach (Florescu/Kossmann)
  • Consider order, typing separate the values
  • Vint(vid, value)
  • Vstring(vid, value)
  • Edge(parent, ordinal, label, flag, target)

17
Inlining Techniques
  • But can we do better if we know the schema?
  • Folks at Wisconsin noted we can exploit the
    structured aspects of semi-structured XML
  • Often the DTD has a lot of required (and often
    singleton) child elements
  • Book(title, author, publisher)
  • Recall how normalization worked
  • Decompose until we have everything in a relation
    determined by the keys
  • But dont decompose any further than that
  • So we should try not to decompose XML beyond the
    point of singleton children

18
Details (Shanmugasundaram et al. )
  • Start with DTD, build a graph representing
    structure

tree
?
_at_id

content
_at_id


i-content
sub-content
  • The edges are annotated with ?, indicating
    optional, repetition of children (respectively)
  • They simplify the DTD to figure this out

19
Building Schemas
  • Now, they tried several alternatives that differ
    in how they handle elements w/multiple ancestors
  • Can create a separate relation for each path
  • Can create a single relation for each element
  • Can try to inline these
  • For tree examples, these are basically the same
  • Combine non-set-valued things with parent
  • Add separate relation for set-valued child
    elements
  • Create new keys as needed

author
book
name
20
Schemas for Our Example
  • TheRoot(rootID)
  • Content(parentID, id, _at_id)
  • Sub-content(parentID, varchar)
  • I-content(parentID, int)
  • If we suddenly changed DTD to content(sub-content, i-content?) what would
    happen?

21
XQuery to SQL
  • Inlining method needs external knowledge about
    the schema
  • Needs to supply the tags and info not stored in
    the tables
  • We can actually directly translate simple XQuery
    into SQL over the relations instead of
    reconstructing the XML then querying it.

22
An Example
  • for X in document(mydoc)/tree/contentwhere
    X/sub-content XYZreturn X
  • The steps of the path expression are generally
    joins
  • Except that some steps are eliminated by the
    fact weve inlined subelements
  • Lets try it over the schema
  • TheRoot(rootID)
  • Content(parentID, id, _at_id)
  • Sub-content(parentID, varchar)
  • I-content(parentID, int)

23
Native XML Storage
  • Storing XML in relations incurs inefficiencies
  • Storage techniques optimized for path retrieval
    may be better appropriate indexes may play a
    role
  • Techniques are already making it into commercial
    systems
  • Oracle 11g XML DB provides a high-performance,
    native XML storage and retrieval technology.
  • DB2 9 pureXML is a native XML database for your
    most demanding XML data server needs.

24
Summary
  • Weve seen that views are useful things, and are
    heavily used within database applications as well
    as for data exchange.
  • Views allow us to store and refer to the results
    of a query (materialized vs. virtual views)
  • Weve seen an example of a view that changes from
    XML to relations and weve even seen how such a
    view can be posed in XQuery and unfolded into
    SQL
  • We have also seen how to define XML views of
    relational data
Write a Comment
User Comments (0)
About PowerShow.com