XML,%20XML%20Schema,%20XPath%20and%20XQuery%20Query%20Languages - PowerPoint PPT Presentation

About This Presentation
Title:

XML,%20XML%20Schema,%20XPath%20and%20XQuery%20Query%20Languages

Description:

XML, XML Schema, XPath and XQuery Query Languages. CS561 ... person id='o555' name Jane /name /person person id='o456' name Mary /name ... – PowerPoint PPT presentation

Number of Views:255
Avg rating:3.0/5.0
Slides: 71
Provided by: webC6
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: XML,%20XML%20Schema,%20XPath%20and%20XQuery%20Query%20Languages


1
XML, XML Schema, XPath and XQuery Query Languages
  • CS561

Slides collated from several sources, including
D. Suciu at Univ. of Washington
2
XML Data
3
XML
  • W3C standard to complement HTML
  • origins structured text SGML
  • motivation
  • HTML describes presentation
  • XML describes content
  • HTML e XML subset SGML

4
From HTML to XML
HTML describes the presentation
5
HTML
  • lth1gt Bibliography lt/h1gt
  • ltpgt ltigt Foundations of Databases lt/igt
  • Abiteboul, Hull, Vianu
  • ltbrgt Addison Wesley, 1995
  • ltpgt ltigt Data on the Web lt/igt
  • Abiteboul, Buneman, Suciu
  • ltbrgt Morgan Kaufmann, 1999

6
XML
  • ltbibliographygt
  • ltbookgt lttitlegt Foundations lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltauthorgt Hull lt/authorgt
  • ltauthorgt Vianu lt/authorgt
  • ltpublishergt Addison Wesley
    lt/publishergt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt
  • lt/bibliographygt

XML describes the content
7
XML Terminology
  • tags book, title, author,
  • start tag ltbookgt, end tag lt/bookgt
  • elements ltbookgtltbookgt,ltauthorgtlt/authorgt
  • elements are nested
  • empty element ltredgtlt/redgt abbrv. ltred/gt
  • an XML document single root element

well formed XML document if it has matching tags
8
XML Attributes
  • ltbook price 55 currency USDgt
  • lttitlegt Foundations of Databases lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt

attributes are alternative ways to represent data
9
More XML Oids and References
  • ltperson ido555gt ltnamegt Jane lt/namegt lt/persongt
  • ltperson ido456gt ltnamegt Mary lt/namegt
  • ltchildren
    idrefo123 o555/gt
  • lt/persongt
  • ltperson ido123 mothero456gtltnamegtJohnlt/namegt
  • lt/persongt

oids and references in XML are just syntax
10
So Far
  • Differences between xml data versus relational
    data ?
  • Data model?
  • Typed?
  • Homogeneity?
  • Correctness?
  • Usage/Purpose ?

11
XML Data Model
  • Numerous competing models
  • Document Object Model (DOM)
  • class hierarchy (node, element, attribute,)
  • defines API to inspect/modify the document
  • XML query data model (formal)

12
XML Namespaces
  • http//www.w3.org/TR/REC-xml-names
  • name prefixlocalpart

ltbook xmlnsisbnwww.isbn-org.org/defgt
lttitlegt lt/titlegt ltnumbergt 15 lt/numbergt
ltisbnnumbergt . lt/isbnnumbergt lt/bookgt
13
XML Namespaces
  • syntactic ltnumbergt , ltisbnnumbergt
  • semantic provide URL for shared schema

lttag xmlnsmystyle http//gt
ltmystyletitlegt
lt/mystyletitlegt ltmystylenumbergt
lt/taggt
14
So Far
  • What are namespaces good for ?
  • Are they typically available for relational
    databases?

15
Schemas for XML
16
DTD - Element Type Definitions
lt!ELEMENT paper (title,author, year,
(journalconference) )gt
17
XML Schemas
  • generalizes DTDs (SGML derivative)
  • now, instead uses XML syntax
  • two main documents structure and data types
  • XML Schema more powerful but more complex

18
XML Schema
  • ltxsdelement namepaper typepapertype/gt
  • ltxsdcomplexType namepapertypegt
  • ltxsdsequencegt
  • ltxsdelement nametitle
    typexsdstring/gt
  • ltxsdelement nameauthor
    minOccurs0/gt
  • ltxsdelement nameyear/gt
  • ltxsd choicegt lt xsdelement
    namejournal/gt
  • ltxsdelement
    nameconference/gt
  • lt/xsdchoicegt
  • lt/xsdsequencegt
  • lt/xsdcomplexType
  • lt/xsdelementgt

DTD lt!ELEMENT paper (title,author,year,
(journalconference))gt
19
So Far
  • Differences between xml schema versus
    relational schema ?
  • Purpose ? Do we need it ?
  • Definition time?
  • Strictness of typing ?
  • Underlying model ?

20
Elements versus Types in XML Schema
DTD lt!ELEMENT person (name, address) gt
ltxsdelement namepersongt ltxsdcomplexTypegt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegt
lt/xsdcomplexTypegtlt/xsdelementgt
ltxsdelement nameperson
typettt /gtltxsdcomplexType nametttgt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegtlt/xsdco
mplexTypegt
21
Elements versus Types in XML Schema
  • Types
  • Simple types (integers, strings, ...)
  • Complex types (regular expressions, like in DTDs)
  • Element-type-element alternation
  • Root element has a complex type
  • Complex type is a regular expression of elements
  • Those elements have their complex types ...
  • ...
  • Leaves have simple types

22
Local and Global Types in XML Schema
  • Local type
  • ltxsdelement namepersongt
    define locally the persons type
    lt/xsdelementgt
  • Global type ltxsdelement nameperson
    typettt/gt ltxsdcomplexType nametttgt
    define here the type ttt
    lt/xsdcomplexTypegt

Global types can be reused in other elements
23
Local v.s. Global Elements inXML Schema
  • Local element
  • ltxsdcomplexType nametttgt
    ltxsdsequencegt ltxsdelement
    nameaddress type.../gt...
    lt/xsdsequencegt lt/xsdcomplexTypegt
  • Global element ltxsdelement nameaddress
    type.../gt ltxsdcomplexType nametttgt
    ltxsdsequencegt ltxsdelement
    refaddress/gt ... lt/xsdsequencegt
    lt/xsdcomplexTypegt

Global elements like in DTDs
24
Regular Expressions in XML Schema
  • Recall the element-type-element alternation
  • ltxsdcomplexType name....gt
    regular expression on
    elements lt/xsdcomplexTypegt
  • Regular expressions
  • ltxsdsequencegt A B C lt/...gt
  • ltxsdchoicegt A B C lt/...gt
  • ltxsdgroupgt A B C lt/...gt
  • ltxsd... minOccurs0 maxOccursunboundedgt
    ..lt/...gt
  • ltxsd... minOccurs0 maxOccurs1gt ..lt/...gt

25
Regular Expressions in XML Schema
  • Regular expressions
  • ltxsdsequencegt A B C lt/...gt
    A B C
  • ltxsdchoicegt A B C lt/...gt
    A B C
  • ltxsdgroupgt A B C lt/...gt
    (A B C)
  • ltxsd... minOccurs0 maxOccursunboundedgt
    ..lt/...gt (...)
  • ltxsd... minOccurs0 maxOccurs1gt ..lt/...gt
    (...)?

26
Regular Expressions in XML Schema
  • Recall the element-type-element alternation
  • ltxsdcomplexType name....gt
    regular expression on
    elements lt/xsdcomplexTypegt
  • Regular expressions
  • ltxsdsequencegt A B C lt/...gt
    A B C
  • ltxsdchoicegt A B C lt/...gt
    A B C
  • ltxsdgroupgt A B C lt/...gt
    (A B C)
  • ltxsd... minOccurs0 maxOccursunboundedgt
    ..lt/...gt (...)
  • ltxsd... minOccurs0 maxOccurs1gt ..lt/...gt
    (...)?

27
Attributes in XML Schema
ltxsdelement namepaper typepapertype/gt ltxsd
complexType namepapertypegt
ltxsdsequencegt ltxsdelement
nametitle typexsdstring/gt . .
. . . . lt/xsdsequencegt ltxsdattribute
namelanguage" type"xsdNMTOKEN"
fixedEnglish"/gt lt/xsdcomplexTypegt
Attributes are associated with the type, not the
element Only to complex types more trouble if
we want to add attributes to simple types.
28
Derived Types by Extensions
ltcomplexType name"Address"gt ltsequencegt
ltelement name"street" type"string"/gt
ltelement name"city"
type"string"/gt lt/sequencegt lt/complexTypegt
ltcomplexType name"USAddress"gt
ltcomplexContentgt ltextension base
"ipoAddress"gt ltsequencegt ltelement
name"state" type"ipoUSState"/gt
ltelement name"zip"
type"positiveInteger"/gt lt/sequencegt
lt/extensiongt lt/complexContentgt lt/complexTypegt
Corresponds to inheritance
29
Key Constraints in XML
30
Keys in XML Schema
XML
ltpurchaseReportgt ltregionsgt ltzip code"95819"gt
ltpart number"872-AA" quantity"1"/gt ltpart
number"926-AA" quantity"1"/gt ltpart
number"833-AA" quantity"1"/gt ltpart
number"455-BX" quantity"1"/gt lt/zipgt ltzip
code"63143"gt ltpart number"455-BX"
quantity"4"/gt lt/zipgt lt/regionsgt ltpartsgt
ltpart number"872-AA"gtLawnmowerlt/partgt ltpart
number"926-AA"gtBaby Monitorlt/partgt ltpart
number"833-AA"gtLapis Necklacelt/partgt ltpart
number"455-BX"gtSturdy Shelveslt/partgt
lt/partsgt lt/purchaseReportgt
XML Schema for Key
ltkey name"NumKey"gt ltselector
xpath"parts/part"/gt ltfield xpath"_at_number"/gt lt
/keygt
31
Keys in XML Schema
  • In general, syntax is

ltkey namesomeDummyNameHere"gt ltselector
xpathp"/gt ltfield xpathp1"/gt ltfield
xpathp2"/gt . . . ltfield
xpathpk"/gt lt/keygt
Notes All XPath expressions start at the
element currently being defined The fields must
identify a single node.
32
Keys in XML Schema
  • Unique guarantees uniqueness
  • Key guarantees uniqueness and existence
  • All XPath expressions are restricted
  • /a/b /a/c OK for selector
  • //a/b//c OK for field
  • Note better than DTDs ID mechanism

33
Examples of Keys in XML Schema
  • Examples

ltkey name"fullName"gt ltselector
xpath".//person"/gt ltfield xpath"firstname"/gt
ltfield xpath"surname"/gt lt/keygt ltunique
name"nearlyID"gt ltselector xpath".//"/gt
ltfield xpath"_at_id"/gt lt/uniquegt
Note Must have single firstname, Single surname
34
Foreign Keys in XML Schema
  • Example

ltkeyref name"personRef" refer"fullName"gt
ltselector xpath".//personPointer"/gt ltfield
xpath"_at_first"/gt ltfield xpath"_at_last"/gt lt/keyrefgt
35
So Far
  • Differences between keys/foreign-keysin xml
    versus relational model?
  • Purpose ?
  • Underlying model ?

36
XPath
  • The Basic Building Block

37
XPath
  • Goal Permit access some nodes from document
  • XPath main construct Axis navigation
  • Navigation step axis node-test predicates
  • Examples
  • descendantnode()
  • childauthor
  • attributebooktitle XML
  • XPath path consists of one or more navigation
    steps, separated by /
  • Navigation step axis node-test predicates
  • Examples
  • /descendantnode()/childauthor
  • /descendantnode()/childauthorparent/attribute
    booktitle XML2

38
XPath
  • Goal Permit access some nodes from document
  • XPath main construct Axis navigation
  • Navigation step axis node-test predicates
  • Examples
  • descendantnode()
  • childauthor
  • attributebooktitle XML

39
XPath
  • XPath path consists of one or more navigation
    steps, separated by /
  • Navigation step axis node-test predicates
  • Examples
  • /descendantnode() /childauthor
  • /descendantnode() /childauthor parent
    /attributebooktitle XML2
  • XPath offers shortcuts
  • no axis means child
  • // º /descendant-or-selfnode()/

40
XPath- Child Axis Navigation
  • author is shorthand for childauthor.
  • Examples
  • aaa -- all the children nodes labeled aaa
  • aaa/bbb -- all the bbb grandchildren of aaa
    children
  • /bbb all the bbb grandchildren of any child
  • Notes
  • . -- the context node
  • / -- the root node

41
XPath- Child Axis Navigation
  • author is shorthand for childauthor.
  • Examples
  • aaa -- all the children nodes labeled aaa (1,3)
  • aaa/bbb -- all the bbb grandchildren of aaa
    children (4)
  • /bbb all the bbb grandchildren of any child
    (4,6)
  • Notes
  • . -- the context node
  • / -- the root node

42
XPath- Child Axis Navigation
  • /doc -- all doc children of the root
  • ./aaa -- all aaa children of the context node

    (equivalent to aaa)
  • text() -- all text children of context node
  • node() -- all children of the context node
    (includes text and attribute nodes)
  • .. -- parent of the context node
  • .// -- the context node and all its descendants
  • // -- the root node and all its descendants
  • //text() -- all the text nodes in the document

43
Predicates
  • 2 -- the second child node of the context node
  • chapter5 -- the fifth chapter child of context
    node
  • last() -- the last child node of the context
    node
  • chaptertitleintroduction -- the chapter
    children of the context node that have one or
    more title children whose string-value is
    introduction (string-value is concatenation of
    all text on descendant text nodes)
  • person.//firstname joe -- the person
    children of the context node that have in their
    descendants a firstname element with string-value
    Joe

44
Axis navigation
  • So far, our expressions have moved us down by
    moving to children nodes.
  • Exceptions are
  • . stay where you are
  • / go to the root
  • // all descendants of the root
  • .// all descendants of the context node

45
Axis navigation
  • XPath has several axes ancestor,
    ancestor-or-self, attribute, child, descendant,
    descendant-or-self, following, following-sibling,
    namespace, parent, preceding, preceding-sibling,
    self
  • Some of these describe single nodes
  • self, parent
  • Some describe sequences of nodes
  • All others

46
XPath Navigation Axes
ancestor
following-sibling
preceding-sibling
self
child
attribute
following
preceding
namespace
descendant
47
XPath Abbreviated Syntax
(nothing) child _at_ attribute // /descendan
t-or-selfnode() . selfnode() .// descendan
t-or-selfnode .. parentnode() / (document
root)
48
XPath
  • Widely adopted -- in XML-Schema and in many query
    languages.
  • About as expressive as regular path expressions

49
So Far
  • Differences between SQL and XPATH?
  • What are similar query capabilities?
  • What features does SQL have, but not XPATH?
  • What features does XPATH support, but not SQL?
  • Is XPath a full-fledged query language?

50
Query Languages - XQuery
51
Summary of XQuery
  • FLWR expressions
  • FOR and LET expressions
  • Collections and sorting
  • Resources
  • XQuery A Query Language for XML Chamberlin,
    Florescu, et al.
  • W3C recommendation www.w3.org/TR/xquery/

52
XQuery
  • Designed based on Quilt (which is based on
    XML-QL)
  • http//www.w3.org/TR/xquery/2/2001
  • XML Query data model (ordered)

53
FLWR (Flower) Expressions
  • FOR ... LET... FOR... LET...
  • WHERE...
  • RETURN...

54
XQuery
  • Find the titles of all books published after 1995

FOR x IN document("bib.xml")/bib/book WHERE
x/year gt 1995 RETURN x/title
How does result look like?
55
XQuery
  • Find all book titles published after 1995

FOR x IN document("bib.xml")/bib/book WHERE
x/year gt 1995 RETURN x/title
Result lttitlegt abc lt/titlegt lttitlegt def
lt/titlegt lttitlegt ghi lt/titlegt
56
XQuery Example
FOR a IN (document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN ltresultgt
a, FOR t IN
/bib/bookauthora/title
RETURN t lt/resultgt
57
XQuery Example
For each author of a book by Morgan Kaufmann,
list all books she published
FOR a IN (document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN ltresultgt
a, FOR t IN
/bib/bookauthora/title
RETURN t lt/resultgt
What is query result ?
58
XQuery
  • Result
  • ltresultgt
  • ltauthorgtJoneslt/authorgt
  • lttitlegt abc lt/titlegt
  • lttitlegt def lt/titlegt
  • lt/resultgt
  • ltresultgt
  • ltauthorgtJoneslt/authorgt
  • lttitlegt abc lt/titlegt
  • lttitlegt def lt/titlegt
  • lt/resultgt
  • ltresultgt
  • ltauthorgt Smith lt/authorgt
  • lttitlegt ghi lt/titlegt
  • lt/resultgt

59
XQuery Example Duplicates
  • For each author of a book by Morgan Kaufmann,
    list all books she published

FOR a IN distinct(document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN ltresultgt
a, FOR t IN
/bib/bookauthora/title
RETURN t lt/resultgt
distinct a function that eliminates duplicates
60
Example XQuery Result
  • Result
  • ltresultgt
  • ltauthorgtJoneslt/authorgt
  • lttitlegt abc lt/titlegt
  • lttitlegt def lt/titlegt
  • lt/resultgt
  • ltresultgt
  • ltauthorgt Smith lt/authorgt
  • lttitlegt ghi lt/titlegt
  • lt/resultgt

61
XQuery
  • FOR x in expr
  • binds x to each element in the list expr
  • Useful for iteration over some input list
  • LET x expr
  • binds x to the entire list expr
  • Useful for common subexpressions and for grouping
    and aggregations

62
XQuery with LET Clause
ltbig_publishersgt FOR p IN
distinct(document("bib.xml")//publisher)
LET b document("bib.xml")/bookpublisher
p WHERE count(b) gt 100 RETURN
p lt/big_publishersgt
count a (aggregate) function that returns
number of elements
63
XQuery
  • Find books whose price is larger than average

LET a avg(document("bib.xml")/bib/book/_at_price)
FOR b in document("bib.xml")/bib/book WHERE
b/_at_price gt a RETURN b
64
FOR versus LET
  • FOR
  • Binds node variables ? iteration
  • LET
  • Binds collection variables ? one value

65
FOR v.s. LET
Returns ltresultgt ltbookgt...lt/bookgtlt/resultgt
ltresultgt ltbookgt...lt/bookgtlt/resultgt ltresultgt
ltbookgt...lt/bookgtlt/resultgt ...
FOR x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
Returns ltresultgt ltbookgt...lt/bookgt
ltbookgt...lt/bookgt
ltbookgt...lt/bookgt ... lt/resultgt
LET x document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
66
Collections in XQuery
  • Ordered and unordered collections
  • /bib/book/author an ordered collection
  • distinct(/bib/book/author) an unordered
    collection
  • LET a /bib/book ? a is a collection
  • b/author ? a collection (several authors...)

Returns ltresultgt ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
... lt/resultgt
RETURN ltresultgt b/author lt/resultgt
67
XQuery Summary
  • FOR-LET-WHERE-RETURN FLWR

FOR/LET Clauses
List of tuples
WHERE Clause
List of tuples
RETURN Clause
Instances of XQuery data model
68
XQuery
  • Some more query features

69
Sorting in XQuery
ltpublisher_listgt FOR p IN distinct(document("
bib.xml")//publisher) RETURN ltpublishergt
ltnamegt p/text() lt/namegt ,
FOR b IN document("bib.xml")//bookpublisher
p RETURN ltbookgt

b/title ,
b/_at_price
lt/bookgt SORTBY (price DESCENDING)
lt/publishergt SORTBY (name)
lt/publisher_listgt
70
Sorting in XQuery
  • Sorting arguments refer to name space of RETURN
    clause, not of FOR clause
  • TIP To sort on an element you dont want to
    display, first return it, then remove it with an
    additional query.

71
If-Then-Else
FOR h IN //holding RETURN ltholdinggt
h/title, IF
h/_at_type "Journal"
THEN h/editor ELSE
h/author lt/holdinggt SORTBY
(title)
72
Existential Quantifiers
FOR b IN //book WHERE SOME p IN b//para
SATISFIES contains(p, "sailing") AND
contains(p, "windsurfing") RETURN b/title
73
Universal Quantifiers
FOR b IN //book WHERE EVERY p IN b//para
SATISFIES contains(p, "sailing") RETURN
b/title
74
So Far
  • Similarities between SQL and XQuery?
  • Differences between SQL and XQuery?

75
XML, XML Data ModelXML Schema, XPath XQuery
Write a Comment
User Comments (0)
About PowerShow.com