XML, XML Schema, Xpath and XQuery - PowerPoint PPT Presentation

About This Presentation
Title:

XML, XML Schema, Xpath and XQuery

Description:

Title: PowerPoint Presentation Last modified by: default Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:241
Avg rating:3.0/5.0
Slides: 51
Provided by: webCsWpi73
Learn more at: http://web.cs.wpi.edu
Category:
Tags: xml | schema | xpath | xquery

less

Transcript and Presenter's Notes

Title: XML, XML Schema, Xpath and XQuery


1
XML, XML Schema, Xpath and XQuery
Slides collated from various sources, many from
Dan Suciu at Univ. of Washington
2
XML
  • W3C standard to complement HTML
  • origins structured text SGML
  • motivation
  • HTML describes presentation
  • XML describes content
  • http//www.w3.org/TR/2000/REC-xml-20001006
    (version 2, 10/2000)

3
From HTML to XML
HTML describes the presentation
4
HTML
  • lth1gt Bibliography lt/h1gt
  • ltpgt ltigt Foundations of Databases lt/igt
  • Abiteboul, Hull, Vianu
  • ltbrgt Addison Wesley, 1995
  • ltpgt ltigt Data on the Web lt/igt
  • Abiteboul, Buneman, Suciu
  • ltbrgt Morgan Kaufmann, 1999

5
XML
  • ltbibliographygt
  • ltbookgt lttitlegt Foundations lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltauthorgt Hull lt/authorgt
  • ltauthorgt Vianu lt/authorgt
  • ltpublishergt Addison Wesley
    lt/publishergt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt
  • lt/bibliographygt

XML describes the content
6
XML Terminology
  • tags book, title, author,
  • start tag ltbookgt, end tag lt/bookgt
  • elements ltbookgtltbookgt,ltauthorgtlt/authorgt
  • elements are nested
  • empty element ltredgtlt/redgt abbrv. ltred/gt
  • an XML document single root element

well formed XML document if it has matching tags
7
More XML Attributes
  • ltbook price 55 currency USDgt
  • lttitlegt Foundations of Databases lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt

attributes are alternative ways to represent data
8
More XML Oids and References
  • ltperson ido555gt ltnamegt Jane lt/namegt lt/persongt
  • ltperson ido456gt ltnamegt Mary lt/namegt
  • ltchildren
    idrefo123 o555/gt
  • lt/persongt
  • ltperson ido123 mothero456gtltnamegtJohnlt/namegt
  • lt/persongt

oids and references in XML are just syntax
9
XML Namespaces
  • http//www.w3.org/TR/REC-xml-names (1/99)
  • name prefixlocalpart

ltbook xmlnsisbnwww.isbn-org.org/defgt
lttitlegt lt/titlegt ltnumbergt 15 lt/numbergt
ltisbnnumbergt . lt/isbnnumbergt lt/bookgt
10
XML Namespaces
  • syntactic ltnumbergt , ltisbnnumbergt
  • semantic provide URL for schema

lttag xmlnsmystyle http//gt
ltmystyletitlegt
lt/mystyletitlegt ltmystylenumbergt
lt/taggt
11
XML Data Model
  • Several competing models
  • Document Object Model (DOM)
  • http//www.w3.org/TR/2001/WD-DOM-Level-3-CMLS-2001
    0209/ (2/2001)
  • class hierarchy (node, element, attribute,)
  • objects have behavior
  • defines API to inspect/modify the document
  • Infoset - PSV (post schema validation)
  • XML Query data model

12
XML Schemas
  • http//www.w3.org/TR/xmlschema-1/10/2000
  • generalizes DTDs
  • uses XML syntax
  • two documents structure and datatypes
  • http//www.w3.org/TR/xmlschema-1
  • http//www.w3.org/TR/xmlschema-2
  • XML-Schema is complex

13
XML Schemas
  • ltxsdelement namepaper typepapertype/gt
  • ltxsdcomplexType namepapertypegt
  • ltxsdsequencegt
  • ltxsdelement nametitle
    typexsdstring/gt
  • ltxsdelement nameauthor
    minOccurs0/gt
  • ltxsdelement nameyear/gt
  • ltxsd choicegt lt xsdelement
    namejournal/gt
  • ltxsdelement
    nameconference/gt
  • lt/xsdchoicegt
  • lt/xsdsequencegt
  • lt/xsdelementgt

DTD lt!ELEMENT paper (title,author,year,
(journalconference))gt
14
Elements v.s. Types in XML Schema
ltxsdelement namepersongt ltxsdcomplexTypegt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegt
lt/xsdcomplexTypegtlt/xsdelementgt
ltxsdelement nameperson
typetttgtltxsdcomplexType nametttgt
ltxsdsequencegt ltxsdelement namename
typexsdstring/gt
ltxsdelement nameaddress
typexsdstring/gt lt/xsdsequencegtlt/xsdco
mplexTypegt
DTD lt!ELEMENT person (name,address)gt
15
Elements v.s. Types in XML Schema
  • Types
  • Simple types (integers, strings, ...)
  • Complex types (regular expressions, like in DTDs)
  • Element-type-element alternation
  • Root element has a complex type
  • That type is a regular expression of elements
  • Those elements have their complex types...
  • ...
  • On the leaves we have simple types

16
Local and Global Types in XML Schema
  • Local type
  • ltxsdelement namepersongt
    define locally the persons type
    lt/xsdelementgt
  • Global type ltxsdelement nameperson
    typettt/gt ltxsdcomplexType nametttgt
    define here the type ttt
    lt/xsdcomplexTypegt

Global types can be reused in other elements
17
Local v.s. Global Elements inXML Schema
  • Local element
  • ltxsdcomplexType nametttgt
    ltxsdsequencegt ltxsdelement
    nameaddress type.../gt...
    lt/xsdsequencegt lt/xsdcomplexTypegt
  • Global element ltxsdelement nameaddress
    type.../gt ltxsdcomplexType nametttgt
    ltxsdsequencegt ltxsdelement
    refaddress/gt ... lt/xsdsequencegt
    lt/xsdcomplexTypegt

Global elements like in DTDs
18
Regular Expressions in XML Schema
  • Recall the element-type-element alternation
  • ltxsdcomplexType name....gt
    regular expression on
    elements lt/xsdcomplexTypegt
  • Regular expressions
  • ltxsdsequencegt A B C lt/...gt
    A B C
  • ltxsdchoicegt A B C lt/...gt
    A B C
  • ltxsdgroupgt A B C lt/...gt
    (A B C)
  • ltxsd... minOccurs0 maxOccursunboundedgt
    ..lt/...gt (...)
  • ltxsd... minOccurs0 maxOccurs1gt ..lt/...gt
    (...)?

19
Attributes in XML Schema
ltxsdelement namepaper typepapertype/gt ltxsd
complexType namepapertypegt
ltxsdsequencegt ltxsdelement
nametitle typexsdstring/gt . .
. . . . lt/xsdsequencegt ltxsdattribute
namelanguage" type"xsdNMTOKEN"
fixedEnglish"/gt lt/xsdcomplexTypegt
Attributes are associated to the type, not to the
element Only to complex types more trouble if we
want to add attributes to simple types.
20
Derived Types by Extensions
ltcomplexType name"Address"gt ltsequencegt
ltelement name"street" type"string"/gt
ltelement name"city"
type"string"/gt lt/sequencegt lt/complexTypegt
ltcomplexType name"USAddress"gt
ltcomplexContentgt ltextension
base"ipoAddress"gt ltsequencegt ltelement
name"state" type"ipoUSState"/gt
ltelement name"zip"
type"positiveInteger"/gt lt/sequencegt
lt/extensiongt lt/complexContentgt lt/complexTypegt
Corresponds to inheritance
21
Keys in XML Schema
XML
ltpurchaseReportgt ltregionsgt ltzip code"95819"gt
ltpart number"872-AA" quantity"1"/gt ltpart
number"926-AA" quantity"1"/gt ltpart
number"833-AA" quantity"1"/gt ltpart
number"455-BX" quantity"1"/gt lt/zipgt ltzip
code"63143"gt ltpart number"455-BX"
quantity"4"/gt lt/zipgt lt/regionsgt ltpartsgt
ltpart number"872-AA"gtLawnmowerlt/partgt ltpart
number"926-AA"gtBaby Monitorlt/partgt ltpart
number"833-AA"gtLapis Necklacelt/partgt ltpart
number"455-BX"gtSturdy Shelveslt/partgt
lt/partsgt lt/purchaseReportgt
XML Schema
ltkey name"NumKey"gt ltselector
xpath"parts/part"/gt ltfield xpath"_at_number"/gt lt
/keygt
22
Keys in XML Schema
  • In general, two flavors

ltkey namesomeDummyNameHere"gt ltselector
xpathp"/gt ltfield xpathp1"/gt ltfield
xpathp2"/gt . . . ltfield
xpathpk"/gt lt/keygt
ltunique namesomeDummyNameHere"gt ltselector
xpathp"/gt ltfield xpathp1"/gt ltfield
xpathp2"/gt . . . ltfield
xpathpk"/gt lt/keygt
Note all Xpath expressions start at the
element currently being defined The fields must
identify a single node
23
Keys in XML Schema
  • Unique guarantees uniqueness
  • Key guarantees uniqueness and existence
  • All Xpath expressions are restricted
  • /a/b /a/c OK for selector
  • //a/b//c OK for field
  • Note better than DTDs ID mechanism

24
Keys in XML Schema
  • Examples

ltkey name"fullName"gt ltselector
xpath".//person"/gt ltfield xpath"forename"/gt
ltfield xpath"surname"/gt lt/keygt ltunique
name"nearlyID"gt ltselector xpath".//"/gt
ltfield xpath"_at_id"/gt lt/uniquegt
Recall must have A single forename, Single
surname
25
Foreign Keys in XML Schema
  • Example

ltkeyref name"personRef" refer"fullName"gt
ltselector xpath".//personPointer"/gt ltfield
xpath"_at_first"/gt ltfield xpath"_at_last"/gt lt/keyrefgt
26
XPATH
27
XPath
  • Goal permit to access some nodes from document
  • XPath main construct axis navigation
  • XPath path consists of one or more navigation
    steps, separated by /
  • Navigation step axis node-test predicates
  • Examples
  • /descendantnode()/childauthor
  • /descendantnode()/childauthorparent/attribute
    booktitle XML2
  • XPath also offers shortcuts
  • no axis means child
  • // º /descendant-or-selfnode()/

28
XPath- Child axis navigation
  • author is shorthand for childauthor. Examples
  • aaa -- all the child nodes labeled aaa (1,3)
  • aaa/bbb -- all the bbb grandchildren of aaa
    children (4)
  • /bbb all the bbb grandchildren of any child
    (4,6)
  • . -- the context node
  • / -- the root node

29
XPath- child axis navigation
  • /doc -- all the doc children of the root
  • ./aaa -- all the aaa children of the context node
    (equivalent to aaa)
  • text() -- all the text children of the context
    node
  • node() -- all the children of the context node
    (includes text and attribute nodes)
  • .. -- parent of the context node
  • .// -- the context node and all its descendants
  • // -- the root node and all its descendants
  • //text() -- all the text nodes in the document

30
Predicates
  • 2 -- the second child node of the context node
  • chapter5 -- the fifth chapter child of the
    context node
  • last() -- the last child node of the context
    node
  • chaptertitleintroduction -- the chapter
    children of the context node that have one or
    more title children whose string-value is
    introduction (the string-value is the
    concatenation of all the text on descendant text
    nodes)
  • person.//firstname joe -- the person
    children of the context node that have in their
    descendants a firstname element with string-value
    Joe

31
Axis navigation
  • So far, nearly all our expressions have moved us
    down by moving to child nodes. Exceptions were
  • . -- stay where you are
  • / go to the root
  • // all descendants of the root
  • .// all descendants of the context node
  • XPath has several axes ancestor,
    ancestor-or-self, attribute, child, descendant,
    descendant-or-self, following, following-sibling,
    namespace, parent, preceding, preceding-sibling,
    self
  • Some of these (self, parent) describe single
    nodes, others describe sequences of nodes.

32
XPath Navigation Axes
ancestor
following-sibling
preceding-sibling
self
child
attribute
following
preceding
namespace
descendant
33
XPath abbreviated syntax
(nothing) child _at_ attribute // /descendan
t-or-selfnode() . selfnode() .// descendan
t-or-selfnode .. parentnode() / (document
root)
34
XPath
  • Reasonably widely adopted -- in XML-Schema and
    query languages.
  • Neither more expressive nor less expressive than
    regular path expressions

35
Query Languages - XQuery
36
Summary of XQuery
  • FLWR expressions
  • FOR and LET expressions
  • Collections and sorting
  • Resources
  • XQuery A Query Language for XML Chamberlin,
    Florescu, et al.
  • W3C recommendation www.w3.org/TR/xquery/

37
XQuery
  • Based on Quilt (which is based on XML-QL)
  • http//www.w3.org/TR/xquery/2/2001
  • XML Query data model (ordered)

38
FLWR (Flower) Expressions
  • FOR ... LET... FOR... LET...
  • WHERE...
  • RETURN...

39
XQuery
  • Find all book titles published after 1995

FOR x IN document("bib.xml")/bib/book WHERE
x/year gt 1995 RETURN x/title
Result lttitlegt abc lt/titlegt lttitlegt def
lt/titlegt lttitlegt ghi lt/titlegt
40
XQuery
  • For each author of a book by Morgan Kaufmann,
    list all books she published

FOR a IN distinct(document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN ltresultgt
a, FOR t IN
/bib/bookauthora/title
RETURN t lt/resultgt
distinct a function that eliminates duplicates
41
XQuery
  • Result
  • ltresultgt
  • ltauthorgtJoneslt/authorgt
  • lttitlegt abc lt/titlegt
  • lttitlegt def lt/titlegt
  • lt/resultgt
  • ltresultgt
  • ltauthorgt Smith lt/authorgt
  • lttitlegt ghi lt/titlegt
  • lt/resultgt

42
XQuery
  • FOR x in expr -- binds x to each element in
    the list expr
  • LET x expr -- binds x to the entire list
    expr
  • Useful for common subexpressions and for
    aggregations

43
XQuery
ltbig_publishersgt FOR p IN
distinct(document("bib.xml")//publisher)
LET b document("bib.xml")/bookpublisher
p WHERE count(b) gt 100 RETURN
p lt/big_publishersgt
count a (aggregate) function that returns the
number of elms
44
XQuery
  • Find books whose price is larger than average

LET aavg(document("bib.xml")/bib/book/_at_price) FO
R b in document("bib.xml")/bib/book WHERE
b/_at_price gt a RETURN b
45
XQuery
  • Summary
  • FOR-LET-WHERE-RETURN FLWR

FOR/LET Clauses
List of tuples
WHERE Clause
List of tuples
RETURN Clause
Instance of Xquery data model
46
FOR v.s. LET
  • FOR
  • Binds node variables ? iteration
  • LET
  • Binds collection variables ? one value

47
FOR v.s. LET
Returns ltresultgt ltbookgt...lt/bookgtlt/resultgt
ltresultgt ltbookgt...lt/bookgtlt/resultgt ltresultgt
ltbookgt...lt/bookgtlt/resultgt ...
FOR x IN document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
LET x document("bib.xml")/bib/book RETURN
ltresultgt x lt/resultgt
Returns ltresultgt ltbookgt...lt/bookgt
ltbookgt...lt/bookgt
ltbookgt...lt/bookgt ... lt/resultgt
48
Collections in XQuery
  • Ordered and unordered collections
  • /bib/book/author an ordered collection
  • Distinct(/bib/book/author) an unordered
    collection
  • LET a /bib/book ? a is a collection
  • b/author ? a collection (several authors...)

Returns ltresultgt ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
ltauthorgt...lt/authorgt
... lt/resultgt
RETURN ltresultgt b/author lt/resultgt
49
Sorting in XQuery
ltpublisher_listgt FOR p IN distinct(document("
bib.xml")//publisher) RETURN ltpublishergt
ltnamegt p/text() lt/namegt ,
FOR b IN document("bib.xml")//bookpublisher
p RETURN ltbookgt

b/title ,
b/_at_price
lt/bookgt SORTBY(price DESCENDING)
lt/publishergt SORTBY(name)
lt/publisher_listgt
50
Sorting in XQuery
  • Sorting arguments refer to name space of RETURN
    clause, not FOR clause
  • To sort on an element you dont want to display,
    first return it, then remove it with an
    additional query.

51
If-Then-Else
FOR h IN //holding RETURN ltholdinggt
h/title, IF
h/_at_type "Journal"
THEN h/editor ELSE
h/author lt/holdinggt SORTBY
(title)
52
Existential Quantifiers
FOR b IN //book WHERE SOME p IN b//para
SATISFIES contains(p, "sailing") AND
contains(p, "windsurfing") RETURN b/title
53
Universal Quantifiers
FOR b IN //book WHERE EVERY p IN b//para
SATISFIES contains(p, "sailing") RETURN
b/title
Write a Comment
User Comments (0)
About PowerShow.com