Title: XML Language Family Detailed Examples
1XML Language FamilyDetailed Examples
- Most information contained in these slide comes
from http//www.w3.org, http//www.zvon.org/ - These slides are intended to be used as a
tutorial on XML and related technologies - Slide authorJürgen Mangler (juergen.mangler_at_univ
ie.ac.at) - This section contains examples on
- XPath,
- XPointer
2XPath is the result of an effort to provide a
common syntax and semantics for functionality
shared between XSL Transformations XSLT and
XPointer. The primary purpose of XPath is to
address parts of an XML document.
- XPath uses a compact, non-XML syntax to
facilitate use of XPath within URIs and XML
attribute values. - XPath operates on the abstract, logical structure
of an XML document, rather than its surface
syntax. - XPath gets its name from its use of a path
notation as in URLs for navigating through the
hierarchical structure of an XML document.
3- In addition to its use for addressing, XPath is
also designed to feature a natural subset that
can be used for matching (testing whether or not
a node matches a pattern) this use of XPath is
described in XSLT (next chapter). - XPath models an XML document as a tree of nodes.
There are different types of nodes, including
element nodes, attribute nodes and text nodes.
4- The basic XPath syntax is similar to filesystem
addressing. If the path starts with the slash / ,
then it represents an absolute path to the
required element.
/AAA/CCC Select all elements CCC which are
children of the root element AAA ltAAAgt
ltBBB/gt ltCCC/gt
ltBBB/gt ltBBB/gt
ltDDDgt ltBBB/gt
lt/DDDgt ltCCC/gt lt/AAAgt
/AAA Select the root element AAA ltAAAgt
ltBBB/gt ltCCC/gt
ltBBB/gt ltBBB/gt
ltDDDgt ltBBB/gt
lt/DDDgt ltCCC/gt lt/AAAgt
5- If the path starts with // then all elements in
the document, that fulfill the criteria following
//, are selected.
//DDD/BBB Select all elements BBB which are
children of DDD ltAAAgt ltBBB/gt
ltDDDgt ltBBB/gt
lt/DDDgt ltCCCgt
ltDDDgt ltBBB/gt
ltBBB/gt lt/DDDgt
lt/CCCgt lt/AAAgt
//BBB Select all elements BBB ltAAAgt
ltBBB/gt ltDDDgt
ltBBB/gt lt/DDDgt
ltCCCgt ltDDDgt
ltBBB/gt lt
BBB/gt lt/DDDgt lt/CCCgt
lt/AAAgt
6- The star selects all elements located by the
preceeding path
////BBB Select all elements BBB which have 3
ancestors ltAAAgt ltCCCgt
ltDDDgt ltBBB/gt
lt/DDDgt lt/CCCgt
ltCCCgt ltDDDgt
ltBBB/gt lt/DDDgt
lt/CCCgt lt/AAAgt
/AAA/CCC/DDD/ Select all elements enclosed by
elements /AAA/CCC/DDD ltAAAgt
ltBBB/gt ltDDDgt
ltBBB/gt lt/DDDgt
ltCCCgt ltDDDgt
ltBBB/gt lt
BBB/gt lt/DDDgt lt/CCCgt
lt/AAAgt
7- The expression in square brackets can further
specify an element. A number in the brackets
gives the position of the element in the selected
set. The function last() selects the last element
in the selection.
/papers/paperlast() Select the last BBB child
of element AAA ltpapersgt ltpaper
author"motschnig"/gt ltpaper
author"derntl"/gt ltpaper
author"motschnig"/gt ltpaper
author"mangler"gt lt/papersgt
/papers/paper1 Select the first BBB child of
element AAA ltpapersgt ltpaper
author"motschnig"gt ltpaper
author"derntl"/gt ltpaper
author"motschnig"/gt ltpaper
author"mangler"/gt lt/papersgt
8- Attributes are specified by _at_ prefix.
//student_at_matnr Select BBB elements which have
attribute id ltstudentsgt ltstudent
matnr"9506264"/gt ltstudent
matnr"0002843"/gt ltstudent
name"Hauer"/gt ltstudent/gt
lt/studentsgt
//_at_matnr Select all attributes _at_matnr ltstud
entsgt ltstudent matnr"9506264"/gt
ltstudent matnr"0002843"/gt
ltstudent name"Hauer"/gt
ltstudent/gt lt/studentsgt
//studentnot(_at_) Select BBB elements without an
attribute ltstudentsgt ltstudent
id"9506264"/gt ltstudent id"0002843"/gt
ltstudent name"Koegler"/gt
ltstudent/gt lt/studentsgt
//student_at_ Select BBB elements which have any
attribute ltstudentsgt ltstudent
id"9506264"/gt ltstudent id"0002843"/gt
ltstudent name"Hauer"/gt
ltstudent/gt lt/studentsgt
9- Values of attributes can be used as selection
criteria. Function normalize-space removes
leading and trailing spaces and replaces
sequences of whitespace characters by a single
space.
//studentnormalize-space(_at_name)'hauer' Select
BBB elements which have an attribute name with
value bbb, leading and trailing spaces are
removed before comparison ltstudentsgt
ltstudent matnr"9506264"/gt
ltstudent name" hauer "/gt
ltstudent name"hauer"/gt
lt/studentsgt
//student_at_id'b1' Select BBB elements which
have attribute id with value b1 ltstudentsgt
ltBBB matnr"9506264"/gt
ltBBB name" hauer "/gt ltBBB
name"hauer"/gt lt/studentsgt
10- Function count() counts the number of selected
elements
//count()3 Select elements which have 3
children ltAAAgt ltCCCgt
ltBBB/gt ltBBB/gt
ltBBB/gt lt/CCCgt
ltDDDgt ltBBB/gt
lt/DDDgt ltEEEgt
ltCCC/gt lt/EEEgt
lt/AAAgt
//count(BBB)2 Select elements which have two
children BBB ltAAAgt ltCCCgt
ltBBB/gt lt/CCCgt
ltDDDgt ltBBB/gt
ltBBB/gt lt/DDDgt
ltEEEgt ltCCC/gt
ltDDD/gt lt/EEEgt
lt/AAAgt
11- Several paths can be combined with separator
("" stands for "or", like the logical or
operator in C).
/AAA/EEE //DDD/CCC /AAA //BBB Number of
combinations is not restricted ltAAAgt
ltBBB/gt ltCCC/gt
ltDDDgt ltCCC/gt
lt/DDDgt ltEEE/gt lt/AAAgt
AAA/EEE //BBB Select all elements BBB and
elements EEE which are children of root element
AAA ltAAAgt ltBBB/gt
ltCCC/gt ltDDDgt
ltCCC/gt lt/DDDgt
ltEEE/gt lt/AAAgt
12Axes are a sophisticated concept in XML to find
out which nodes relate to each other and how.
ltparentgt ltpreceding-sibling/gt
ltpreceding-sibling/gt ltnodegt
ltdescendant/gt ltdescendant/gt lt/nodegt
ltfollowing-sibling/gt ltfollowing-sibling/gt ltpa
rentgt
preceding- sibling
following- sibling
descendant
descendant
The above example illustrates how axes work.
Starting with node an axe would select the equal
named nodes. This example is also the base for
the next two pages.
13- The following main axes are available
- the child axis contains the children of the
context node - the descendant axis contains the descendants of
the context node a descendant is a child or a
child of a child and so on thus the descendant
axis never contains attribute or namespace nodes - the parent axis contains the parent of the
context node, if there is one - the following-sibling axis contains all the
following siblings of the context node if the
context node is an attribute node or namespace
node, the following-sibling axis is empty - the preceding-sibling axis contains all the
preceding siblings of the context node if the
context node is an attribute node or namespace
node, the preceding-sibling axis is empty - (http//www.w3.org/TR/xpathaxes)
14- The child axis contains the children of the
context node. The child axis is the default axis
and it can be omitted. - The descendant axis contains the descendants of
the context node a descendant is a child or a
child of a child and so on thus the descendant
axis never contains attribute or namespace nodes.
//CCC/descendantDDD Select elements DDD which
have CCC among its ancestors ltCCCgt
ltDDDgt ltEEEgt
lt/DDDgt lt/EEEgt
lt/DDDgt lt/CCCgt
/AAA Equivalent of /childAAA ltAAAgt
ltBBB/gt ltCCC/gt lt/AAAgt
15- XPointer is intended to be the basis of fragment
identifiers only for the text/xml and
application/xml media types (they can point only
to documents of these types). - Pointing to fragments of remote documents is
analogous to the use of anchors in HTML. Roughly
documentxpointer()
ltlink xmlnsxlink"http//www.w3.org/2000/xlink"gt
xlinktype"simple"gt xlinkhref"mydocu
ment.xmlxpointer(//AAA/BBB1)"gt lt/linkgt
16- If there are forbidden characters in your
expression, you must deal with them somehow. - When XPointer appears in an XML document, special
characters must be escaped according to
directions in XML.
- The characters lt or must be escaped using lt
and amp. - Any unbalanced parenthesis must be escaped using
circumflex ()
ltlink xmlnsxlink"http//www.w3.org/1999/xlink"
xlinktype"simple" xlinkhref"test.xmlxpointe
r(//AAA position() lt 2)"gt Bzw. xlinkhref"tes
t.xmlxpointer(string-range('(text
in'))"gt lt/linkgt
17- If your elements have an ID-type attribute, you
can address them directly using the value of the
ID-type attribute. (Don't forget you must have
an attribute defined as an ID type in your DTD!) - Using ID-type attributes, you can easily include
or jump to parts of documents. - The example below selects node with id("b1").
xpointer(id("b1")) ltbookgt ltbook id"b1"
name"XML"gtBad book.lt/bookgt ltbook id"b2"
name"JAVA"gt Good book.
ltadditionalgtMakes me sleep like a
baby.lt/additionalgt lt/bookgt ltbook id"123"
name"42"gtAll answers on only one
page.lt/bookgtlt/bookgt
18- The specification defines one full form and one
shorthand form (which is an abbreviation of the
full one).
- Short Form /1/2/3
- Full Form xpointer(/1/2/3)
ltAAAgt ltBBB myid"b1" bbb"111"gtText in the
first element BBB.lt/BBBgt ltBBB myid"b2"
bbb"222"gt Text in another element BBB.
ltDDD ddd"999"gtText in more nested
element.lt/DDDgt ltDDD ddd"888"gtText in more
nested element.lt/DDDgt ltDDD ddd"777"gtText
in more nested element.lt/DDDgt lt/BBBgt ltCCC
ccc"123" xxx"321"gtAgain some text in some
element.lt/CCCgt lt/AAAgt
19- A location of type point is defined by a node,
called the container node (node that contains the
point), and a non-negative integer, called the
index. - (//AAA, //AAA/BBB are the container nodes, 1,
2 is used if more than one container node of
the same name exists)
xpointer(start-point(//AAA)) xpointer(start-point(
range(//AAA/BBB1))) ltAAAgt? ltBBB
bbb"111"gtlt/BBBgt ltBBB bbb"222"gt ltDDD
ddd"999"gtlt/DDDgt lt/BBBgt ltCCC ccc"123"
xxx"321"/gt lt/AAAgt
ltAAAgt ltBBB bbb"111"gtlt/BBBgt ltBBB
bbb"222"gt ltDDD ddd"999"gtlt/DDDgt lt/BBBgt?
ltCCC ccc"123" xxx"321"/gt lt/AAAgt
xpointer(end-point(range(//AAA/BBB2)))
xpointer(start-point(range(//AAA/CCC)))
20- When the container node of a point is of a node
type that cannot have child nodes (such as text
nodes, comments, and processing instructions),
then the index is an index into the characters of
the string-value of the node such a point is
called a character-point. - You can use this to write a link that behaves
like a search function. It always jumps to the
first appearance of a string, e.g. the word
"another".
xpointer(start-point(string-range(//,'another',
2, 0))) ltAAAgt ltBBB bbb"111"gtText in the
first element BBB.lt/BBBgt ltBBB bbb"222"gt
Text in a?nother element BBB. ltDDD
ddd"999"gtText in more nested element.lt/DDDgt lt/
BBBgt ltCCC ccc"123" xxx"321"gtAgain some text
in some element.lt/CCCgtlt/AAAgt
21- The range function returns ranges covering the
locations in the argument location-set. For each
location x in the argument location-set, a range
location representing the covering range of x is
added to the result location set.
The range-inside function returns ranges covering
the contents of the locations in the argument
location-set.
xpointer(range(//AAA/BBB2)) ltAAAgt ltBBB
bbb"111"/gt ltBBB bbb"222"gt Text in
another element BBB. lt/BBBgt ltCCC ccc"123"
xxx"321"/gtlt/AAAgt
xpointer(range-inside(//AAA/BBB2)) ltAAAgt ltBB
B bbb"111"/gt ltBBB bbb"222"gt Text in
another element BBB. lt/BBBgt ltCCC ccc"123"
xxx"321"/gtlt/AAAgt
22- For each location x in the argument location-set,
end-point adds a location of type point to the
result location-set. That point represents the
end point of location x.
xpointer(end-point(string-range(//AAA/BBB,'another
'))) ltAAAgt ltBBB bbb"111"gtText in the first
element BBB.lt/BBBgt ltBBB bbb"222"gt Text
in another? element BBB. ltDDD
ddd"999"gtText in more nested element.lt/DDDgt lt/
BBBgt ltCCC ccc"123" xxx"321"gtAgain some text
in some element.lt/CCCgt lt/AAAgt