Title: Jon Allen
1Jon Allen instructional media magic, inc. As
presented at the JA-SIG Winter Conference Bal
Harbor, Florida, December 6, 2003
2The Abstract
- Looking for a methodology to quickly and
effectively create Transformations? Interested in
the basics of XSLT and Xpath, and a good way to
get started? If so, this workshop is for you! We
will be discussing the fundamental concepts of
XSLT and Xpath, and the methodologies that have
emerged from months of developing stylesheet
transformations for the uPortal 2.0 project. - We will discuss the design aspects related to
converting structured information in XML into
device-dependent markup languages such as HTML,
and WML, and the guidelines and best practices
evolving from this experience. No prior XSLT
experience is necessary.
3Overview
- Introduction
- uPortal
- Basic XPath
- Basic XSLT
- Markup XHTML
- Cascading Style Sheets
- Tools
- The Creation Process
- Hands-on
4Introduction
5Background
- Who W3C
- What XPath and XSLT Specs.
- When 11/16/1999
- Why A need arose for a specification to
define the syntax and semantics for
transforming XML documents.
6References
- The definitivereference
- Michael Kay
- Wrox Press Inc
- ISBN 1861005067
7References
- Great practicalreference
- Jeni Tennison
- Hungry Minds
- ISBN 0764547763
8References
- Practical use oftransformationsin Java code
- Eric Burke
- O'Reilly Assoc.
- ISBN 0596001436
9JA-SIGs uPortal
10Use of XSLT in uPortal
- XSLT is used to render the "Structure", "Theme"
and "Channels" - Structure is an XML to XML Transformation of
Abstract Layout to Structured Layout (such as Tab
and Column) - Theme is an XML to Markup Language Transformation
(XHTML, WAP, etc) which renders the container for
headers, navigation, channels, footers, etc. - Channels have XML to Markup Language
Transformations that render content
11Structure XSLT
Tab/Col/Row Tab/Column Tree/Column
XSLT Processor
User Layout XML
XML
StructureXSLT
12Conceptual Structure
The Structure is XML at this point, the
illustrations are representational
13Theme XSLT
HTML 4.0Browser HTML 3.2 PDA WML Mobile Phone
XSLT Processor
StructureXML
ThemeXSLT
14Theme
15Channel XSLT
Channel XML and Stylesheets
Final Output
Output Streamto Device
16Final Output
17Basic Architecture
Channel XSLT
Channel XSLT
Framework
Channel XSLT
Structure XSLT
Theme XSLT
Skins - CSS
18Channel
- Elementary unit of presentation, defined by the
IChannel interface
User InteractionExternal Information
Channel Content(Presentation)
IChannel
19IChannel content must
- Be well-formed XML such as XHTML, RSS, SVG, SMIL,
or a SOAP message (HTML is not well formed XML) - Rendered by an XSL transformation using an XSL
stylesheet
20Framework Organization
User Interaction
Presentation
uPortal Framework
Channel
Channel
Channel
21User Layout
- User Layout is an abstract structure defining the
overall content available to the user - userLayout is a tree structure consisting of
folders and channels, the later always being
the leaf nodes
22User Layout
23Structure Transformation
24Theme Transformation
User Layout
Tab
Tab
Tab
Jim Smith
Financial Aid
Library
Column
Column
Channel
Channel
Channel
Channel
Channel
Channel
Dictionary.com
Bookmarks
Cartoon
25Compiling the Presentation
userLayout
Structuretransformation
XSLT
structuredLayout
setRuntimeData()
XSLT
Channels
Themetransformation
renderXML()
HTML, WML VoiceML...
26Content Transformation
XML
XSLT Processor
XHTML Web Browser
HTML PDA
Stylesheet
WML Cell Phone
27Multiple Target Devices
28Skins Cascading Stylesheets
29Skin imm
30Skin VSAC
31Skin matrix
32User Preferences
- Swappable layout and preference management
modules - Profile management module
- Tab-column specific prefs. module
- Skin selection
33User Preferences
34Publish/Subscribe
- Channel publishing document
- Channel parameters
- Default values
- Modification permissions
- Descriptions
- Publish/Subscribe steps
- Step sequence
- Instructions, help
- A complex channel with multiple XSL views
35uPortal Base Channel Types
36Applet Channel
37Image Channel
38Inline Frame
39Remote Channel
40RSS Channel
41Web Proxy Channel
42XSLT Channel
43Channel Controls
44Channel Review
45XHTML
46Why XHTML?
- XSLT Stylesheets are XML documents
- If you plan to output HTML, it will reside in the
template bodies of the XSLT Stylesheet, and the
markup will be output as literal output. - When the XSLT Stylesheet contains Markup, it has
to be well formed
47Differences with HTML 4
- Due to the fact that XHTML is an XML application,
certain practices that were perfectly legal in
HTML 4 must be changed. - Documents must be well-formed
- Well-formedness is a new concept introduced by
XML. Essentially this means that all elements
must either have closing tags or be written in a
special form (as described below), and that all
the elements must nest. - CORRECT nested elements.
- ltpgthere is an emphasized ltemgtparagraphlt/emgt.lt/pgt
- INCORRECT overlapping elements
- ltpgthere is an emphasized ltemgtparagraph.lt/pgtlt/emgt
48Differences with HTML 4
- Element and attribute names must be in lower case
- XHTML documents must use lower case for all HTML
element and attribute names. This difference is
necessary because XML is case-sensitive e.g. ltligt
and ltLIgt are different tags.
49Differences with HTML 4
- In HTML 4 certain elements were permitted to omit
the end tag with the elements that followed
implying closure. This omission is not permitted
in XML-based XHTML. All elements must have an end
tag. - CORRECT terminated elements
- ltpgthere is a paragraph.lt/pgtltpgthere is another
paragraph.lt/pgt - INCORRECT unterminated elements
- ltpgthere is a paragraph.ltpgthere is another
paragraph.
50Differences with HTML 4
- All attribute values must be quoted, even those
which appear to be numeric. - CORRECT quoted attribute values
- lttable rows"3"gt
- INCORRECT unquoted attribute values
- lttable rows3gt
51Differences with HTML 4
- XML does not support attribute minimization.
Attribute-value pairs must be written in full.
Attribute names such as nowrap and checked cannot
occur in elements without their value being
specified. - CORRECT unminimized attributes
- lttd nowrapnowrap"gt
- INCORRECT minimized attributes
- lttd nowrapgt
52Differences with HTML 4
- Empty elements must either have an end tag or the
start tag must end with /gt. For instance, ltbr/gt
or lthrgtlt/hrgt. - CORRECT terminated empty tags
- ltbr/gtlthr/gt
- INCORRECT unterminated empty tags
- ltbrgtlthrgt
53Cascading Style Sheets
54CSS uPortal and Skins
- uPortal uses Cascading Style Sheets to Skin the
portal and content. - These CSS files are optimized for a particular
structure. - For consistency channel developers should become
familiar with the CSS classes that are defined
for a particular structure and apply them to the
markup language in their XSLT transformations.
55The CSS Classes
- There is a sample developer channel in uPortal
that describes and gives examples of the CSS
classes for the Tab/Column Theme.
56The CSS Channel
57Basic XPath
58Nodes and Node Trees
- When an application wants to operate on an XML
document it builds an internal model of what the
document looks like. - This model is known as a document object model or
DOM. - In XPath and XSLT, it's called a node tree.
59Types of Nodes
- Root nodes
- The top of the node tree
- Element nodes
- XML elements
- Attribute nodes
- XML attributes
- Text nodes
- Textual content in XML elements
- Comment nodes
- XML comments
- Processing instruction nodes
- XML processing instructions
- Namespace nodes
- The in-scope namespaces on an element
R
E
E
E
E
A
T
60Node Tree Example
- ltdocumentgt
- ltpara typenotegt content lt/paragt
- ltpara typewarninggt content lt/paragt
- ltpara typewarninggt content lt/paragt
- lt/documentgt
61XPath Definition
- XPath is a language for addressing parts of an
XML document, designed to be used by XSLT. - Example XPath
- childparaattributetype'warning'position()
2 - In English
- select the second para child of the context node
that has a type attribute with a value of warning.
62Dissecting the Example
- childparaattributetype'warning'position()
2 - Axis childpara
- Filter attributetype'warning'
- Filter position()2
63Dissecting the Example
- childparaattributetype'warning'position()
2 - Axis childpara
- Filter attributetype'warning'
- Filter position()2
Context node
Axis
Filtered
64Dissecting the Example
- childparaattributetype'warning'position()
2 - Axis childpara
- Filter attributetype'warning'
- Filter position()2
Context node
Axis
Filtered
Filtered
65Dissecting the Example
- childparaattributetype'warning'position()
2 - Axis childpara
- Filter attributetype'warning'
- Filter position()2
Context node
Axis
Selected
Filtered
Filtered
66Types of XPaths
- Expressions
- Return a value, which might be a node set that
is processed or a string that is output. - ltxslwhen test"_at_typewarning"gt
- Patterns
- Either match a particular node or don't match
that node. - ltxsltemplate match"para"gt
67XPath Expressions
- Select Nodes
- ltxslfor-each selectchildZgt
- Conditional
- ltxslif testposition()2gt
- Calculation
- ltxslvalue-of selectposition()4"gt
C
Z
Z
C
1
4
2
3
68Node Set Expressions
- The most common way that XSLT uses XPaths is to
select node sets. These XPaths usually occur
within a select attribute, for example on
xslforeach or xslapplytemplates, and are
known as location paths. - ltxslfor-each selectchildparagt
- ltxslapply-templates select"paragraph"/gt
69Location Paths
- The purpose of location paths is to select node
sets from a node tree. - Location paths can be absolute or relative
- Absolute location pathsstart from a known
locationsuch as the root node - ltxslfor-each select/R/Ngt
- Relative location paths start from the context
node. - ltxslfor-each selectNgtnote same as
childN
R
N
N
X
Y
R
N
C ontext
Y
X
N
Z
70Steps
- A location path is made up of a number of steps.
- Each step takes you from a node to a node set.
- Each step is separated from the one before it
with a /.
71Steps
- Every step is made up of an axis and a node test.
- The axis specifies thedirection that thestep is
taken in - The node test specifiesthe kinds of nodes
thatshould be collected inthat direction. - Within a step, the axisand the node test
areseparated by a double colon . -
child para
72Some Shorthand Notation
- If no axis is specified in a step, the default
axis is the child axis. - Longhand selectchildpara/childsentence
- Shorthand selectpara/sentence
- Another important axis is the attribute axis,
which takes you from the context node to the
attributes of that node. - Longhand selectattributetype
- Shorthand select_at_type
- Another essential axis is the parent axis. This
takes you from the context node to the parent of
that node. - Longhand selectparentnode()/childxyz
- Shorthand select../xyz
73Axes
- ancestor node()
- Takes you up the tree to the ancestors of the
context node in reverse document order.
2
1
74Axes
- ancestor-or-self node()
- Takes you up the tree to the ancestors of the
context node in reverse document order starting
with the context node.
3
2
1
75Axes
- child node()
- Takes you to the children of the context node in
document order. This is the default axis.
2
1
76Axes
- descendant node()
- Takes you to the descendants of the context
node. The resulting nodes are in document order.
3
1
6
5
4
2
77Axes
- descendant-or-self node()
- Takes you to the descendants of the context
node, starting with the context node, in document
order.
1
4
2
7
6
5
3
78Axes
- following node()
- Takes you to the nodes that occur after the
context node in document order, but that are not
the context nodes descendants.
4
1
3
2
79Axes
- following-sibling node()
- Takes you to the siblings of the context node
that occur after the context node in document
order.
2
1
80Axes
- parent node()
- Selects a single node the parent of the
context node.
1
81Axes
- preceding node()
- Takes you to the nodes that occur before the
context node in reverse document order, excluding
the context nodes ancestors.
3
1
2
82Axes
- preceding-sibling node()
- Takes you to the siblings (children of the same
parent) of the context node that occur before it,
in reverse document order.
1
2
83Axes
- self node()
- Takes you to the context node itself.
1
84Predicates (filters)
- Predicates are placed in square brackets either
- at the end of a step
- selectparaposition()3/sentence
- or at the end of a location path
- selectpara/sentenceposition()3
- Predicates act as filters on node sets.
- When predicate expressions test false, the node
is filtered out.
85Predicates (filters)
- You can have any number of predicates following
each other. - selectparaposition()3_at_indent.5/sentence
- The context node list for each predicate contains
the nodes that are still in the node set after
it's been filtered by the previous predicates. - Predicates can be used at any point in a location
path, but they only apply to the immediately
preceding step.
86Expressions - Union
- You can combine node sets by creating a union
using the operator. - If there are any nodes that occur in both node
sets, the union only holds one copy of them. - You can use predicates on the result of a union,
just as you can on any node set.
87Expressions - Operators
- Logical Operators
- and, or and not()
- Comparative Operators
- , !, lt, lt, gt, gt
- Remember to escape lt to lt
- Mathematical Operators
- , -, , div, mod
88Example Locating Nodes
- This XPathexpression selects allthe price
elements ofall the cd elements ofthe catalog
element - /catalog/cd/price
- This XPath expression selectsall the cd
elementsin the document - //cd
- note most XPath expressions are written in
shorthand
89Selecting Unknown Elements
- Wildcards ( ) can be used to select unknown XML
elements. - This XPath expression selects all the child
elements of all the cd elements of the catalog
element - /catalog/cd/
- The following XPath expression selects all the
price elements that are grandchild elements of
the catalog element - /catalog//price
90Selecting Unknown Elements
- The following XPath expression selects all price
elements which have 2 ancestors - ///price
- The following XPath expression selects all
elements in the document - //
91Selecting Branches
- The following XPath expression selects the first
cd child element of the catalog element - /catalog/cd1
- The following XPath expression selects the last
cd child element of the catalog element (Note
There is no function named first()) - /catalog/cdlast()
92Selecting Branches
- The following XPath expression selects all the cd
elements of the catalog element that have a price
element - /catalog/cdprice
- The following XPath expression selects all the cd
elements of the catalog element that have a price
element with a value of 10.90 - /catalog/cdprice10.90
93Selecting Branches
- The following XPath expression selects all the
price elements of all the cd elements of the
catalog element that have a price element with a
value of 10.90
/catalog/cdprice10.90/price
94Selecting Several Paths
- The following XPath expression selects all the
title and artist elements of the cd element of
the catalog element - /catalog/cd/title /catalog/cd/artist
- The following XPath expression selects all the
title and artist elements in the document - //title //artist
- The following XPath expression selects all the
title, artist and price elements in the document - //title //artist //price
95Selecting Attributes
- This XPath expression selects all attributes
named country - //_at_country
- This XPath expression selects all cd elements
which have an attribute named country - //cd_at_country
- This XPath expression selects all cd elements
which have any attribute - //cd_at_
- This XPath expression selects all cd elements
which have an attribute named country with a
value of 'UK' - //cd_at_country'UK'
96Basic XSLT
97Background
- XSLT is part of a larger initiative within the
World Wide Web Consortium (W3C) to define a way
of presenting XML documents. This initiative is
known as XSL (Extensible Stylesheet Language). - XSLT is an XML vocabulary that's used to define a
transformation between an XML document and a
result document, which might be in XML, in HTML,
or a text document.
98What is an XSLT Stylesheet?
- The XSLT processor uses a stylesheet to transform
an XML document. - The stylesheet contains instructions for
generating a new document based on information in
the source document. - This can involve adding, removing, or rearranging
nodes, as well as presenting the nodes in a new
way.
99The XSL Processing sequence
source document
XML parser
source tree
XSL stylesheet
rules base
apply templates
Result file or stream
write result to output
result tree
100Differentiating Stylesheets
- XSLT stylesheets do not perform the same function
as Cascading Style Sheets (CSS) - CSS is used to apply style elements to markup
languages to affect formatting in a single pass,
top down, fashion. - XSLT produces a separate result tree and can
iterate and perform conditional logic. - XSLT and CSS are most powerful when used
together. More later
101Example of a Stylesheet
- When you work with a stylesheet, three documents
are involved - Source document in XML
- Desired output the result document, which can be
HTML, XML, or text - XSLT stylesheet, which is also an XML document
102Hello World!
- XML Document
- lt?xml version"1.0"?gt
- ltgreetinggtHello, World!lt/greetinggt
- Desired Output
- lthtmlgt
- ltheadgtlttitlegtGreetinglt/titlegtlt/headgt
- ltbodygtltpgtHello, World!lt/pgtlt/bodygt
- lt/htmlgt
- XSLT Stylesheet
- lt?xml version"1.0"?gt
- ltxslstylesheet version"1.0" xmlnsxsl"http//ww
w.w3.org/1999/XSL/Transform"gt - ltxsltemplate match"/"gt
- lthtmlgt
- ltheadgtlttitlegtGreetinglt/titlegtlt/headgt
- ltbodygtltpgtltxslvalue-of select"greeting"/gtlt/
pgtlt/bodygt - lt/htmlgt
- lt/xsltemplategt
- lt/xslstylesheetgt
root
greeting
Hello World!
103Dissecting the XSLT
- XSLT stylesheets are XML Documents
- lt?xml version"1.0"?gt
- Standard XSLT heading/Namespace
- ltxslstylesheet version"1.0" xmlnsxsl"http//
www.w3.org/1999/XSL/Transform"gt - The template rule to be triggered when a
particular part of the source document is being
processed. / is XPath for root node - ltxsltemplate match"/"gt
104Dissecting the XSLT
- Once the rule is triggered the body of the
template defines what output to generate. - lthtmlgt
- ltheadgtlttitlegtGreetinglt/titlegtlt/headgt
- ltbodygtltpgtltxslvalue-of select"greeting"/gtlt/pgtlt
/bodygt - lt/htmlgt
- Most of the template is HTML except the value-of
element which is an XSL instruction. - The XPath in the select statement is asking for
the contents of the child of the context node
with an element name of greeting.
root
greeting
Hello World!
105Dissecting the XSLT
- All thats left to do is close (finish) the
template and the stylesheet. - lt/xsltemplategt
- lt/xslstylesheetgt
106Some XSL Top Level Elements
- ltxsldecimal-formatgt
- Declares a decimal format.
- ltxslimportgt
- Imports another stylesheet into this stylesheet.
- ltxslkeygt
- provides a way to work with documents that
contain an implicit cross-reference structure - ltxsloutputgt
- This instruction specifies how you want the
result tree to be output
107Some XSL Top Level Elements
- ltxslparamgt
- Declares a parameter for a stylesheet or
template, and specifies a default value for the
parameter - ltxsltemplategt
- Specifies a template rule.
- ltxslvariablegt
- Declares a variable and binds a value to that
variable. - The difference between the xslparam and
xslvariable instructions is that xslparam
defines a default value while xslvariable
defines a fixed value. - If used as a top level element the scope is
global otherwise the scope is local to a specific
template
108What Is a Template?
- A template defines what the XSLT processor should
do when it processes a particular node in the XML
source document. - The XSLT processor populates the result document
by instantiating a sequence of templates. - Instantiation of a template means that the XSLT
processor - Copies any literal data from the template to the
result document - Executes the XSLT instructions in the template
109Contents of a Template
- To define a template, you specify the
xsltemplate instruction. - In the xsltemplate tag, the value of the match
attribute is an XPath pattern. - This pattern matches (identifies) nodes in the
source XML document. - The value of the match attribute is the template
rule.
110The Template Body
- The template body defines actions you want the
XSLT processor to perform each time it
instantiates the template. It contains - XSLT instructions you want the XSLT processor to
follow - Elements that specify literal output you want the
XSLT processor to insert in the result document.
For example - lttable align"center" cellpadding"5"gt
111Determining Templates to Instantiate
- When the XSLT processor applies a stylesheet to
an XML document, it begins processing with the
root node of the XML source document - Every stylesheet includes default templates
- Whether or not you explicitly define a template
rule that matches the root node, the XSLT
processor always instantiates a template that
matches the root node.
112The Example Handout
113Dissected XSLT Stylesheet
114Result
115Dissecting a sample
- In the sample stylesheet, the template rule in
the first template matches the root node - ltxsltemplate match"/"gt
- The XSLT processor instantiates this template to
start generating the result document. - It copies the first few lines from the template
to the result document.
116Dissecting a sample
- Then the XSLT processor reaches the following
XSLT instruction - ltxslapply-templates select"/bookstore/book"/gt
- When the XSLT processor reaches the select
attribute, it creates a list of all source nodes
that match the specified pattern. - In this example, the list contains book elements.
- The processor then processes each node in the
list in turn by instantiating its matching
template.
117Dissecting a sample
- First, the XSLT processor searches for a template
that matches the first book element. The template
rule in the second template matches the book
element - ltxsltemplate match"book"gt
- After instantiating this template for the first
book element, the XSLT processor searches for a
template that matches the second book element
118Dissecting a sample
- The XSLT processor instantiates the book template
again, and then repeats the process for the third
book element. - After three instantiations of the book template,
the XSLT processor returns to the first template
(the template that matches the root node) and
continues with the line after the
xslapply-templates instruction.
119Built-in Templates
- When the XSLT processor cannot find a template
that matches a selected node, it uses built-in
templates. Every stylesheet includes built-in
templates whether or not you explicitly define
them.
120Built-in Templates
- The following template matches the root node and
element nodes and selects all attributes and
child nodes for further processing - ltxsltemplate match"/"gt
- ltxslapply-templates / gt
- lt/xsltemplategt
- The following template matches text and attribute
nodes. This template copies the value of the text
or attribute node to the result document - ltxsltemplate match"_at_text()"gt
- ltxslvalue-of select"." / gt
- lt/xsltemplategt
121Major XSL Instructions
- xslapply-imports
- xslapply-templates
- xslattribute
- xslattribute-set
- xslcall-template
- xslchoose
- xslcomment
- xslcopy
- xslcopy-of
- xsldecimal-format
- xslelement
- xslfallback
- xslfor-each
- xslif
- xslimport
- xslinclude
- xslkey
- xslmessage
- xslnamespace-alias
- xslnumber
- xslotherwise
- xsloutput
- xslparam
- xslpreserve-space
- xslprocessing-instruction
- xslsort
- xslstrip-space
- xslstylesheet
- xsltemplate
- xsltext
- xsltransform
- xslvalue-of
- xslvariable
- xslwhen
- xslwith-param
122The Tools
123The Applications
- These are the applications I am familiar with --
this is not an endorsement ? -- and no testing
was done on animals - XML/XSLT Document Development IDE
- Sonic Stylus Studio
- XML Spy
- Cooktop (open source)
- Komodo (commercial runs on LINUX)
- Emacs (minor mode XSLT-process)
- HTML Markup IDE
- Macromedia Dreamweaver MX (XHTML), Adobe GoLive
- HTML to XHTML Conversion/Cleanup
- HTML Tidy (open source)
124The Creation Process
125Creating an XSLT Stylesheet
- Step One get or develop one of the following
- A representative XML Document
- A DTD (a whole different seminar)
- A Schema (a whole different seminar)
- Step two Analyze the data
- Think about the best way to present and interact
with it. The data is your friend. - Step three Develop sample markup
- Design mocks for each markup language device
that you plan to support. Dont try to paint the
canvas without a sketch. - Step four Convert the markup to XML
- The stylesheet will only eat tasty markup. Poorly
formed markup will be regurgitated. - Step five copy markup into XSLT editor
- Match on a root element and start cooking XSLT.
126Hands - on
127Hands on
- Convert RSS Syndicated Content XML Documents
into HTML using an XSLT Stylesheet.