Introduction to XML - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to XML

Description:

Introduction to XML Val rie Bellynck EFPG-INPG France mailto:Valerie.Bellynck_at_efpg.inpg.fr What is XML ? means : eXtensible Markup Language (in French langage ... – PowerPoint PPT presentation

Number of Views:421
Avg rating:3.0/5.0
Slides: 65
Provided by: Val202
Category:

less

Transcript and Presenter's Notes

Title: Introduction to XML


1
Introduction to XML
  • Valérie Bellynck
  • EFPG-INPG
  • France

?mailtoValerie.Bellynck_at_efpg.inpg.fr
2
What is XML ?
From "XML in Micro-Application", e-Poche
collection
  • means eXtensible Markup Language
  • (in French  langage à balises extensible , or
     langage à balises extensibles  in spanish ?)
  • 1996 clarification by the XML Working Group,
    under World Wide Web Consortium (W3C)
    supervision
  • XML generalisation of HTML where fixed
    semantic predefined tags ? author  invented 
    own tags
  • 1998 official evolution to standard XML 1.0
    specifications ? recommandations

?http//www.w3c.org/XML/
3
HTML ? XML ? SGML
  • XML comes from SGML, not from HTML

From XML in Micro-Application e-Poche collection
4
SGML
  • Standard Generalized Markup Language
  • defined in 1986 by ISO 8879 standard
  • dissociates completely in a document
  • content / presentation / structure description
  • used in - industry for technical documents-
    electronic document management (GED)
  • problems - does not aimed at Internet use-
    complex and heavy description to follow

?http//www.sgmlsource.com/Goldfarb/history/index/
htm
5
HTML
  • HyperText Markup Language
  • is an extension of SGML
  • is a language of document descriptionsection
    titles, bookmarks, anchors, linguistic elements
    to format text, to describe tables...
  • is interpreted by a browser (a client
    application for Internet requests)
  • the display is browser-independent
  • problems - content and presentation are mixed

?http//www.w3c.org/HTML/
6
Targets to XML it must be...
  • used without difficulty in Internet
  • defined quickly
  • described in a formal and concise way
  • auto-describing
  • able to extent its-self
  • deal with an arborecent data description
  • treatable with any application equiped with a
    text parser
  • able to support UNICODE and any other police
    codage for linguistic universality
  • support a large panel of applications
  • compatible with SGML
  • make easier writing software aimed to document
    processing
  • a way of representing data as human-readable
    documents
  • easy to use for creating documents

7
Markup Languages ?
  • Markups are pairs of expressions (tags) which
    surround a block of text, to indicate some
    characteristics
  • ex in HTML, the tag ltBgt commands beginning of
    bold display and lt/Bgt commands its end

ltBgt Text in Bold lt/Bgt
Text in Bold
?
  • Tags can be parametrised by attributes
  • ex in HTML, - the tag ltagt allows to define a
    hypertext link - the URL of the link is
    defined by the attribute href - the clickable
    text is surrounded by the tags ltagt and lt/agt

lta href"http//www.3ie.org/xml"gt click here lt/agt
?
click here
8
HTML code
ltHTMLgt ltHEADgt ltTITLEgtLime Jello Marshmallow
Cottage Cheese Surpriselt/TITLEgt lt/HEADgt ltBODYgt ltH
3gtLime Jello Marshmallow Cottage Cheese
Surpriselt/H3gt My grandma's favorite (may she
rest in peace). ltH4gtIngredientslt/H4gt ltTABLE
BORDER"1"gt ltTR BGCOLOR"308030"gt ltTHgtQtylt/T
HgtltTHgtUnitslt/THgtltTHgtItemlt/THgt lt/TRgtltTRgt ltTDgt1
lt/TDgtltTDgtboxlt/TDgtltTDgtlime gelatinlt/TDgt lt/TRgtltTRgt
ltTDgt500lt/TDgtltTDgtglt/TDgtltTDgtmulticolored tiny
marshmallowslt/TDgt lt/TRgtltTRgt ltTDgt500lt/TDgtltTDgtm
llt/TDgtltTDgtcottage cheeselt/TDgt lt/TRgtltTRgt ltTDgtlt
/TDgtltTDgtdashlt/TDgtltTDgtTabasco sauce
(optional)lt/TDgt lt/TRgt lt/TABLEgt ltPgt ltH4gtInstru
ctionslt/H4gt ltOLgt ltLIgtPrepare lime gelatin
according to package instructions...lt/LIgt lt!--
and so on --gt lt/BODYgt lt/HTMLgt
9
HTML example code in browser
10
XML example code
lt?xml version"1.0"?gt ltRecipegt ltNamegtLime Jello
Marshmallow Cottage Cheese Surpriselt/NamegtltDescri
ptiongtMy grandma's favorite (may she rest in
peace).lt/DescriptiongtltIngredientsgt ltIngredientgt
         ltQty unit"box"gt1lt/Qtygt        
ltItemgtlime gelatinlt/Itemgt lt/Ingredientgt ltIngredi
entgt         ltQty unit"g"gt500lt/Qtygt        
ltItemgtmulticolored tiny marshmallowslt/Itemgt lt/Ing
redientgt ltIngredientgt         ltQty
unit"ml"gt500lt/Qtygt         ltItemgtCottage
cheeselt/Itemgt lt/Ingredientgt ltIngredientgt       
  ltQty unit"dash"/gt         ltItem
optional"1"gtTabasco saucelt/Itemgt lt/Ingredientgtlt
/IngredientsgtltInstructionsgt      ltStepgtPrepare
lime gelatin according to package
instructionslt/Stepgt      lt!-- And so on...
--gtlt/Instructionsgt lt/Recipegt
11
XML heading informations
Every XML file should begin with a header
defining which version of XML is used in the
document lt?xml version"1.0"?gt This is done
through the version attribute. Other attributes
can define global properties, such as -
encoding attribute, which defines the character
encoding lt?xml version"1.0" encoding"ISO-8859-1"
?gt The encoding specific to French characters is
ISO-8859-1 The international universal encoding
for all characters is UTF-8
12
Well-formed XML means  parsable 
  • A well-formed XML document is a document that
    follows all the notational and structural rules
    for XML, otherwise it is meaningless
  • By analogy, the expression 2 ( 5 () 9 gt 7
    is meaningless even if it looks (sort of)
    like math
  • The most important rules are 
  • No unclosed tags a block cant be "opened" with
    a tag ltTAGgt without being "closed" afterwards
    with lt/TAGgt
  • Use of closed empty elements they must have
    either a closing tag ltEMPTY type"example"gt
    lt/EMPTYgt or a single tag with slash " /" before
    the closing " gt" ltEMPTY type"example" /gt
  • No overlapping tags a tag that opens inside
    another tag must close before the containing tag
    closes ltINCLUDING-TAGgt ltCONTAINING-TAGgt
    lt/CONTAINING-TAGgt lt/INCLUDING-TAGgt
  • Enclosing quotes for attribute values
    ltTAG type"example"gt

13
Valid XML
  • A document is valid because it matches its
    Document Type Definition (DTD)
  • A DTD is a grammar for some class of documents
    using a markup language, that is, a set of rules
    to describe the authorized sequences and
    embeddings of tags
  • The language to write DTDs is a special language,
    not XML
  • but there is a more complex syntax to define DTs
    in XML (schemas)
  • A DTD specifies
  • what elements may exist,
  • which attributes the elements may have,
  • what structural organisation of elements is
    attempted what element may or must be found
    inside other elements, and in what order.

? due to DTD, XML is eXtensible
14
Power of DTD
  • Wrinting a DTD is how you actually define a new
    markup language -- often called a dialect of XML.
  • At present, DTDs are being written for an
    enormous number of different problem domains,
    and each DTD defines a new markup language.
  • New markup languages now exist, or are being
    designed,
  • to mark up specific domains such as the plays of
    Shakespeare or business data in the footwear
    industry (FDX) ...
  • to define general data resources (RDF)
  • to model information in the health care industry
    (HL7 SGML/XML)
  • to typeset, display, and actively use
    mathematical equations (MathML)
  • and to perform electronic data interchange
    (XML/EDI).

15
Modelling information structure in XML
16
DTD for the example
lt!-- This is the example DTD for the example XML
--gtlt!ELEMENT Recipe (Name, Description?,
Ingredients?, Instructions?)gtlt!ELEMENT Name
(PCDATA)gtlt!ELEMENT Description
(PCDATA)gtlt!ELEMENT Ingredients
(Ingredient)gtlt!ELEMENT Ingredient (Qty,
Item)gtlt!ELEMENT Qty (PCDATA)gtlt!ATTLIST Qty
unit CDATA REQUIREDgtlt!ELEMENT Item
(PCDATA)gtlt!ATTLIST Item optional CDATA "0" 
isVegetarian CDATA "true"gtlt!ELEMENT Instructions
(Step)gt
17
DTD defining tags
  • lt!ELEMENT Recipe (Name, Description?,
    Ingredients?, Instructions?)gt The lt!ELEMENT...gt
    statement defines a tag in the document. This
    tag defines a ltRecipegt tag, stating that it can
    contain
  • - a ltNamegt , - an optional ltDescriptiongt (the
    question mark ? denotes optionality),
  • - an optional ltIngredientsgt tag,
  • - and an optional ltInstructionsgt tag.
  • lt!ELEMENT Name (PCDATA)gt This simply states
    that a ltNamegt tag can contain character data and
    nothing else.
  • lt!ATTLIST Item optional CDATA "0" isVegetarian
    CDATA "true"gt This section states that the
    ltItemgt tag has two possible attributes -
    optional , whose default value is 0 and
  • - isVegetarian , whose default value is true .
  • lt!--- This is a comment --gt the text  This is a
    comment  wont be interpreted.

18
DTD other definitions
lt!ENTITY Utterance "example of sentence or
value"gt This defines an internal entity.It
associates a value to a name which will be more
explicit than a tag in the document.. The
browser will replace the entity Utterance by
the text example of sentence or value There
are external entities too which can either be
some XML content or not, and are all defined in
XML language. lt!ENTITY TextPresentation SYSTEM
"http//foo.com/presentation/text.xml"gt It
allows the document to reference the content of
the file saved in the URL.The browser will
replace the entity TextPresentation by the
content of the file placed at http//foo.com/prese
ntation/text.xml lt!NOTATION gif
SYSTEM "usr/local/bin/display"gt lt!ENTITY
ImagePresentation SYSTEM "http//foo.com/img/lion.
gif" NDATA gifgt For not XML content, as gif
files, for example, the notation definition
allows to specify the authorized
application ltimagePres src "ImagePresentation"gt
which will include the image in the document
through the browser
19
DTD file call in XML file
  • in the XML file,
  • a document type declaration tells the parser
  • to start looking for a ltRecipegt tag as the
    top-level tag (root) of the document.
  • that the DTD is in the system file personne.dtd
  • lt!DOCTYPE Recipe SYSTEM "example.dtd"gt

lt?xml version"1.0" encoding"ISO-8859-1"
?gt lt!DOCTYPE personne SYSTEM "personne.dtd"gtltpers
onnegt ltprenomgtAlainlt/prenomgt ltnomgtConnult/nomgtlt/
personnegt
20
DTD directly included in file
lt!DOCTYPE personne directDTDcontentgt
lt?xml version"1.0" encoding"ISO-8859-1"
?gt lt!--DTD declaration and definition
--gt lt!DOCTYPE personne lt!ELEMENT personne
(prenom, nom)gtlt!ELEMENT prenom
(PCDATA)gtlt!ELEMENT nom (PCDATA)gtgtlt!--end of
DTD declaration and definition --gt ltpersonnegt ltpr
enomgtAlainlt/prenomgt ltnomgtConnult/nomgtlt/personnegt
21
What is a  NameSpace  ?
  • It allows to share tags between XML-authors of
    documents
  • It allows to choose between own-defined tags and
    someone-else-defined tags
  • It concerns DTD used for elements and for
    attributes
  • Some NamesSpace can become a W3C norm -
    XMLSchema (eXtensible Markup Language Schema)-
    Xlink (eXtensible link)- XSL (eXtensible
    Stylesheet Language)- XHTML- versions of HTML
    (3.0, 4.0...)

22
Example of HTML Namespace
lt?XML version"1.0"?gtlt!--Every elements are in
HTML Namespace--gtlthtmlhtml xmlnshtml
"http//www.w3.org/TR/REC-html40"gt lthtmlheadgt
lthtmltitlegtNamespace Example uselt/htmltitlegt
lt/htmlheadgt lthtmlbodygt lthtmlpgt Text
and Links lthtmla href "http//foo.com"gtherelt
/htmlagt lt/htmlpgt lt/htmlbodygtlt/htmlhtmlgt
This example uses the XML name space of HTML
defined in the W3C recommendations REC-html40 for
HTML version 4.0
23
Example of using 2 Namespaces
lt?XML version"1.0"?gtltlslivre xmlnslv
"unrloc.govlivres" xmlnsisbn
"unrISBN0-395-36341-6"gt ltlvtitregtHarry Potter
et la coupe de feult/lvtitregt ltisbnnumbergt07475
54420lt/isnbnumbergtlt/lslivregt
This example commands the browser to load 2
namespaces using respectively lv and isbn as
prefixes
24
Case of schema structure representation in XML
  • XML Schema
  • is an XML based alternative to DTD
  • has support for Data types (more than only
    PCDATA)
  • use XML syntax (gt editable with an XML editor,
    parseable by any XML parser, manipulate with the
    XML DOM, transformable with XSLT)
  • is extendible just like XML (gt reusability,
    derivability for own data types from standard
    types , multiple schema referenciation from the
    same document)
  • secure data communication (sender and receiver
    can both have same  expectation  about the
    content by sharing its structural representation
    link to interoperability)

?http//www.w3schools.com/default.asp/
25
Exemple de schéma
lt?XML version"1.0" encoding"iso-8859-1"
?gtltxsdschema xmlnsxsd "http//www.w3.org/2001/
XMLSchema" elementFormDefault"qualified" gt
ltxsdelement name"film" type"typeFilm" /gt
ltxsdcomplexType name"typeFilm" gt ltxsdsequence
gt ltxsdelement name"titre" type"xsdstring"
/gt ltxsdelement name"acteurs" type"typeActeur"
/gt ltxsdelement name"realisateur"
type"xsdstring" /gt ltxsdelement name"annee"
type"xsddecimal" /gt ltxsdelement name"texte"
type"xsdstring" /gt ltxsdelement name"note"
type"xsdstring" minOccurs"0" maxOccurs"1"
/gt lt/xsdsequence gt lt/xsdcomplexType
gtltxsdcomplexType name"typeActeur"
gt ltxsdsequence gt ltxsdelement name"personne"
type"xsdstring" minOccurs"0"
maxOccurs"unbounded" /gt lt/xsdsequence gt
lt/xsdcomplexType gtlt/xsdschemagt
26
Presentation CSS and XSL
  • for general control over formatting, use
  • Cascading Style Sheet
  • eXtensible Stylesheet Language
  • Both are declarative languages
  • XSL is more recent than CSS
  • XSL is described in XML, using namespace power

27
CSS for HTML and XML
  • exists as a current recommendation from the W3C,
    usable with HTML or XML
  • Is simpler to use and less powerful than XSL
  • is supported by most current-generation browsers
    (to varying degrees)

?http//www.W3.org/TR/html401/present/styles
28
Cascading Style Sheets
In the small example next, ltHTMLgt contains ltBODYgt
contains ltH1gt contains text
ltHTMLgt ltHEADgt lt/HEADgt ltBODYgt ltH1gtA Theory
About the Brontosauruslt/H1gt My theory about the
brontosaurus is... lt/BODYgtlt/HTMLgt
The whole idea of a style sheet is to use these
structural relationships to indicate where
changes in text style, spacing, and so on should
occur.
ltSTYLE TYPE"text/css"gtlt!--H1 color red
font-size 16pt text-decoration underline
--gtlt/STYLEgt
29
Example of CSS file
html\body background-color rgb(255, 230, 230)
article display block font-familyhelvetica,
sans-serif background-color rgb(230,
230, 255) titre display block font-size
200 text-align center border-width
medium border-style groove auteur display
block font-size 80 font-weight bold date
display inline font-size 80 font-style
italic lieu display inline font-size 80
font-weight bold texte display block
grand display inline font-variant
small-caps font-size 120 font-weight
bold image display block border-width
thin text-align center border-style
solid content url(attr(site))
legende display block text-align center
padding-right 2mm padding-top 2mm
padding-bottom 2mm padding-left 2mm
30
External CSS
  • The CSS to use can be defined
  • using ltLINKgt element (in the ltHEADgt for default
    use)

ltHTMLgt ltHEADgt ltLINK href"special.css"
rel"stylesheet" type"text/css"gt lt/HEADgt ltBODYgt
ltH1gtA Theory About the Brontosauruslt/H1gt My
theory about the brontosaurus is... lt/BODYgtlt/HTM
Lgt
  • in the ltMETAgt declaration (only for default use)

... ltHEADgt ltMETA http-equiv"Content-Style-Type"
content"text/css"gt lt/HEADgt ...
31
How do browsers apply CSS ?
  • The browser will determine which style to use as
    follows
  • select the last CSS ltMETAgt declaration
  • otherwise, select the last other CSS
    declaration (for example, by ltLINKgt )
  • otherwise, the default stylesheet language is
    "text/css"

32
Why CSS is named CSS ?
  • These style sheets are called cascading style
    sheets, because styles (like fonts, colors, and
    so on) for one markup element "cascade" down, and
    apply to all of the element's contents.
  • For example, if a paragraph tag (ltPgt) is set to
    show its text in red, all text and any other
    element inside that paragraph will be displayed
    in red, unless one sub-element of the paragraph
    specifies a color for its contents.

33
XSL for XML and SGML
  • used exclusively to format XML or SGML
  • more complex and powerful than CSS

?http//nwalsh.com/docs/tutorials/webtek2000/xsl/i
e/frames.html
34
XSL Why Stylesheets for XML ?
From Norman Walsh http//nwalsh.com/docs/tutorials
/webtek2000/xsl/ie/frames.html
  • because
  • XML is not a fixed tag set (like HTML) and has
    no (application) semantics
  • XML markup does not (usually) include formatting
    information
  • Reuse the same content can look different in
    different contexts
  • Multiple output formats different media (paper,
    online), different sizes (manuals, reports),
    different classes of output devices
    (workstations, hand-held devices)
  • Styles tailored to the reader's preference (e.g.,
    accessibility) print size, color, simplified
    layout for audio readers

35
Options for displaying XML
36
What does a StyleSheet do ?
  • It specifies the presentation of XML information
    using two basic categories of techniques
  • An optional transformation of the input document
    into another structure
  • generation of constant text
  • suppression of content
  • moving text (e.g., exchanging the order of the
    first and last name)
  • duplicating text (e.g., copying titles to make a
    table of contents)
  • executing more complex transformations that
    "compute" new information in terms of the
    existing information
  • A description of how to present the transformed
    information
  • i.e., a specification of what properties to
    associate to each of the various parts of the
    transformed information

37
Needs to present information
  • Description of how to present the (possibly
    transformed) data includes three levels of
    formatting information
  • Specification of the general screen or page (or
    even audio) layout
  • Assignment of the transformed content into basic
    "content container types" (e.g., lists,
    paragraphs, inline text)
  • Specification of formatting properties (spacing,
    margins, alignment, fonts, etc.) for each
    resulting "container"

38
Components of XSL
  • The full XSL language logically consists of three
    component languages which are described in three
    W3C (World Wide Web Consortium) recommendations
  • XPath XML Path Language a language for
    referencing specific parts of an XML document
  • XSLT XSL Transformations a language for
    describing how to transform one XML document
    (represented as a tree) into another
  • XSL Extensible Stylesheet Language XSLT plus a
    description of a set of Formatting Objects and
    Formatting Properties

39
XML to Result Tree
  • An XSLT "stylesheet" transforms the input
    (source) document tree into a structure called a
    result tree consisting of result objects

? Transform to Another Vocabulary
40
What is an XSL Stylesheet ?
  • XSLT Stylesheets are XML documents namespaces
    are used to identify semantically significant
    elements.
  • Most stylesheets are stand-alone documents rooted
    at ltxslstylesheetgt (or ltxsltransformgt).
  • It is possible to have "single template"
    stylesheet/documents.
  • Note that it is the mapping from namespace
    abbreviation to URI that is important, not the
    literal namespace abbreviation "xsl " that is
    used most commonly

41
Understanding a template
  • Most templates have the following form
    ltxsltemplate match" para "gt
  • ltpgt ltxslapply-templates/gt lt/pgt
  • lt/xsltemplategt
  • The whole ltxsltemplategt element is a template
  • The match pattern determines where this template
    applies
  • Literal result elements come from non-XSL
    namespace(s)
  • XSLT elements come from the XSL namespace

42
Style sheet example
A small, complete style sheet
ltxslstylesheet xmlnsxsl"http//www.w3.org/1999/
XSL/Transform" version"1.0"gt
ltxsloutput method"html"/gt ltxsltemplate
match"doc"gt lthtmlgt
ltheadgtlttitlegtltxslvalue-of select"title"/gtlt/h
eadgt ltbodygtltxslapply-templates/gtlt/body
gt lt/htmlgt lt/xsltemplategt
ltxsltemplate match"title"gt
lth1gtltxslapply-templates/gtlt/h1gt
lt/xsltemplategt ltxsltemplate
match"para"gt ltpgtltxslapply-templates/gtlt/p
gt lt/xsltemplategt lt/xslstylesheetgt
43
Transformation is application of templates
  • Templates transform portions of the source tree
    into portions of the result tree.
  • The ordered accumulation of all the transformed
    portions forms the complete result tree.
  • Individual templates are free to process
    elements from anywhere in the source tree.

44
Match Patterns (locating elements)
  • critical capability of a stylesheet language
    locate source elements to be styled
  • For example,
  • - CSS, does this with "selectors".
  • - FOSIs do it with "e-i-c's", elements in
    context.
  • - XSLT does it with "match patterns" defined in
    XPath.

45
XPath
  • XPath has an extensible string-based syntax
    inspired, in part, by the common "path/file" file
    system syntax
  • para
  • matches all ltparagt children in the current
    context
  • para/emphasis
  • matches all ltemphasisgt elements that have a
    parent of ltparagt
  • ancestor-or-self/_at_sepchar
  • matches the sepchar attribute on the current
    element or any ancestor of the current element
  • numberedlist/listitemposition() mod 2 0
  • matches odd list items in a numbered list.

46
Applying style recursively
  • The process is allowed to run recursively, driven
    primarily by the document.
  • A series of templates is created, such that if
    there is a template to match each context, then
    these templates are recursively applied starting
    at the root of the document.

ltxsltemplate match"section/title"gt
lth2gtltxslapply-templates/gtlt/h2gtlt/xsltemplategt
  • ltxsltemplate match"..."gt
  • ltxslapply-templatesgt

ltxslapply-templates select"thtd"/gt
  • 2 obstacles appear when using the recursive
    model,
  • how to arbitrate between multiple patterns that
    match and
  • how to process the same nodes in different
    contexts.
  • These are solved by conflict resolution and
    modes, respectively.

47
Applying style proceduraly
  • This process for applying style, is to select
    each action procedurally.
  • A series of templates is created, such that each
    template explicitly selects and processes the
    necessary elements.

ltxslfor-each select"row"gt lttrgt
ltxslfor-each select"entry"gt
lttdgtltxslvalue-of select"."/gtlt/tdgt
lt/xslfor-eachgt lt/trgtlt/xslfor-eachgt
ltxslfor-eachgt
ltxsltemplate name"..."gt
ltxsltemplate name"admonition"gt ltxslparam
name"type"gtwarninglt/xslparamgt
...lt/xsltemplategt
ltxslcall-templategt
ltxslcall-template name"admonition"gt
ltxslwith-param name"type"gtcautionlt/xslwith-para
mgtlt/xslcall-templategt
48
Conditional processing
Simple conditional (no "else")
ltxslifgt
ltxslif test"somecondition"gt ltxsltextgtthis
text only gets used if somecondition is
true()lt/xsltextgtlt/xslifgt
Select among alternatives with ltxslwhengt and
ltxslotherwisegt
ltxslchoosegt
ltxslchoosegt ltxslwhen test"count gt 2"gt
ltxsltextgt, and lt/xsltextgt lt/xslwhengt
ltxslwhen test"count gt 1"gt ltxsltextgt and
lt/xsltextgt lt/xslwhengt ltxslotherwisegt
ltxsltextgt lt/xsltextgt lt/xslotherwisegtlt/xslch
oosegt
49
Variables
Variables can be used to save computed values.
  • Variables are created with ltxslvariablegt .
  • Variables are "single assignment" (no side
    effects)
  • Variables are lexically scoped

Once created, variables can be used to generate
content
lta href"file"gt...lt/agt
And control conditional processing
ltxslif test"count 3"gt...lt/xslifgt gt
50
Creating the resulting tree
Literal Result Elements Any element in a
template rule that is not in the XSL (or
other extension) namespace is copied
literally to the result tree
ltpgt...lt/pgt
XSL Elements Elements in the XSL namespace
ltxsltext gt
ltxslvalue-of gt
ltxslelement gt
ltxslattribut gt
...
51
Numbering and sorting
You can
  • Count source tree elements (chapters, list-items,
    stock quotes, etc.)
  • Convert between number formats (1, B, iii, ...)
  • Sort elements for presentation

52
Overall XSL formatting capabilities
XSL FO formatting capabilities in XSL 1.0 are
approximately the union of
  • HTML CSS capabilities
  • most high quality print output capabilities
    including internationalization features

Not included are complex page layouts (e.g.,
magazine and newspaper layout), complex
layout-driven formatting (e.g., copy fitting and
complex floats), and loose leaf pagination
(change page production)
53
Formatting objects and properties
  • XSL XSLT vocabulary of FOs and properties
  • XSL defines a powerful set of formatting objects
  • XSL uses (and extends) a set of Common Formatting
    Properties developed jointly with the CSSFP
    (Cascading Style Sheet and Formatting Property)
    Working Group
  • When a result tree uses this standardized set of
    formatting objects and properties, then an
    XSL-compliant formatter can process that result
    tree to produce the specified output

54
Formatting object basics
Inline versus block objects Common formatting
properties, harmonized with CSS
55
Common formatting objects
  • page-sequence--a major part (such as front or
    body) in which the basic page layout may differ
    from other parts
  • flow--a chapter- or section-like division within
    a page-sequence
  • block--a paragraph (or title or block quote,
    etc.)
  • inline--e.g., a font change within a paragraph
  • wrapper--a "transparent" object usable as either
    a block or an inline object that has no effect
    other than to provide a place to hang inheritable
    properties
  • list FOs--list-block, list-item, list-item-label,
    list-item-body
  • graphic--references an external graphic object
  • table FOs--mostly analogous to the standard
    (CALS, OASIS, HTML) table models

56
Basic properties
  • font properties
  • margin and spacing properties
  • border and padding properties
  • keeps/breaks
  • horizontal alignment/justification
  • indentation
  • more formatting object specific properties

57
Some application domains (1)
  • HR-XML (Human Resources XML)is a standard suite
    of XML specifications to enable e-business and
    the automation of human resources-related data
    exchanges
  • XHTML (eXtensible HTML)is a standard designed to
    help the transition from HTML to XML. It makes it
    possible to use XML processing tools, in
    particular to modify presentation depending on
    the target device (PDA, cellular...)
  • SVG (Scalable Vector Graphics)allows to describe
    2-dimensional graphics in XML. Its
    standardization is supported by Adobe, Microsoft,
    others
  • SMIL (Synchronized Multimedia Integration
    Language)is a standard suite of XML
    specifications to enable e-business and the
    automation of human resources-related data
    exchanges

58
Some application domains (2)
  • MathML (Mathematical Markup Language)is a
    language for normalized scientific content. It
    allows to represent complex mathematical
    expressions for displaying them on Internet
  • DHTML (Dynamic HTML)is a kind of self-contained
    thing-unto-itself to create HTML that can change
    even after a page has been loaded into a browser
  • PPML (Printnamic Dynamic Markup Language)is an
    XML-based language for variable-data printing. It
    was developed by the Digital Printing Initiative
    (PODi)
  • 3DML, HumanML, Artificial Intelligence ML ...

59
In short, XML is...
  • a powerful tool for
  • data representation,
  • storage,
  • modelling,
  • and interoperation

60
(No Transcript)
61
Small XML example code
lt?xml version"1.0" encoding"ISO-8859-1"?gt
ltarticlegt  lttitregt Un journaliste accuse, un
policier dément lt/titregt  ltauteurgt Alain Connu
lt/auteurgt  ltdategt 14 juin 1972 lt/dategt  ltlieugt
banquise lt/lieugt  lttextegt Un journaliste de la
place accuse les autorités ...  lt/textegt
lt/articlegt
62
Petite introduction à XML
  • Un document XML est bien formé sil respecte
    certaines contraintes
  • toutes les balises ayant un contenu non vide
    doivent être fermées
  • les balises n'ayant pas de contenu doivent se
    terminer par /gt  
  • les valeurs d'attributs doivent être entre
    guillemets
  • Un document XML est valide par rapport à une DTD
    s'il respecte les règles exprimées par la DTD
  • DTD ensemble de règles indiquant quelles sont
    les séquences et imbrications de balises
    autorisées

lt!ELEMENT UL (LI)gt lt!ELEMENT LI (PCDATA u it b)gt
63
Modelling information structure in XML

64
Small introduction to Markup Languages (XML,
HTML)
From the course of Bertrand Ibrahim, Geneva
University
  • XML allows to structure the information
  • XML allows to automatize the processing of
    structured documents and formatted data
  • XML a generalization of HTML where, instead of
    using a set of predefined tags with predefined
    meanings, authors can "invent" their own tags
Write a Comment
User Comments (0)
About PowerShow.com