Appendix A: XML and XML Schema - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Appendix A: XML and XML Schema

Description:

Choice: exclusive or. Can occur within other compositors. Appendix A. 10 ... Music xmlns=http://a.b.c/Muse. xmlns:xsi='the standard-xsi' ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 36
Provided by: DrMunind5
Category:
Tags: xml | appendix | schema

less

Transcript and Presenter's Notes

Title: Appendix A: XML and XML Schema


1
  • Appendix AXML and XML Schema

Service-Oriented Computing Semantics, Processes,
Agents Munindar P. Singh and Michael N. Huhns,
Wiley, 2005
2
Highlights of this Chapter
  • XML and Vocabularies
  • Well-Formedness
  • Namespaces and Qualified Names
  • XML Extensions
  • XML Schema
  • XML Query Languages
  • XPath
  • XSLT
  • Limitations

3
Brief Introduction to XML
  • Basics
  • Parsing
  • Storage
  • Transformations

4
Markup History
  • None
  • Ad hoc tags
  • SGML (Standard Generalized Markup L) complex,
    few reliable tools
  • HTML (HyperText ML) simple, unprincipled, mixes
    structure and display
  • XML (eXtensible ML) simple, yet extensible
    subset of SGML to capture new vocabularies
  • Machine processible
  • Comprehensible to people easier debugging

5
XML Basics and Namespaces
  • lt?xml version"1.0"?gt lt! not part of the
    document per se ?
  • ltarbitrarytoptag xmlnshttp//one.default.namesp
    ace/if-needed
  • xmlnsarbitraryhttp//wherever.it.might.be/arb
    it-ns
  •       xmlnsrandomhttp//another.one/random-ns
    gt
  •      ltarbitraryatag attr1v1 attr2v2gt
  • Optional text also known as PCDATA
  • ltarbitrarybtag attr1v1 attr2v2 /gt
  • lt/arbitraryataggt
  • ltrandomsimple_tag/gt
  • ltrandomatag attr3v3/gt lt! compare with
    arbitraryatag above ?
  • lt/arbitrarytoptaggt

6
Parsing and Validating
  • An XML document maps to a parse tree.
  • Each tag ends once nesting structure (one root)
  • Each attribute occurs at most once quoted string
  • Well-formed XML documents can be parsed
  • Applications have an explicit or implicit syntax
    for their particular XML-based tags
  • If explicit, may be expressed in DTDs and XML
    Schemas
  • Best referred to definitions elsewhere
  • XML Schemas, expressed in XML, are superior to
    DTDs
  • When docs are produced by external components,
    they should be validated

7
XML Schema
  • A data definition language for XML defines a
    notion of schema validity
  • Same syntax as regular XML documents
  • Local scoping of subelement names
  • Incorporates namespaces
  • Types
  • Primitive (built-in) string, integer, float,
    date,
  • Primitive (built-in) ID (key), IDREF (foreign
    key)
  • simpleType constructors list, union
  • Restrictions intervals, lengths, enumerations,
    regex patterns,
  • Flexible ordering of elements
  • Key and referential integrity constraints

8
XML Schema complexType
  • Specifies types of elements with structure
  • Must use a compositor if 1subelements
  • Subelements with types
  • Min and max occurrences (default 1) of
    subelements
  • Elements with text content not easy ignore
  • EMPTY elements easy. Example?

9
XML Schema Compositors
  • Sequence ordered
  • Can occur within other compositors
  • Allows varying min and max occurrence
  • All unordered
  • Must occur directly below root element
  • Max occurrence of each element is 1
  • Choice exclusive or
  • Can occur within other compositors

10
XML Schema Key Namespaces
  • http//www.w3.org/2001/XMLSchema
  • Conventional prefix xsd
  • Terms for defining schemas schema, element,
    attribute,
  • The tag schema has an attribute targetNamespace
  • http//www.w3.org/2001/XMLSchema-instance
  • Conventional prefix xsi
  • Terms for use in instances schemaLocation, null
  • targetNamespace user-defined

11
XML Schema Instance Doc
  • ltMusic xmlnshttp//a.b.c/Muse
  • xmlnsxsithe standard-xsi
  • xsischemaLocationa-schema-as-a-URI
    a-schema-location-as-a-URLgt
  • lt/Musicgt
  • Define null values as ltaTag xsiniltrue/gt

12
Creating Schema Docs 1
  • ltschema xmlnsthe-standard-xsd
  • targetNamespacethe-targetgt
  • ltinclude schemaLocationpart-one.xsd/gt
  • ltinclude schemaLocationpart-two.xsd/gt
  • lt! schemaLocation as in xsd, not xsi ?
  • lt/schemagt
  • Included into the same namespace as the including
    space.

13
Creating Schema Docs 2
  • Use imports instead of include
  • Specify namespaces from which schemas are to be
    imported
  • Location of schemas not required and may be
    ignored if provided

14
Document Object Model (DOM)
  • Basis for parsing XML, which provides a
    node-labeled tree in its API
  • Conceptually simple traverse by requesting tag,
    its attribute values, and its children
  • Processing program reflects document structure
  • Can edit documents
  • Inefficient for large documents parses them
    first entirely to build the tree even if a tiny
    part is needed

15
DOM Example Simeoni 2003
  • Element s d.getDocumentElement()
  • NodeList l s.getElementsByTagName(member)
  • Element m (Element) l.item(0)
  • int code m.getAttribute(code)
  • NodeList kids m.getChildNodes()
  • Node kid kids.item(0)
  • String tagName ((Element)kid).getTagName()

16
Simple API for XML (SAX)
  • Parser generates a sequence of events
  • startElement, endElement,
  • Programmer implements these as callbacks
  • More control for the programmer
  • Processing program does not reflect document
    structure

17
SAX Example Simeoni 2003
  • class MemberProcess extends DefaultHandler
  • public void startElement (String uri, String n,
    String qName, Attributes attrs)
  • if (n.equals(member)) code
    attrs.getValue(code)
  • if (n.equals(project)) inProject true
  • buffer.reset()
  • public void endElement (String uri, String n,
    String qName)
  • if (n.equals(project)) inProject false
  • if (n.equals(member) !inProject)
  • name buffer.toString().trim()

18
Programming with XML
  • Current approaches concentrate on structure but
    ignore meaning
  • Difficult to construct and maintain
  • Treat everything as a string
  • Inadequate type checking can hide errors
  • Emerging approaches (e.g., JAXB) provide superior
    binding from XML to programming languages
  • Primitives such as unmarshal to materialize an
    object from XML

19
Uses of XML
  • Exchanging information across software components
  • Storing information in nonproprietary format
  • XML documents represent structured descriptions
  • Products, services, catalogs
  • Contracts
  • Queries, requests, invocations (as in SOAP)
  • Data-centric versus document-centric (irregular,
    heterogeneous data, depend on entire doc for
    app-specific meaning) views

20
Data-Centric View
  • ltrelationgt
  • lttuplegtltattr1gtV11lt/attr1gt ltattrngtV1nlt/attrngtlt/t
    uplegt
  • lttuplegtltattr1gtVm1lt/attr1gt ltattrngtVmnlt/attrngtlt/t
    uplegt
  • lt/relationgt
  • Extract and store into DB via mapping to DB model
  • Regular, homogeneous tags
  • May be expensive if repeatedly parsed and
    instantiated

21
Document-Centric View
  • Storing docs in DBs
  • Use character large objects (clobs) within DB
  • Store paths to external files containing docs
  • Combine with some structured elements with search
    conditions for both structured elements and
    unstructured clobs or files
  • Heterogeneity also complicates mappings to
    traditional typed OO programming languages

22
Directions
  • Limitations of XML
  • Doesnt represent meaning
  • Enables multiple representations for the same
    information transform if models known
  • Trends sophisticated approaches for
  • Querying and manipulating XML, e.g., XSLT
  • Binding to PLs and DBs
  • Semantics, e.g., RDF, DAML, OWL,

23
XML Query Languages
  • XPath
  • XPointer
  • XSLT
  • XQuery

24
XPath
  • Model XML documents as trees with nodes
  • Elements
  • Attributes
  • Text (PCDATA)
  • Comments
  • Root node above root of document

25
Achtung!
  • Parent in XPath is like parent as traditionally
    in computer science
  • Child in XPath is confusing
  • An attribute is not the child of its parent
  • Makes a difference for certain kinds of recursion
    (e.g., apply-templates discussed in XSLT)
  • Our terminology is based on the traditional
    terminology
  • e-children, a-children, t-children
  • Sets via et- or ta-, etc.

26
XPath Paths
  • Leading / root
  • / indicates walking down a tree
  • .current node
  • ..parent node
  • _at_attr to access values for the given attribute
  • text()
  • comment()

27
XPath Navigation
  • Select children according to position, e.g., j,
    where j could be 1 last()
  • Descendant-or-self operator, //
  • .//elem finds all elems under the current
  • //elem finds all elems in the document
  • Ancestors not needed in this course
  • Wildcard,
  • collects e-children of the node where it is
    applied, but omits the t-children
  • _at_ finds all attribute values

28
XPath Queries
  • Incorporate selection conditions in XPath
  • Attributes //Song_at_genrejazz
  • Elements //Songstarts-with(.//group, Led)
  • Existence of attribute //Song_at_genre
  • Existence of subelement //Songgroup
  • Boolean operators and, not, or
  • Set operator union () none others
  • Arithmetic operators gt, lt,
  • String functions contains(), concat(), length(),
  • Aggregates sum(), count()

29
XPointer
  • Combines XPath with URLs
  • URL to get to a document XPath to walk down the
    document
  • Can be used to formulate queries, e.g.,
  • Song-URLxpointer(//Song_at_genrejazz)

30
XSLT
  • A functional programming language
  • A stylesheet specifies transformations on a
    document
  • lt?xml version1.0?gt
  • lt?xml-stylesheet typetext/xsl
  • hrefURL-to-dot-xsl?gt lt! the sheet to use ?
  • ltmain-taggt
  • lt/main-taggt

31
XSLT Stylesheets
  • Use the XSLT namespace, conventionally
    abbreviated as xsl Includes primitives
  • Copy-of
  • ltfor-each selectgt
  • ltif testgt
  • ltchoose gt

32
XSLT Templates 1
  • A pattern to specify where a given transform
    should apply
  • This match only works on the root
  • ltxsltemplate match/gt
  • lt/xsltemplategt
  • Only anonymous templates in this course

33
XSLT Templates 2
  • Can be applied recursively on the et-children via
  • ltxslapply-templates/gt
  • By default, if no other template matches,
    recursively apply to et-children of current node
    (ignores attributed) and to root
  • ltxsltemplate match/gt
  • ltxslapply-templates/gt
  • lt/xsltemplategt
  • Can over-apply to override the default, may need
    an empty template
  • ltxsltemplate match/gt lt! e.g., match all
    text() ?

34
XSLT Templates 3
  • Subtleties of XSLT matching are beyond our scope
  • Discuss some examples

35
Appendix A Summary
  • XML enables information sharing
  • XML is well established
  • Several aspects are worked out
  • Lots of tools
  • Works with databases and programming languages
  • XML provides a useful substrate for
    service-oriented computing
Write a Comment
User Comments (0)
About PowerShow.com