W3C XML Schema: what you might not know (and might or might not like!) - PowerPoint PPT Presentation

About This Presentation
Title:

W3C XML Schema: what you might not know (and might or might not like!)

Description:

User interface tools. Programming language bindings ... Cool tricks with components. In memory schemas. Handy tools for working with schemas ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 40
Provided by: noahmen
Category:
Tags: w3c | xml | know | schema

less

Transcript and Presenter's Notes

Title: W3C XML Schema: what you might not know (and might or might not like!)


1
W3C XML Schema what you might not know (and
might or might not like!)
  • Noah Mendelsohn
  • Distinguished Engineer
  • IBM Corp.
  • October 10, 2002

2
Topics
  • Quick review of XML concepts
  • Why XML Schema?
  • What is XML Schema?
  • Where do schemas come from?
  • A few validation tricks
  • Wrapup

3
Warning! To save screen
space, some examples are simplified. Namespace
decls. are omitted, only the key parts of schema
declarations are shown, etc.
4
Quick review of XMLconcepts
5
This is an XML document
lt?xml version1.0?gt lte1gt lte2gt
lte3 a1123 /gt lte2gt lt/e1gt

6
Infoset the XML data model
lt?xml version1.0?gt lte1gt lte2gt lte3
a1123 /gt lte2gt lt/e1gt

7
More on XML infosets
  • XML 1.0 describes only documents with angle
    bracket syntax ltgt
  • Infosets also describe DOM, SAX, and other
    representations
  • XML Schema validates infosetsapplies to all of
    the representations
  • XML Schema can validate from any element
    information item (e.g. e1 or e2)

8
Why XML Schema?
9
What are schemas for?
  • Contracts agreeing on formats
  • Tool building know what the data will be before
    the first instance shows up
  • Database integration
  • User interface tools
  • Programming language bindings
  • Validation make sure we got what we expected

10
What is XML Schema?
11
This is an XML document
lt?xml version1.0?gt ltmynse1
xmlnsmynshttp//example.org/myns
xmlnsyournshttp//example.org/yournsgt
ltmynse2gt ltyournse1 a1xyz/gt
ltmynse3 a1123 mynsa1456/gt
ltyournse1 mynsa1456/gt lt/mynse2gt
ltyournse4/gt lt/mynse1gt

12
This is an XML schema
ltxsdschema targetNamespacehttp//example
.org/myns xmlnsxsd"http//www.w3.org/200
1/XMLSchema" ..namespaces ommitted to
protect innocent..gt lt!- declare element e1
-gt ltxsdelement namee1gt ltxsdsequencegt
ltxsdelement namee2/gt
ltxsdelement refyournse4/gt
lt/xsdsequencegt lt/xsdelement lt/xsdschemagt

13
This is an XML schema document
ltxsdschema targetNamespacehttp//example
.org/myns xmlnsxsd"http//www.w3.org/200
1/XMLSchema" ..namespaces ommitted to
protect innocent..gt lt!- declare element e1
-gt ltxsdelement namee1gt ltxsdsequencegt
ltxsdelement namemynse2/gt
ltxsdelement refyournse4/gt
lt/xsdsequencegt lt/xsdelement lt/xsdschemagt

14
This is an XML document
lt?xml version1.0?gt ltmynse1gt
xmlnsmynshttp//example.org/myns
xmlnsyournshttp//example.org/yournsgt
ltmynse2gt ltyournse1 a1xyz/gt
ltmynse3 a1123 mynsa1456/gt
ltyournse1 mynsa1456/gt lt/mynse2gt
ltyournse4/gt lt/mynse1gt

To validate this, we need gt1 schema document
15
Import brings in declarations for other namespaces
ltxsdschema targetNamespacehttp//example
.org/myns.xsd xmlnsxsd"http//www.w3.org
/2001/XMLSchema" ..namespace ommitted to
protect innocent..gt ltimport namespacehttp//e
xample.org/yourns schemaLocationhttp//ex
ample.org/yourns.xsdgt ltxsdelement
namee1gt ltxsdsequencegt
ltxsdelement namemynse2/gt ltxsdelement
refyournse4/gt lt/xsdsequencegt
lt/xsdelement lt!- declare element e2 -gt
ltxsdelement namee2 type/gt lt/xsdschemagt

16
Terminology
17
Cool tricks with components
  • In memory schemas
  • Handy tools for working with schemas
  • Build the components for you
  • Resolve subtyping across namepaces, etc.
  • Examples
  • http//www.eclipse.org/xsd
  • Henry Thompsons XSV
  • Conformance testing

18
How to read the spec.
  • 3.3 Element Declarations
  • 3.3.1 The Element Declaration Schema Component
  • 3.3.2 XML Representation of Element Declaration
    Schema Components
  • 3.3.3 Constraints on XML Representations of
    Element Declarations
  • 3.3.4 Element Declaration Validation Rules
  • 3.3.5 Element Declaration Information Set
    Contributions
  • 3.3.6 Constraints on Element Declaration Schema
    Components
  • Warning the spec. never gives any rule twice!

19
Post-schema validation infoset (PSVI)
  • Fearsome title, simple concept
  • Infoset the data model for an XML
    documenttells you what you can know (that
    matters) after a parse.
  • PSVI tells you what you can know after a
    validation
  • What parts of doc are valid?
  • Per which types?
  • Default values
  • Etc.

20
Self-describing vs. schema- described docs
  • You can use xsitype in your documents lte
    xsitypexsdintegergt123lt/egt
  • Use xsitype with built ins (and no attributes)
  • Your document is nearly self-describing
  • SOAP encoding supports this
  • xsitype with your own types
  • Partially self-describing
  • You know the type names need schema to know
    what types are
  • SOAP 1.2 Encoding supports this too!

21
Where do schemas come from?
22
How are schema components found?
  • In short, wherever you want!
  • Hint from schema
  • ltxsdimport ns schemaLocationyyy.xsd/gt
  • Hint from instance
  • ltmynse1 schemaLocation myNSUri yyy.xsd/gt
  • Processor command line or config
  • Compiled into application (validating HTML editor)

23
Why all this flexibility?
  • gt 1 schema / namespace (versions, bug fixes,
    experiments, etc.)
  • Who gets control?
  • Docheads want to name schema in instance
  • eCommerce do you trust the schema named in a
    purchaseOrder?
  • Ultimately the application chooses
  • Synthetic DB builds it dynamically

24
Streaming
  • Most validation can be done 1 pass
  • Id/idref, key/keyref require limited lookaside
  • Problem
  • ltmyinstancegt 10Mbytes of data here lt!-
    oops..need a new schema! --gt ltnewnsa
    schemaLocnewnsUri xxxgt lots more
    datalt/mysinstancegt
  • Answer
  • Assemble schema incrementally or in advance
  • Result must be same cant tell which from the
    outside!

25
Our language vs. your language why ltimportgt?
ltxsdschema targetNamespacehttp//example.org/ns
1 xmlnsns1http//example.org/ns1
xmlnsxsd"http//www.w3.org/2001/XMLSchema"gt
ltxsdelement nameX/gt ltxsdelement
refns1X/gt lt/xsdschemagt

A fragment of a schema document
26
Our language vs. your language why ltimportgt?
ltxsdschema targetNamespacehttp//example.org/ns
1 xmlnsns1http//example.org/ns1
xmlnsns2http//example.org/ns2
xmlnsxsd"http//www.w3.org/2001/XMLSchema"gt
ltimport namespacehttp//example.org/ns2gt
ltxsdelement nameX/gt ltxsdelement
refns1X/gt ltxsdelement refns2Y/gt
lt/xsdschemagt

Add a reference to an external element
27
Our language vs. your language why ltimportgt?
Unimported namespaces enhance the schema language
Imported namespaces enhance your language.
ltxsdschema targetNamespacehttp//example.org/ns
1 xmlnsns1http//example.org/ns1
xmlnsns2http//example.org/ns2
xmlnsxsd"http//www.w3.org/2001/XMLSchema
xmlnsxsd2"http//www.w3.org/2004/XMLSchemagt
ltimport namespacehttp//example.org/ns2gt
ltxsdelement nameX/gt ltxsdelement
refns1X/gt ltxsdelement refns2Y/gt
ltxsd2betterElement namenewone
/gt lt/xsdschemagt

Enhance the schema language!
28
A few validation tricksModeling content
29
How to validate this?
ltsoapEnvelopegt ltsoapBodygt your message
here lt/soapBodygt lt/soapEnvelopegt
  • What is the content model for ltsoapBodygt?
  • Can you validate the contents?

30
Inside/out vocabularies (some specific SOAP
examples)
lt! SOAP PURCHASE ORDER --gt ltsoapEnvelope
xmlnssoaphttp//www.w3.org/2002/06/soap-envelop
egt ltsoapBodygt ltpopurchaseOrder
xmlnspohttp//example.org/pogt
lt/popurchaseOrdergt lt/soapBodygt lt/soapEnvelope
gt lt! SOAP INVOICE --gt ltsoapEnvelope
xmlnssoaphttp//www.w3.org/2002/06/soap-envelop
egt ltsoapBodygt ltinvinvoice
xmlnspohttp//example.org/invgt
lt/invinvoicegt lt/soapBodygt lt/soapEnvelopegt
31
Schemas for envelopes
Putting skip here says dont validate the
content of the body
ltxsdcomplexType namebodyTypegt ... ltxsdsequ
encegt ltxsdany processContentsski
p/gt lt/xsdsequencegt ... lt/xsdcomplexTypegt
32
Schemas for envelopes
Putting strict here says you must have
declarations and must successfully validate the
contents of body.
ltxsdcomplexType namebodyTypegt ...
ltxsdsequencegt ltxsdany
processContentsstrict/gt lt/xsdsequencegt ...
lt/xsdcomplexTypegt
33
Schemas for envelopes
Putting lax here says validate only if your
schema has declarations for the contents
ltxsdcomplexType namebodyTypegt ... ltxsdsequ
encegt ltxsdany processContentslax
/gt lt/xsdsequencegt ... lt/xsdcomplexTypegt
34
Versioning vocabularies schemas
  • Its hard!
  • Use namespaces?
  • Do 50 bug fixes give you 50 namespaces?
  • How much interop? Does old schema accept new
    version?
  • What about Xpath?
  • For better or worse schemas has no organized
    model for versioning

35
Inheritance why have it?
  • Allow reuse of definitions
  • Model real-world inheritance and polymorphism
  • Substitutability
  • Mappings to programming systems w/inheritance
  • Schemas provides mechanisms offering parial
    solutions to these problems

36
Refinement vs. Extension
  • Data inheritance is different from method
    inheritance
  • No active code receiver sees everything
    order matters, e.g. for multiple inheritance
  • Innovation(?) in schema
  • Restriction subtype is a subset (supports
    substitutability)
  • Extension subtype builds on base (supports
    modular development, some mappings to real world
    and programming languages.)
  • No multiple inheritance (for now)

37
Wrapup
38
Some things I learned
  • No such thing as a simple feature
  • Big committee -gt big language
  • Documents data together are cool
  • But neither community gets a simple schema
    language
  • Make realistic schedules we didnt make time to
    pull features

39

Thank you!
Write a Comment
User Comments (0)
About PowerShow.com