Fundamental XML for Developers - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Fundamental XML for Developers

Description:

Apostrophe (') Double quote (') Entity references. Allow to use XML ... Apostrophe (') Quotation mark (") Mark up characters ' &' in element message ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 83
Provided by: leslie135
Category:

less

Transcript and Presenter's Notes

Title: Fundamental XML for Developers


1
Fundamental XML for Developers
  • Dr. Timothy M. Chester
  • Texas AM University

2
Timothy M. Chester is. . .
  • Senior IT Manager, Texas AM University
  • Application Development, Systems Integration,
    Developer Tools Training
  • Lecturer, Texas AM College of Business
  • Courses on Business Programming Fundamentals
    (VB.NET, C), XML Advanced Web Development.
  • Author
  • Visual Studio Magazine, Dr. Dobbs Journal, IT
    Professional
  • Consultant
  • President Principal, eInternet Studios
  • Contact Information
  • E-mail tim-chester_at_tamu.edu
  • Web http//tim-chester.tamu.edu

3
Texas AM University
4
You Are. . .
  • Software Developers
  • New to XML, Object Oriented Development
  • Require basics of XML course
  • IT Managers
  • Need familiarity with XML basics and terminology
  • Interested in how XML can affect both software
    development and legacy system integration

5
This session . . .
  • Assumes you know nothing about XML or XML based
    technologies
  • Provides a basic introduction to XML based
    technologies
  • Demonstrates some of the basics of working with
    the DOM, XSLT, Schema, WSDL, and SOAP.

6
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

7
Underlying Technologies XML Is the Glue
XML
HTML
TCP/IP
Technology
Connecting Applications
Connectivity
Presentation
FTP, E-mail, Gopher
Innovation
Web Pages
Connect the Web
Web Services
Browse the Web
Program the Web
8
Evolution of Web
9
Web Services Overview Application Model
Partner Web Service
Other Web Services
Partner Web Service
Internet XML
YourCompany.com
End Users
Application Business Logic Tier
Data Access and Storage Tier
Other Applications
10
Introducing XML
  • XML stands for Extensible Markup Language. A
    markup language specifies the structure and
    content of a document.
  • Because it is extensible, XML can be used to
    create a wide variety of document types.

11
Introducing XML
  • XML is a subset of a the Standard Generalized
    Markup Language (SGML) which was introduced in
    the 1980s. SGML is very complex and can be
    costly.
  • These reasons led to the creation of Hypertext
    Markup Language (HTML), a more easily used markup
    language. XML can be seen as sitting between SGML
    and HTML easier to learn than SGML, but more
    robust than HTML.

12
The Limits of HTML
  • HTML was designed for formatting text on a Web
    page. It was not designed for dealing with the
    content of a Web page. Additional features have
    been added to HTML, but they do not solve data
    description or cataloging issues in an HTML
    document.
  • Because HTML is not extensible, it cannot be
    modified to meet specific needs. Browser
    developers have added features making HTML more
    robust, but this has resulted in a confusing mix
    of different HTML standards.

13
Introducing XML
  • HTML cannot be applied consistently. Different
    browsers require different standards making the
    final document appear differently on one browser
    compared with another.

14
Introduction to XML Markup
  • XML document (intro.xml)
  • Marks up message as XML
  • Commonly stored in text files
  • Extension .xml

15
(No Transcript)
16
Introduction to XML Markup (cont.)
  • XML documents
  • Must contain exactly one root element
  • Attempting to create more than one root element
    is erroneous
  • Elements must be nested properly
  • Incorrect ltxgtltygthellolt/xgtlt/ygt
  • Correct ltxgtltygthellolt/ygtlt/xgt
  • Must be well-formed

17
XML Parsers
  • An XML processor (also called XML parser)
    evaluates the document to make sure it conforms
    to all XML specifications for structure and
    syntax.
  • XML parsers are strict. It is this rigidity built
    into XML that ensures XML code accepted by the
    parser will work the same everywhere.

18
XML Parsers
  • Microsofts parser is called MSXML and is built
    directly in IE versions 5.0 and above.
  • Netscape developed its own parser, called
    Mozilla, which is built into version 6.0 and
    above.

19
Parsers and Well-formed XML Documents (cont.)
  • XML parsers support
  • Document Object Model (DOM)
  • Builds tree structure containing document data in
    memory
  • Simple API for XML (SAX)
  • Generates events when tags, comments, etc. are
    encountered
  • (Events are notifications to the application)

20
Parsing an XML Document with MSXML
  • XML document
  • Contains data
  • Does not contain formatting information
  • Load XML document into Internet Explorer 5.0
  • Document is parsed by msxml.
  • Places plus () or minus (-) signs next to
    container elements
  • Plus sign indicates that all child elements are
    hidden
  • Clicking plus sign expands container element
  • Displays children
  • Minus sign indicates that all child elements are
    visible
  • Clicking minus sign collapses container element
  • Hides children
  • Error generated, if document is not well formed

21
XML document shown in IE6.
22
Character Set
  • XML documents may contain
  • Carriage returns
  • Line feeds
  • Unicode characters
  • Enables computers to process characters for
    several languages

23
Characters vs. Markup
  • XML must differentiate between
  • Markup text
  • Enclosed in angle brackets (lt and gt)
  • e.g,. Child elements
  • Character data
  • Text between start tag and end tag
  • Welcome to XML!
  • Elements versus Attributes

24
White Space, Entity References and Built-in
Entities
  • Whitespace characters
  • Spaces, tabs, line feeds and carriage returns
  • Significant (preserved by application)
  • Insignificant (not preserved by application)
  • Normalization
  • Whitespace collapsed into single whitespace
    character
  • Sometimes whitespace removed entirely
  • ltmarkupgtThis is character datalt/markupgt
  • after normalization, becomes
  • ltmarkupgtThis is character datalt/markupgt

25
White Space, Entity References and Built-in
Entities (cont.)
  • XML-reserved characters
  • Ampersand ()
  • Left-angle bracket (lt)
  • Right-angle bracket (gt)
  • Apostrophe ()
  • Double quote ()
  • Entity references
  • Allow to use XML-reserved characters
  • Begin with ampersand () and end with semicolon
    ()
  • Prevents from misinterpreting character data as
    markup

26
White Space, Entity References and Built-in
Entities (cont.)
  • Build-in entities
  • Ampersand (amp)
  • Left-angle bracket (lt)
  • Right-angle bracket (gt)
  • Apostrophe (apos)
  • Quotation mark (quot)
  • Mark up characters ltgt in element message
  • ltmessagegtltgtamplt/messagegt

27
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

28
Introduction
  • XML Document Object Model (DOM)
  • Build tree structure in memory for XML documents
  • DOM-based parsers parse these structures
  • Exist in several languages (Java, C, C, Python,
    Perl, C, VB.NET, VB, etc)

29
Introduction
  • DOM tree
  • Each node represents an element, attribute, etc.
  • lt?xml version "1.0"?gtltmessage from "Paul"
    to "Tem"gt ltbodygtHi, Tim!lt/bodygtlt/messagegt
  • Node created for element message
  • Element message has child node for body element
  • Element body has child node for text "Hi, Tim!"
  • Attributes from and to also have nodes in tree

30
DOM Implementations
  • DOM-based parsers
  • Microsofts msxml
  • Microsoft.NET System.Xml Namspace
  • Sun Microsystems JAXP

31
Creating Nodes
  • Create XML document at run time

32
Traversing the DOM
  • Use DOM to traverse XML document
  • Output element nodes
  • Output attribute nodes
  • Output text nodes

33
DOM Components
  • Manipulate XML document

34
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

35
Introduction
  • XML Path Language (XPath)
  • Syntax for locating information in XML document
  • e.g., attribute values
  • String-based language of expressions
  • Not structural language like XML
  • Used by other XML technologies
  • XSLT

36
Nodes
  • XML document
  • Tree structure with nodes
  • Each node represents part of XML document
  • Seven types
  • Root
  • Element
  • Attribute
  • Text
  • Comment
  • Processing instruction
  • Namespace
  • Attributes and namespaces are not children of
    their parent node
  • They describe their parent node

37
XPath node types
38
XPath node types. (Part 2)
39
Location Paths
  • Location path
  • Expression specifying how to navigate XPath tree
  • Composed of location steps
  • Each location step composed of
  • Axis
  • Node test
  • Predicate

40
Axes
  • XPath searches are made relative to context node
  • Axis
  • Indicates which nodes are included in search
  • Relative to context node
  • Dictates node ordering in set
  • Forward axes select nodes that follow context
    node
  • Reverse axes select nodes that precede context
    node

41
Node Tests
  • Node tests
  • Refine set of nodes selected by axis
  • Rely upon axis principle node type
  • Corresponds to type of node axis can select

42
Node-set Operators and Functions (cont.)
  • Location-path expressions
  • Combine node-set operators and functions
  • Select all head and body children element nodes
  • head body
  • Select last bold element node in head element
    node
  • head/title last()
  • Select third book element
  • book position() 3
  • Or alternatively
  • book 3
  • Return total number of element-node children
  • count( )
  • Select all book element nodes in document
  • //book

43
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

44
Introduction
  • Extensible Stylesheet Language (XSL)
  • Used to format XML documents
  • Consist of two parts
  • XSL Transformation Language (XSLT)
  • Transform XML document from one form to another
  • Use XPath to match nodes
  • XSL formatting objects
  • Alternative to CSS

45
Setup
  • XSLT processor
  • Microsoft Internet Explorer 6
  • Java 2 Standard Edition
  • Microsoft.NET System.Xml Namespace

46
Templates
  • XSLT document
  • XML document with root element stylesheet
  • template element
  • Matches specific XML document nodes
  • Uses XPath expression in attribute match

47
Templates (cont.)
  • XSLT
  • Two trees of nodes
  • Source tree corresponds to original XML document
  • Result tree contains nodes produced by
    transformation
  • Transforms intro.xml into HTML document

48
Iteration and Sorting
  • XSLT allows
  • Iteration through node set
  • Element for-each
  • Sorting node set
  • Element sort
  • Attribute ascending (i.e., A-Z)
  • Attribute descending (i.e., Z-A)

49
Conditional Processing
  • Perform conditional processing
  • Such as if statement
  • Use element choose
  • Allows alternate conditional statements
  • Similar to switch statement
  • Has child elements when and otherwise
  • when element content used if condition is met
  • otherwise element content used if no conditions
    in when condition are met

50
XSLT and XPath
  • XPath Expression
  • locates elements, attributes and text in XML
    document

51
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

52
Working with Namespaces
  • Name collision occurs when elements from two or
    more documents share the same name.
  • Name collision isnt a problem if you are not
    concerned with validation. The document content
    only needs to be well-formed.
  • However, name collision will keep a document from
    being validated.

53
Name Collision
  • This figure shows two documents each with a Name
    element

54
Using Namespaces to Avoid Name Collision
This figure shows how to use a namespace to avoid
collision
55
Declaring a Namespace
  • A namespace is a defined collection of element
    and attribute names.
  • Names that belong to the same namespace must be
    unique. Elements can share the same name if they
    reside in different namespaces.
  • Namespaces must be declared before they can be
    used.

56
Declaring a Namespace
  • A namespace can be declared in the prolog or as
    an element attribute. The syntax to declare a
    namespace in the prolog is
  • lt?xmlnamespace nsURI prefixprefix?gt
  • Where URI is a Uniform Resource Identifier that
    assigns a unique name to the namespace, and
    prefix is a string of letters that associates
    each element or attribute in the document with
    the declared namespace.

57
Declaring a Namespace
  • For example,
  • lt?xmlnamespace nshttp//uhosp/patients/ns
    prefixpatgt
  • Declares a namespace with the prefix pat and
    the URI http//uhosp/patients/ns.
  • The URI is not a Web address. A URI identifies a
    physical or an abstract resource.

58
(No Transcript)
59
(No Transcript)
60
Schemas
  • A schema is an XML document that defines the
    content and structure of one or more XML
    documents.
  • To avoid confusion, the XML document containing
    the content is called the instance document.
  • It represents a specific instance of the
    structure defined in the schema.

61
Comparing Schemas and DTDs
  • This figure compares schemas and DTDs

62
Schema Dialects
  • There is no single schema form.
  • Several schema dialects have been developed in
    the XML language.
  • Support for a particular schema depends on the
    XML parser being used for validation.

63
Starting a Schema File
  • A schema is always placed in a separate XML
    document that is referenced by the instance
    document.

64
Schema Types
  • XML Schema recognize two categories of element
    types complex and simple.
  • A complex type element has one or more
    attributes, or is the parent to one or more child
    elements.
  • A simple type element contains only character
    data and has no attributes.

65
Schema Types
  • This figure shows types of elements

66
Understanding Data Types
  • XML Schema supports two data types built-in and
    user-derived.
  • A built-in data type is part of the XML Schema
    specifications and is available to all XML Schema
    authors.
  • A user-derived data type is created by the XML
    Schema author for specific data values in the
    instance document.

67
Understanding Data Types
  • A primitive data type, also called a base type,
    is one of 19 fundamental data types not defined
    in terms of other types.
  • A derived data type is a collection of 25 data
    types that the XML Schema developers created
    based on the 19 primitive types.

68
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

69
WSDL
  • Think "TypeLib for SOAP"
  • WSDL Web Service Description Language
  • Uniform representation for services
  • Transport Protocol neutral
  • Access Protocol neutral (not only SOAP)
  • Describes
  • Schema for Data Types
  • Call Signatures (Message)
  • Interfaces (Port Types)
  • Endpoint Mappings (Bindings)
  • Endpoints (Services)

70
UDDI
  • Think "Yahoo!" for WebServices
  • Universal Description and Discovery Interface
  • WebService-Programmable "Yellow Pages"
  • Advertise Sites and Services
  • May point to DISCO resources
  • Initiative driven by Microsoft, IBM, Ariba

71
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

72
SOAP Overview
  • A lightweight protocol for exchanging information
    in a distributed, heterogeneous environment
  • It enables cross-platform interoperability
  • Interoperable
  • OS, object model, programming language neutral
  • Hardware independent
  • Protocol independent
  • Works over existing Internet infrastructure

73
SOAP Overview
  • Guiding principle Invent no new technology
  • Builds on key Internet standards
  • SOAP HTTP XML
  • Submitted to W3C
  • The SOAP specification defines
  • The SOAP message format
  • How to send messages
  • How to receive responses
  • Data encoding

74
SOAP SOAP Is Not
  • Objects-by-reference
  • Distributed garbage collection
  • Bi-directional HTTP
  • Activation
  • Complicated
  • Doesnt try to solve every problem in distributed
    computing
  • Can be easily implemented

75
SOAPThe HTTP Aspect
  • SOAP requests are HTTP POST requests

POST /WebCalculator/Calculator.asmx
HTTP/1.1 Content-Type text/xml SOAPAction
http//tempuri.org/Add Content-Length
386 lt?xml version1.0?gt ltsoapEnvelope ...gt
... lt/soapEnvelopegt
76
SOAPMessage Structure
The complete SOAP message
SOAP Message
Headers
Protocol binding headers
ltEnvelopegt encloses payload
SOAP Envelope
ltHeadergt encloses headers
SOAP Header
Individual headers
Headers
ltBodygt contains SOAP message name
SOAP Body
Message Name Data
XML-encoded SOAP message name data
77
SOAPSOAP Message Format
  • An XML document using the SOAP schema

lt?xml version1.0?gt ltsoapEnvelope ...gt
ltsoapHeader ...gt ... lt/soapHeadergt
ltsoapBodygt ltAdd xmlnshttp//tempuri.org/gt
ltn1gt12lt/n1gt ltn2gt10lt/n2gt lt/Addgt
lt/soapBodygt lt/soapEnvelopegt
78
SOAPServer Responses
  • Server replies with a result message

HTTP/1.1 200 OK ... Content-Typetext/xml Content-
Length 391 lt?xml version1.0?gt ltsoapEnvelope
...gt ltsoapBodygt ltAddResult
xmlnshttp//tempuri.org/gt
ltresultgt28.6lt/resultgt lt/AddResultgt
lt/soapBodygt lt/soapEnvelopegt
79
SOAPIndustry Support
  • Microsoft
  • Rogue Wave Software Inc.
  • Scriptics Corp.
  • Secret Labs AB
  • UserLand Software Inc.
  • Zveno Pty. Ltd.
  • IBM
  • Hewlett Packard
  • Intel
  • DevelopMentor Inc.
  • Digital Creations
  • IONA Technologies PLC
  • Jetform
  • ObjectSpace Inc.
  • Rockwell Software Inc.
  • SAP
  • Compaq

80
Agenda
  • XML
  • Document Object Model (DOM)
  • XPATH
  • XSLT
  • Schema
  • WSDL
  • SOAP
  • Questions

81
Questions
82
Bibliography
  • Harvey Deitels XMLHow To Program
  • Prentice Hall XML Reference
  • Microsoft Academic Resource Kit
Write a Comment
User Comments (0)
About PowerShow.com