P1252109256vBdOt - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

P1252109256vBdOt

Description:

Lark (Tim Bray) MSXML (Microsoft) XJ (Data Channel) Xerces (Apache) 5 - 6. Prescod ... Represents documents in the form of a hierarchy of nodes. ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 25
Provided by: katrinb
Category:

less

Transcript and Presenter's Notes

Title: P1252109256vBdOt


1
5
Processing XML
2
Overview
  • Parsing XML documents
  • Document Object Model (DOM)
  • Simple API for XML (SAX)
  • Class generation

3
What's the Problem?
?
lt?xml version"1.0"?gt ltbooksgt ltbookgt
lttitlegtThe XML Handbooklt/titlegt
ltauthorgtGoldfarblt/authorgt
ltauthorgtPrescodlt/authorgt ltpublishergtPrentic
e Halllt/publishergt ltpagesgt655lt/pagesgt
ltisbngt0130811521lt/isbngt ltprice
currency"USD"gt44.95lt/pricegt lt/bookgt ltbookgt
lttitlegtXML Designlt/titlegt
ltauthorgtSpencerlt/authorgt ltpublishergtWrox
Presslt/publishergt ... lt/bookgt lt/booksgt
?
Book
4
Parsing XML Documents
Docu-ment
DTD /Schema
DOM
SAX
5
Parser
  • Project X (Sun Microsystems)
  • Ælfred (Microstar Software)
  • XML4J (IBM)
  • Lark (Tim Bray)
  • MSXML (Microsoft)
  • XJ (Data Channel)
  • Xerces (Apache)
  • ...

6
The Document Object Model
XML Document
Structure
lt?xml version"1.0"?gt ltbooksgt ltbookgt
lttitlegtThe XML Handbooklt/titlegt
ltauthorgtGoldfarblt/authorgt
ltauthorgtPrescodlt/authorgt ltpublishergtPrentic
e Halllt/publishergt ltpagesgt655lt/pagesgt
ltisbngt0130811521lt/isbngt ltprice
currency"USD"gt44.95lt/pricegt lt/bookgt ltbookgt
lttitlegtXML Designlt/titlegt
ltauthorgtSpencerlt/authorgt ltpublishergtWrox
Presslt/publishergt ... lt/bookgt lt/booksgt
books
book
book
publisher
pages
isbn
author
title
PrenticeHall
The XMLHandbook
Goldfarb
655
...
Prescod
7
The Document Object Model
  • Provides a standard interface for access to and
    manipulation of XML structures.
  • Represents documents in the form of a hierarchy
    of nodes.
  • Is platform- and programming-language-neutral
  • Is a recommendation of the W3C (October 1, 1998)
  • Is implemented by many parsers

8
DOM - Structure Model
Document
books
book
book
Node
publisher
pages
isbn
author
title
Element
PrenticeHall
The XMLHandbook
Goldfarb
655
...
Prescod
NodeList
9
The Document Interface
Method Result
docTypeimplementation documentElement getElements
ByTagName(String) createTextNode(String) createCom
ment(String) createElement(String) create
CDATASection(String)
DocumentType DOMImplementation Element NodeLis
t String Comment Element CDATASection
10
The Node Interface
Method
Result
String String short Node NodeList Node
Node Node Node NodeNamedMap Node Node
Node Boolean
nodeName nodeValue nodeType parentNode childNodes
firstChild lastChild previousSibling nextSibling a
ttributes insertBefore(Node new,Node
ref) replaceChild(Node new,Node
old) removeChild(Node) hasChildNode
11
Node Types / Node Names
Result NodeType /NodeName
Node Node Node Fields
Type Name ELEMENT_NODE 1
tagName ATTRIBUTE_NODE 2 name of
attribute TEXT_NODE 3 "text" CDATA_SECTI
ON_NODE 4 "cdata-section" ENTITY_REFERENCE
_NODE 5 name of entity referenced ENTITY_NO
DE 6 entity name PROCESSING_INSTRUCTION_N
ODE 7 targetCOMMENT_NODE 8
"comment"DOCUMENT_NODE 9
"document"DOCUMENT_TYPE_NODE 10 document
type name DOCUMENT_FRAGMENT_NODE 11
"document-fragment" NOTATION_NODE 12
notation name

12
The NodeList Interface
Method Result
length item(int)
Int Node
13
The Element Interface
Method Result
tagName getAttribute(String) setAttribute(String
name, String value) removeAttribute(String) getAtt
ributeNode(String) setAttributeNode(Attr) removeAt
tributeNode(String) getElementsByTagName
String String Attr Attr Attr NodeList
14
DOM Methods for Navigation
parentNode
nextSibling
previousSibling
firstChild
lastChild
childNodes(length, item())
getElementsByTagName
15
DOM Methods for Manipulation
appendChild insertBefore replaceChildremoveChild
createElement createAttribute createTextNode
16
Example
books
book
book
author
author
author
Spencer
Prescod
Goldfarb
doc.documentElement.childNodes.item(0).getElements
ByTagName("author").
item(1).childNodes.item(0).data
17
Script
ltHTMLgt ltHEADgtltTITLEgtDOM Examplelt/TITLEgtlt/HEADgt ltBO
DYgt ltH1gtDOM Examplelt/H1gt ltSCRIPT
LANGUAGE"JavaScript"gt var doc, root, book1,
authors, author2 doc new
ActiveXObject("Microsoft.XMLDOM") doc.async
false doc.load("books.xml") if
(doc.parseError ! 0) alert(doc.parseError.rea
son) else root doc.documentElement docu
ment.write("Name of Root node " root.nodeName
"ltBRgt") document.write("Type of Root node "
root.nodeType "ltBRgt") book1
root.childNodes.item(0) authors
book1.getElementsByTagName("author") document.wr
ite("Number of authors " authors.length
"ltBRgt") author2 authors.item(1) document.wri
te("Name of second author " author2.childNodes.
item(0).data) lt/SCRIPTgt lt/BODYgtlt/HTMLgt
18
SAX - Simple API for XML
Docu-ment
DTD
Application
19
SAX - Simple API for XML
  • Event-driven parsing model
  • "Don't call the DOM, the parser calls you."
  • Developed by the members of the XML-DEV Mailing
    List
  • Released on May 11, 1998
  • Supported by many parsers ...
  • ... but Ælfred is the saxon king.

20
Procedure
  • DOM
  • Creating a parser instance
  • Parsing the whole document
  • Processing the DOM tree
  • SAX
  • Creating a parser instance
  • Registrating event handlers with the parser
  • Parser calls the event handler during parsing

21
Namespace Support
lt?xml version"1.0"?gt ltorder xmlns"http//www.net
-standard.com/namespaces/order"
xmlnsbk"http//www.net-standard.com/namespaces/
books" xmlnscust"http//www.net-standard.
com/namespaces/customer" gt ... ltbkbookgt
ltbktitlegtXML Handbooklt/bktitlegt
ltbkisbngt0130811521lt/bkisbngt lt/bkbookgt .... lt/or
dergt
22
Access to Qualified Elements
Node "book"
bkbook http//www.net-standard.com/namespaces/boo
ks bk book
23
Generation of Data Structures
24
Summary
  • To avoid expensive text processing, applications
    use an XML parser that creates a DOM tree of a
    document.
  • The DOM provides a standardized API to access the
    content of documents and to manipulate them.
  • Alternatively or additionally, applications can
    work event-based using the SAX interface, which
    is provided by many parsers.
Write a Comment
User Comments (0)
About PowerShow.com