TECH854 XML TECHNOLOGIES Programming with XML - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

TECH854 XML TECHNOLOGIES Programming with XML

Description:

{ MyCar car; car.setColor(Color.green); What is wrong. with these programs ? 7. Gillian Miller ... Scanner (lexical) returns set of tokens and deals with whitespace ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 51
Provided by: gillian91
Category:

less

Transcript and Presenter's Notes

Title: TECH854 XML TECHNOLOGIES Programming with XML


1
TECH854XML TECHNOLOGIESProgramming with XML
  • Gillian Miller
  • gillian_at_ics.mq.edu.au
  • Room 378

2
PROGRAMMING WITH XML
  • Week 5 Parsing XML - SAX
  • Week 6 XML to objects - DOM
  • Week 7 XML Web Applications
  • Week 8 XML Integration, Data Exchange and
    Databases
  • Making XML work in real applications
  • This part of the course requires programming (ie
    Java) and assumes programming expertise
  • Plenty of examples and code tempates

3
Microsoft
Java
IBM
XML Mediator
Java
Web Online Inventory
XML
Warehouse Oracle
Web Server Apache
HTML
Web Server
Java
XML
Web Online Shopping
XML Transformers
Web Content Management
XML Content
XML is format for data exchange and content
management
4
Why JAVA ?
  • Portability - Java Virtual machine
  • Web and Application Services - J2EE standards
  • Object-Oriented Language - logical and easy to
    program
  • Java playing increasing role in e-commerce,
    application integration and large information
    systems arena
  • Industrial strength capabilities
  • Scaleability
  • Security model
  • Performance (good middle ground)
  • Multithreading
  • Java includes extensive library of frameworks and
    interfaces (the API) - JDBC, JAXP, Servlets, JSP,
    JAXB

5
XML and Java Alphabet soup
JavaBean
Java
RMI
DTD
JDK
XML
JDBC
EJB
XML-Schema
Servlets
DOM
Java WSDK
JavaDoc
JAXP
JSP
JDOM
SAX
JAX-RPC
XSLT
JAXM
WSDL
Soap
JAXR
6
Java Prerequisites
public class HelloWorld public String
getHello() return Helloo public
static void main (String args) int
j2 System.out.println( Hello World
) System.out.println( getHello() j
)
What is wrong with these programs ?
public class MyCar Color carColor
public void setColor (Color color)
this.carColor color public static
void main(String args) MyCar car
car.setColor(Color.green)
7
Java Prerequisites
  • Classes, Objects, methods, nulls, base types
  • C syntax
  • Packages, imports, Javadoc, common libraries
  • Inheritance, constructors
  • Abstract classes, interfaces
  • Event programming, callbacks, listeners
  • Exception handling
  • Class paths - java compilation
  • Competent use of an IDE
  • use of java in some context - eg database,
    ticketing application, game, client/server

8
Parsing XML
TextArea
Orders
HTTP Post
Product
Person
lt?xml version 1.0?gt ltordersgt ltorderitem
personid235 gt ltproductgtchoc
lt/productgt ltqtygt535lt/qtygt
lt/orderitemgt lt/ordersgt
OrderItem
Address
Servlet
DB Connection
We program with objects Objects and Arrows But
XML is simply an ASCII file
9
Class ExerciseHow to read in an XML file?
lt?xml version 1.0?gt ltordersgt ltorderitem
personid235 gt ltproductgtchoc
lt/productgt ltqtygt535lt/qtygt
lt/orderitemgt lt/ordersgt
Write some pseudo code to read the XML file and
turn it into an order object ? What are some
issues ?
10
XML File
SAX or DOM
Lexical Tokeniser
Parser Grammar Checker
Internal Form
11
Lexical scanning/parsing
  • At base level - file is sequence of ascii
    characters
  • Scanner (lexical) returns set of tokens and deals
    with whitespace
  • (can it throw all the whitespace away?)
  • Parser - checks the XML grammar
  • (how many grammars are there?)
  • startdocument
  • lt?xml version1.0gt - Processing Instruction
  • ltorderitemgt - Tag
  • lt/orderitemgt - End Tag
  • 545 - characters

12
Processors
  • XML has well developed processors for this task
  • Take advantage of this work
  • Months of development time plus wide use in
    market place
  • Issues - new standards (eg what about schema)
  • Complexities - CData, external entities,
    parameter entities, namespaces
  • Recall why XML has taken off as data interchange
    format
  • Because we already have processors and agreed
    syntax, we do not have to reinvent wheel of
    syntax, grammars and parsing of one-off formats !!

13
XML Parsers
  • XML Processors
  • Parsers determine well-formedness of document
  • Optionally determine validity
  • Two common models
  • SAX Simple API for XML
  • DOM the Document Object Model

14
DOM versus SAX
xml.fujitsu.com/en/tech/dom/
15
SAX Parser
  • SAX provides an event based interface to the
    parser.
  • User callbacks are associated with events for
  • startElement
  • endElement
  • characters (text)
  • startDocument etc
  • The callbacks are passed the data associated with
    the tag or the text of the character data.
  • Good for large XML files - especially where
    processing required is linear.
  • For us this is lower level - good place to start
    - however SAX requires much more programming than
    other XML processors

16
Step 1 - Obtain Processor
  • Apache Xerces http//xml.apache.org
  • Others
  • Sun Microsystems Crimson
  • Microsofts MSXML Parser
  • IBM XML4J
  • Oracle XML Parser
  • Check support of SAX 2.0, DOM Level 2
  • To use Xerces with Java
  • Download and unzip - eg C\xerces-1_4_3
  • Include in CLASSPATH
  • set CLASSPATH.c\xerces-1_4_3\xerces.jar

17
http//java.sun.com/xml/jaxp/dist/1.1/docs/tutoria
l/sax/2a_echo.html
18
Step 2 - UnderstandFramework
DefaultHandler
http//java.sun.com/xml/jaxp/dist/1.1/docs/tutoria
l/overview/3_apis.html
19
SAX API Framework
  • SaxParserFactory - creates instance of factory
  • SaxParser - does the work, you pass it your
    DefaultHandler class
  • DefaultHandler - wrapper for 4 classes below
  • you extend this class and implement methods you
    require
  • ContentHandler - interface - most of the work -
    methods such as startElement, endElement
  • ErrorHandler - handles errors - 3 methods error,
    warning, fatalError
  • DTDHandler - DTD entities, you probably wont need
  • EntityResolver - resolve external entities, you
    probably wont need
  • SAXReader - only if you want to get low level
    getXMLReader events

20
Work Through Example
  • We will work through an example from Sun
  • This is a good tutorial for you to work through
    later, and is available online
  • The code is detailed and intricate, you will need
    to study again at your leisure. However the
    tutorial covers many detailed aspects of the SAX
    API and is a valuable resource.
  • Files - echo2.java, slideSample01.xml

http//java.sun.com/xml/jaxp/dist/1.1/docs/tutoria
l/sax/2a_echo.html
21
Imports
import javax.xml.parsers.
import org.xml.sax.helpers.DefaultHandler
import org.xml.sax.
22
Main Body
you extend DefaultHandler !
get instance of your class DefaultHandler
get instance of SAXParserFactory
then get the SAXParser
This does the Work handler calls itself
using callbacks
23
Event Programming
  • "An event driven program is just a bunch of
    objects laying around waiting for an event to
    happen."
  • Do initial setup (register handlers)
  • Start program
  • Program waits for event to happen
  • When event happens, event handler springs into
    action
  • Program then waits for next event
  • Some Java examples - WindowListener,
    ButtonHandler, SAX

24
SAX Event Handlers
  • startDocument()
  • endDocument()
  • startElement( String uri, String localname,
  • String qname, Attributes atts)
  • endElement(String name)
  • characters (char ch, int start, int length)
  • ignorableWhitespace(char ch, int start, int
    length)
  • setDocumentLocator (Locator locator)
  • uri - namespace URI
  • localName - unprefixed name
  • qname - with prefix eg oraelement

25
Back to Echo Program
These methods are part of the Echo2 class
defined earlier
Recall line - saxParser.parse( new
File(argv0), handler) Registers this object
as the handler Then when events occur, call
thyself
26
Echo Program
startElement
attrs.getLength()
attrs.getValue(i)
27
Echo Program
endElement
characters
why StringBuffer ??
28
Utilities
29
Results of running ECHO2
30
Document Locator
  • public void setDocumentLocator(Locator loc)  
  •  try     
  • out.write("LOCATOR")
  •     out.write("SYS ID " loc.getSystemId()
    )
  •     out.flush()   
  • catch (IOException e)     // Ignore
    errors   
  • gtgt LOCATOR SYS ID fileltpathgt/../samples/slideSam
    ple01.xml
  • Store locator so when callback occurs, you can
    have access to file information
  • Must only be used within scope of the document
    parse

31
try, catch, throws, exception
  • Javas inbuilt mechanisms for when things go
    wrong
  • can not convert a string to a number, sql errors,
    file IO
  • an Exception object is generated - thrown
  • your job is to catch exception
  • statements that cause a problem are encased in a
    try statement
  • SAX Errors - fatalError, error, warning

32
Exceptions
try // statements that could cause a
problem catch (SaxException) // error
statements
Exception Methods getMessage() getLineNumber
() getSystemId ()
33
SAX Exceptions
  • warnings
  • usually related to DTD, informative, eg element
    defined twice
  • fatalError
  • necessitates stopping parser - eg not well-formed
  • error (Non Fatal error)
  • related to DTD, validating error
  • Default is to keep going - you may wish to
    override
  • public void error(SaxParseException exc) throws
    SAXException
  • System.out.println(Parsing Warning\n
  • Line exc.getLineNumber() \n
    URI exc.getSystemId() \n
  • Message exc.getMessage())
  • throw new SAXException(Warning
    encountered)

34
Introducing DTDs to XML
  • If you introduce a DTD your parser will behave
    slightly differently

extra CHARS
35
DTD Processing
  • Without DTD whitespace is returned
  • With DTD, whitespace can be ignored when element
    structure is known
  • BUT YOU May wish to preserve or track white space
  • eg to preserve document indenting
  • Use method ignorableWhiteSpace
  • public void ignorableWhitespace char buf, int
    offset, int Len)
  • throws SAXException
  •   nl()   
  • emit("IGNORABLE")

36
Validating Documents
  • Validating Documents
  • Schema or DTD must be present
  • the ignorableWhitespace method is invoked
    whenever possible
  • Note that a DTD is processed even if it is not
    validated
  • To turn validation on, you must use the
    validation feature

37
Using a Validating Parser
public static void main(String argv)   if
(argv.length ! 1)     ...      // Do not use
the default (non-validating) parser     // Use
the validating parser    SAXParserFactory
factory SAXParserFactory.newInstance()
  factory.setValidating(true)    try
    ...
see echo10.java
38
Features
  • A feature is a flag used by processor to indicate
    whether a certain type of processing should
    occur. eg in JAXP use
  • factory.setNamespaceAware(true)
  • factory.setValidating(true)
  • Note that this is not related to the DTDHandler
    interface

39
LexicalHandler
  • Sometimes you may wish to preserve the original
    XML document as it is
  • eg entities lt rather than lt
  • comments
  • CDATA
  • There is an interface called LexicalHandler
  • comment(String comment)Passes comments to the
    application.
  • startCDATA(), endCDATA()Tells when a CDATA
    section is starting and ending, which tells your
    application what kind of characters to expect the
    next time characters() is called.
  • startEntity(String name), endEntity(String
    name)Gives the name of a parsed entity. 
  • startDTD(String name, String publicId, String
    systemId), endDTD()Tells when a DTD is being
    processed, and identifies it.

40
JAXP
  • JAXP - Java API For XML Processing
  • makes it easier to process XML data
  • Abstract Layer between program and SAX, DOM, XSL
  • Provides namespace support
  • The examples so far have used JAXP, which means
    imports and factory calls are easier
  • Underneath, it still uses the DOM and SAX API

41
Using SAX Without JAXP
import org.xml.sax.helpers.XMLReaderFactory impor
t org.xml.sax.XMLReader import
org.xml.sax.helpers.DefaultHandler import
org.w3c.doc.Document DefaultHandler
handler String filename XMLReader parser
XMLReaderFactory.createXMLReader() parser.setCont
entHandler(handler) // parser.setDTDHandler(handl
er) // parser.setErrorHandler(handler) parser.pa
rse(filename) // parser.setFeature(http//xml.o
rg/sax/features/namespaces, true) parser.setFeat
ure(http//xml.org/sax/features/validation,
true)
p57, Maruyama et al
42
SAX 1 - Now Deprecated
Used to have to specify all methods in interface
documentHandler, HandlerBase, AttributeList,
XMLReader methods have been replaced by
equivalences in SAX 2.
http//developer.java.sun.com/developer/technicalA
rticles/xml/JavaTechandXML/
43
SAX 1 - DocumentHandler
public class myXMLHandler implements
DocumentHandler public void
setDocumentLocator(Locator loc) public
void startDocument() public void
endDocument() public void
startElement (String tag, AttributeList atts)
public void endElement (String tag)
public void characters (char ch, int
start, int len) public void
ignorableWhiteSpace(char ch, int start, int
length) public void processingInstructi
on(String target, String data) and to
call XMLReader xmlReader
XMLReaderFactory.createXMLReader(PARSER_NAME) xml
Reader.setContentHandler (myXMLHandler) xmlReader
.parse ( filename.xml)
In SAX2 , this has been replaced by
ContentHandler or better still DefaultHandler
(an ADAPTER class)
44
Notes
  • Often StartElement and EndElement will end up
    doing a lot of the processing work
  • You will end up with a set of nested case
    statements for each element type
  • You may need something extra to keep track or
    where you are
  • Consider using a Stack or a set of state
    constants

45
Use of States
protected static final int ROOT 0, CATALOG 1
public void startElement(String uri,
String localName, String tag, Attributes atts)
if ( tag.equals(Catalog) state
ROOT) state CATALOG else if
(tag..equals(Product) state CATALOG)
state PRODUCT id
atts.getValue(id) text
null .. public void endElement(String uri,
String lname, String tag) if
(tag.equals(Catalog) state CATALOG
..
Marchal, p23 - 25
46
Some Java References
Bradley, Millspaugh, Programming with Java
Bruce Eckel Thinking in Java
Horstmann Core Java
High Level Introduction
Online as PDF
Course notes http//www.comp.mq.edu.au/courses/com
p833/ password is comp833a http//www.comp.mq.edu
.au/courses/comp824/ password is comp824dis
47
Java and XML References
McLaughlin Java XML, 2nd e
Benoit Marchal Applied XML Solutions
Maruyama et al XML and Java - Second Edition
48
Java Documentation Online start using it !
49
Tools
  • Computer, Internet, monitor (windows/unix)
  • Text Editor (Notepad, vi)
  • Command/ms-dos window
  • XML Tools
  • Xerces
  • Java Development Kit
  • can be downloaded from Sun - also check out the
    Java Web Services Development kit
  • IDE -
  • Bluej is in labs - it is small with nice editor -
    can download from www.bluej.org for home use
  • Netbeans - free and more sophisticated - includes
    servlets - www.netbean.org
  • JBuilder - in labs - need to register to get key
  • JDeveloper - in labs - can use for compilation -
    we cant change the library paths

50
Example - Primitive Types
public class MyBasicVars ( public static void
main ( String args ) double pi
3.14 int j 1 boolean more true while (
more ) int jsquare j j System.out.print
ln ( Square is jsquare) j j 1 if
( j gt 10 ) more false
double varname int varname boolean varname
while (cond)
if (cond)
Write a Comment
User Comments (0)
About PowerShow.com