Xerces2: The Sequel With No Equal - PowerPoint PPT Presentation

About This Presentation
Title:

Xerces2: The Sequel With No Equal

Description:

ApacheCon US - Las Vegas, Nevada. 1. Xerces2: The Sequel With No Equal. Andy Clark ... ApacheCon US - Las Vegas, Nevada. 5. 20 November 2002. Design ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 28
Provided by: AndyC69
Learn more at: http://people.apache.org
Category:
Tags: equal | las | sequel | vegas | xerces2

less

Transcript and Presenter's Notes

Title: Xerces2: The Sequel With No Equal


1
Xerces2The Sequel With No Equal
  • Andy Clark

2
Introduction
  • Speaker
  • Worked for IBM
  • Currently unemployed ?
  • Parser
  • First developed in IBMs Tokyo research lab
  • Maintained and expanded in California
  • Donated to Apache
  • Work continues in Toronto

3
Agenda
  • Xerces1 Overview
  • Design and problems
  • Xerces2 Overview
  • Challenges and design
  • Q A

4
Xerces1 OverviewDesign and Problems
  • Andy Clark

5
Design
  • XML4J/Xerces1 designed for performance
  • Parser Implementation
  • Parsing pipeline
  • Custom reader implementations
  • StringPool
  • Defers transcoding of byte buffers until needed
  • Symbol table for common document strings

6
Pipeline Configuration
  • Intended to be generic

Scanner
Validator
Parser
XML
API
7
Pipeline Configuration Problems
  • Hard-coded dependencies on implementation
  • Inconsistent Interfaces

XML
API
Dependency
Different Interfaces
8
Custom Readers
9
Custom Readers Problems
  • Duplicated code
  • Allows more bugs to appear
  • Bugs are different based on encoding because code
    is not shared
  • More complicated

10
Deferred Transcoding
11
Deferred Transcoding Problems
  • All components need reference to StringPool
  • Strings not immediately available to methods
  • Must make call to StringPool to query String
  • Memory management is complicated
  • Responsibility of callee to free resources
  • Uses more memory

12
Xerces2 OverviewChallenges and Design
  • Andy Clark

13
Challenges
  • Requirements
  • Simple design and implementation
  • Easy to maintain
  • More modularity and configurability
  • Support current and future features
  • Design Decisions
  • Always transcode bytes into Unicode characters
  • Removes StringPool and dependencies
  • Clean architecture

14
Xerces Native Interface (XNI)
  • Streaming Information Set
  • Similar to SAX
  • No loss of document information
  • Parser configuration and layering
  • Future extensions
  • Native pull-parser, tree model, etc.
  • Does not preserve all document information but
    communicates more information to the application
    than DOM or SAX.

15
(No Transcript)
16
Parsing Pipeline
  • Handlers communicate information between parser
    components

17
Handler Overview
XMLDocumentHandler
XML
API
XMLDTDHandler XMLDTDContentModelHandler
18
Parser Layout
  • Components and Manager

Component Manager
Regular Components
19
Reader Management
20
Parser Configuration
  • Before

Parser pipeline is part of the document parser
base class.
Required duplication to re-configure parser and
still take advantage of API generator code.
XML
21
Parser Configuration
  • After

Parser pipeline and settings are specified in a
separate parser configuration object.
Allows re-use of framework without rewriting
existing code.
22
API Generators
  • Different APIs can be generated from same
    document parser

JavaBean Parser

SAX Parser
DOM Parser
XNI
Document Parser
23
Sample Parser Configuration 1
  • HTML parser
  • Available as NekoHTML download

HTML Parser Configuration
HTML Scanner
HTML
Tag Balancer
24
Sample Parser Configuration 2
  • Non-validating parser (for performance)
  • Available with Xerces download

Non-Validating Parser Configuration
Scanner / Namespace Binder
XML
25
Sample Parser Configuration 3
  • XInclude processing
  • Not yet implemented

XInclude Parser Configuration
Scanner
XML
XInclude
Validator
26
Sample Parser Configuration 4
  • Database result set converted to XML
  • Not yet implemented

Database Parser Configuration
DB
Database Query
Validator
27
Thats All, Folks!
  • Question and Answers
  • Any questions?
  • Links
  • http//www.apache.org/andyc/xml/present/
  • http//xml.apache.org/xerces2-j/
  • http//www.apache.org/andyc/neko/
Write a Comment
User Comments (0)
About PowerShow.com