Title: ebXML Day (Barcelona 23.5.2002) Implementing ebXML Registry Information Model
1ebXML Day(Barcelona 23.5.2002)Implementing
ebXML Registry Information Model
2Some background...
- Technical Architect (Media and Telecom section,
TietoEnator in Brussels, Belgium). - Specialising in system architecture development
of Java/ XML solutions. - Open-source evangelist.
- New committer on ebxmlrr open-source project.
- Previously worked for Nokia (Finland), IBM Global
Services (Belgium)
3Some background...
- TietoEnator
- Staff of 10,000 and annual net sales of 1.1
billion euros. - IT Services organization with strong base in
Scandinavia esp. Finland, Sweden - Consulting, systems development and integration,
operation and support, product development
services, and software services. - In Belgium, working with both commercial and
public sector clients. - http//www.tietoenator.com/
4Our Project
- Implementation of MIReG metadata model and
framework - MIREG Managing Information Resources for
e-Government - Sponsored by European Commissions IDA initiative
- IDA Interchange of Data between Administrations
- IDAs mission using advances in information and
communications technology to support rapid
electronic exchange of information between Member
State administrations - http//europa.eu.int/ISPO/ida/
5Project Goal
- To implement a system for managing metadata about
information resources, documents and services. - To implement a system that facilitates
- content interoperability
- simplification of administrative processes
- improved information flows
- To allow users to
- locate and track documents, metadata and versions
- search and manage content
- search and manage administrative metadata
6What is Metadata??
- Data about data.
- Metadata describes a resource e.g
- Name
- Title
- Subject
- Date issued
- Version
- Date modified
- Identifier
- Dublin Core standard is simple standard for
describing a wide range of networked resources.
See http//dublincore.org
7Dublin Core Elements
Creator Publisher
Contributor Relation
Coverage Rights
Date Subject
Description Source
Format Title
Identifier Type
Language
8MIReG Metadata Model Framework
- Metadata management system manages metadata about
information resources, documents and services - Describes citizens, enterprises, public servants,
long-lived information (e.g. archived documents). - Dublin Core MIReG extensions
- administrative metadata - to describe how the
resource should be managed and processed - access rights
- security classification
- disposal
- long-term preservation
- etc
9Functional requirements
- Metadata management system should support
- Exporting documents and their metadata
- Converting existing metadata to Dublin Core/RDF
- Adding or updating administrative metadata
- Storing metadata
- Providing metadata search capability
- Importing documents and their metadata.
10Why ebXML?
- Metadata Management System should
- be flexible and evolutionary
- facilitate content interoperability i.e.
information exchange between organisations - be standards compliant and open
- provide well defined interfaces, allowing
Creation, Update, Retrieval and Deletion of
metadata and content
11Non-functional requirements
- All open-source solution, adopting best of
breed solutions (e.g. Apache WS, Apache Tomcat,
Apache Xindice, Castor Java-XML binding) - Co-operate with open source community wherever
possible. - Delivered system based on standards such as
ebXML, W3C Schema, RDF, Dublin Core. - Total XML solution from database to user
interface.
12Tools and APIs Requirements
- Open-source!! Ability to see the code and make
changes if necessary... - Support open-source community responds quickly
to bug reports and questions. - Dont reimplement ebXML Registry from scratch
co-operate with existing open-source team(s). - Dont reinvent the wheel reuse best existing
solutions. - Use stable and well adopted tools (e.g. Apache
Web Server 1.3, Apache Tomcat 4.0)
13Tools and APIs Problems
- Steep learning curve ... many new tools and APIs
to master. - Support for full W3C Schema standard not
available in Castor Java-XML binding. - No concrete JAXB implementation available that
supports W3C Schema (only DTD) - Xindice 1.0 only really supports US-ASCII (UTF-8
patch now available in Xindice 1.1 development..) - Xindice XPath contains() search is slow. Must use
equality tests to gain benefits of indexing. - Xindices transaction support not yet available
... -
14Standards Technologies (1)
- Standards
- JAXP Java API for XML Processing
- JAXB Java API for XML Binding
- JAXR Java API for XML Registries
- JAXM Java API for XML Messaging
- SOAP Version 1.1
- W3C XML Schema
- RDF RDF Schema
- XSLT (Version 1.0 )
- XPATH (Version 1.0 )
- XML DB
- Java Servlet specification (Sun Microsystems)
version 2.3 - JSP specification (Sun Microsystems) version
1.2
15Standards Technologies (2)
- Open Source technologies tools
- Apache Web Server 1.3
- Apache Tomcat 4.0.3 servlet engine
- Apache SOAP (XML messaging API)
- Apache Xerces XML parser
- Apache Xalan XSLT processor
- Apache Xindice (DBXML) native XML database
- Castor open source framework for Java XML
binding - All software written using the Java programming
- language (Java versions 1.3 and 1.4)
16Architecture Overview (High Level)
17Architecture Overview (ebXML Service Layer)
18Architecture Overview (XML database layer)
19Xindice v Relational
- Relational database model
- Tables
- Views
- Data is structured, based on pre-defined schema
- Standardised queries via SQL (SELECT, INSERT,
UPDATE, DELETE etc..) - Most RDBMS support JOIN operations
- Possible to make XML to Relational mapping (e.g.
IBM DB2 XML Extender)
20Xindice v Relational
- Xindice database model
- Hierarchical organisation of data
- The root of the hierarchy is a database instance
- Data managed as XML Documents
- Insert the data as XML and retrieve it as XML
- Sets of documents form a Collection (similar
- idea as file system folder)
- Queries with using standard XPath (Query engine
built around Apache Xalan) - Indexation system speeds Xpath query performance
21Mapping ebXML RIM to Xindice
- Main Concepts
- All ebXML RIM components stored as separate XML
documents - In Xindice ebXML RegistryObject id used as
document id - Use Association to link two RegistryObjects e.g
- ltrimObjectRef id"urnuuidb2345678-1234-1234-123
456789077"/gt - ltrimObjectRef id"urnuuidc2345678-1234-1234-123
456789012"/gt - lt! Association describes relationship between
these two objects --gt - ltrimAssociation associationType"Packages"
sourceObject"urnuuidb2345678-1234-1234-12345678
9077" targetObject"urnuuidc2345678-1234-1234-12
3456789012"/gt
22Mapping ebXML RIM to Xindice
- All XML data is wrapped in a custom
ltRegistryDatagt wrapper. - ltRegistryDatagt wrapper contains namespace
declaration - ltRegistryData xmlns"urnoasisnamestcebxml-reg
reprimxsd2.0" xmlnsrim"urnoasisnamestcebx
ml-regreprimxsd2.0gt - XPath queries include namespace prefix e.g.
//rimExtrinsicObject - All ebXML RIM components are stored in same
collection
23ebXML RIM and Dublin Core
- Metadata mapped to ExtrinsicObject slots
- e.g creatorArthur C. Clarke maps to
- ltSlot namecreator" slotTypedc-metadata"gt
- ltValueListgt
- ltValuegtArthur C. Clarkelt/Valuegt
- lt/ValueListgt
- lt/Slotgt
- Sub-set of ebXML RIM implemented in short-term
User, Slot, ExtrinsicObject, AuditableEvent,
Association, ExternalLink
24Querying Xindice with XPath (1)
- W3C standard XPath
- Advanced path like expressions, allowing node
selection and filtering - Example
- ltrimExtrinsicObject id"urnuuidb089d653-bad1-41
d6-93ad-9dc93c055339"gt - ltrimNamegt
- ltrimLocalizedString value"ebXML RIM Schema
metadata"/gt - lt/rimNamegt
- ltrimDescriptiongt
- ltrimLocalizedString value"metadata about
ebXML RIM schema"/gt - lt/rimDescriptiongt
- lt!-- metadata here as slots --gt
- ltrimSlot name"title" slotType"schema-metadata"
gt - ltrimValueListgt
- ltrimValuegtebXML RIM W3C Schemalt/rimValuegt
- lt/rimValueListgt
- lt/rimSlotgt etc . . .
25Querying Xindice with XPath (2)
- Select ExtrinsicObject with identifier
urnuuidb089d653-bad1-41d6-93ad-9dc93c055339 - //rimExtrinsicObject_at_identifier'urnuuidb089d
653-bad1-41d6-93ad-9dc93c055339' - Case sensitive Select ExtrinsicObject with Slot
whose name is title and whose value list entry
contains the word ebXML - //rimExtrinsicObjectrimSlot_at_name'title'/rim
ValueList/rimValuecontains(.,'ebXML')
26Querying Xindice with XPath(3)XPath JOIN
- One Xindice collection can be queried as one
large document - Example
- ltrimExternalLink id"acmeLink2"gt
- ltrimNamegt
- ltrimLocalizedString value"Link 2"/gt
- lt/rimNamegt
- ltrimDescriptiongt
- ltrimLocalizedString value"ACME's Link 2"/gt
- lt/rimDescriptiongt
- lt/rimExternalLinkgt
- ltrimAssociation id"acmeLink2-alreadySubmittedCPP
-Assoc" associationType"ExternallyLinks"
sourceObject"acmeLink2" targetObject"urnuuida2
345678-1234-1234-123456789012"/gt
27Querying Xindice with XPath (4)XPath JOIN
- XPath
- Get the RegistryObject whose id is the same as
the targetObjects id of the Association whose
sourceObjects id is acmeLink2 - //_at_id//rimAssociation_at_sourceObject'acmeLink2
'/_at_targetObject
28Querying Xindice with XPath (5)XPath JOIN
- Example
- ltExtrinsicObject id"urnuuid548b6bf0-cf77-4450-9
efe-ee465b504484" status"Submitted"
xmlns"urnoasisnamestcebxml-regreprimxsd2.0
"gt -
- lt/ExtrinsicObjectgt
- ltAuditableEvent
- id"urnuuid724719b2-6b4f-41ca-b910-af5219ebcdd9
" objectType"AuditableEvent" - eventType"Created"
- registryObject"urnuuid548b6bf0-cf77-4450-9efe-
ee465b504484" timestamp"2002-05-15T113856.980"
- user"urnuuid921284f0-bbed-4a4c-9342-ecaf0625f9
d7" xmlns"urnoasisnamestcebxml-regreprimxsd
2.0" /gt
29Querying Xindice with XPath (6)XPath JOIN
- XPath
- Get all RegistryObjects created by user with id
'urnuuid921284f0-bbed-4a4c-9342-ecaf0625f9d7 - //_at_id//rimAuditableEvent_at_eventType'Created'
and _at_user'urnuuid921284f0-bbed-4a4c-9342-ecaf0
625f9d7'/_at_registryObject
30XUpdate
- XMLDB initiative specification
http//www.xmldb.org/xupdate - Batch modifications against XML document set.
- Example
- ltxupdateupdate select"//rimUser_at_id'urnuuid
921284f0-bbed-4a4c-9342-ecaf0625f9d7'/rimEmailAd
dress/_at_address"gtpeter.burgess_at_tietoen
ator.comlt/xupdateupdategt
31Castor Java-XML Binding (1)
- Implements majority of W3C Schema recommendation
(e.g. no Union) - UnMarshal a java.io.Reader into Java object
- StringReader stringReader new
StringReader(extObjXML) - ExtrinsicObject extObj ExtrinsicObject.unmarshal
(stringReader) - Marshall Java object to java.io.Writer
- extObject.marshal(stringWriter)
- String extObjXML stringWriter.toString()
-
32Castor Java-XML Binding (2)
- Fast, reliable, performant
- Uses SAX
- High level interface.
- Manipulate XML document as Java Object
- No need to walk the DOM tree, or build custom SAX
handlers
33Lessons Learned
- Reuse of existing solutions saves much time in
long term - Access to all software sources was invaluable
make own bug fixes on the spot. - Open-source is a two-way street. Use others
solutions and also contribute your own. - Solid architecture because we took time to
carefully design the system (plus prototyping,
learning new APIs) - XML databases offer a very realistic solution for
projects with XML data storage needs. - XPath is very powerful even possible to
implement JOIN in Xindice.
34References
- ebxmlrr project http//sourceforge.net/projects/eb
xmlrr - Apache Xindice http//xml.apache.org/xindice
- XMLDB initiative http//www.xmldb.org
- Castor Java-XML Binding http//castor.exolab.org/
- IDA (European Commission) http//europa.eu.int/ISP
O/ida - TietoEnator http//www.tietoenator.com
- Peter Burgess - peter.burgess_at_tietoenator.com