An Extension to XML Schema for Structured Data Processing - PowerPoint PPT Presentation

About This Presentation
Title:

An Extension to XML Schema for Structured Data Processing

Description:

Mapping XML data into relational ... Not interoperable, and difficult to maintain ... A Glance of XML Data. A Glance of The Linked Schema. Schema Extension ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 20
Provided by: cseCu
Category:

less

Transcript and Presenter's Notes

Title: An Extension to XML Schema for Structured Data Processing


1
An Extension to XML Schema for Structured Data
Processing
  • Presented by Jacky Ma
  • Date 10 April 2002

2
Presentation Outline
  • The Problems
  • Research Objectives
  • The Schema Extension MMX
  • MMX Query System
  • Discussion
  • Conclusion

3
The Problems
  • Mapping XML data into relational tables
  • Not natural to XML structure
  • Efficient, but may not be a effective method
  • Legacy application-specific structured data
  • Similar modeling but proprietary implementation
  • Not interoperable, and difficult to maintain
  • Lack of modular design and thus difficult to
    combine to form more complex data structure
  • Meta-data can facilitate wide range of needs,
    while XML Schema is solely used for physical data
    validation nowadays

4
Research Objectives
  • To facilitate more effective searching and
    storing of XML contents by making use of
    meta-data (XML Schema)
  • Propose a data-oriented model to allow different
    storage mechanism, processing model, and query
    model on XML contents

5
Our Approach MMX
  • Use meta-data to map XML data into structured
    data objects
  • Define the structured data models conceptually
    and link the models to XML document structure
    syntactically
  • Meta-data is defined as an extension of XML
    Schema
  • The extension is called MMX (Multi Model XML)

6
Program Driven vs. Data Driven
Information for processing is hard-coded in
program
Program Driven
MMX!
Data Driven
Processing instruction is hard-coded in data?!
7
A Glance of XML Data
8
A Glance of The Linked Schema
9
Schema Extension
  • The extended schema is associated with a
    namespace
  • The extended schema goes within a schema element,
    like lttreeelementgt in the example
  • lttreeelementgt specify a single structure object
    instance
  • Name association for elements and attributes
  • Class hierarchies
  • lttreeelementgt -gt lttreeinternalgt -gt
    lttreeleafNodegt
  • finally to the structure specified in
    lttreeleafNodeValuegt
  • Additional properties in ltrootNodeAttrgt,
    ltinternalNodeAttrgt and ltleafNodeAttrgt
  • Schema writer has to know the structure model
    specification, while the XML writer only needs to
    know the given schema

10
Modeling
  • For an instance of MMX data object
  • As an encapsulated information object only
    accessible from the root, thus as a single tree
    node
  • As a mapping from root node, query method and
    query parameters to the value at leaf nodes
  • Leaf nodes may contain any valid XML content, as
    long as defined in the Schema
  • I.e. may contain another MMX data object
  • A query is modeled as a 3-dimension tuple
  • accessing-node, query-method, query-parameters
  • Accessing-node is specified by XPath
  • Query-method is specified in String Value
  • Query-parameters is multi-dimension depends on
    the current model

11
Modeling (2)
A
Tree(1) is accessible frompoint A, occasionally,
a query (e.g. A, spatial-search,(3, 5),
assuming Tree(1) will accept spatial-search
with two coordinates) may return point B as
answer, either by XPath of B or the XML subtree
of B. From this point B, user may drill down
the tree by issueanother query on Tree(2).
Tree (1)
B
Tree(2)
XML Elements..
12
Query with and without MMX
  • From the original XML data, we could not assume
    the semantics of the data
  • We can ONLY do XML-based query such as XPath
  • We can do the spatial query ONLY IF we can map
    the data into a R-Tree
  • After mapping the data into R-Tree
  • Spatial Queries
  • Give me the point at (2,7)
  • Give me the point nearest to (4,4)
  • Nearest Neighbor Search
  • Give me the point nearest to Franklin

13
Processing
  • Users might not know the type of the node (and
    not necessary to know). They are interested in
    what they can do
  • Users retrieved the list of possible operation by
    issuing a LIST-OPERATION method to the root
    element of a MMX object
  • Possible operations may include queries, updates,
    and other model-specific operations

14
MMX Query System
  • To show that the schema, modeling, and processing
    of MMX extension is workable
  • To illustrate how it assists in querying XML data
  • To facilitate as the platform for testing the
    implementation of arbitrary structured models
  • Implement with JDK1.4

15
System Design
Clients
XML
DOM
MMXDocument
Node Data
Schema
MMX Element
ParseSchema
FetchClasses
AbstractMMX Element
The Abstract Class defines common interface that
have to be implement in each MMX Element such as
LIST-OPERATION, QUERY, BUILD, etc.
Extends class
(Partly)Defines

VP-Tree
X-Tree
R-Tree
R-TreeSchema
Maps
16
Discussions - Pros
  • Compatible with the relational approach, and
    supersedes that.
  • Modular design promotes reusability and
    maintainability
  • XML flatten the legacy structured data to make
    them text-editable, easy to transport and process
    by different systems

17
Discussion - Cons
  • There is no generic syntax to precisely describe
    all kinds of structures models
  • The size of XML file is often larger than legacy
    data file
  • Each structure model needs additional
    implementation effort
  • Schema specification become longer and longer
    quickly as number of supported model increases

18
Conclusion
  • Propose a representation to encapsulate data
    structures
  • Describe XML data with the Schema conceptually as
    well as syntactically
  • Map legacy structure models into Schema, and map
    XML data to the structure models by the Schema
  • Structured data repository with increased
    interoperability, reusability, and
    transportability

19
QA
Write a Comment
User Comments (0)
About PowerShow.com