Use of Native XML Database In TAPoR Project - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Use of Native XML Database In TAPoR Project

Description:

Introduces XMLType functions integrated in the common SQL statement: ... more than enough documentation, tutorials, discussions, all around the Internet; ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 22
Provided by: Eri7253
Category:

less

Transcript and Presenter's Notes

Title: Use of Native XML Database In TAPoR Project


1
Use of Native XML Database In TAPoR Project
  • Eric Zhang
  • Dec 02, 2003
  • TAPoR Project - Alberta

2
Data Vs. Document
  • Data-centric XML documents focus only on the data
    contained inside the document, such as an address
    book. The order of the element is not important
  • Document-centric XML documents not only focus on
    the data contained inside, but also on the
    structure of the doc, such as the element
    ordering information.

3
Storing XML using Relational DB
  • Mainly for data-centric documents where the data
    is inserted into different tables according to
    some mapping strategy
  • ltstudentgt
  • ltnamegt Eric lt/namegt
  • ltagegt 27 lt/agegt
  • lt/studentgt

Table Student
4
Problem with Relational DB
  • The mapping strategy is hard to define when the
    document has a very complex structure
  • When the document is semi-structured, either a
    lot of tables or a table with lots of null
    columns is created
  • The ordering information about the elements,
    processing instructions, comments, etc. are lost
  • When retrieving documents, a lot of multi-table
    join queries need to be performed which make the
    queries very slow

5
Native XML Database
  • Store XML document as it is

In Native Database ltPlaygt ltScenegt This is a
sweet story ltPb gradeab/gt happened long
long ltTd nameterrific/gt lt/Scenegt lt/Playgt
A XML File ltPlaygt ltScenegt This is a sweet
story ltPb gradeab/gt happened long long ltTd
nameterrific/gt lt/Scenegt lt/Playgt
6
Advantage of using Native XML DB
  • Document completes round-trip as an xml
    document
  • Keeps all information, such as the data contained
    inside, the ordering of elements, the processing
    instructions, comments, etc.
  • Index can be created based on element, attribute,
    etc. to speed up the query process
  • Most products support standard XPath queries
  • Since most native XML DBs usually create an index
    for element and attribute, the performance is
    better than just querying the files in a file
    system

7
General Query Process in Native XML DB
  • Using XPath query against a xml document or a set
    of documents.

Give me the document that has Basic as the
value of attribute start in element ltcontentgt,
which is the child element of ltchaptergt, which is
the child element of ltdocgt. Query db
with /doc/chapter/content_at_startBasic Then
with the returned document or document fragment,
you can perform more processing using API
provided or convert the result to string or a DOM
tree
ltdoc namea.xmlgt ltchaptergt lttitlegt
Oracle lt/titlegt ltabstract count50gt Sth
about oracle lt/abstractgt ltcontent
startBasic endAppendixgt Table, API,
PL/SQl, .. lt/contentgt lt/chaptergt
.. lt/docgt
8
Discussion of 4 Systems
  • Oracle 9i XML DB (Commercial)
  • dbXML (Open Source)
  • eXist (Open Source)
  • Xindice

9
Storage
10
Schema/DTD support
11
Query
12
Update
13
Index
14
Basic DB Function
15
Programming API
16
Other Tools
17
Support
18
Comparison Chart
Very Good 3 Not Bad 2 Uncompetitive
1
19
Oracle Vs. eXist
eXist's advantage - Support DTD,
similar to cocoon, using catalog file
- Return result separately even they happen
in a single file - Automatically
indexing, we don't need to worry about index
- Support XMLDB API, which support
SAX - Integrate with Cocoon
- Free to use, easy to install and
manage eXist's disadvantage -
Can only store XML file, not other format
- Only very basic db functions are
provided - Developed only by a
single person, little support and doc
- The expendability and stability couldn't be
foresee Oracle's advantage -
Can store other format file, as well as
relational data - Can save
storage if document is schema-based
- Specific index can be created as you need
- Provide all Oracle's db function
- Provide a big set of XML tools
- Huge collection of doc, support,
articles, even from oracle directly
- The expendability and stability is supposed
to be good Oracle's disadvantage
- Doesn't support DTD - Only
return result as a single hit even
multi-occurrence happened in one file
- Query need to be foreseen so that index can
be created, query without index is very
slow(???) - The JDBC api doesn't
support as much as functions as XMLDB API
- Doesn't support SAX -
Expensive, hard to install, manage
20
Conclusion
  • All 4 database systems can basically do what we
    want -- store XML documents, and query against
    XML data.
  • dbXML and Xindice are the worst in comparison
    with the other two, while Oracle 9i and eXist
    have their different advantages. However, not all
    circumstance can be foreseen right now, so it is
    hard to say either Oracle or eXist is definitely
    better.
  • Oracle 9i and eXist are worth further research
    about their features, performance, etc. We can
    use them parallel for different data collections.

21
Performance Comparison
Write a Comment
User Comments (0)
About PowerShow.com