Title: A%20Performance%20Evaluation%20of%20Alternative%20Mapping%20Schemes%20for%20Storing%20XML%20Data%20in%20a%20Relational%20Database
1A Performance Evaluation of Alternative Mapping
Schemes for Storing XML Data in a Relational
Database
- By
- Daniela Floresu
- Donald Kossmann
2Table of Contents
- Introduction
- Approaches to Store Semi-Structured Data
- Data Model for Semi-Structured Data
- Query Language and XML-QL
- Storing XML Data in Relational Database
- Mapping Attributes
- Mapping Values
- Evaluating the Mapping Schemes
- Conclusion
3Introduction
- August 3, 1999
- How XML data can be stored and Queried
- Presented alternative Mapping Schemes to Store
XML data - Performance experiments that analyze the
tradeoffs of the schemes
4Approaches to Store Semi-Structured Data
- Special Purpose Database System
- Examples are Lore, Rufus and Strudel
- Store and retrieve xml data, using specially
designed structures and indices - Object Oriented Database
- Example is O2 or Objectsore
- Rich data modeling capabilities of OODMS are
exploited - Standard Relational Database System
- Data is mapped in tables of a relational schema
5Data Model for Semi-Structured Data
- Characteristics of Semi-Structured Data
- Schema is not given in advance, may be implicit
- Schema is relatively large and may be changing
frequently - Schema is descriptive rather than perspective
- Data is not strongly typed
- Simple graph data model similar to OEM model
6Data Model for Semi-Structured Data
7Query Language and XML-QL
- All query languages for semi-structure are based
on labeled graph - Features of Semi-Structure query language
- regular path expression
- ability to query the schema
- In addition, XML-QL restructuring mechanism
8Storing XML Data in Relational Database Mapping
Attributes
- Edge Approach
- Store all attributes in single table
- Edge(source, ordinal, name, flag, target)
- Indexing, Forward and backward traversals
- Variant of Edge approach is Store attributes
name in separate table
9Storing XML Data in Relational Database Mapping
Attributes
- Attribute Approach
- All the attributes with the same name in one
table - Resembles to binary storage scheme proposed to
stir semi-structure data - Aname(source, ordinal, flag, target)
- Indexing
10Storing XML Data in Relational Database Mapping
Attributes
- Universal Table
- Single Universal table to store all attributes of
XML document - Universal(source, ordinaln1, flagn1,
targetn1,..)
11Storing XML Data in Relational Database Mapping
Attributes
- Normalized Universal Table
- Multi-valued attributes are stored in separate
Overflow tables - UnivNorm(source, ordinaln1, flagn1,
targetn1,..) - Overflow(source, ordinal, flag, target),.
12Storing XML Data in Relational Database Mapping
Values
- Storing values in separate table
- Value table storing all integers, dates, and all
strings - Vtype(vid, value)
13Storing XML Data in Relational Database Mapping
Values
- Storing values together with attributes
- Column for each data type Inlining
- No flag is needed
- For indexing, on every value columns separately
in addition to source and target
14Evaluating the Mapping Schemes
- Plan of Attack
- Size of Relational Database for each mapping
scheme - The time to bulkload the relational database
given an XML document - The time to reconstruct the XML document from the
relational data - The time to execute different classes of XML
queries - The time to execute different kinds of update
functions
15Evaluating the Mapping Schemes
- Experimental Platform
- Commercial relational database system, installed
on Sun Sparc Station 20 with - Two 75 MHZ processors
- 128MB of main memory a disk that stores the
database and intermediate results of query
processing - Machine runs on Solaris 2.6, with limited size of
main memory buffer to 6.4MB - Calls to relational database from the Java
programs are implemented with JDBC
16Evaluating the Mapping Schemes
- Benchmark Specification
- Benchmark Database
17Evaluating the Mapping Schemes
- Benchmark Specification
- Benchmark Queries
18Evaluating the Mapping Schemes
- Benchmark Specification
- Update Functions
19Evaluating the Mapping Schemes
- Benchmark Specification
- Database Size
20Evaluating the Mapping Schemes
- Benchmark Specification
- Bulkloading Times
21Evaluating the Mapping Schemes
- Benchmark Specification
- Reconstructing the XML Document
22Evaluating the Mapping Schemes
- Benchmark Specification
- Running Times of the Queries
23Evaluating the Mapping Schemes
- Benchmark Specification
- Running Times of the Updates Functions
24Conclusion
- Relational database has following advantages
- Mature and Scale very well
- Traditional and Semi-structured data can co-exist
in relational database - RDBMS are capable of performing more complex XML
queries on large database - Disadvantages
- Very expensive to reconstruct the original XML
data from relational database - Components such as authorization and concurrency
control need to be implemented outside RDBMS
25Conclusion (Contd)
- Alternative mapping schemes results shows
- Attribute tables for every attribute name that
occurs in an XML document and inlining of values
into these Attributes tables is the best approach