Normalization Theory for XML - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Normalization Theory for XML

Description:

Normalization Theory for XML – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 16
Provided by: pcgu2
Category:

less

Transcript and Presenter's Notes

Title: Normalization Theory for XML


1
Normalization Theory for XML
  • Murali Mani
  • Oct 17, 2003

2
Why Normalization?
3
How to study normalization?
  • How do we specify FDs, MVDs?
  • What are normalization steps?
  • Different Normal Forms.

4
XML Schema Structural Spec
  • G (N, T, P, S)
  • N Book, Author, Publisher, PCDATA
  • T book, author, publisher, pcdata
  • S Book
  • Book ? book (Author , Publisher)
  • Author ? author (PCDATA)
  • Publisher ? publisher (_at_nameString)
  • PCDATA ? pcdata (?)

Regular Tree Grammar Every production rule is of
the form A ? a X A ? N, a ? T, X is a regular
expression over N
5
XML Schema Constraint Spec
(Library, Person, lt_at_namegt) (Library, Book,
lt_at_ISBNgt) (Library, Paper, lt_at_titlegt) (Person,
Review, lt_at_articlegt)
_at_articleIDREF references (Book Paper)
6
Unnesting for XML Proposal 1
Person
The following FD holds for this
instance article, rating ? name
7
Unnesting for XML Proposal 2
for p in //person for name in p//_at_name for
paper in p//_at_PID for article in
p//_at_article for article in p//_at_rating return
ltpersongt ltnamegtnamelt/namegt
ltPIDgtpaperlt/PIDgt ltarticlegtarticlelt/article
gt ltratinggtratinglt/ratinggt lt/persongt
Person
The following FD does not hold for this
instance article, rating ? name
8
Example
N Root, Library, Book, Author T root,
library, book, author S Root Root ? root
(Library ) Library ? library (_at_lname, _at_address,
Book) Book ? book (_at_title, _at_loc, Author
) Author ? author (_at_aname, _at_age)
  • Key constraints
  • (Root, Library, lt_at_lnamegt)
  • (Root, Book, lt_at_titlegt)
  • (Book, Author, lt_at_anamegt)
  • FDs
  • (Root, Book, parentlibrary/_at_lname ? _at_loc)
  • (Root, Author, _at_aname ? _at_age)

9
Properties of FDs
  • For any 2 types X, Y, the FD (X, Y, p ? y) is
    true, where
  • p is a path expression producing 1/more
    descendant elements of Y
  • Eg (X, Book, author ? book)
  • For any two types X, Y, the FD (X, Y, y ? a) is
    true, where
  • a is an attribute or element that can occur only
    once for a given y
  • Eg (X, Book, book ? _at_title)
  • Eg (X, Book, book ? parentlibrary/_at_address)
  • Eg (X, Book, book ? parentlibrary)

10
Properties of FDs (contd)
  • To prove For any FD, (X, Y, S ? a)
  • All path expressions in S end only in attributes.

11
Normalization Step 1
  • For a FD of the form (X, Y, S ? a)
  • If ? Z such that (X, Z, S) is a key constraint
  • Move a to be child of Z

(Root, Book, parentlibrary/_at_lname ? _at_loc)
12
(No Transcript)
13
Normalization Step 2
  • For a FD of the form (X, Y, S ? a)
  • If ? ? Z such that (X, Z, S) is a key constraint
  • Create new type Z with (X, Z, S) as key
  • Move a to be child of Z

(Root, Author, _at_aname ? _at_age)
14
  • Key constraints
  • (Root, Library, lt_at_lnamegt)
  • (Root, Book, lt_at_titlegt)
  • (Book, Author, lt_at_anamegt)
  • (Root, Author1, lt_at_anamegt)
  • Foreign Key constraints
  • (Book, Author, lt_at_anamegt) references
  • (Root, Author1, lt_at_anamegt)

15
Conclusions and Future Work
  • Studied 2 ways of unnesting XML documents
  • Defined functional dependencies
  • Studied normalization Steps
  • Improving the goodness of normalized XML schemas.
  • Current restriction path expressions cannot
    navigate IDREF/(S)
  • Inference of FDs and MVDs
Write a Comment
User Comments (0)
About PowerShow.com