XML Data - PowerPoint PPT Presentation

About This Presentation
Title:

XML Data

Description:

Address is the element name, and (Name, Street, ZIP?, City, Tel , Fax*, Email? ... Every address must contain, Name, Street, City and Tel. ZIP and Email are ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 25
Provided by: sanjay70
Learn more at: https://web.mst.edu
Category:
Tags: xml | addresses | data | street

less

Transcript and Presenter's Notes

Title: XML Data


1
XML Data
  • ltbookgt
  • lttitlegt database systemslt/titlegt
  • ltauthorgt John ltlastnamegt Korthlt/lastnamegtlt/autho
    rgt
  • ltprice currency USDgt 5.87lt/pricegt
  • lt/bookgt
  • DTD
  • lt!ELEMENT book (title, author, price)gt
  • lt!ELEMENT title (PCDATA)gt
  • lt!ELEMENT author(PCDATA)lastname)

2
  • lttrgt lttd width"20" valign"top"gt Firma
    Karl-Heinz Rosowski lt/tdgt
  • lttd width"20" valign"top"gt Maikstraße 14 lt/tdgt
  • lttd width"20" valign"top"gt 22041 Hamburg lt/tdgt
  • lttd width"20" valign"top"gt 721 99 64 lt/tdgt
  • lttd width"20" valign"top"gt 21110111 lt/tdgt
    lt/trgt

HTML Version
  • lt?xml version"1.0"?gt
  • ltAddressesgt
  • ltAddress id"12359"gt
  • ltNamegtFirma Karl-Heinz Rosowskilt/Namegt
  • ltStreetgtMaikstraße 14lt/Streetgt
  • ltZIPgt22041lt/ZIPgt
  • ltCitygtHamburglt/Citygt
  • ltTelgt721 99 64lt/Telgt
  • ltFaxgt21110111lt/Faxgt ltEmail/gt
  • lt/Addressgt
  • lt/Addressesgt

XML Version
3
XML - Document - Continued
  • lt?xml version"1.0"?gt is the XML declaration.
  • ElementsMost common form of markup. ltelementgt
    lt/elementgt. For example ltnamegtJack Lemon lt/namegt
  • Attributes are name-value pairs that occur
    inside start-tags after the element name. For
    example ltAddress id"12359"gt attaches value
    12359 to attribute id of Address element.
  • Entity References to handle special characters
    of XML like lt in the XML documents.

4
  • Comments lt!-- this is a comment --!gt
  • CDATA Sections a CDATA (string of characters)
    section instructs the parser to ignore most
    markup characters. For example source code,
    lt!CDATA p q b (I lt 3)gt, between
    CDATA and all character data is passed to an
    application, with out interpretation.

5
XML - DTD - Element Type Declarations
  • Element type declarations identify the names of
    elements and the nature of their content. A
    typical element type declaration looks like
  • lt!Element Address (Name, Street, ZIP?, City,
    Tel, Fax, Email?)gt
  • Address is the element name, and (Name, Street,
    ZIP?, City, Tel, Fax, Email?) is the content
    model. Every address must contain, Name, Street,
    City and Tel. ZIP and Email are optional, whereas
    there can be zero or more Fax numbers.

6
  • The declarations for Name, Street, ZIP , must
    also be given. For example
  • lt!Element Name (PCDATA)gt
  • Attribute List Declarations identify which
    elements may have attributes, what values the
    attributes may hold, and what value is default.
    Attribute values appear only within start-tags
    and empty-element tags.
  • ltAddress id"12359"gt

7
XML - Summary
  • HTML describes presentation
  • XML describes content
  • XML vs. HTML
  • users define new tags
  • arbitrary nesting
  • validation is possible

8
XML and Semi Structural Data Model
  • XML data is fundamentally different than
    relational and object oriented data.
  • XML is not rigidly structured.
  • In relational and OO data model every data
    instance has a schema which is separate and
    independent of the data.
  • XML data is self describing and can naturally
    model irregularities that cannot be modeled by
    relational or OO data model.

9
  • For example, data items may have missing elements
    or multiple occurrences of the same element
    elements may have atomic values in some data
    items and structured values in others and
    collections of elements can have heterogeneous
    structure.
  • Even XML data that has an associated DTD is
    self-describing (the schema is always stored
    with the data) and, except for very restricted
    forms of DTDs, may have all the irregularities
    described above.
  • XML is an instance of semistructured data.

10
XML-QL
  • Regular path expression
  • pattern matching
  • used edge labeled graphs
  • extract data from existing XML documents and
    construct new XML documents
  • support for ordered and unordered views on XML
    document
  • simple and declarative

11
XML-QL
  • The simplest XML-QL queries extract data from an
    XML document. Consider the following DTD
  • lt!ELEMENT book (author,title,publisher)gt
  • lt!ATTLIST Book year CDATAgt
  • lt!ELEMENT article (author title year?,
    (shortversion longversion))gt
  • lt!ATTLIST article type CDATAgt
  • lt!ELEMENT publisher (name, address)gt
  • lt!ELEMENT author (firstname?, lastname)gt

12
XML-QL Example Data
ltbibgt ltbook year1995gt lttitlegt An
Introduction to DB Systems lt/titlegt ltauthorgt
ltlastnamegt Date lt/lastnamegtlt/authorgt ltpublishergt
ltnamegt Addison-Wesleylt/namegt lt/publishergt lt/bookgt
ltbook year1995gt lttitlegt Foundations for
OR Databases lt/titlegt ltauthorgt ltlastnamegt Date
lt/lastnamegtlt/authorgt ltauthorgt ltlastnamegt
Darwen lt/lastnamegtlt/authorgt ltpublishergtltnamegt
Addison-Wesleylt/namegt lt/publishergt lt/bookgt lt/bibgt
13
Matching Data Using Patterns
  • XML uses element patterns to match data in an XML
    document.
  • Find all authors of books whose publisher is
    Addison-Wesley in XML document www.a.b.c/bib.xml
  • WHERE ltbookgt
  • ltpublishergtltnamegtAddison-Wesleylt/namegtlt/publishe
    rgt
  • lttitlegt t lt/titlegt
  • ltauthorgt a lt/authorgt
  • lt/bookgt IN www.a.b.c/bib.xml
  • CONSTRUCT a
  • matches every ltbookgt element in the XML document
    that has at least one lttitlegt element, one
    ltauthorgt element , and one publisher element
    whose ltnamegt is Addison-Wesley. For each such
    match it binds t and a to every title and
    author pair.

14
XML-QL Constructing XML Data
  • Often we would like format the result.
  • Find all authors and titles of books whose
    publisher is Addison-Wesley in XML document
    www.a.b.c/bib.xml
  • WHERE ltbookgt
  • ltpublishergtltnamegtAddison-Wesleylt/gtlt/gt
  • lttitlegt t lt/titlegt
  • ltauthorgt a lt/authorgt
  • lt/bookgt IN www.a.b.c/bib.xml
  • CONSTRUCT ltresultgt
  • ltauthorgt a lt/gt
  • lttitlegt t lt/gt
  • lt/gt

15
Constructing XML Data -cont.
Result of the query ltresultgt ltauthorgtltlastname
gt Date lt/lastnamegtlt/authorgt lttitlegt
Introduction to Database Systems
lt/titlegt lt/resultgt ltresultgt ltauthorgtltlastnamegt
Date lt/lastnamegtlt/authorgt lttitlegt Foundations
for OR Databases lt/titlegt lt/resultgt ltresultgt lt
authorgtltlastnamegt Darwen lt/lastnamegtlt/authorgt ltt
itlegt Foundations for OR Databases
lt/titlegt lt/resultgt One result for each author,
duplicating title information.
16
XML-QL Nested Queries.
WHERE ltbookgt lttitlegt t lt/gt ltpublishergtltname
gtAddison-Wesleylt/gtlt/gt lt/gt CONTENT_AS p IN
www.a.b.c/bib.xml CONSTRUCT ltresultgt lttitle
gt t lt/gt WHERE ltauthorgt a lt/gt in
p CONSTRUCT ltauthorgt a lt/gt
lt/gt ltresultgt ltauthorgtltlastnamegt Date
lt/lastnamegtlt/authorgt lttitlegt Introduction to
Database Systems lt/titlegt lt/resultgt ltresultgt lt
authorgtltlastnamegt Date lt/lastnamegtlt/authorgt ltaut
horgtltlastnamegt Darwen lt/lastnamegtlt/authorgt lttitl
egt Foundations for OR Databases
lt/titlegt lt/resultgt
17
XML-QL Join Queries
XML queries cab express joins by matching two
or more elements that contain same value. Find
all articles that have at least one author who
has written a book since 1995. WHERE ltarticlegt
ltauthorgt ltfirstnamegt f lt/gt //
firstname f ltlastnamegt l lt/gt //
lastname l lt/gt lt/gt CONTENT_AS a
IN "www.a.b.c/bib.xml" ltbook yearygt
ltauthorgt ltfirstnamegt f lt/gt //
join on same firstname f ltlastnamegt
l lt/gt // join on same lastname l lt/gt
lt/gt IN "www.a.b.c/bib.xml", y gt
1995 CONSTRUCT ltarticlegt a lt/gt
18
XML-QL Data Model for XML
  • XML graph G in which each node is represented by
    a unique string called object identifier (OID),
    Gs edges are labelled with element tags, Gs
    nodes are labeled with sets of attribute value
    pairs, Gs leaves are labeled with one string
    value, and G has a distinguished node called
    root.

19
XML-QL Data Model for XML
  • The model allows several edges between the same
    two nodes with the following restriction
  • between any two nodes there can be at most one
    edge with a given label
  • a node cannot have two leaf children with the
    same label and same string value
  • XML graphs are not only derived from XML
    documents, but are also generated by queries.

20
XML- Element Identity, Ids, and IDREFS
  • For element sharing XML reserves an attribute of
    type ID which allows a unique key to be
    associated with an element.
  • An attribute of type IDREF allows an element to
    refer to another element with the designated key,
    and one of the type IDREFS may refer to multiple
    elements.

21
  • lt!ATTLIST person ID REQUIREDgt
  • lt!ATTLIST article author IDREFS IMPLIEDgt
  • ltperson ID"o123"gt
  • ltfirstnamegtJohnlt/firstnamegt
  • ltlastnamegtSmithltlastnamegt
  • lt/persongt
  • ltperson ID"o234"gt
  • . . .
  • lt/persongt
  • ltarticle author"o123 o234"gt
  • lttitlegt ... lt/titlegt
  • ltyeargt 1995 lt/yeargt
  • lt/articlegt

22
XML- Element Identity, Ids, and IDREFS
23
The following query produces all lastname, title
pairs by joining the author element's IDREF
attribute value with the person element's ID
attribute value. WHERE ltarticle authorigt
lttitlegt lt/gt ELEMENT_AS t
lt/gt, ltperson IDigt
ltlastnamegt lt/gt ELEMENT_AS l
lt/gt CONSTRUCT ltresultgt t llt/gt The idiom
lttitlegtlt/gt ELEMENT_AS t binds t to a lttitlegt
element with arbitrary contents. The element
expression lttitle/gt matches a lttitlegt element
with empty contents.
24
XML-QL- Advanced Examples
Tag Variables Regular Path Expressions Transformin
g XML Data (from one DTD to another) Integrating
Data from different XML sources Embedding queries
in data XML-QL check http//www3.org/TR/NOTE-xml
-ql
Write a Comment
User Comments (0)
About PowerShow.com