Querying XML Data - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Querying XML Data

Description:

title XML Developer's Guide /title genre Computer /genre ... originally for semi-structured data (Lore) extended for XML. XML-QL (AT&T Labs) extends SQL ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 31
Provided by: willie4
Category:
Tags: xml | data | lore | querying

less

Transcript and Presenter's Notes

Title: Querying XML Data


1
Querying XML Data
  • By Willie Milnor
  • LSDIS Lab
  • Computer Science Department
  • University of Georgia

2
Data Storage
  • Relational databases
  • relations/tables of records
  • attributes
  • records are connected via foreign keys
  • XML
  • hierarchical structure of elements
  • nested attributes or sub elements
  • elements connected via references to element Ids
    (IDREFs)

3
Simple Example
  • Relational database
  • author
  • au1 Matthew Gambardella Atlanta, GA
  • 770-350-9955
  • book
  • bk15 XML Developer's Guide Computer 44.95
  • 2000-10-01 au1

4
Simple Example
  • XML
  • ltauthor idau1gt
  • ltnamegt Matthew Gambardella lt/namegt
  • ltaddressgt Atlanta, GA lt/addressgt
  • ltphonegt 770-350-9655 lt/phonegt
  • lt/authorgt
  • ltbook id"bk15 authorau1gt
  • lttitlegtXML Developer's Guidelt/titlegt
    ltgenregtComputerlt/genregt
  • ltpricegt44.95lt/pricegt
  • ltpublish_dategt2000-10-01lt/publish_dategt
  • lt/bookgt

5
Schemas
  • Structure of data represented in schemas
  • Relational databases use a DDL
  • XML
  • DTD
  • XML Schema
  • Specific schema langauages for XML-like languages
  • i.e. RDF/S for RDF
  • no schema
  • self-describing data

6
Why Use Semi-Structured Data
  • Raw data is often semi-structured
  • script generated
  • HTML has inherent semi-structure
  • Less rigid structure than traditional databases
  • may not have predefined schema
  • convenient

7
Why Use Semi-Structured Data
  • Integration of heterogeneous data sets
  • schemas do not enforce typing
  • single-/multi-valued attributes
  • Standardize information exchange
  • XML is the de facto format
  • RDF in Semantic Web
  • Ability to browse/query without full knowledge of
    the schema

8
Why Use XML
  • XML (Extended Markup Language) has emerged as a
    standard for exchanging data on the Web
  • XML can be used on a wide variety of platforms
    and interpreted with a wide variety of tools
  • many available parsers
  • Enables separation of content (XML) and
    presentation
  • XSL (Extensible Stylesheet Language)

9
Challenges Posed by XML
  • extract data from large XML documents
  • exchange (export) XML data
  • exchange XML among different, related ontologies
  • integrate XML from multiple sources
  • answer
  • query languages for XML

10
Querying XML Data
  • Traditional databases
  • store in a native database
  • use a traditional database query language
  • i.e. DB4XML
  • requires schema representation rigid
  • Develop a language for XML
  • some inspired by traditional DQLs
  • incorporate relational algebra
  • others inspired by XML

11
Examples of XML Query Languages
  • Lorel (Stanford)
  • originally for semi-structured data (Lore)
  • extended for XML
  • XML-QL (ATT Labs)
  • extends SQL
  • XQL (Texcel, webMethods, Microsoft)
  • for selecting/filtering elements and text of XML
    documents
  • natural extension of XSL pattern syntax

12
Features of DQLs in XQLs
  • Specific data model
  • Lorel XML element is a pair lteid, valuegt
  • ltPerson idP1 nameJohn Smith colleagueP2gt
  • XML-QL document modeled by XML graph
  • node refers to an object identifier
  • edges labeled with element tag identifiers
  • each graph has a root
  • XQL assumes the implied XML data model

13
Features of DQLs in XQLs
  • Basic query abstractions
  • select supported by all
  • joins
  • Lorel and XML-QL support joins over the same
    document or several documents
  • XQL allows joins only among data in the same
    document
  • semantics
  • Lorel returns pointers to objects in original
    document
  • XML-QL and XQL return new documents that can be
    queried indepentenly

14
Features of DQLs in XQLs
  • Path expressions
  • a form of navigational queries
  • partially specified expressions
  • supported by all
  • XQL path like a file path
  • pse matched with cyclic data
  • supported by Lorel and XML-QL
  • undefined in XQL

15
Features of DQLs in XQLs
  • Quantification, negation and reduction
  • existential quantification
  • supported by all
  • universal quantification
  • not supported by XML-QL
  • negation
  • not supported by XML-QL
  • reduction
  • supported by none

16
Features of DQLs in XQLs
  • Restructuring abstractions
  • new elements
  • Lorel and XML-QL have functions for construction
  • XQL provides no construct mechanism
  • grouping
  • not in XQL
  • Skolem function for OIDs
  • could be used for grouping applied to attributes
  • not supported by XQL

17
Features of DQLs in XQLs
  • Aggregation, nesting and binary queries
  • aggregates
  • supported by all
  • nested queries
  • XML-QL and Lorel allow nested queries at any
    level
  • unsupported in XQL
  • binary queries
  • XML-QL does not support difference

18
Features of DQLs in XQLs
  • Order management
  • ordered results
  • XML-QL and Lorel order by given attribute
  • preserving order
  • (naturally) supported by all
  • instance order querying
  • only in Lorel and XML-QL

19
Features of DQLs in XQLs
  • Typing
  • type coercion
  • partially supported in XQL
  • fully supported in Lorel
  • Updates to data
  • insert, delete, update
  • only in Lorel

20
Features of XQLs
  • XML integration
  • XML/RDF schema support
  • not supported
  • cross-document linking
  • not supported
  • support of tag variables
  • in XML-QL variables can be associated with tags
  • in Lorel
  • obtain all paths reachable from a path expression
  • used for building the names of the query result

21
Desired Qualities
  • Declarative content of the result defined by the
    query
  • Lorel resembles a calculus-based language
  • XML-QL represents a logic language
  • XQL uses URL-like patterns
  • Expressive Power
  • Lorel
  • XML-QL
  • XQL

22
Direction of XML
  • Semantic Web
  • common framework
  • allows data to be shared/reused over application,
    enterprise, and community boundaries
  • provides ability to give meaning to data
  • based on RDF
  • using XML syntax and URIs for names
  • more expressive than XML

http//www.w3.org/2001/sw/
23
Data in the Semantic Web
  • RDF Resource Description Framework
  • standard representation language based on labeled
    graph
  • nodes are resources (literals)
  • edges are properties
  • schema definition language (RDF/S)
  • create vocabularies of labels
  • XML syntax

http//www.w3.org/RDF/
24
RDF Graph
25
Querying the Semantic Web
  • RQL RDF Query Language
  • adapts functionalities of semi-structured and
    XML query languages
  • for real-scale SW applications
  • knowledge portals
  • E-market places
  • manage voluminous RDF bases and schemas
  • currently utilized in many projects

http//www2002.org/CDROM/refereed/329/
26
Querying the Semantic Web
  • RQL RDF Query Language
  • uniformly navigate/filter RDF graphs
  • schema-level
  • Instance-level
  • highly expressive
  • typed functional query language
  • relies on a formal graph model
  • Interpretation of multiple RDF schemas

http//www2002.org/CDROM/refereed/329/
27
RQL Example
28
Beyond Query Languages
  • Query languages may not be enough
  • View Languages
  • Relational databases
  • SQL is query and view language
  • XML
  • Active Views (INRIA)
  • RDF
  • RVL RDF View Definition Language

29
References
  • Comparative Analysis of Five XML Query Languages
  • RQL A Declarative Query Language for RDF
  • Viewing the Semantic Web Through RVL Lenses
  • World Wide Web Consortium

30
Comments Questions?
Write a Comment
User Comments (0)
About PowerShow.com