OEM and LORE Query Language - PowerPoint PPT Presentation

About This Presentation
Title:

OEM and LORE Query Language

Description:

OEM and LORE Query Language Sanjay Madria Department of Computer Science University of Missouri-Rolla madrias_at_umr.edu – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 34
Provided by: sanj107
Learn more at: https://web.mst.edu
Category:

less

Transcript and Presenter's Notes

Title: OEM and LORE Query Language


1
OEM and LORE Query Language
  • Sanjay Madria
  • Department of Computer Science
  • University of Missouri-Rolla
  • madrias_at_umr.edu

2
  • Source http//WWW-DB.Stanford.EDU/lore/

3
Semistructured Data (SSD)
  • No explicit schema Irregular and incomplete data
  • Schema may be hidden or mixed with data
  • Examples
  • Semi-structured data arises mainly from the
    integration of heterogeneous data sources both
    structured and non-rigid structured
  • Information sources change, or new sources added.
  • semantic discrepancies among heterogeneous data
    sources
  • Data from the web
  • Overall site structure may change often.
  • Biological data

4
Characteristics of SSD
  • Missing or additional attributes
  • Multiple attributes
  • Different types in different objects
  • Heterogeneous collections
  • Self-describing
  • Irregular, no priori structure

5
Object Exchange Model (OEM)
  • Motivation
  • Self-describing data model
  • information exchange and extraction
  • Handle incomplete and irregular data
  • Why a new data model? it not a new model.

6
LORE
  • Lore Lightweight Object Repository
  • Lightweight because
  • Object Model supported is lightweight
  • No multiuser or heavyweight DBMS features

7
Lore - motivation
  • Relational data model has null values, and OO
    models have inheritance and complex objects. Both
    have difficulties in designing schemas to
    incorporate irregular data.
  • To manage semi-structured data, as in such
    environment
  • Difficult to decide in advance on a single,
    correct schema as
  • structure of the data may evolve rapidly, or
  • data elements may change types, or
  • data not conforming to previous structure may be
    added

8
  • Thus
  • Need for management of semi-structured data!
  • Data managed by Lore is not confined to a schema
    and it may be irregular or incomplete.
  • OEM is the Lores data model.
  • Lorel is Lores query language.

9
Object Exchange Model (OEM)
  • Data in this model can be thought of as a labeled
    directed graph.
  • Schema-less and self-describing.
  • nodes are objects and
  • edges are labeled with attribute names and,
  • leaf nodes have atomic values
  • Object nesting.
  • Vertices in graph are objects.
  • Each object has a unique object identifier (oid),
    such as 5.
  • Atomic objects have no outgoing edges and are of
    types such as int, real, string, gif, java, etc.
  • All other objects that have outgoing edges are
    called complex objects.

10
OEM (Cont.)
  • Examples
  • Object 3 is complex, and its subobjects are 8,
    9, 10, and 11.
  • Object 7 is atomic and has value Clark.
  • DBGroup is a name that denotes object 1.(Names
    are entry points into the database).
  • Type and structure heterogeneity Observe that
    members may have 0, 1 or more offices, office is
    a string and sometimes, also complex object, a
    room may be string and integer
  • DbGroup.Member denotes all member-labeled
    subobjects

11
An OEM Database
DBGroup
1
Member
Project
Member
Member
Project
Member
2
3
4
5
6
Name
Project
Name
Office
Project
Age
Name
Age
Office
Office
9
11
8
10
12
13
14
7
15
16
Clark
Smith
46
Gates 252
Lore
Tsimmis
Jones
28
Room
Building
Room
Building
17
18
19
20
CIS
411
CIS
252
12
Object Exchange Model - OEM
  • Each value exchanged is given an explicit label.
  • Object ?temp-in-Fahrenheit, integer, 80?
  • temp-in-Fahrenheit is the label.
  • Each object is self-describing, with a label,
    type and value.
  • ?set-of-temps, set, cmpnt1, cmpnt2 ?
  • cmpnt1 is ?temp-in-Fahrenheit, integer, 80?
  • cmpnt2 is ?temp-in-Celsius, integer, 20?

13
Labels
  • Lables Play two roles
  • identifying an object (component)
  • identifying the meaning of an object (component)

?person-record, set, cmpnt1, cmpnt2, cmpnt3 ?
cmpnt1 is ?person-name, string, Fred?
cmpnt2 is ?office-num-in-bldg-5, integer, 333?
cmpnt3 is ?department, string, toy?
  • Person-name both identifies cmpnt1 and coveys its
    meaning.
  • In relational data this corresponds to .

14
Labels - Issues
  • Labels are relative (more specific) to the source
    of the data object.
  • Similar labels from different sources need to be
    resolved.
  • Labels provide the flexibility in representing
  • object structure

15
OEM - Specification
  • Each object in OEM has the following structure
  • Label A variable character string describing
    what the object represents.
  • Type The data type of the objects value. Each
    is either an atom type, or type set.
  • Value A variable-length value of the object.
  • Object-ID A unique variable-length identifier
    for the object or null.

16
OEM - Summary
  • OEM is an information exchange model. It does not
    specify how objects are stored at source.
  • OEM does specify how objects are received at a
    client, but after objects are received they can
    be stored in any way the client likes.
  • Each source has a distinguished object with
    lexical identifier root.

17
  • ltbiblio,set,doc1,doc2,,docngt
  • doc1 is ltdoc, set, auths1, topic1, call-no1gt
  • auths1 is ltauth-set,set auth11gt
  • auth11 is ltauth-ln, string, Ullmangt
  • topic1 is lttopic, string,Databasesgt
  • call-no1 is ltinternal-call-no, integer, 25gt
  • doc2 is ltdoc, set, auths2, topic2, call-no2gt
  • auths2 is ltauth-set,set auth21, auth22, auth23gt
  • auth21 is ltauth-ln, string, Ahogt
  • auth22 is ltauth-ln, string, Hopcroftgt
  • auth23 is ltauth-ln, string, Ullmangt

Example
  • topic2 is lttopic, string,Algorithmsgt
  • call-no1 is ltdewey-decimal, string, BR273gt
  • docn is ltdoc, set, authsn, topicn, call-nongt
  • authsn is ltauth,string,
  • Crichtongt
  • topic1 is lttopic, string,Dinosaursgt
  • call-no1 is ltfictional-call-no, integer, 95gt
  • biblio is the root object.

18
OEM - QL
  • SELECT Fetch-expression
  • FROM Object
  • WHERE Condition
  • The result of this query is itself an object,
    with special label answer
  • ?answer, set, obj1, obj2, , objn ?
  • Each returned obji is a component of object
    specified in the From clause of the query, where
    the component is located by the Fetch-expression
    and satisfies the Condition.

19
Path
  • The notion of path is used in both
    Fetch-Expression in the Select clause and the
    condition in the Where clause.
  • Path describes traversals through an object using
    subobject structure and labels.
  • Example biblio.doc.auth
  • Paths are used in Fetch-Expression to specify
    which components are are returned in the answer
    object.
  • Paths are used in the condition to qualify the
    fetched objects or other (related) components in
    the same object structure.

20
Queries - Simple
  • Retrieve the topic of each document for which
    Ullman is one of the authors
  • SELECT biblio.doc.topic
  • FROM root
  • WHERE biblio.doc.auth-set.auth-ln Ullman
  • Intuitively, the querys where clause finds all
    paths through subobject structure with the
    sequence of labels biblio,doc,auth-set,auth-ln
    such that the object at the end of the path has
    value Ullman.
  • ltanswer, set, obj1, obj2gt
  • obj1 is lttopic, string, Databasesgt
  • obj2 is lttopic, string, Algorithmsgt

21
Queries - wild-cards
  • Retrieve all documents with internal call number
  • SELECT biblio.?.topic
  • FROM root
  • WHERE biblio.?.internal-call-no
  • ? label matches any label. For this query,
    the doc labels can be replaced by any other
    strings and query would produce the same result.
    By convention, two occurrences of ? In the same
    query must match the same label unless variables
    are used.
  • ltanswer, set, obj1gt
  • obj1 is lttopic, string, Databasesgt

22
Queries - wild-paths
  • Retrieve all documents with internal call number
  • SELECT .topic
  • FROM root
  • WHERE .internal-call-no
  • Symbol matches any path of length one or
    more. The use of followed by a single label is
    a convenient and common way to locate objects
    with a certain label in complex structure.
    Similar to ?, two occurrences of in the same
    query must match the same sequence of labels,
    unless variables are used.
  • ltanswer, set, obj1gt
  • obj1 is lttopic, string, Databasesgt

23
Queries - variables
  • Retrieve each document for which both
    Hopcroft and Aho are co-authors
  • SELECT biblio.doc
  • FROM root
  • WHERE biblio.doc.auth-set.auth-lnAho and
  • biblio.doc.auth-set.auth-lnHopcr
    oft
  • Here, the query finds all the paths with
    structure biblio, doc, auth-set, and with two
    distinct path completions with label auth with
    values Aho and Hopcroft
  • ltanswer, set, obj1gt
  • obj1 is the complete doc2

24
Lorel Query Language
  • Need query language that supports path
    expressions for traversing graph data and
    handling of typeless data.
  • A simple path expression is a name followed by a
    sequence of labels.
  • DBGroup.Member.Office.
  • Set of objects that can be reached starting with
    the DBGroup object, following edges labeled as
    member and then office.

25
Lorel (cont.)
  • Example
  • select DBGroup.Member.Officewhere
    DBGroup.Member.Age lt 30
  • Result
  • Office Gates 252
  • Office
  • Building CIS
  • Room 411

26
Lorel Query Rewrite
  • Previous query rewritten to (OQL style)
  • select Ofrom DBGroup.Member M, M.Office Owhere
    exists y in M.Age y lt 30

27
Lorel Query Features
  • Explicitly handle coercion.
  • Automatic type coercion
  • 0.5 lt 0.9 should return true
  • Comparison on age transformed to existential
    condition.
  • Since all properties are set-valued in OEM.
  • A user can ask DBGroup.Member.Age lt 30 regardless
    of whether Age is single valued, set valued, or
    unknown.

28
  • Path expression queries -specification for a set
    of possible paths through the graph
  • Example - is a path expression that matches any
    number of labels
  • Use of Data guides Structural summary of the
    database

29
Lorel (cont.)
  • General path expressions are loosely specified
    patterns for labels in the database.(
    disjunction, ? label pattern optional)
  • Example
  • select DBGroup.Member.Namewhere
    DBGroup.Member.Office(.Room.Cubicle)? like
    252
  • Result
  • Name JonesName Smith

30
Lorel Queries - Simple Path Expression
  • Retrieve the offices of members with age greater
    than 30 years
  • Query SELECT DBGroup.Member.Office
  • WHERE DBGroup.Member.Age gt 30
  • Result Office Gates 252
  • Office
  • Building CIS
  • Room 411

31
Queries - General Path Expression
  • Query SELECT DBGroup.Member.Name
  • WHERE DBGroup.Member.Office(.Room.Cubicle)?
  • Like 252
  • Result Name Jones
  • Name Smith
  • Room matches all labels starting from Room, like
    Room68. stands for disjunction. ? indicates
    that the label pattern is optional. like 252
    specifies that the data value should end with
    string 252.

32
Queries - SubQueries
Retrieve Lore project members who work on other
projects Query SELECT M.Name, ( SELECT
M.Project.Title WHERE M.Project.Title !
Lore) FROM DBGroup.Member M WHERE
M.Project.Title Lore Result Member Name
Jones Title Tsimmis
33
Lore - Summary
  • Lore does facilitate query and updates on
    semi-structural databases
  • There has been more work done on optimization
    using data guides (vldb97).
  • How is this related to WWW?
  • XML-QL and related work provides the answer.
Write a Comment
User Comments (0)
About PowerShow.com