Intelligent Querying of Web Documents Using a Deductive XML Repository - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Intelligent Querying of Web Documents Using a Deductive XML Repository

Description:

Title: PowerPoint Presentation Last modified by: Nick Bassiliades Created Date: 1/1/1601 12:00:00 AM Document presentation format: – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 19
Provided by: auth3178
Category:

less

Transcript and Presenter's Notes

Title: Intelligent Querying of Web Documents Using a Deductive XML Repository


1
Intelligent Querying of Web Documents Using a
Deductive XML Repository
  • Nick Bassiliades, Ioannis Vlahavas
  • Dept. of Informatics
  • Aristotle University of Thessaloniki

2
Abstract
  • X-DEVICE is a deductive OODB system
  • It is used for storing XML documents as objects
  • X-DEVICE has a powerful rule-based query language
    for
  • intelligently querying stored XML documents
  • publishing the results
  • The rule language features
  • second-order syntax
  • generalized path and ordering expressions
  • Metadata are used to translate the extended
    features into first-order rules

3
Object Model of XML Data
  • DTD definitions are automatically translated into
    a class schema
  • XML documents are automatically translated into
    objects
  • Generated classes and objects are stored within
    the underlying OODB ADAM
  • ADAM is an OODB built on Prolog (Norman Paton,
    Peter M.D. Gray, Univ. of Aberdeen)

4
Object Model of XML DataW3C XQuery TEXT Use Case
  • lt!ELEMENT company (name, ticker_symbol?,
    description?, business_code, partners?,
    competitors?)gt
  • lt!ELEMENT name (PCDATA)gt
  • lt!ELEMENT ticker_symbol (PCDATA)gt
  • lt!ELEMENT description (PCDATA)gt
  • lt!ELEMENT business_code (PCDATA)gt
  • lt!ELEMENT partners (partner)gt
  • lt!ELEMENT partner (PCDATA)gt
  • lt!ELEMENT competitors (competitor)gt
  • lt!ELEMENT competitor (PCDATA)gt

5
Object Model of XML DataAlternation
  • lt!ELEMENT content (par figure) gt

6
Deductive XML Query Language
  • The X-DEVICE language is an extension of DEVICE,
    the basic deductive rule language
  • N. Bassiliades, I. Vlahavas, A.K. Elmagarmid,
    E-DEVICE An extensible active knowledge base
    system with multiple rule type support, IEEE
    TKDE, 12(5), 824-844, 2000.
  • X-DEVICE rules are pre-compiled into DEVICE
    deductive rules
  • Deductive rules are compiled into production
    rules
  • ECA rules with one complex event
  • Matching through RETE network

7
X-DEVICE LanguageBasic first-order deductive
rules
  • if C_at_company(nameXYZ Ltd,
  • partner.partners ? P)
  • then partner_of_xyz(partnerP)
  • Selects company C with name XYZ Ltd
  • Iterates over partners P through navigation
  • Path inverse notation NOT partners.partner
  • Defines a new derived class of partners of
    company XYZ
  • Derived objects are materialized

8
X-DEVICE Language Recursion
  • if P_at_partner_of_xyz(partnerP1)and
  • C_at_company(nameP1,
  • partner.partners ? P2)
  • then partner_of_xyz(partnerP2)
  • Rule processing uses semi-naïve evaluation
  • Negation is allowed (safety, stratification)
  • Single-valued attributes use for instantiation
  • Multi-valued attributes use ? for instantiation
  • Prolog lists guarantee correct ordering

9
X-DEVICE LanguageVariable-Attribute Expressions
  • if C_at_company(A XYZ)
  • then a_xyz_comp(companylist(C))
  • We dont know which attribute of company contains
    the string XYZ
  • A is second-order variable (meta-variable)
  • list is an aggregation function (collects company
    OIDs in a multi-valued attribute)
  • The operator performs string search

10
X-DEVICE LanguageTranslation of
Variable-Attributes
  • if company_at_xml_seq(elem_order ? A)
  • then new_rule(
  • if C_at_company(A XYZ)
  • then a_xyz_comp(companylist(C))
  • ) gt deductive_rule
  • Iterate over meta-class xml_seq to find all
    attributes (sub-elements) of class company
  • A production rule creates one deductive rule for
    each instantiation of A
  • A is now a first-order variable in the condition
    and a constant in the action

11
X-DEVICE LanguageGeneralized Path Expressions
  • if C_at_company( XYZ)
  • then a_xyz_comp(companylist(C))
  • The search for string XYZ must be performed
  • not only to attributes of company
  • but also to attributes of objects contained
    within company
  • at all levels of nesting

12
X-DEVICE LanguageTranslation of Generalized Paths
  • Iterate over all immediate elements of class
    company
  • Store them into an auxiliary derived class
  • if company_at_xml_seq(elem_order ? X1)
  • then tmp_elem1(cnd_elemX1,
  • pathX1)

13
X-DEVICE LanguageTranslation of Generalized Paths
  • Recursively iterate over all elements and
    sub-elements stored in the auxiliary class
  • The path-so-far from the root company element is
    accumulated
  • if X1_at_tmp_elem1(cnd_elemX2,pathX3)
  • and X2_at_xml_seq(elem_order ? X4)
  • then tmp_elem1(cnd_elemX4,
  • pathX4X3)

14
X-DEVICE LanguageTranslation of Generalized Paths
  • Terminate the recursion if no more nested
    elements can be found
  • Create one deductive rule for each discovered
    concrete path
  • if X1_at_tmp_elem1(cnd_elemX2,pathX3) and
  • not X2_at_xml_seq and
  • prologcreate_path(X3,PATH)
  • then new_rule(
  • if C_at_company(PATH XYZ)
  • then a_xyz_comp(companylist(C))
  • ') gt deductive_rule

15
X-DEVICE LanguageTranslation of Generalized Paths
  • The following deductive rules are created
  • C_at_company(name XYZ)
  • C_at_company(ticker_symbol XYZ)
  • C_at_company(description XYZ)
  • C_at_company(business_code XYZ)
  • C_at_company(partner.partners XYZ)
  • C_at_company(competitor.competitors XYZ)
  • Optimization of multiple rules is achieved
    through common parts of the RETE network
  • The DEVICE system takes care of that

16
X-DEVICE LanguageOrdering Expressions
  • W3C TEXT Case Query 5
  • For each news item that is relevant to the
    Gorilla Corp, create an item summary element.
  • The content of the item summary is the content of
    the title, date, and first paragraph of the news
    item
  • if N_at_news_item(.contentGorilla Corp,
    par.content ?1 PAR,
  • titleT, dateD)
  • then item_summary(titleT,dateD,
  • parPAR)

17
X-DEVICE LanguageTranslation of Ordering
  • Collect all the paragraphs that satisfy the
    condition
  • Store them in a list of an auxiliary derived
    class
  • if N_at_news_item(.contentGorilla Corp,
    par.content ? X1,
  • titleT, dateD)
  • then tmp_elem1(tmp_var1T, tmp_var2D,
  • tmp_objlist(X1))

18
X-DEVICE LanguageTranslation of Ordering
  • Isolate a sub-list of all the paragraphs that
    satisfy the ordering expression ?1
  • There is one Prolog goal for each ordering
    expression
  • if X3_at_tmp_elem1(tmp_var1T,tmp_var2D,
  • tmp_objX1) and
  • prologlength(X2,1),append(X2,_,X1)
  • then tmp_elem2(tmp_var1T,tmp_var2D,
  • tmp_objX2)

19
X-DEVICE LanguageTranslation of Ordering
  • Iterate over all qualifying results and return
    them into the target element
  • if X1_at_tmp_elem2(tmp_var1T,tmp_var2D,
  • tmp_obj ? PAR)
  • then item_summary(titleT,dateD,
  • parPAR)

20
X-DEVICE LanguageBuilding Result Documents
  • The top-level element of the XML result document
    is identified with the keyword xml_result
  • The DTD of the result document is identified
    through object references
  • W3C TEXT Case Query 2
  • Find news items where the Foo Corp company and
    one or more of its partners are mentioned in the
    same paragraph and/or title
  • List each news item by its title and date

21
X-DEVICE LanguageBuilding Result Documents
  • Find the Foo company and iterate over its
    partners
  • For each partner, iterate over news items and
    search for Foo and its partner inside the title
    of the same news item
  • if C_at_company(nameFoo Corp,
  • partner.partners ? P) and
  • N_at_news_item(titleTFoo Corp P,
  • dateD)
  • then xml_result(news_item1(titleT,
  • dateD))

22
X-DEVICE LanguageBuilding Result Documents
  • Find the Foo company and iterate over its
    partners
  • For each partner, iterate over news items and
    search for Foo and its partner inside the
    nested paragraphs of the same item
  • if C_at_company(nameFoo Corp,
  • partner.partners ? P) and
  • N_at_news_item(.par.contentFoo Corp
  • P,
    titleT, dateD)
  • then news_item1(titleT,dateD)

23
X-DEVICE LanguageBuilding Result Documents
  • lt!DOCTYPE news_item1
  • lt!ELEMENT news_item1
  • (title, date)gt
  • lt!ELEMENT title (PCDATA)gt
  • lt!ELEMENT date (PCDATA)gt
  • gt
  • The structure of the title and date elements is
    automatically determined by the type of the
    corresponding rule variables

24
Advantages of X-DEVICE
  • Logic-based query languages have
  • well-understood mathematical properties
  • declarative nature
  • advanced optimization techniques (magic-sets)
  • X-DEVICE compared to XQuery (functional)
  • more high-level, declarative syntax
  • more compact and comprehensible
  • general path expressions
  • due to fixpoint semantics and second-order
    variables

25
Advantages of X-DEVICE
  • Users can express complex XML document views
  • Information customization for e-commerce,
    e-learning, etc.
  • X-DEVICE offers multiple knowledge representation
    formalisms
  • Deductive, Production, and Active rules
  • Structured objects
  • Production and Active rules can be used to update
    XML documents
  • All the above can play an important role as an
    infrastructure for the Semantic Web

26
Intelligent Querying of Web Documents Using a
Deductive XML Repository
  • Nick Bassiliades, Ioannis Vlahavas
  • Dept. of Informatics
  • Aristotle University of Thessaloniki
  • X-DEVICE site
  • www.csd.auth.gr/lpis/systems/
  • x-device.html
Write a Comment
User Comments (0)
About PowerShow.com