Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query? - PowerPoint PPT Presentation

About This Presentation
Title:

Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query?

Description:

'Retrieve castles near London that are reachable by train in less than 2 hours' ... List of documents retrieved from Web containing text 'castle near London' ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 21
Provided by: aditya5
Category:

less

Transcript and Presenter's Notes

Title: Querying for Information Integration: How to go from an Imprecise Intent to a Precise Query?


1
Querying for Information Integration How to go
from an Imprecise Intent to a Precise Query?
  • Aditya Telang
  • Sharma Chakravarthy, Chengkai Li

2
Motivation
  • Retrieve castles near London that are reachable
    by train in less than 2 hours
  • Find 3-bedroom houses in Houston within 2 miles
    of a school and within 5 miles of a highway and
    priced under 250,000
  • Retrieve French restaurants within 1 mile of
    IMAX Theater in Dallas, Texas

3
Current Scenario
  • Retrieve castles near London that are
  • reachable by train in less than 2 hours

- Decision Making Process - Manually Combine
Results to arrive at a decision
London Train schedules
Trains from London
Castles Near London
4
Ideal Scenario
Intent Retrieve castles near London that are
reachable by train in less than 2 hours
Information Integration System
Actual Results for the intent
5
The InfoMosaic Approach
6
Query Specification
Query Castle within 2 hours by train from London
  • Query Bank within 1 mile of University of
    Texas, Arlington

7
How to specify a query?
  • Search Method (e.g., Google)
  • Just needs to search for the keyword in a set
    of documents
  • Get list of documents and post-process (rank,
    cluster, classify, etc.)
  • In an Integration scenario, this doesnt work
  • bank D1
  • University of Texas, Arlington D2
  • 1 (out of 1 mile) is ignored
  • Intersecting documents returned will not generate
    results desired

8
How to specify a query?
  • Database Method (e.g., SQL)
  • Too rigid
  • Need to know database (or source) and its
    corresponding attributes
  • SELECT T1.a1, T2.a2 FROM T1, T2 WHERE
  • Web is not organized as a database hence exact
    mapping between sources and attributes is not
    feasible and not available

9
How to specify a query?
  • Natural Language
  • Ideal mechanism
  • Inherently hard considering ambiguities of
    natural language.
  • school institution for education group of fish
  • Mechanisms such as Question-Answering frameworks
    focus on sophisticated language models built for
    specific domains independently.
  • Incorrect assumption in a integration scenario

10
Query Specification
  • Query castle near London

SELECT castle.name, FROM castle_DB WHERE Castle.
location London
Relation containing tuples
List of documents retrieved from Web containing
text castle near London
11
Query Specification
  • Query castle near London

SELECT castle. WHERE castle.place London
No idea about user intent Castle building,
move in chess, ?
Information Integration
No idea about source, schema, attributes, etc. No
idea about how to pose a query
12
Proposed Approach
  • Approach refine-as-you-input
  • Approach verify-after-input

13
Approach 1 Refine-as-you-input
  • Based on most popular paradigm of querying used
    today keyword search
  • Input Set of keywords/concepts (e.g., castle,
    train, )
  • Output Set of 1 or more Precise Structured Query
  • Challenge
  • Keyword Resolution entity, attribute, value?
  • Generating Query from minimal information
  • Problem
  • Could result in too many non-relevant queries
  • Positive
  • Paradigm accepted by Web, IR and even DB
    community !!!

14
Approach 1 Refine-as-you-input
User Interaction
15
Approach 1 Verify-after-input
  • Based on a rigorous method of formulation queries
    similar to SQL
  • Input user filled template based
  • Output single precise query
  • Problem
  • Users dont like filling too many details
  • Coming up with a unique template across domains
  • Positive
  • Less ambiguous
  • Reduced number of user interactions

16
Approach 1 Verify-after-input
17
Evaluation Plan
  • Testing the approaches on RDBMS where the schema
    and output is known
  • Actual user studies

18
Related Work
19
Future Work
  • Perform extensive experiments to prove the
    validity of the proposed approaches
  • Address other issues in information integration
  • Current focus Ranking TelangDBRank07

20
Thank You !
Write a Comment
User Comments (0)
About PowerShow.com