Relaxing Join and Selection Queries - PowerPoint PPT Presentation

About This Presentation
Title:

Relaxing Join and Selection Queries

Description:

Adjust the conditions. What conditions to adjust? How to ... Different ways to adjust the conditions: Select vs. Join. How much to adjust each condition? ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 33
Provided by: raresv
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Relaxing Join and Selection Queries


1
Relaxing Join and Selection Queries
  • Rares Vernica
  • UC Irvine, USA
  • Joint work with Nick Koudas, Chen Li, and Anthony
    K. H. Tung

2
Query Example
  • SELECT FROM Jobs J, Candidates C
  • WHERE J.Salary
  • AND J.Zipcode C.Zipcode
  • AND C.WorkExp 5

3
What if the query answer is empty?
  • SELECT FROM Jobs J, Candidates C
  • WHERE J.Salary
  • AND J.Zipcode C.Zipcode
  • AND C.WorkExp 5
  • Adjust the conditions
  • What conditions to adjust?
  • How to adjust them?

4
Example Percentages of Empty Result Queries
  • In a Customer Relationship Management (CRM)
    application developed by IBM
  • 18.07 (3,396 empty result queries in 18,793
    queries)
  • In a real estate application developed by IBM
  • 5.75
  • In a digital library application JCM00
  • 10.53
  • In a bioinformatics application RCP98
  • 38
  • Efficient Detection of Empty-Result Queries
    (p.1015)Gang Luo (IBM T.J. Watson Research
    Center, USA) VLDB 2006

5
Observations
  • Different ways to adjust the conditions Select
    vs. Join
  • How much to adjust each condition?Salary vs. Salary
  • Adjust join vs. Adjust both selections

Salary
WorkExp 5
6
Contributions
  • Query relaxation framework for selections and
    joins
  • Lattice-based approach for query relaxation
  • Efficient relaxation algorithms

7
Overview
  • Motivation
  • Query Relaxation
  • Lattice-based Relaxation
  • Relaxation Algorithms
  • Variations
  • Experiments

8
Query Relaxation
  • Top-k / Nearest neighbor
  • Weight for each condition
  • Skyline
  • No weights are needed
  • Conditions are not considered equal
  • Return non dominated points

9
Query Relaxation
  • Skyline
  • Stephan Börzsönyi, Donald Kossmann, Konrad
    Stocker The Skyline Operator. ICDE 2001

10
Overview
  • Motivation
  • Query Relaxation
  • Lattice-based Relaxation
  • Relaxation Algorithms
  • Variations
  • Experiments

11
Lattice-based Relaxation
R select on Jobs J join condition S select
on Candidates
Salary WorkExp 5
12
Overview
  • Motivation
  • Query Relaxation
  • Lattice-based Relaxation
  • Relaxation Algorithms
  • Variations
  • Experiments

13
Relaxing Selection Conditions
INCORRECT
  • Algorithm
  • Compute Skyline on Jobs
  • Compute Skyline on Candidates
  • Join the Skylines

Salary
WorkExp 5
Empty Join
Skyline
Skyline
Skyline
14
Relaxing Selection Conditions
  • Join First Algorithm
  • Compute the join(disregarding the selections)
  • Compute Skyline on join results

Salary
WorkExp 5
Join
Skyline
15
Relaxing Selection Condition
  • Variations
  • Pruning Join
  • Build the Skyline during the join
  • Pruning Join
  • Pruning Join
  • Build the local Skyline before the join
  • Sorted Access Join
  • Fagins Top-k sort the columns on relaxation
  • Compute the join Skyline

16
Relaxing all conditions
  • Multi-Dim.-Index-based-Relaxation Algorithm
  • Traverse the index structure top-down
  • Form pairs of nodes or records
  • Build the Skyline

17
Overview
  • Motivation
  • Query Relaxation
  • Lattice-based Relaxation
  • Relaxation Algorithms
  • Variations
  • Experiments

18
Variations
  • Computing Top-k over Skyline
  • Weight to each condition
  • Queries with multiple joins
  • Conditions on nonnumeric attributes
  • Dominance checking function

19
Overview
  • Motivation
  • Query Relaxation
  • Lattice-based Relaxation
  • Relaxation Algorithms
  • Variations
  • Experiments

20
Experimental Setting
  • Datasets
  • Real
  • Internet Movie Database (IMDB)
  • Movies (120k) ActorInMovies (1.2m)
  • Census-Income UCI KDD Repository
  • Census (200k)
  • Synthetic
  • Independent, Correlated, and Anticorrelated
  • Implementation
  • GNU C
  • Spatial Index Library (R-tree)
  • Linux, AMD Opteron 240, 1GB RAM

21
Different algorithms, different behaviors
IMDB Dataset
22
Different datasets, different behaviors
Anticorrelated Dataset
Correlated Dataset
Independent Dataset
23
How big is the Skyline?
24
Relaxing join takes time
Self-join on Census Dataset
25
Top-k over Skyline
IMDB Dataset
26
Related Work
  • Muslea et al.
  • Alternate forms of conjunctive expressions
  • Efficient Skyline algorithms
  • Selection queries
  • Efficient Top-k algorithms
  • Require weights for conditions

27
Conclusions
  • Query relaxation framework for selections and
    joins
  • Lattice-based approach for query relaxation
  • Efficient relaxation algorithms

28
Future Work
  • Optimum use of the lattice structure
  • Relax conditions on string attributes
  • Algorithms applicable outside the databases

29
Questions ?
30
(No Transcript)
31
Skyline vs. Top-k
32
Skyline vs. Top-k over Skyline
Write a Comment
User Comments (0)
About PowerShow.com