Efficient Skyline Querying with Variable User Preferences on Nominal Attributes - PowerPoint PPT Presentation

About This Presentation
Title:

Efficient Skyline Querying with Variable User Preferences on Nominal Attributes

Description:

M (Mozilla) 3. 3000. f. M (Mozilla) 2. 2400. e. H (Horizon) 4 ... M (Mozilla) 3. 3000. f. M (Mozilla) 2. 2400. e. H (Horizon) 4. 3600. d. H (Horizon) 5. 3000 ... – PowerPoint PPT presentation

Number of Views:158
Avg rating:3.0/5.0
Slides: 60
Provided by: raym168
Category:

less

Transcript and Presenter's Notes

Title: Efficient Skyline Querying with Variable User Preferences on Nominal Attributes


1
Efficient Skyline Querying with Variable User
Preferences on Nominal Attributes
  • Raymond Chi-Wing Wong1, Ada Wai-Chee Fu2, Jian
    Pei3,
  • Yip Sing Ho2, Tai Wong2 and Yubao Liu4
  • The Hong Kong University of Science and
    Technology1
  • The Chinese University of Hong Kong2Simon Fraser
    University3
  • Sun Yat-Sen University4

Prepared by Raymond Chi-Wing Wong Presented by
Raymond Chi-Wing Wong
2
Outline
  • Introduction
  • Skyline
  • Contributions
  • Problem Definition
  • Adaptive SFS
  • IPO-Tree
  • Conclusion

3
1. Introduction
Suppose we want to look for a vacation package
3 packages
We want to have a cheaper package.
We want to have a higher hotel-class.
Package ID Price Hotel-class
a 1600 4
b 2400 1
c 3000 5
Package a dominates package b
  • We know that
  • Package a has a cheaper price
  • Package a has a higher hotel-class

skyline
We want to find a set of packages which are NOT
dominated by any other pacakges
All of the best possible choices. i.e., a, c
4
1. Introduction
Suppose a customer has the following
preferences. H lt T lt M
Suppose another customer has the following
preferences. H lt M lt T
The skyline points are packages a and c.
The skyline points are packages a, c and e.
In other words, different preferences give
different skyline points.
Suppose we want to look for a vacation package
6 packages
We want to have a cheaper package.
We want to have a higher hotel-class.
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
How about this one?
Different customers may have different
preferences on Hotel-group.
5
1. Introduction
Suppose a customer has the following
preferences. H lt T lt M
Suppose another customer has the following
preferences. H lt M lt T
The skyline points are packages a and c.
The skyline points are packages a, c and e.
In other words, different preferences give
different skyline points.
Suppose we want to look for a vacation package
6 packages
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
6
1. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
7
1. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
8
1. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
It works. However, this solution is not scalable
and the results cannot be returned efficiently.
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
9
1. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
It works when there are limited number of
preferences. However, this solution is not
scalable when there are a lot of possible
preferences.
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
e.g. three nominal attributes (like Hotel-Group)
each of which contains 40 possible values there
are 4.1 x 109 possible preferences (in our
problem setting).
10
1. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Good tradeoff between storage consumption and
efficiency
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
11
1. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
  • Questions
  • What preferences should be stored?
  • With these preferences, how can we perform a
    skyline query efficiently?

Adaptive SFS
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
IPO-Tree (Implicit Preference Order Tree)
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
12
1. Contributions
  • Most Existing Work
  • Assume that each attribute has a certain ordering
    (either totally ordered or partially ordered) on
    the attribute values
  • Our Work
  • Different users can have different preferences
    (i.e., the ordering on attribute values are
    different with different users)
  • Propose a semi-materialization method IPO-tree to
    answer the skyline query efficiently.

13
2. Problem Definition
  • Usually, a user should NOT specify an ordering on
    all possible values on attribute Hotel-Group
  • Only list a few of the most favorite choices

e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
14
2. Problem Definition
  • Usually, a user should NOT specify an ordering on
    all possible values on attribute Hotel-Group
  • Only list a few of the most favorite choices

e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
A user prefers M to H.
15
2. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
  • Usually, a user should NOT specify an ordering on
    all possible values on attribute Hotel-Group
  • Only list a few of the most favorite choices

e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
A user prefers H to .
This is the reason why we call an implicit
preference.
16
2. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
  • Usually, a user should NOT specify an ordering on
    all possible values on attribute Hotel-Group
  • Only list a few of the most favorite choices

e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary orders

MltH
17
2. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
  • Usually, a user should NOT specify an ordering on
    all possible values on attribute Hotel-Group
  • Only list a few of the most favorite choices

e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary orders

MltH
, MltT
18
2. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
  • Usually, a user should NOT specify an ordering on
    all possible values on attribute Hotel-Group
  • Only list a few of the most favorite choices

e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary orders

MltH
, MltT
, HltT
19
2. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
  • Questions
  • What preferences should be stored?
  • With these preferences, how can we perform a
    skyline query efficiently?

All possible values in attribute Hotel-group
other than M and H (in this case, T)
  • Usually, a user should NOT specify an ordering on
    all possible values on attribute Hotel-Group
  • Only list a few of the most favorite choices
  • Idea of our proposed semi-materialization
    IPO-tree
  • Store the skyline wrt the first-order implicit
    preference ONLY
  • Find the skyline wrt the implicit preference of
    any ordering from the skyline wrt the first-order
    implicit preference

e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Since the user gives only TWO choices, we define
the order of his preference to be TWO. We also
call this preference the second-order implicit
preference.
20
3. Adaptive SFS
Adaptive SFS
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
IPO-Tree (Implicit Preference Order Tree)
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
21
3. Adaptive SFS
  • Original SFS
  • Idea
  • Suppose we have a function f
  • Each tuple is assigned with a score obtained by f
  • Sort the tuples in ascending order of the scores
  • Process the tuples with this ordering

What we change is the score function Idea 1.
Pre-Computation first pre-sort the tuples
according to this new score function 2. Skyline
Queryre-sort the tuples for a skyline query
  • Adaptive SFS
  • Similar idea
  • However, the original score function is based on
  • Numeric attributes
  • NOT nominal attributes

22
4. IPO-Tree
Adaptive SFS
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
IPO-Tree (Implicit Preference Order Tree)
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
23
4. IPO-Tree
  • Questions
  • What preferences should be stored?
  • With these preferences, how can we perform a
    skyline query efficiently?
  • Idea of our proposed semi-materialization
    IPO-tree
  • Store the skyline with respect to the first-order
    implicit preference ONLY
  • Find the skyline with respect the implicit
    preference of any ordering from the skyline with
    respect to the first-order implicit preference

24
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders M lt T, M lt H
Some values other than M (i.e., H and T)
M lt SKY1 a, c, e, f
25
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
Some values other than H (i.e., T and M)
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
f is NOT a skyline point.
Why?
26
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
f is NOT a skyline point.
Why?
With the binary order HltM, c dominates f
We say that HltM disqualifies f as a skyline
point.
27
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
M lt H lt
Binary Orders
MltH
28
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Some values other than M and H (i.e., T)
M lt H lt
Binary Orders
MltH
, MltT
29
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Some values other than M and H (i.e., T)
M lt H lt
Binary Orders
, HltT
MltH
, MltT
30
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
M lt H lt
Binary Orders
, HltT
MltH
, MltT
31
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
M lt H lt
Binary Orders
, HltT
MltH
, MltT
32
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
M lt H lt
Binary Orders
, HltT
MltH
, MltT
33
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Additional binary order!
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
This binary order may disqualify some data points
in SKY3 like f
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Observation These points must be in PSKY1
PSKY1 e, f
M lt H lt
SKY3
a, c, e, f
Binary Orders
, HltT
MltH
, MltT
a, c, e U e, f
a, c, e, f
34
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
Skyline wrt the first-order preference
Skyline wrt the first-order preference
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
Skyline wrt the second-order preference
M lt H lt
SKY3
a, c, e, f
Binary Orders
, HltT
MltH
, MltT
a, c, e U e, f
a, c, e, f
35
4. IPO-Tree
Merging Property
v2 lt
v1 lt
Skyline wrt the first-order preference
Skyline wrt the first-order preference
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Skyline wrt the second-order preference
M lt H lt
SKY3
a, c, e, f
v1 lt v2 lt
36
4. IPO-Tree
v1 lt
v2 lt
v1 lt v2 lt
v1 lt v2 lt
v3 lt
v1 lt v2 lt v3 lt
v1 lt v2 lt v3 lt
v4 lt
v1 lt v2 lt v3 lt v4 lt
37
5. Empirical Study
  • Datasets
  • Synthetic Dataset
  • Anti-correlated dataset
  • Real Dataset (from UCI)
  • Nursery Dataset
  • Default Values (Synthetic)
  • No. of tuples 500K
  • No. of numeric dimensions 3
  • No. of nominal dimensions 2
  • No. of values in a nominal dimension 20
  • Order of implicit preference 3

38
5. Empirical Study
  • Variation
  • No. of data points
  • No. of numeric dimensions
  • No. of nominal dimensions
  • Cardinality of nominal dimensions
  • Order of implicit preference
  • Comparison
  • SFS-D
  • SFS-A
  • IPO Tree
  • IPO Tree-10

Original SFS
Adaptive SFS
IPO Tree which stores 10 most frequent values for
each nominal attribute (for comparison)
39
5. Empirical Study
Synthetic Data Set
40
5. Empirical Study
Real Data Set
41
6. Conclusion
  • Different customers have different preferences ?
    different skylines
  • Skyline Query on Nominal Attributes
  • Adaptive SFS algorithm
  • IPO-Tree algorithm
  • Experiments

42
QA
43
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
44
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Using some existing algorithms, we can first
remove some data points which must not be in
skyline with respect to any implicit preference
45
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
46
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point a is
3
1604
1
1600
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
1604
47
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point c is
3
3003
0
3000
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
1604
3003
48
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1604
3003
2406
3005
49
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Package ID Score
a 1604
e 2406
c 3003
f 3005
50
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point a is
2
1603
1
1600
Step 2 (Skyline Query) re-sort the tuples
for a skyline query (e.g., HltTlt)
Skyline Query
Pre-computation
Value H is assigned with value 1. Value T is
assigned with value 2. All values other than H
and T (i.e.,M) are still equal to value 3.
Package ID Score
a
e
c
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1603
2406
3005
51
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point c is
1
3001
0
3000
Since the score of a and c are updated, we need
to re-sort a and c. Note that the ordering of
all OTHER points not containing H nor
T remains unchanged.
Step 2 (Skyline Query) re-sort the tuples
for a skyline query (e.g., HltTlt)
Skyline Query
Pre-computation
Value H is assigned with value 1. Value T is
assigned with value 2. All values other than H
and T (i.e.,M) are still equal to value 3.
Package ID Score
a
e
c
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1603
2406
3001
3005
52
3. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
We just use the original SFS. With this sorted
list, we find the skyline a, c
Step 2 (Skyline Query) re-sort the tuples
for a skyline query (e.g., HltTlt)
Skyline Query
Pre-computation
Package ID Score
a
e
c
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1603
2406
3001
3005
53
4. IPO-Tree
e.g.1 Hotel-Group Mlt Airline
Glt e.g.2 Hotel-Group Mlt Airline
? e.g.3 Hotel-Group ? Airline
Glt
  • Idea
  • Pre-computation
  • Store the skyline wrt the first-order preference
  • Skyline Query
  • Find the skyline wrt the preference of any order
    according to the stored skylines wrt the
    first-order preference

How can we do it efficiently?
We propose an indexing structure called IPO-tree
54
4. IPO-Tree
e.g. three nominal attributes (like Hotel-Group)
each of which contains 40 possible
values
Package ID Price Reverse Hotel-class Hotel-group Airline
a 1600 1 T (Tulips) G (Gonna)
b 2400 4 T (Tulips) G (Gonna)
c 3000 0 H (Horizon) G (Gonna)
d 3600 1 H (Horizon) R (Redish)
e 2400 3 M (Mozilla) R (Redish)
f 3000 2 M (Mozilla) W (Wings)
Full Materialization
there are 4.1 x 109 possible preferences (in our
problem setting).
Semi-Materialization IPO-tree
there are 70,644 nodes (which is significantly
smaller than4.1 x 109).
root
Hotel-Group
Airline
Hotel-group Tlt Airline Glt
Hotel-group Tlt Airline ?
Hotel-group ? Airline Glt
55
4. IPO-Tree
  • One nominal attribute
  • Merging Property
  • Multiple nominal attributes
  • Consider ONE nominal attribute at a timewith
    Merging Property
  • Fix the ordering of OTHER nominal attributes
  • Then, consider each of other nominal attributes
    with Merging Property

56
4. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
57
4. IPO-Tree
Package ID Price Hotel-class Hotel-group Airline
a 1600 4 T (Tulips) G (Gonna)
b 2400 1 T (Tulips) G (Gonna)
c 3000 5 H (Horizon) G (Gonna)
d 3600 4 H (Horizon) R (Redish)
e 2400 2 M (Mozilla) R (Redish)
f 3000 3 M (Mozilla) W (Wings)
Hotel-Group MltHlt Airline GltRlt
58
4. IPO-Tree
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
M lt H lt
SKY3
a, c, e, f
a, c, e U e, f
a, c, e, f
59
4. IPO-Tree
  • Theorem Given a user query with x-th order
    implicit preference on m nominal attributes,
    the number of set operations required for an
    x-th order implicit preference is O(xm).

m 2 x 2 No. of set operations O(22)
Write a Comment
User Comments (0)
About PowerShow.com