Title: Efficient Skyline Querying with Variable User Preferences on Nominal Attributes
1Efficient Skyline Querying with Variable User
Preferences on Nominal Attributes
- Raymond Chi-Wing Wong1, Ada Wai-Chee Fu2, Jian
Pei3, - Yip Sing Ho2, Tai Wong2 and Yubao Liu4
- The Hong Kong University of Science and
Technology1 - The Chinese University of Hong Kong2Simon Fraser
University3 - Sun Yat-Sen University4
Prepared by Raymond Chi-Wing Wong Presented by
Raymond Chi-Wing Wong
2Outline
- Introduction
- Skyline
- Contributions
- Problem Definition
- Adaptive SFS
- IPO-Tree
- Conclusion
31. Introduction
Suppose we want to look for a vacation package
3 packages
We want to have a cheaper package.
We want to have a higher hotel-class.
Package ID Price Hotel-class
a 1600 4
b 2400 1
c 3000 5
Package a dominates package b
- We know that
- Package a has a cheaper price
- Package a has a higher hotel-class
skyline
We want to find a set of packages which are NOT
dominated by any other pacakges
All of the best possible choices. i.e., a, c
41. Introduction
Suppose a customer has the following
preferences. H lt T lt M
Suppose another customer has the following
preferences. H lt M lt T
The skyline points are packages a and c.
The skyline points are packages a, c and e.
In other words, different preferences give
different skyline points.
Suppose we want to look for a vacation package
6 packages
We want to have a cheaper package.
We want to have a higher hotel-class.
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
How about this one?
Different customers may have different
preferences on Hotel-group.
51. Introduction
Suppose a customer has the following
preferences. H lt T lt M
Suppose another customer has the following
preferences. H lt M lt T
The skyline points are packages a and c.
The skyline points are packages a, c and e.
In other words, different preferences give
different skyline points.
Suppose we want to look for a vacation package
6 packages
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
61. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
71. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
81. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
It works. However, this solution is not scalable
and the results cannot be returned efficiently.
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
91. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
It works when there are limited number of
preferences. However, this solution is not
scalable when there are a lot of possible
preferences.
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
e.g. three nominal attributes (like Hotel-Group)
each of which contains 40 possible values there
are 4.1 x 109 possible preferences (in our
problem setting).
101. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Good tradeoff between storage consumption and
efficiency
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
111. Introduction
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Problem Given a preference on Hotel-group, we
want to find the skyline with respect to this
preference efficiently
- Questions
- What preferences should be stored?
- With these preferences, how can we perform a
skyline query efficiently?
Adaptive SFS
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
IPO-Tree (Implicit Preference Order Tree)
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
121. Contributions
- Most Existing Work
- Assume that each attribute has a certain ordering
(either totally ordered or partially ordered) on
the attribute values
- Our Work
- Different users can have different preferences
(i.e., the ordering on attribute values are
different with different users)
- Propose a semi-materialization method IPO-tree to
answer the skyline query efficiently.
132. Problem Definition
- Usually, a user should NOT specify an ordering on
all possible values on attribute Hotel-Group - Only list a few of the most favorite choices
e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
142. Problem Definition
- Usually, a user should NOT specify an ordering on
all possible values on attribute Hotel-Group - Only list a few of the most favorite choices
e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
A user prefers M to H.
152. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
- Usually, a user should NOT specify an ordering on
all possible values on attribute Hotel-Group - Only list a few of the most favorite choices
e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
A user prefers H to .
This is the reason why we call an implicit
preference.
162. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
- Usually, a user should NOT specify an ordering on
all possible values on attribute Hotel-Group - Only list a few of the most favorite choices
e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary orders
MltH
172. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
- Usually, a user should NOT specify an ordering on
all possible values on attribute Hotel-Group - Only list a few of the most favorite choices
e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary orders
MltH
, MltT
182. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
All possible values in attribute Hotel-group
other than M and H (in this case, T)
- Usually, a user should NOT specify an ordering on
all possible values on attribute Hotel-Group - Only list a few of the most favorite choices
e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary orders
MltH
, MltT
, HltT
192. Problem Definition
Problem Given an implicit preference on
Hotel-group, we want to find the skyline with
respect to this preference efficiently
- Questions
- What preferences should be stored?
- With these preferences, how can we perform a
skyline query efficiently?
All possible values in attribute Hotel-group
other than M and H (in this case, T)
- Usually, a user should NOT specify an ordering on
all possible values on attribute Hotel-Group - Only list a few of the most favorite choices
- Idea of our proposed semi-materialization
IPO-tree - Store the skyline wrt the first-order implicit
preference ONLY - Find the skyline wrt the implicit preference of
any ordering from the skyline wrt the first-order
implicit preference
e.g. M lt H lt
Implicit preference
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Since the user gives only TWO choices, we define
the order of his preference to be TWO. We also
call this preference the second-order implicit
preference.
203. Adaptive SFS
Adaptive SFS
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
IPO-Tree (Implicit Preference Order Tree)
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
213. Adaptive SFS
- Original SFS
- Idea
- Suppose we have a function f
- Each tuple is assigned with a score obtained by f
- Sort the tuples in ascending order of the scores
- Process the tuples with this ordering
What we change is the score function Idea 1.
Pre-Computation first pre-sort the tuples
according to this new score function 2. Skyline
Queryre-sort the tuples for a skyline query
- Adaptive SFS
- Similar idea
- However, the original score function is based on
- Numeric attributes
- NOT nominal attributes
224. IPO-Tree
Adaptive SFS
Straightforward solution
Adopt some existing skyline techniques such as
SFS (Sort-First Skyline) to compute the skyline
on-the-fly when we need to perform a skyline query
Full Materialization solution
Pre-computation For each possible preference,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly for a
skyline query
IPO-Tree (Implicit Preference Order Tree)
Semi-Materialization solution
Pre-computation For SOME possible preferences,
(1) pre-compute
the skyline and (2) store it in a storage Skyline
Query return the stored skyline directly OR with
simple operations for a
skyline query
234. IPO-Tree
- Questions
- What preferences should be stored?
- With these preferences, how can we perform a
skyline query efficiently?
- Idea of our proposed semi-materialization
IPO-tree - Store the skyline with respect to the first-order
implicit preference ONLY - Find the skyline with respect the implicit
preference of any ordering from the skyline with
respect to the first-order implicit preference
244. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders M lt T, M lt H
Some values other than M (i.e., H and T)
M lt SKY1 a, c, e, f
254. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
Some values other than H (i.e., T and M)
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
f is NOT a skyline point.
Why?
264. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
f is NOT a skyline point.
Why?
With the binary order HltM, c dominates f
We say that HltM disqualifies f as a skyline
point.
274. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
M lt H lt
Binary Orders
MltH
284. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Some values other than M and H (i.e., T)
M lt H lt
Binary Orders
MltH
, MltT
294. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Some values other than M and H (i.e., T)
M lt H lt
Binary Orders
, HltT
MltH
, MltT
304. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
M lt H lt
Binary Orders
, HltT
MltH
, MltT
314. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
M lt H lt
Binary Orders
, HltT
MltH
, MltT
324. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
M lt H lt
Binary Orders
, HltT
MltH
, MltT
334. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Additional binary order!
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
This binary order may disqualify some data points
in SKY3 like f
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Observation These points must be in PSKY1
PSKY1 e, f
M lt H lt
SKY3
a, c, e, f
Binary Orders
, HltT
MltH
, MltT
a, c, e U e, f
a, c, e, f
344. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
Binary Orders H lt T, H lt M
Binary Orders M lt T, M lt H
Skyline wrt the first-order preference
Skyline wrt the first-order preference
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
Skyline wrt the second-order preference
M lt H lt
SKY3
a, c, e, f
Binary Orders
, HltT
MltH
, MltT
a, c, e U e, f
a, c, e, f
354. IPO-Tree
Merging Property
v2 lt
v1 lt
Skyline wrt the first-order preference
Skyline wrt the first-order preference
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
Skyline wrt the second-order preference
M lt H lt
SKY3
a, c, e, f
v1 lt v2 lt
364. IPO-Tree
v1 lt
v2 lt
v1 lt v2 lt
v1 lt v2 lt
v3 lt
v1 lt v2 lt v3 lt
v1 lt v2 lt v3 lt
v4 lt
v1 lt v2 lt v3 lt v4 lt
375. Empirical Study
- Datasets
- Synthetic Dataset
- Anti-correlated dataset
- Real Dataset (from UCI)
- Nursery Dataset
- Default Values (Synthetic)
- No. of tuples 500K
- No. of numeric dimensions 3
- No. of nominal dimensions 2
- No. of values in a nominal dimension 20
- Order of implicit preference 3
385. Empirical Study
- Variation
- No. of data points
- No. of numeric dimensions
- No. of nominal dimensions
- Cardinality of nominal dimensions
- Order of implicit preference
- Comparison
- SFS-D
- SFS-A
- IPO Tree
- IPO Tree-10
Original SFS
Adaptive SFS
IPO Tree which stores 10 most frequent values for
each nominal attribute (for comparison)
395. Empirical Study
Synthetic Data Set
405. Empirical Study
Real Data Set
416. Conclusion
- Different customers have different preferences ?
different skylines - Skyline Query on Nominal Attributes
- Adaptive SFS algorithm
- IPO-Tree algorithm
- Experiments
42QA
433. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
443. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Using some existing algorithms, we can first
remove some data points which must not be in
skyline with respect to any implicit preference
453. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
463. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point a is
3
1604
1
1600
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
1604
473. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point c is
3
3003
0
3000
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
1604
3003
483. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Step 1 (Pre-computation) pre-sort the tuples
according to the new score function
Each value in attribute Hotel-Group is assigned
with a SPECIAL value This special value is set
to the total number of possible values in
Hotel-Group (i.e., 3)
Package ID Score
a
c
e
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1604
3003
2406
3005
493. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Package ID Score
a 1604
e 2406
c 3003
f 3005
503. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point a is
2
1603
1
1600
Step 2 (Skyline Query) re-sort the tuples
for a skyline query (e.g., HltTlt)
Skyline Query
Pre-computation
Value H is assigned with value 1. Value T is
assigned with value 2. All values other than H
and T (i.e.,M) are still equal to value 3.
Package ID Score
a
e
c
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1603
2406
3005
513. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
Score of point c is
1
3001
0
3000
Since the score of a and c are updated, we need
to re-sort a and c. Note that the ordering of
all OTHER points not containing H nor
T remains unchanged.
Step 2 (Skyline Query) re-sort the tuples
for a skyline query (e.g., HltTlt)
Skyline Query
Pre-computation
Value H is assigned with value 1. Value T is
assigned with value 2. All values other than H
and T (i.e.,M) are still equal to value 3.
Package ID Score
a
e
c
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1603
2406
3001
3005
523. Adaptive SFS
Package ID Price Reverse Hotel-class Hotel-group
a 1600 1 T (Tulips)
b 2400 4 T (Tulips)
c 3000 0 H (Horizon)
d 3600 1 H (Horizon)
e 2400 3 M (Mozilla)
f 3000 2 M (Mozilla)
We just use the original SFS. With this sorted
list, we find the skyline a, c
Step 2 (Skyline Query) re-sort the tuples
for a skyline query (e.g., HltTlt)
Skyline Query
Pre-computation
Package ID Score
a
e
c
f
Package ID Score
a 1604
e 2406
c 3003
f 3005
1603
2406
3001
3005
534. IPO-Tree
e.g.1 Hotel-Group Mlt Airline
Glt e.g.2 Hotel-Group Mlt Airline
? e.g.3 Hotel-Group ? Airline
Glt
- Idea
- Pre-computation
- Store the skyline wrt the first-order preference
- Skyline Query
- Find the skyline wrt the preference of any order
according to the stored skylines wrt the
first-order preference
How can we do it efficiently?
We propose an indexing structure called IPO-tree
544. IPO-Tree
e.g. three nominal attributes (like Hotel-Group)
each of which contains 40 possible
values
Package ID Price Reverse Hotel-class Hotel-group Airline
a 1600 1 T (Tulips) G (Gonna)
b 2400 4 T (Tulips) G (Gonna)
c 3000 0 H (Horizon) G (Gonna)
d 3600 1 H (Horizon) R (Redish)
e 2400 3 M (Mozilla) R (Redish)
f 3000 2 M (Mozilla) W (Wings)
Full Materialization
there are 4.1 x 109 possible preferences (in our
problem setting).
Semi-Materialization IPO-tree
there are 70,644 nodes (which is significantly
smaller than4.1 x 109).
root
Hotel-Group
Airline
Hotel-group Tlt Airline Glt
Hotel-group Tlt Airline ?
Hotel-group ? Airline Glt
554. IPO-Tree
- One nominal attribute
- Merging Property
- Multiple nominal attributes
- Consider ONE nominal attribute at a timewith
Merging Property - Fix the ordering of OTHER nominal attributes
- Then, consider each of other nominal attributes
with Merging Property
564. IPO-Tree
Package ID Price Hotel-class Hotel-group
a 1600 4 T (Tulips)
b 2400 1 T (Tulips)
c 3000 5 H (Horizon)
d 3600 4 H (Horizon)
e 2400 2 M (Mozilla)
f 3000 3 M (Mozilla)
574. IPO-Tree
Package ID Price Hotel-class Hotel-group Airline
a 1600 4 T (Tulips) G (Gonna)
b 2400 1 T (Tulips) G (Gonna)
c 3000 5 H (Horizon) G (Gonna)
d 3600 4 H (Horizon) R (Redish)
e 2400 2 M (Mozilla) R (Redish)
f 3000 3 M (Mozilla) W (Wings)
Hotel-Group MltHlt Airline GltRlt
584. IPO-Tree
M lt SKY1 a, c, e, f
H lt SKY2 a, c, e
PSKY1 e, f
M lt H lt
SKY3
a, c, e, f
a, c, e U e, f
a, c, e, f
594. IPO-Tree
- Theorem Given a user query with x-th order
implicit preference on m nominal attributes,
the number of set operations required for an
x-th order implicit preference is O(xm).
m 2 x 2 No. of set operations O(22)