Approximate querying about the Past, the Present, and the Future in SpatioTemporal Databases - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Approximate querying about the Past, the Present, and the Future in SpatioTemporal Databases

Description:

Historical time query - Historical Synopsis. Future time query - Prediction Model. 5 ... Historical Synopsis. AMH maintains the current buckets. Past index ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 34
Provided by: Jim459
Category:

less

Transcript and Presenter's Notes

Title: Approximate querying about the Past, the Present, and the Future in SpatioTemporal Databases


1
Approximate querying about the Past, the Present,
and the Futurein Spatio-Temporal Databases
Jimeng Sun Carnegie Mellon Univ.
Dimitris Papadias, HKUST
Joint work with
Yufei Tao, City Univ. of Hong Kong
Bin Liu, HKUST
2
Motivation
  • Spatio-temporal databases and Data streams
  • The monitoring applications
  • Traffic supervision
  • Mobile users monitoring
  • Weather forecasting
  • Example
  • find the number of vehicles
  • in the city center now
  • The challenge is to provide fast query response
    in update-intensive environments

3
Problem and method
  • Problems
  • How to efficiently store/summarize
    spatio-temporal information?
  • How to approximately answer queries about the
    past, the present, and the future?
  • Given a large number of updates ltid, loc, timegt,
    answer count query ltR, tgt where R is a query
    region and t a timestamp.

4
Method specification
  • Present time query -Adaptive Multi-dimensional
    histogram (AMH)
  • Historical time query - Historical Synopsis
  • Future time query - Prediction Model

5
Related work
  • Histograms
  • Static multi-dimensional histograms
  • Mhist Poosala, etc 97, Minskew Acharya,etc
    99, GenhistGunopulos, etc 00, SQ Aboulnaga,
    etc 00
  • Query-adaptive multi-dimensional histograms
  • STGrid Aboulnaga,etc 00, STHolesBruno, etc
    01, SASHLim, 03
  • Spatio-temporal databases
  • focus on exact query processing
  • Agarwal, etc 00 Pfoser, etc 00Kollios, etc
    99

6
Outline
  • Introduction
  • Problem and proposed method
  • Present time query Adaptive multi-dimensional
    histogram
  • Historical time query Historical synopsis
  • Future time query Prediction model
  • Experiment
  • Conclusion

7
Query types
Select count() from Objects where location in
R time at t
8
System Overview
Historical Synopsis
AMH
Spatio-temporal updates e.g. Id old_loc new_loc
time obj1 loc1, loc2, t1 obj2 loc3,
loc4, t1 obj1 loc2, loc3, t2 obj3
loc4, loc5, t2 .
Queries
PT
Past Index
HT
FT
Prediction Model
  • Goals
  • Accurate query result
  • Fast query response under intensive updates.

9
Outline
  • Introduction
  • Problem and proposed method
  • Present time query Adaptive multi-dimensional
    histogram
  • Historical time query Historical synopsis
  • Future time query Prediction model
  • Experiment
  • Conclusion

10
Adaptive Multi-dimensional Histogram (AMH)
  • Objective minimize WVS?(areaivari) (Minskew
    Acharya, Poosala, Ramaswamy 99)



Resolution H 6
Max number of Bucket B 6
11
Dynamic Maintenance of AMH
  • Our scheme record the information during the
    construction and modify the structure as needed.
  • 1. information update
  • Update the bucket count
  • 2. bucket reorganization
  • Merge to claim buckets
  • Split to reduce WVS

Reorganization algorithm
while(no pending query) while( b lt B)
split() merge()
12
Information update
Buckets
n1
n1
b1
b3
b6
b1
n3
n2
n2
mapping
n4
b5
b2
b1
b2
b1
n5
b6
b4
BPT
b5
b4
b3
13
Bucket reorganization -Merge
  • Merge the subtree that leads to minimal WVS
    increase

BPT
n1
n3
n2
b5
n1
n1
b
b1
b2
n3
n3
n2
n2
Buckets
n4
n4
n4
b5
b5
b1
b
Merge
b1
b2
b1
b2
n5
b6
n5
b6
BPT
b4
b3
b4
b3
b2
Bucket Info 1. region x-, xy-,y 2.
frequency count/area 3. 2nd moment (for
variance calculation)
b5
14
Bucket reorganization -Split
  • Split the bucket that leads to maximal WVS
    decrease

n1
n1
n3
n2
Split
n3
b5
b
n2
b2
n4
b5
b
b1
b2
b1
b2
15
Features of AMH
  • Bucket information is updated as new data arrive
  • Bucket extents continuously adapt to the data
    distribution changes
  • The maintenance does not affect the normal query
    processing
  • It is interruptible at any moment of time
  • It is performed during the idle CPU cycles
  • Goals
  • Accuracy of the query
  • Fast response

16
Outline
  • Introduction
  • Problem and proposed method
  • Present time query Adaptive multi-dimensional
    histogram
  • Historical time query Historical synopsis
  • Future time query Prediction model
  • Experiment
  • Conclusion

17
Historical Synopsis
  • AMH maintains the current buckets.
  • Past index stores the obsolete buckets.

update(b, t) insert_past_index(old_b, t)
insert_AMH(b, t)
18
Historical Synopsis (cont.)
  • Past index
  • Packed B-tree (fast insertion)
  • Index on ending time
  • 3D R-tree (fast query)
  • Index on both time and location

19
Prediction Model
  • Prediction based on velocity doesnt work!
  • Velocity is highly dynamic
  • We only use the past and present location
    information to do prediction.

20
Prediction Model (cont.)
ltR, 0gt ltR, -1gt ltR, -2gt
ltR,tgt
FT
PT
Parse
Prediction Model
HT
results
forecast the future using any time series
prediction method we use AR
21
Outline
  • Introduction
  • Problem and proposed method
  • Present time query Adaptive multi-dimensional
    histogram
  • Historical time query Historical synopsis
  • Future time query Prediction model
  • Experiments
  • Conclusion

22
Experiment settings
  • Datasets
  • 2.5M updates for each dataset
  • spatial 50K mobile objects from 2 spatial
    datasets
  • road from a spatio-temporal generator
    (described in Brinkhoff 2002 )
  • 100100 cells, 500 buckets

final
initial
Road network
Data distribution
23
Experiment Design
  • Q1 Robustness with time/updates
  • Q2 Comparison with conventional histogram
  • Q3 Comparison with velocity-based prediction
  • Q4 Implementation decision on B-tree and R-tree

24
Robustness with time
Query qlength 6 of the data space 25K
queries uniformly distribute along space and
time error rate actual-approximate/actual
road
error rate
25
Comparison with conventional histogram
  • Minskew (a static spatial histogram) is rebuilt
    every 50k location updates
  • tp is the proportion between the cost of AMH and
    that of Minskew
  • The re-organization operations of AMH are
    uniformly distributed among the 50k location
    updates.
  • Better accuracy order of times speed-up

Minskew
road
AMH
26
Comparison with velocity-based prediction
  • Compare with the velocity-based prediction method
    in Tao, Sun, Papadias 03
  • Velocity-based prediction assumes no update
    between current time and query time
  • Error of location-based method grows slowly
  • Error of velocity-based grows much faster

Velocity-based
Location-based
27
B-tree vs. R-tree
R-tree
B-tree
Query type
  • B-tree performs better at the high update rate.
  • R-tree provides much faster query response.
  • In general, when query/update ratio is large
    (gt30), R-tree performs better.

28
Conclusion
  • A comprehensive approach for processing count
    queries about any time
  • The approach handles extremely high load without
    affecting query processing
  • Fast query response
  • Accuracy under high load

29
QA
30
Histogram
  • Partition the space into buckets
  • Data within a bucket are summarized by the mean
  • The properties of a good histogram
  • Uniformity within each bucket
  • Incrementally update-able

bad
good
31
Spatial dataset
Query qlength 6 of the data space 25K
queries uniformly distributed in space and time
Velocity-based
Minskew
spatial
Location-based
AMH
32
Evaluation over different query types
spatial
road
present
historical
future
33
Summary
Historical Synopsis
AMH
0. goal min(WVS) 1. Info update 2.
Reorganization happens when CPU is idle
Prediction Model
Old buckets
Past Index
Forecast based on the present and past.
1.Recent buckets in memory 2.Old buckets dump to
the disk
Write a Comment
User Comments (0)
About PowerShow.com