Approximate Spatial Query Processing Using Raster Signatures - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Approximate Spatial Query Processing Using Raster Signatures

Description:

Raster Signatures ... Approximate Spatial Query Processing Using Raster Signatures. 4 ... Four Color Raster Signature (4CRS) Raster approximation (VLDB98) ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 28
Provided by: Zimb
Category:

less

Transcript and Presenter's Notes

Title: Approximate Spatial Query Processing Using Raster Signatures


1
Approximate Spatial Query Processing Using
Raster Signatures
Federal University of Rio de Janeiro
  • Leonardo Guerreiro Azevedo, Rodrigo Salvador
    Monteiro, Geraldo Zimbrão Jano Moreira de Souza
  • Coppe Graduate School of Engineering
  • Institute of Mathematics Computer Science
    Department

2
Common Spatial Queries
  • Area of polygon
  • Area of polygon within window
  • Spatial Joins
  • polygon ? polygon, polygon ? polyline polyline
    ? polyline
  • Distance
  • Buffer
  • Perimeter
  • Topological queries

3
Common Spatial Queries
  • Approximate Area of polygon
  • Approximate Area of polygon within window
  • Approximate Spatial Joins
  • polygon ? polygon, polygon ? polyline polyline
    ? polyline
  • Approximate Distance
  • Approximate Buffer
  • Approximate Perimeter
  • Approximate Topological queries

4
Approximate Answers to Spatial Queries
  • What is an approximate answer?
  • If the exact result is a number, the
    approximate result will be a number and a
    confidence interval
  • If not, the graphical display of approximate
    answers is something like a fuzzy map

5
Motivation
  • The increase of storage capacity
  • The decrease of hardware costs
  • Disk access time is still high
  • Complex queries
  • Data stored in devices that are not on-line.

6
Motivation
  • Approximate answer may be enough
  • exact answers are itself approximations
  • Approximate answers can be computed quickly
  • Spatial query processing
  • Scale
  • Quality
  • Round-off errors

7
Scenarios and Applications
  • Decision Support System
  • Increasing business competitiveness
  • More use of accumulated data
  • Data mining
  • During drill down query sequence in ad-hoc data
    mining
  • Earlier queries in a sequence can be used to find
    out the interesting queries.
  • Data warehouse
  • Performance and scalability when accessing very
    large volumes of data during the analysis
    process.

8
Scenarios and Applications
  • Query optimization
  • To define the most efficient access plan for a
    given query
  • Distributed data recording and warehousing
    environments
  • Data may be remote, and even may be unavailable
  • Old data can be disposed in order to make room
    for new ones. Therefore it becomes impossible to
    answer to queries on deleted information.

9
Scenarios and Applications
  • Mobile computing
  • An approximate answer may be an alternative
  • When the data is not available
  • To save storage space

10
A framework for approximate query processing
Data environment set-up for providing approximate
answers
New data
Approx. Query Engine
Database
Queries
Responses
11
Four Color Raster Signature (4CRS)
  • Raster approximation (VLDB98)
  • Object representation upon a grid of cells.
  • Each cell stores relevant information using few
    bits.
  • Grid resolution can be changed
  • Precision ? storage requirements
  • 4 types of cells

12
4CRS Approximation Construction of Signatures
Polygon
4CRS
13
Polygon approximate area
  • The algorithm is based on the sum of the expected
    area of each cell grid
  • Empty cells 0
  • Full cells 100
  • Weak and Strong cells ? supposing uniform
    distribution
  • Weak cells (0, 0.5 interval ? mean 0.25
  • Strong cells (0.5, 1) interval ? mean 0.75
  • Count the number of each cell type in the
    polygons 4CRS, and multiply these values by the
    presumed cell area.

14
Confidence interval
  • A measure of answer accuracy
  • The polygon area inside weak or strong cell is
    assumed to be uniformly distributed.
  • Weak cells
  • Strong cells
  • Using Central Limit Theorem ? confidence interval
  • 95
  • 99

15
Confidence interval (example)
  • Query results
  • weak cells 100
  • strong cells 120
  • full cells 400
  • Confidence interval 95
  • Weak cells
  • Strong cells
  • Full cells 400 (full cells have the exact area!)
  • Total
  • Error between -1.15 and 1.15

16
Cell Area Distribution
Weak
Strong
Comparable to an uniform distribution Variance
0.021369 (U 0.020833) Mean 0.246453 (U 0.25)
17
Example
  • empty cells 55
  • weak cells 27
  • strong cells 26
  • full cells 79
  • Approximate area( S weak 0.25 S strong
    0.75 S full ) cellArea
  • Exact area 106.40
  • Appr. area 105.25
  • Error 1.07

18
Approximate area of polygon ? window intersection
  • This algorithm is similar to the approximate
    polygon area algorithm
  • There are two kinds of cell overlap
  • The cell may be completely contained by the
    window
  • The cell may be partially contained by the window
  • proportional to its overlapping area

19
Experimental tests
  • Computer PC Pentium IV 1,8 GHz, 512 MB RAM
  • Page size 2,048 Bytes
  • Target to evaluate the use of 4CRS for
    approximate query processing against exact query
    processing related to the following aspects
  • Response time
  • Storage requirements
  • Accuracy
  • The algorithms tested were
  • Polygon approximate area
  • Approximate area of polygon x window intersection
  • 100 random windows for each data set (different
    sizes and positions)

20
Experimental tests
  • Use of R-trees in order to reduce the search
    space.

21
Experimental tests
  • The polygon real data sets used in the
    experiments consist of township boundaries,
    census block-group, topography, geologic map and
    hydrographic map from Iowa (USA), and Brazilian
    municipalities.

22
Approximate polygon area
23
Approximate polygon area
24
Approximate polygon ? window area
25
Approximate polygon ? window area
26
Conclusion
  • The experimental results demonstrated the
    efficiency of the 4CRS use for approximate query
    processing.
  • Storage requirements
  • 4CRS has an average of 3.75 of the real data set
    size
  • Accuracy
  • Approximate area average error of 2.62
  • Window query approximate area average error of
    1
  • Response time
  • Approximate area average 28.41
  • Window query approximate area average 7.22
  • Disk access
  • Approximate area average 1.90
  • Window query approximate area average 7.04

27
Future works
  • Algorithms for the other operations
  • Approximate area of polygon x polygon
    intersection algorithm is being evaluated
  • Use of approximations for mobile computing
Write a Comment
User Comments (0)
About PowerShow.com