CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: CSIS 7101: Spatial Data (Part 2) Efficient Processing of Spatial Joins Using R-trees


1
CSIS 7101Spatial Data (Part 2)Efficient
Processing of Spatial Joins Using R-trees
  • Rollo Chan
  • Chu Chung Man
  • Mak Wai Yip
  • Vivian Lee
  • Eric Lo
  • Sindy Shou
  • Hugh Wang

2
Efficient Processing of Spatial Join Using R-trees
  • What is Spatial Data?
  • Consists of points, lines, rectangles, polygons,
    surfaces
  • Two types of queries in DBS
  • Single scan and Multiple scan queries
  • How to retrieve spatial objects in GIS
    efficiently?
  • Spatial Access Method (SAM) eg. R-tree

3
What is Spatial Access Method?
  • Designed to support single scan query
  • eg. Window query
  • Find all objects which intersect a given window
  • Attempts to store objects which are close
    together in the data space on a common page
  • Reduces number of disk accesses

4
How is window query processed by SAM?
  • 1) Filter step
  • Find all objects whose minimum bounding
    rectangles intersects the query rectangle
  • 2) Refinement step
  • Check whether the objects fulfill the query
    condition

5
What is Spatial Join?
  • To combine two sets of spatial objects according
    to some spatial properties
  • It is an important type of query for multiple
    scanning in spatial DBS

6
Example of Spatial Join
  • Two relations forests, cities
  • (Assume an attributes in each relation
    represents the borders of forests and cities)
  • Example query would be
  • Find all forests which are in a city

7
Problems when performing Spatial Join
  • It is too expensive in terms of CPU time and I/O
    time
  • Traditional index structure is not efficient for
    spatial join
  • How to make it more efficient?
  • R-tree

8
Why using R-tree for Spatial Join ?
  • To optimize CPU-time and I/O time
  • Less comparison than a simple nested loop
  • Other algorithms cannot be efficiently applied to
    spatial join

9
R-tree Approach for Spatial Join
  • Suppose there are two R-trees
  • R, S
  • Idea
  • To use the property that directory rectangles
    form the minimum bounding box of data rectangles
    in the corresponding subtrees.
  • If the rectangles of two directory entries ER
    and ES have common intersection then there is a
    pair (rectR, rectS)

10
Minimum Bounding Box
11
Is there anyway to be more efficient?
  • There are two areas we need to take into account
    in order to be more efficient
  • CPU Time Tuning
  • I/O Time Tuning

12
CPU Time Tuning
  • Two ways to improve CPU time
  • Restricting the search space
  • Spatial sorting and plane sweep

13
Restricting the search space
  • Idea
  • Scan through each of two nodes marks all entries
    which are required for performing the join, (i.e.
    which intersect the intersecting rectangles of
    two nodes. )
  • Then, each marked entry of one node is tested
    against all marked entries of the other node.

14
Restricting the search space (contd)
Original 7 of R 7 of S
5
49 joins
1
4
6
2
2
1
1
5
1
3
2
6
2
3
7
Now 3 of R 2 of S
7
3
6 joins
Plus Scanning 7 of R 7 of S
4
14 times
15
Spatial sorting and plane sweep
  • Idea
  • Sort the entries in a node of the R-tree
    according to the spatial location of the
    corresponding rectangles.
  • Then move the Sweep-Line perpendicular to one of
    the axes from left to right to compute the
    intersections.

16
Example of Sorted Intersection Test
r1.xu
  • t r1 r1 lt--gt s1
  • t s1 s1 lt--gt r2
  • t r2 r2 lt--gt s2, r2 lt--gt s3
  • t s2 -
  • t r3 r3 lt--gt s3

s1.xl lt r1.xu
s1.xl
Sweep-Line
17
I/O Time Tuning
  • To achieve good I/O-performance with a buffer
    size as small as possible
  • R-tree might occupy only small portion of
    LRU-buffer
  • Compute a read schedule of the pages to minimize
    the number of disk accesses
  • Local optimization policy based on spatial
    locality
  • Idea of Read Schedule If a frequently used page
    always resides in the buffer, the number of disk
    access can be improved by a lot

18
Three such techniques
  • Local plane sweep
  • Local plane sweep with pinning
  • Local z-order

19
Local Plane-Sweep Order
  • Idea
  • Based on spatial ordering, the plane-sweep
    algorithm creates a sequence of pairs of
    intersecting rectangles.
  • This sequence can be used to determine the read
    schedule of the spatial join.

20
Local Plane-Sweep Order (contd)
  • Read schedule

6
r3
r3
4
s2
s2
lt
gt
,
,
,
,
,
3
r1
r1
r4
r4
s1
s1
5
r2
1
r2
2
21
Local Plane-Sweep Order w/ Pinning
  • Idea
  • Determine a pair of (Er,Es) of entries wrt local
    plane sweep order. Compute the degree of the
    rectangles of both entries
  • Deg(E.rect) of intersections between E.rect
    and the rectangles which belong to entries of the
    other tree that are not yet processed
  • Pin the page in the buffer whose corresponding
    rectangle has maximal degree
  • Perform spatial join on the pinned page with all
    other pages

22
Local Plane-Sweep Order w/ Pinning (contd)
Er.rect r1 Es.rect s2
r3
Es
1
s2
0
2
Deg(r1)
Deg(s2)
2
Er
r1
r4
s1
r2
23
Local Z-Order
  • Idea
  • Compute the intersections between each rectangle
    of the one node and all rectangles of the other
    node
  • Sort the rectangles according to the spatial
    location of their centers
  • Decompose the underlying space into cells of
    equal size and provide an ordering on this set of
    cells

24
Local Z-Order (contd)
r3
III
III
s2
IV
IV
II
II
r1
r4
s1
I
I
r2
Read schedule lts1,r2,r1,s2,r4,r3gt
25
Number of Disk Access
gt
5384
5290
Size of LRU Buffer
lt
2392
2373
26
Number of Disk Access (contd)
Size of LRU Buffer
27
Q A
  • Thats it for the Presentation
  • Any Questions?

28
Reference
  1. Brinkhoff T., Kriegel H.P., Seeger B. (1993).
    Institute of Computer Science, University of
    Munich. Efficient Processing of Spatial Joins
    Using R-trees. Washington, DC, USA ACM-SIGMOD.
Write a Comment
User Comments (0)
About PowerShow.com