Spatial Indexing - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Spatial Indexing

Description:

Spatial Indexing SAMs Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation technique and a ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 46
Provided by: Valued1301
Learn more at: https://www.cs.bu.edu
Category:

less

Transcript and Presenter's Notes

Title: Spatial Indexing


1
Spatial Indexing
  • SAMs

2
Spatial Indexing
  • Point Access Methods can index only points. What
    about regions?
  • Z-ordering and quadtrees
  • Use the transformation technique and a PAM
  • New methods Spatial Access Methods SAMs
  • R-tree and variations

3
Problem
  • Given a collection of geometric objects (points,
    lines, polygons, ...)
  • organize them on disk, to answer spatial queries
    (range, nn, etc)

4
Transformation Technique
  • Map an d-dim MBR into a point ex.
  • (xmin, xmax) (ymin, ymax) gt
  • (xmin, xmax, ymin, ymax)
  • Use a PAM to index the 2d points
  • Given a range query, map the query into the 2d
    space and use the PAM to answer it

5
R-tree
  • Guttman 84 Main idea allow parents to overlap!
  • gt guaranteed 50 utilization
  • gt easier insertion/split algorithms.
  • (only deal with Minimum Bounding Rectangles -
    MBRs)

6
R-tree
  • A multi-way external memory tree
  • Index nodes and data (leaf) nodes
  • All leaf nodes appear on the same level
  • Every node contains between m and M entries
  • The root node has at least 2 entries (children)

7
Example
  • eg., w/ fanout 4 group nearby rectangles to
    parent MBRs each group -gt disk page

I
C
A
G
H
F
B
J
E
D
8
Example
  • F4

P1
P3
I
C
A
G
H
F
B
J
E
P4
D
P2
9
Example
  • F4 m2, M4

P5
P6
I
P6
P1
P2
P3
P4
C
A
P1
G
H
P3
F
B
J
E
P4
D
P2
P5
10
R-trees - format of nodes
  • (MBR obj_ptr) for leaf nodes

x-low x-high y-low y-high ...
obj ptr
...
11
R-trees - format of nodes
  • (MBR node_ptr) for non-leaf nodes

x-low x-high y-low y-high ...
node ptr
...
12
R-treesSearch
P5
P6
I
P1
P2
P3
P4
C
P1
A
G
H
P3
F
B
J
E
P4
D
P2
13
R-treesSearch
P5
P6
I
P1
P2
P3
P4
C
P1
A
G
H
P3
F
B
J
E
P4
D
P2
14
R-treesSearch
  • Main points
  • every parent node completely covers its
    children
  • a child MBR may be covered by more than one
    parent - it is stored under ONLY ONE of them.
    (ie., no need for dup. elim.)
  • a point query may follow multiple branches.
  • everything works for any(?) dimensionality

15
R-treesInsertion
Insert X
P1
P3
I
C
A
G
H
F
B
X
J
E
P4
D
P2
X
16
R-treesInsertion
Insert Y
P1
P3
I
C
A
G
H
F
B
J
E
P4
Y
D
P2
17
R-treesInsertion
  • Extend the parent MBR

P1
P3
I
C
A
G
H
F
B
J
E
P4
Y
D
P2
Y
18
R-treesInsertion
  • How to find the next node to insert the new
    object?
  • Using ChooseLeaf Find the entry that needs the
    least enlargement to include Y. Resolve ties
    using the area (smallest)
  • Other methods (later)

19
R-treesInsertion
  • If node is full then Split ex. Insert w

P1
P3
K
I
C
A
G
W
H
F
B
J
K
E
P4
D
P2
20
R-treesInsertion
  • If node is full then Split ex. Insert w

P3
I
P5
K
C
A
G
P1
W
H
F
B
J
E
P4
D
P2
Q2
Q1
21
R-treesSplit
  • Split node P1 partition the MBRs into two groups.
  • (A1 plane sweep,
  • until 50 of rectangles)
  • A2 linear split
  • A3 quadratic split
  • A4 exponential split
  • 2M-1 choices

P1
K
C
A
W
B
22
R-treesSplit
  • pick two rectangles as seeds
  • assign each rectangle R to the closest seed

seed1
23
R-treesSplit
  • pick two rectangles as seeds
  • assign each rectangle R to the closest
    seed
  • closest the smallest increase in area

seed1
24
R-treesSplit
  • How to pick Seeds
  • LinearFind the highest and lowest side in each
    dimension, normalize the separations, choose the
    pair with the greatest normalized separation
  • Quadratic For each pair E1 and E2, calculate the
    rectangle JMBR(E1, E2) and d J-E1-E2. Choose
    the pair with the largest d

25
R-treesInsertion
  • Use the ChooseLeaf to find the leaf node to
    insert an entry E
  • If leaf node is full, then Split, otherwise
    insert there
  • Propagate the split upwards, if necessary
  • Adjust parent nodes

26
R-TreesDeletion
  • Find the leaf node that contains the entry E
  • Remove E from this node
  • If underflow
  • Eliminate the node by removing the node entries
    and the parent entry
  • Reinsert the orphaned (other entries) into the
    tree using Insert
  • Other method (later)

27
R-trees Variations
  • R-tree DO not allow overlapping, so split the
    objects (similar to z-values)
  • R-tree change the insertion, deletion
    algorithms (minimize not only area but also
    perimeter, forced re-insertion )
  • Hilbert R-tree use the Hilbert values to insert
    objects into the tree

28
Spatial Access Methods
  • PAMs
  • Grid File
  • kd-tree based (LSD-, hB- trees)
  • Z-ordering B-tree
  • R-tree
  • Variations R-tree, Hilbert R-tree

29
R-tree
Multi-way external memory structure, indexes
MBRs Dynamic structure
P1
P3
I
C
A
G
H
F
B
J
E
P4
D
P2
30
R-tree
  • The original R-tree tries to minimize the area of
    each enclosing rectangle in the index nodes.
  • Is there any other property that can be
    optimized?

R-tree ? Yes!
31
R-tree
  • Optimization Criteria
  • (O1) Area covered by an index MBR
  • (O2) Overlap between directory MBRs
  • (O3) Margin of a directory rectangle
  • (O4) Storage utilization
  • Sometimes it is impossible to optimize all the
    above criteria at the same time!

32
R-tree
  • ChooseSubtree
  • If next node is a leaf node, choose the node
    using the following criteria
  • Least overlap enlargement
  • Least area enlargement
  • Smaller area
  • Else
  • Least area enlargement
  • Smaller area

33
R-tree
  • SplitNode
  • Choose the axis to split
  • Choose the two groups along the chosen axis
  • ChooseSplitAxis
  • Along each axis, sort rectangles and break them
    into two groups (M-2m2 possible ways where one
    group contains at least m rectangles). Compute
    the sum S of all margin-values (perimeters) of
    each pair of groups. Choose the one that
    minimizes S
  • ChooseSplitIndex
  • Along the chosen axis, choose the grouping that
    gives the minimum overlap-value

34
R-tree
  • Forced Reinsert
  • defer splits, by forced-reinsert, i.e. instead
    of splitting, temporarily delete some entries,
    shrink overflowing MBR, and re-insert those
    entries
  • Which ones to re-insert?
  • How many? A 30

35
R-tree variations
  • What about static datasets?
  • (no ins/del) Hilbert
  • What about other bounding shapes?

36
R-trees - variations
  • what about static datasets (no ins/del/upd)?
  • Q Best way to pack points?

37
R-trees - variations
  • what about static datasets (no ins/del/upd)?
  • Q Best way to pack points?
  • A1 plane-sweep
  • great for queries on x
  • terrible for y

38
R-trees - variations
  • what about static datasets (no ins/del/upd)?
  • Q Best way to pack points?
  • A1 plane-sweep
  • great for queries on x
  • bad for y

39
R-trees - variations
  • what about static datasets (no ins/del/upd)?
  • Q Best way to pack points?
  • A1 plane-sweep
  • great for queries on x
  • terrible for y
  • Q how to improve?

40
R-trees - variations
  • A plane-sweep on HILBERT curve!

41
R-trees - variations
  • A plane-sweep on HILBERT curve!
  • In fact, it can be made dynamic (how?), as well
    as to handle regions (how?)

42
R-trees - variations
  • Dynamic (Hilbert R-tree)
  • each point has an h-value (hilbert value)
  • insertions like a B-tree on the h-value
  • but also store MBR, for searches

43
Hilbert R-tree
  • Data structure of a node?

x-low, ylow x-high, y-high
LHV
ptr
h-value gt LHV MBRs inside parent MBR
44
R-trees - variations
  • Data structure of a node?

B-tree
x-low, ylow x-high, y-high
LHV
ptr
h-value gt LHV MBRs inside parent MBR
45
R-trees - variations
  • Data structure of a node?

R-tree
x-low, ylow x-high, y-high
LHV
ptr
h-value gt LHV MBRs inside parent MBR
Write a Comment
User Comments (0)
About PowerShow.com