I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis - PowerPoint PPT Presentation

Loading...

PPT – I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis PowerPoint presentation | free to download - id: 183d9d-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis

Description:

Union(xi, xj) : Joins the sets containing xi and xj ... O(N log N) time [Carr et al. 03] Extend to arbitrary dimensions. Join Tree and Split Tree ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 42
Provided by: Gue285
Learn more at: http://www.cse.ust.hk
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: I/O-Efficient Batched Union-Find and Its Applications to Terrain Analysis


1
I/O-Efficient Batched Union-Find and Its
Applications to Terrain Analysis
  • Pankaj K. Agarwal, Lars Arge, and Ke Yi
  • Duke University
  • University of Aarhus

2
The Union-Find Problem
  • A universe of N elements x1, x2, , xN
  • Initially N singleton sets x1, x2 , , xN
  • Each set has a representative
  • Maintain the partition under
  • Union(xi, xj) Joins the sets containing xi and
    xj
  • Find(xi) Returns the representative of the set
    containing xi

3
The Solution
representatives
d
h
i
p
b
j
a
f
l
s
r
c
z
k
e
g
m
n
Union(d, h)
Find(n)
h
h
d
f
l
d
f
l
n
m
b
j
a
b
j
a
m
link-by-rank
path compression
e
g
e
g
n
4
Complexity
  • O(N a(N)) for a sequence of N union and find
    operations Tarjan 75
  • a() Inverse Ackermann function (very slow!)
  • Optimal in the worst case Tarjan79, Fredman and
    Saks 89
  • Batched (Off-line) version
  • Entire sequence known in advance
  • Can be improved to linear on RAM Gabow and
    Tarjan 85
  • Not possible on a pointer machine Tarjan79

5
Simple and Good, as long as
  • The entire data structure fits in memory

6
The I/O Model
Main memory of size M
One I/O transfers B items between memory and disk
Disk of infinite size
7
Our Results
  • An I/O-efficient algorithm for the batched
    union-find problem using O(sort(N)) O(N/B
    logM/B(N/B)) I/Os expected
  • Same as sorting
  • optimal in the worst case
  • A practical algorithm using O(sort(N) log(N/M))
    I/Os
  • Applications to terrain analysis
  • Topological persistence O(sort(N)) I/Os
  • Contour trees O(sort(N)) I/Os

8
I/O-Efficient Batched Union-Find
  • Assumption No redundant unions
  • Each union must join two different sets
  • Will remove later
  • Two-stage algorithm
  • Convert to interval union-find
  • Compute an order on the elements s.t. each union
    joins two adjacent sets
  • Solve batched interval union-find

9
Union Graph
(Tree if no redundant unions)
1 Union(d, g) 2 Union(a, c) 3 Union(r, b) 4
Union(a, e) 5 Union(e, i) 6 Union(r, a) 7
Union(a, d) g 8 Union(d, h)
r 9 Union(b, f)
r
r
9
3
6
6
3
f
a
b
a
b
4
4
2
9
2
7
7
c
d
e
f
c
d
e
1
5
8
1
5
g
h
i
g
i
8
h
Equivalent union trees
10
Transforming the Union Tree
r
r
r
7
3
3
3
6
6
6
8
8
a
b
a
b
h
a
b
d
h
4
2
9
2
9
9
4
4
7
7
1
2
c
d
e
f
c
d
e
f
c
e
f
g
1
5
8
1
5
5
i
g
h
i
g
i
r
7
9
6
3
8
a
b
d
f
h
Weights along root-to-leaf path decrease
1
2
4
5
c
e
g
i
11
Formulating as a Batched Problem
r
3
6
a
b
r
7
4
9
2
9
6
3
7
8
a
b
d
f
h
c
d
e
f
1
2
1
5
8
4
5
c
e
g
i
g
h
i
For each edge, find the lowest ancestor edge with
a higher weight
12
Cast in a Geometry Setting
r
3
9
6
8
a
b
7
4
2
9
7
6
c
d
e
f
5
1
5
8
4
3
g
h
i
2
1
Euler Tour
x positions in the tour y weight
In O(sort(N)) I/Os Chiang et al. 95
13
Cast in a Geometry Setting
r
3
9
6
8
a
b
7
4
2
9
7
6
c
d
e
f
5
1
5
8
4
3
g
h
i
2
1
For each edge, find the lowest ancestor edge with
a higher weight
For each segment, find the shortest segment above
and containing it
14
Distribution Sweeping
M/B vertical slabs
checked recursively
Total cost O(sort(N))
checked here
15
In-Order Traversal
r
3
9
6
7
Weights along root-to-leaf path decrease
8
a
b
d
f
h
1
2
4
5
c
e
g
i
  • At u, with child u1,, uk (in increasing order
    of weight)
  • Recursively visit subtree at u1
  • Return u
  • For i2 ,, k Recursively visit subtree at ui

b
r
a
c
e
i
g
d
h
f
Claim this traversal produces the right order
16
Solving Interval Union-Find
Union x two operands y time stamp Find x
operand y time stamp
representative
17
Solving Interval Union-Find
Union x two operands y time stamp Find x
operand y time stamp
Four instances of batched ray shooting O(sort(N))
18
Solving Interval Union-Find
Union x two operands y time stamp Find x
operand y time stamp
Four instances of batched ray shooting O(sort(N))
19
Handling Redundant Unions
  • Union tree becomes a general graph
  • Compute the minimum spanning tree
  • O(sort(N)) I/Os (randomized) Chiang et al. 95
    O(sort(N) loglog B) I/Os (deterministic) Arge et
    al. 04
  • Deterministic O(sort(N)) I/Os if graph is planar
  • Only MST edges are non-redundant

20
Applications
  1. Topological Persistence
  2. Contour Trees

21
Application Topological Persistence
  • Introduced by Edelsbrunner et al. 2000
  • Measure importance on a surface
  • Feature extraction
  • Topological de-noising
  • Many applications
  • Surface modeling
  • Shape analysis
  • Terrain analysis
  • Computational Biology

22
Topological Persistence Illustrated
23
Formulated as Batched Union-Find
  • Represented as a triangulated mesh
  • Consider minimum-saddle pairs
  • When reach
  • A minimum or maximum do nothing
  • A regular point u Issue union(u,v) for a lower
    neighbor v
  • A saddle u let v and w be nodes from us two
    connected pieces in its lower link Issue
    find(v), find(w), union(u,v), union(u,w)

lower link
24
Experiment 1 Random Union-Find
128MB memory
25
Experiment 2 Topological Persistence on Terrain
Data
Neuse River Basin of North Carolina 0.5
billion points
26
Experiment 2 Topological Persistence on Terrain
Data
128MB memory
Entire data set (0.5b) IM fails and EM takes 10
hours
27
Contour Trees
28
Summary
  • An I/O-efficient algorithm for the batched
    union-find problem using O(sort(N)) O(N/B
    logM/B(N/B)) I/Os
  • optimal in the worst case
  • A practical algorithm using O(sort(N) log(N/M))
    I/Os
  • Applications to terrain analysis
  • Topological persistence O(sort(N)) I/Os
  • Contour trees O(sort(N)) I/Os
  • Open Question
  • On-line case Can we get below O(N a(N)) I/Os?

29
Thank you!
30
Previous Results
  • Directly maintain contours
  • O(N log N) time van Kreveld et al. 97
  • Needs union-split-find for circular lists
  • Do not extend to higher dimensions
  • Two sweeps by maintaining components, then merge
  • O(N log N) time Carr et al. 03
  • Extend to arbitrary dimensions

31
Join Tree and Split Tree
Qualified nodes
9
9
9
9
8
8
8
8
7
7
7
7
6
6
6
6
5
5
5
5
4
4
4
4
3
3
3
3
2
2
1
1
1
1
Join tree
Split tree
Join tree
Split tree
32
Final Contour Tree
Hard to BATCH!
9
9
9
8
8
8
7
7
7
6
6
6
5
5
5
4
4
4
3
3
3
2
2
2
1
1
1
Join tree
Split tree
Contour tree
33
Another Characterization
Let w be the highest node that is a descendant of
v in join tree and ancestor of u in split tree,
(u, w) is a contour tree edge
9
9
9
Now can BATCH!
8
8
8
u
7
7
u
7
u
6
6
6
v
v
u
5
5
5
w
w
w
4
4
4
3
3
3
2
2
2
1
1
1
Join tree
Split tree
Contour tree
34
Map to Rectangles
9
9
w
8
8
u
7
7
u
u
6
6
v
v
5
5
w
w
4
4
v
3
3
2
2
1
1
Can be solved in O(sort(N)) I/Os (practical, too)
Join tree
Split tree
35
Topological Persistence
36
Label Nodes with Intervals
9
8
7
6
5
4
3
2
1
Using Euler tour (O(sort(N) I/Os)
37
Map to Rectangles
9
9
w
8
8
u
7
7
u
u
6
6
v
v
5
5
w
w
4
4
v
3
3
2
2
1
1
Can be solved in O(sort(N)) I/Os (practical, too)
Join tree
Split tree
38
Formulated as Batched Union-Find
  • Represented as a triangulated mesh
  • Consider minimum-saddle pairs
  • When reach
  • A minimum or maximum do nothing
  • A regular poin u Issue union(u,v) for a lower
    neighbor v
  • A saddle u let v and w be nodes from us two
    connected pieces in its lower link Issue
    find(v), find(w), union(u,v), union(u,w)

lower link
39
Experiment 1 Random Union-Find
40
Experiment 2 Topological Persistence on Terrain
Data
41
Experiment 2 Topological Persistence on Terrain
Data
About PowerShow.com