External-Memory MST - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

External-Memory MST

Description:

Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) ... abf. cde. 3,5,6,9 {b,a} {a,c} {c,d} {d,e} {a, f} ... – PowerPoint PPT presentation

Number of Views:27

Avg rating:3.0/5.0

Slides: 26

Provided by: dlaz8

Category:

more less

Transcript and Presenter's Notes

Title: External-Memory MST

1
External-Memory MST

(Arge, Brodal, Toma)

2
Minimum-Spanning Tree

Given a weighted, undirected graph G(V,E), the
minimum-spanning tree (MST) problem is the
problem of finding a spanning tree for G of
minimum weight.
Assumptions
G is connected
No two edges in G have the same weight.

3
External-Memory Graph Algorithms

Standard two-level I/O model with a single disk
N V E
M number of vertices/edges that can fit into
internal memory.
B number of vertices/edges per disk block.
The graph is given as a list of edges sorted by
vertex.

4
External-Memory Graph Algorithms (2)

For MST and CC, randomize O(sort(E)) I/Os
algorithms are known.

5
Prims Algorithm
7
b,a
1
3
a,c
5
c,d
d,e
9
8
6
2
a, f
4
a b c d e f
Priority Queue
6
Prims Algorithm (2)

Prims algorithm cannot be implemented
efficiently in external memory
It is not guaranteed that even the priority queue
alone fits in memory.
Thus, we cannot in general get the current vertex
priority without using an I/O.
A direct implementation leads to an ?(E) I/O
algorithm.

7
Prims Algorithm (3)
Modification store edges in the priority-queue
instead of vertices.
7
b,a
1
3
a,c
5
c,d
d,e
9
8
6
2
a, f
4
d,e (4) b,d (6) c,b (5) a, f
(7) b,c (5) c,e (8) d,b (6)
b,d (6) e,c (8) d,b (6) c,e
(8) a, f (7) e, f (9)
a, f (7) e,c (8) c,e (8) e, f (9)
a,c (3) b,c (5) b,d (6) a, f (7)
c,d (2) b,d (6) c,b (5) a, f
(7) b,c (5) c,e (8)
c,b (5) a, f (7) b,c (5) e,c
(8) b,d (6) c,e (8) d,b (6) e,
f (9)
e,c (8) c,e (8) e, f (9) f, e (9)
b,a (1) b,c (5) b,d (6)
Any two edges have distinct weights
Priority Queue
8
Modified Prim Algorithm

The correctness follows directly from the
correctness of the original algorithm (blue
rule still applies).
Efficiency
At least one I/O per vertex in order to read its
adjacency list gt O(V E/B) I/Os.
O(E) operations on external priority queue can be
performed in O(sort(E)).
Thus in total we have O(V sort(E)) I/Os.

9
Boruvkas Algorithm
(1) Select for each vertex the minimum weight
edge adjacent to it. (2) Contract the graph and
return to (1)
b,a
7
1
3
5
c,d
d,e
9
8
6
2
a, f
4
10
Boruvkas Algorithm
(1) Select for each vertex the minimum weight
edge adjacent to it. (2) Contract the graph and
return to (1)
b,a
abf
a,c
c,d
3,5,6,9
d,e
a, f
cde
11
External-Memory Boruvkas Step

For each vertex v, let C(v) be the lightest
vertex adjacent to it.
Let G be the graph obtained by taking only edges
of the form (v, C(v)) for each v.
Let Gd be the graph obtained by directing each
edge (v, C(v)) in G from C(v) to v.
The goal is to contract each connected component
in G into a single vertex.

12
Unique Representatives

In each connected component of Gd
Each vertex has indegree 1.
The weight of the edges along any root-leaf path
is increasing.
There is exactly one cycle, consisting of the
minimal weight edge.

13
External-Memory Boruvkas Step (2)

The roots can be easily identified, and we can
choose them to be the unique representatives of
the components in G.
We would like to replace each edge (u, v) with an
edge (ur, vr), where ur and vr are the unique
representatives of the components containing u
and v respectively.
Then, we can remove parallel self edges, and
obtain the contracted graph.

14
External-Memory Boruvkas Step (3)
L
Output
(b,a) (1) (a, f) (7) (c,d) (2) (d,e) (4) (d,e)
(4) (a, f) (7)
G
G
Gd
b ? b c ? c a ? b d ? c f ? b e ? c
1
7
3
5
9
8
Priority Queue
6
2
a (1) b d (2) c
d (2) c f (7) b
e (4) c f (7) b
4
Initialized with each vertex that is an immediate
successor of a root vertex.
15
External-Memory Boruvkas Step (4)

To finish the contraction
sort the output of the previous phase and E by
the first component. Then scan the two lists
simultaneously, replacing each edge (v, u) in E
with (vr,u).
sort the output and E by the second component,
and then scan the two lists replacing each edge
(vr, u) in E with (vr, ur).
sort E by both components and by weight, and with
a single scan remove duplicate self edges.

16
Boruvkas Step - I/O efficiency

Lightest incident edges can be collected in
O(E/B) I/Os in a simple scan of the edge-list
representation of G (we assume it is sorted).
Detection of cycles in Gd can be done in
O(sort(V)) I/Os
sort the collected edges by weight and find
duplicates in a single scan.
remove edges to break cycles and identify unique
representatives.

17
Boruvkas Step - I/O efficiency (2)

The list L contains each edge in Gd at most
twice, and can be constructed in O(sort(V)) I/Os
sort one instance of the list of edges by the
second component.
sort another instance by the first component.
create the structure of L in a single scan and
sort it by weight.
4. The PQ can be initialized in a similar way in
O(sort(V)) I/Os.

18
Boruvkas Step - I/O efficiency (3)

5. We perform a total of V insertions to PQ, and
V extract-min operations. That can be performed
in O(sort(V)) I/Os.
6. Replacing the edges of G with the unique
representatives is done using a few sorting and
scanning operations as described before. Here the
entire edge list is sorted, and thus O(sort(E))
I/Os are needed.
Total
O(E/B sort(V) sort(E)) O(sort(E)) I/Os.

19
Results So Far
O(V sort(E)) I/Os
Modified Prim
O(sort(E) lgV) I/Os
Modified Boruvka
O(sort(E)lg(VB/E)) I/Os

Contract G until V E/B using Boruvkas steps.
Run Prim on the result.

It is possible to perform lg(VB/E) Boruvkas
steps using lglg(VB/E) superphases requiring
O(sort(E)) I/Os each.
20
Yet a better MST algorithm

Superphase Algorithm
At superphase i
Let Ni 2(3/2)i (Ni1 Ni(Ni)1/2)
Let Gi (Vi, Ei) be the graph prior to
superphase i.
Let Ei ? Ei be the set that for each vertex
contains the ?vNi? lightest edges incident to it.
Let the blocking value for a vertex be the weight
of the ?vNi 1?th lightest edge incident to it
(or infinity if no such edge exists).
Ei and blocking values can be found with
O(sort(Ei)) I/Os as described earlier.

21
Superphase Algorithm

At superphase i, perform on Gi ?logvNi?
contraction phases as described before, but now
select the lightest edge incident to a vertex
only if it is smaller than its blocking value.
After a single contraction, the blocking value of
a supervertex is set to be the minimum of the
blocking values of the contracted vertices.
After that, the remaining edges of Ei contain
all edges of Ei adjacent to supervertex v with
weight smaller than the blocking value of v.
Thus only edges that actually belong to the MST
are contracted.

22
Superphase Algorithm (2)

But how many vertices remain after each
superphase?
The blocking value might prevents us from
selecting an edge for v. But if so than
The blocking value of v corresponds to the
blocking value of some vertex u in Vi, and v must
contain the ?vNi? edges adjacent to u in Ei.
Thus v must be the contraction of at least vNi
vertices from Vi
If no blocking value prevents us from selecting
an edge for v, then after ?logvNi? phases, v must
be the contraction of at least 2logvNi vNi
vertices.

23
Superphase Algorithm (3)

It can be proved by induction on i that Vi 2V /
Ni
For i 0, Ni 2 and V0 V.
Vi1 Vi / vNi (2V / Ni) / vNi 2V /
Ni1
Conclusion Ei Vi ?vNi? 2V / vNi
Thus, in order to reduce the number of vertices
by a factor of vNi we used so far
O(sort(Ei) sort(Ei) logvNi)
O(sort(E) sort(V / vNi) logvNi)
O(sort(E)) I/Os.

24
Superphase Algorithm (4)

In order to finish a superphase, we need to
reincorporate edges from Ei not selected to Ei
During the contraction phases, maintain a list C
of the form (v, vs) for v ? Vi.
Use the output of the Boruvkas step, as
described earlier, in order to update C
Sort C by second component and the output by
first component and scan them simultaneously.
This is done using O(sort(Vi)) I/Os.
In total, in order to maintain C, we use
O(sort(Vi)logvNi) O(sort(V / Ni)logvNi)
O(sort(V)) I/Os.

25
Superphase Algorithm I/O Efficiency

Ei and blocking values are computed in
O(sort(Ei)) I/Os.
Each superphase takes up O(sort(E)) I/Os.
Maintaining the list C during the superphase is
done with O(sort(V)) I/Os.
Given C, the edges in (Ei \ Ei) can be
reincorporated in O(sort(E)) as we did in the
single contraction algorithm.
Finally, in order to reduce V to E/B,
log3/2lg(VB / E) superphases are needed.
Total O(sort(E)lglg(VB / E)) I/Os.