Title: Network Coding: A New Direction in Combinatorial Optimization
1Network CodingA New Direction in Combinatorial
Optimization
2Collaborators
- David Karger
- Robert Kleinberg
- April Rasala Lehman
- Kazuo Murota
UMass
3Transportation Problems
Max Flow
4Transportation Problems
5Communication Problems
A problem of inherent interest in the planning
of large-scale communication, distribution and
transportation networks also arises with the
current rate structure for Bell System
leased-line services.
Motivation for Network Design largely from
communication networks
- Robert Prim, 1957
Spanning Tree
Steiner Forest
Steiner Tree
Multicommodity Buy-at-Bulk
Facility Location
Steiner Network
6What is the capacity of a network?
s2
s1
t1
t2
- Send items from s1?t1 and s2?t2
- Problem no disjoint paths
7An Information Network
s2
s1
t1
t2
- If sending information, we can do better
- Send xor b1?b2 on bottleneck edge
8Moral of Butterfly
Transportation Network Capacity ? Information
Network Capacity
9Understanding Network Capacity
- Information Theory
- Deep analysis of simple channels(noise,
interference, etc.) - Little understanding of network structures
- Combinatorial Optimization
- Deep understanding of transportation problems on
complex structures - Does not address information flow
- Network Coding
- Combine ideas from both fields
10Definition Instance
- Graph G (directed or undirected)
- Capacity ce on edge e
- k commodities, with
- A source si
- Set of sinks Ti
- Demand di
- Typically
- all capacities ce 1
- all demands di 1
- Technicality
- Always assume G is directed. Replace
with
11Definition Solution
- Alphabet ?(e) for messages on edge e
- A function fe for each edge s.t.
- Causality Edge (u,v) sendsinformation
previously received at u. - Correctness Each sink ti can decodedata from
source si.
b1
12Multicast
13Multicast
- Graph is DAG
- 1 source, k sinks
- Source has r messages in alphabet ?
- Each sink wants all msgs
Thm ACLY00 Network coding solution exists iff
connectivity r from source to each sink
14Multicast Example
s
m1
m2
t2
t1
15Linear Network Codes
- Treat alphabet ? as finite field
- Node outputs linearcombinations of inputs
- Thm LYC03 Linear codes sufficient for multicast
16Multicast Code Construction
- Thm HKMK03 Random linear codes work (over
large enough field) - Thm JS03 Deterministic algorithm to construct
codes - Thm HKM05 Deterministic algorithm to construct
codes (general algebraic approach)
17Random Coding Solution
- Randomly choose coding coefficients
- Sink receives linear comb of source msgs
- If connectivity ? r, linear combshave full rank
- ? can decode!
- Without coding, problem isSteiner Tree Packing
(hard!)
18Our Algorithm
- Derandomization of HKMK algorithm
- Technique Max-Rank Completionof Mixed Matrices
- Mixed Matrix contains numbers and variables
- Completion choice of values for variables that
maximizes the rank.
19k-Pairs Problemsaka Multiple Unicast Sessions
20k-pairs problem
- Network coding when each commodity has one sink
- Analogous to multicommodity flow
- Goal compute max concurrent rate
- This is an open question
21Rate
- Each edge has its own alphabet ?(e) of messages
- Rate min log( ?(S(i)) )
- NCR sup rate of coding solutions
- Observation If there is a fractional flow with
rational coefficients achieving rate r, there is
a network coding solution achieving rate r.
Source S(i)
22Directed k-pairs
s1
s2
- Network coding rate can be muchlarger than flow
rate! - Butterfly graph
- Network coding rate (NCR) 1
- Flow rate ½
- Thm HKL04,LL04 ? graphs G(V,E) whereNCR
O( flow rate V ) - Thm HKL05 ? graphs G(V,E) whereNCR O( flow
rate E )
t2
t1
23NCR / Flow Gap
s1
s2
NCR 1Flow rate ½
G (1)
t1
t2
Network Coding
Flow
s2
s2
s1
s1
Edge capacity 1
Edge capacity ½
t1
t2
t1
t2
24NCR / Flow Gap
s3
s4
s1
s2
G (2)
t3
t4
t1
t2
- Start with two copies of G (1)
25NCR / Flow Gap
s3
s4
s1
s2
G (2)
t3
t4
t1
t2
- Replace middle edges with copy of G (1)
26NCR / Flow Gap
s3
s4
s1
s2
G (1)
G (2)
t3
t4
t1
t2
27NCR / Flow Gap
s1
s2
s3
s4
s2n-1
s2n
G (n-1)
G (n)
t1
t2
t3
t4
t2n-1
t2n
- commodities 2n, V O(2n), E O(2n)
- NCR 1, Flow rate 2-n
28Optimality
- The graph G (n) provesThm HKL05 ? graphs
G(V,E) whereNCR O( flow rate E ) - G (n) is optimalThm HKL05 ? graph
G(V,E),NCR/flow rate O(min V,E,k)
29Network flow vs. information flow
- Multicommodity
- Flow
- Efficient algorithms for computing maximum
concurrent (fractional) flow. - Connected with metric embeddings via LP duality.
- Approximate max-flow min-cut theorems.
- Network
- Coding
- Computing the max concurrent network coding rate
may be - Undecidable
- Decidable in poly-time
- No adequate duality theory.
- No cut-based parameter is known to give sublinear
approximation in digraphs.
No known undirected instance where network coding
rate ? max flow! (The undirected k-pairs
conjecture.)
30Why not obviously decidable?
- How large should alphabet size be?
- Thm LL05 There exist networks wheremax-rate
solution requires alphabet size - Moreover, rate does not increase monotonically
with alphabet size! - No such thing as a large enough alphabet
31Approximate max-flow / min-cut?
- The value of the sparsest cut is
- a O(log n)-approximation to max-flow in
undirected graphs. AR98, LLR95, LR99 - a O(vn)-approximation tomax-flow in directed
graphs. CKR01, G03, HR05 - not even a valid upper bound on network coding
rate in directed graphs!
e
e has capacity 1 and separates 2 commodities,
i.e. sparsity is ½. Yet network coding rate is 1.
32Approximate max-flow / min-cut?
- The value of the sparsest cut induced by a vertex
partition is a valid upper bound, but can exceed
network coding rate by a factor of O(n). - We next present a cut parameter which may be a
better approximation
ti
si
sj
tj
33Informational Dominance
- Definition A e if for every network coding
solution, the messages sent on edges of A
uniquely determine the message sent on e. - Given A and e, how hard is it to determine
whether A e? Is it even decidable? - Theorem HKL05 There is a combinatorial
characterization of informational dominance.
Also, there is an algorithm to compute whetherA
e in time O(k²m).
34Informational Dominance
- Def A dominates B if information in A determines
information in Bin every network coding solution.
s1
s2
A does not dominate B
t2
t1
35Informational Dominance
- Def A dominates B if information in A determines
information in Bin every network coding solution.
s1
s2
A dominates B
Sufficient Condition If no path from any source
? B then A dominates B (not a necessary condition)
t2
t1
36Informational Dominance Example
- Obviously flow rate NCR 1
- How to prove it? Markovicity?
- No two edges disconnect t1 and t2 from both
sources!
37Informational Dominance Example
s1
s2
t1
Cut A
t2
- Our characterization implies thatA dominates
t1,t2 ? H(A) ? H(t1,t2)
38Informational Meagerness
- Def Edge set A informationally isolates
commodity set P if A ? P P. - iM (G) minA,P for P informationally isolated
by A - Claim network coding rate ? iM (G).
39Approximate max-flow / min-cut?
- Informational meagerness is no better than an
O(log n)-approximation to the network coding
rate, due to a family of instances called the
iterated split butterfly.
40Approximate max-flow / min-cut?
- Informational meagerness is no better than a
O(log n)-approximation to the network coding
rate, due to a family of instances called the
iterated split butterfly. - On the other hand, we dont even know if it is a
o(n)-approximation in general. - And we dont know if there is a polynomial-time
algorithm to compute a o(n)-approximation to the
network coding rate in directed graphs.
41Sparsity Summary
- Directed Graphs
- Undirected Graphs
Flow Rate ? Sparsity lt NCR ? iM (G)
in some graphs
Flow Rate ? NCR ? Sparsity
Gap can be O(log n) when G is an expander
42Undirected k-Pairs Conjecture
?
?
?
?
Sparsity
Flow Rate
NCR
Unknown until this work
Undirected k-pairs conjecture
43The Okamura-Seymour Graph
Every edge cut has enough capacity to carry the
combined demand of all commodities separated by
the cut.
Cut
44Okamura-Seymour Max-Flow
Flow Rate 3/4
si is 2 hops from ti. At flow rate r, each
commodity consumes ? 2r units of bandwidth in a
graph with only 6 units of capacity.
45The trouble with information flow
- If an edge combines messages from multiple
sources, which commodities get charged for
consuming bandwidth? - We present a way around this obstacle and
boundNCR by 3/4.
s1
t3
s2
s4
t1
t4
s3
t2
At flow rate r, each commodity consumes at least
2r units of bandwidth in a graph with only 6
units of capacity.
46Okamura-Seymour Proof
Thm AHJKL05 flow rate NCR 3/4.
- We will prove
- Thm HKL05 NCR ? 6/7 lt Sparsity.
- Proof uses properties of entropy.
- A ? B ? H(A) ? H(B)
- Submodularity H(A)H(B) ? H(A?B)H(A?B)
- Lemma (Cut Bound) For a cut A ? E,H( A ) ? H(
A, sources separated by A ).
47- H(A) ? H(A,s1,s2,s4) (Cut Bound)
Cut A
48- H(B) ? H(B,s1,s2,s4) (Cut Bound)
Cut B
49- Add inequalities
- H(A) H(B) ? H(A,s1,s2,s4) H(B,s1,s2,s4)
- Apply submodularity
- H(A) H(B) ? H(A?B,s1,s2,s4) H(s1,s2,s4)
- Note A?B separates s3 (Cut Bound)
- ? H(A?B,s1,s2,s4) ? H(s1,s2,s3,s4)
- Conclude
- H(A) H(B) ? H(s1,s2,s3,s4) H(s1,s2,s4)
- 6 edges ? rate of 7 sources ? rate ? 6/7.
50Rate ¾ for Okamura-Seymour
s1
s1
t3
s1 t3
i
s4
t4
s2 t1
s3
s3 t2
51Rate ¾ for Okamura-Seymour
s1 t3
i
s4
t4
s2 t1
i
s3 t2
i
52Rate ¾ for Okamura-Seymour
s1 t3
i
s4
t4
s2 t1
i
s3 t2
i
53Rate ¾ for Okamura-Seymour
s1 t3
s4
t4
s2 t1
s3 t2
54Rate ¾ for Okamura-Seymour
s1 t3
s4
t4
s2 t1
s3 t2
55Rate ¾ for Okamura-Seymour
s1 t3
s4
t4
s2 t1
¾ RATE
3 H(source) 6 H(undirected edge) 11 H(source)
6 H(undirected edge) 8 H(source)
s3 t2
56Special Bipartite Graphs
- This proof generalizes to
- show that max-flow NCR
- for every instance which is
- Bipartite
- Every source is 2 hops away from its sink.
- Dual of flow LP is optimized by assigning length
1 to all edges.
s1 t3
s4
t4
s2 t1
s3 t2
57The k-pairs conjecture and I/O complexity
- In the I/O complexity model AV88, one has
- A large, slow external memory consisting of pages
each containing p records. - A fast internal memory that holds O(1) pages.
(For concreteness, say 2.) - Basic I/O operation read in two pages from
external memory, write out one page.
58I/O Complexity of Matrix Transposition
- Matrix transposition Given a pp matrix of
records in row-major order, write it out in
column-major order. - Obvious algorithm requires O(p²) ops.
- A better algorithm uses O(p log p) ops.
59I/O Complexity of Matrix Transposition
- Matrix transposition Given a pp matrix of
records in row-major order, write it out in
column-major order. - Obvious algorithm requires O(p²) ops.
- A better algorithm uses O(p log p) ops.
s1
s2
60I/O Complexity of Matrix Transposition
- Matrix transposition Given a pxp matrix of
records in row-major order, write it out in
column-major order. - Obvious algorithm requires O(p²) ops.
- A better algorithm uses O(p log p) ops.
s1
s2
s3
s4
61I/O Complexity of Matrix Transposition
- Matrix transposition Given a pxp matrix of
records in row-major order, write it out in
column-major order. - Obvious algorithm requires O(p²) ops.
- A better algorithm uses O(p log p) ops.
s1
s2
s3
s4
t3
t1
62I/O Complexity of Matrix Transposition
- Matrix transposition Given a pxp matrix of
records in row-major order, write it out in
column-major order. - Obvious algorithm requires O(p²) ops.
- A better algorithm uses O(p log p) ops.
s1
s2
s3
s4
t3
t4
t1
t2
63I/O Complexity of Matrix Transposition
- Theorem (Floyd 72, AV88) If a matrix
transposition algorithm performs only read and
write operations (no bitwise operations on
records) then it must perform O(p log p) I/O
operations.
s1
s2
s3
s4
t3
t4
t1
t2
64I/O Complexity of Matrix Transposition
- Proof Let Nij denote the number of ops in which
record (i,j) is written. For all j, - Si Nij p log p.
- Hence
- Sij Nij p² log p.
- Each I/O writes only p records. QED.
s1
s2
s3
s4
t3
t4
t1
t2
65The k-pairs conjecture and I/O complexity
- Definition An oblivious algorithm is one whose
pattern of read/write operations does not depend
on the input. - Theorem If there is an oblivious algorithm for
matrix transposition using o(p log p) I/O ops,
the undirected k-pairs conjecture is false.
s1
s2
s3
s4
t3
t4
t1
t2
66The k-pairs conjecture and I/O complexity
- Proof
- Represent the algorithm with a diagram as before.
- Assume WLOG that each node has only two outgoing
edges.
p1
p2
p1
q
p2
s1
s2
s3
s4
t3
t4
t1
t2
67The k-pairs conjecture and I/O complexity
- Proof
- Represent the algorithm with a diagram as before.
- Assume WLOG that each node has only two outgoing
edges. - Make all edges undirected, capacity p.
- Create a commodity for each matrix entry.
p1
p2
p1
q
p2
s1
s2
s3
s4
t3
t4
t1
t2
68The k-pairs conjecture and I/O complexity
- Proof
- The algorithm itself is a network code of rate 1.
- Assuming the k-pairs conjecture, there is a flow
of rate 1. - Si,jd(si,tj) p E(G).
- Arguing as before, LHS is O(p² log p).
- Hence E(G)O(p log p).
p1
p2
p1
q
p2
s1
s2
s3
s4
t3
t4
t1
t2
69Other consequences for complexity
- The undirected k-pairs conjecture implies
- A O(p log p) lower bound for matrix transposition
in the cell-probe model. - Same proof.
- A O(p² log p) lower bound for the running time of
oblivious matrix transposition algorithms on a
multi-tape Turing machine. - I/O model can emulate multi-tape Turing
machines with a factor p speedup.
70Open Problems
- Computing the network coding rate in DAGs
- Recursively decidable?
- How do you compute a o(n)-factor approximation?
- Undirected k-pairs conjectureDoes flow rate
NCR? - At least prove a O(log n) gap between sparsest
cut and network coding rate for some graphs.
71Summary
- Information ? Transportation
- For multicast, NCR rate min cut
- Algorithms to find solution
- k-pairs
- Directed NCR gtgt flow rate
- Undirected Flow rate NCR in O-S graph
- Informational dominance