Title: Algorithms for Concave Cost Network Flow Problems
1Algorithms for Concave Cost Network Flow Problems
- Kamesh Munagala
- Stanford University
2Talk Outline
- Motivation via simple example
- Concave cost flow problem
- Formal problem statement
- Simple Randomized Algorithm
- Special Cases
- Motivation from networking problems
- Our results
- The buy-at-bulk algorithm
3Cost Structures in Network Design
4Warehouse Location
5Decision Problem
- Costs
- Opening and operating warehouse
- Shipping demand
- Tradeoff Lots of warehouses implies low shipping
cost - Optimize Linear combination of costs
- Decisions
- How many warehouses to open
- Where to open warehouses
- How to ship to outlets
6Warehouse Cost
- Minimum fixed cost for operating warehouse
- Additional cost depending on storage capacity
needed - Typically reduces as capacity increases
- Example Staff does not double with doubling
capacity
7Transportation Cost
- Linear in distance to outlet
- Linear in load transported to outlet
- Minimum fixed cost for one truck
-
8Features of Cost Structure
- Economies of Scale
- More capacity cheaper per unit demand
- Applies to warehouse costs
- Discreteness in quantity
- Cannot purchase arbitrarily small capacity
- Applies to warehouse and transportation costs
- General phenomena in network design
- Costs of caches, routers and cables obey these
properties
9Modeling Allocation Costs
Cost
- Cost is
- non-decreasing
- concave
-
- function of demand serviced
0
Demand
10Concave Cost Flow Problem
11Concave Cost Flow Problem
- Given
- Undirected network
- Cost on edges
- Concave function of demand
- Many demand nodes
- Distinguished sink node
- Compute
- Minimum cost flow
Sink
Sources
12Facility Location
Warehouse cost f(i)
Warehouse i
Transportation Cost c(i,j)
Outlet j Demand d(j)
Optimize ? c(i,j) d(j) ? f(i)
13Modeling Facility Location
Sink
f(i)
i
c(i,j)
d(j)
Optimize ? c(i,j) d(j) ? f(i)
14Solution
Sink
15Other Special Cases
- Steiner Trees
- Probabilistic Steiner Trees KM00
- Multilevel Facility Location
- Buy at Bulk Network Design SCRS97
- Applications in network design
- Multicast tree design
- Hierarchical placement of caches and routers
- Placement of web content in caches
- Buying cables to provision bandwidth
16Hardness of the Flow Problem
- Facility location is NP-Hard
- Steiner Tree Problem
- Fixed cost for using edge
- NP-Hard Karp. 1972
- Approximation algorithms
- Provably close to optimal solution on all
instances - Example Cost ? 5 OPT
- Polynomial running time
Cost
1
0
Flow
Sink
3
1
2
Flow 1
1
Cost 5
17Previous Results
- Operations Research
- Uncapacitated Fixed Charge Problem
- Magnanti, Mireault, Wong. 1986
- Hochbaum, Segev. 1989
- Ortega, Wolsey. 2000
- No approximation algorithms known for this problem
18Our Result
- Logarithmic approximation
- Meyerson, Munagala, Plotkin. 2000
- Properties of our algorithm
- Simple to implement
- Uses shortest path and greedy matching
computations - Efficient in practice
- Approximation ratio much better on real data
- Subsequent Results
- Best approximation result till date
- De-randomization Chekuri, Khanna, Naor.
2001 - Best hardness 1.47 Guha, Khuller. 1998
19Basic Algorithm
- Merging demand reduces cost
- For every pair (u,v) compute min cost path in
graph to send demand from u to v or vice versa - Let this be cost of (u,v) edge
- Compute min cost matching in this complete graph
- Pair demands using this matching
- Choose one node in pair as center and send demand
to it - Number of demand nodes halves
- Repeat logarithmic times
20Proof Idea
- The optimal solution encodes a matching of nodes
- Implies cost of matching at most cost of optimal
solution - Marathe et al 1998
Matching in OPTs solution
21Problem too hard?
- Which node is cheaper to route to depends on
demand being routed - Hard to make decisions about merging a whole
group of nodes - Not enough structure in solution
- Except for the fact that it encodes a matching
- Best hardness result known is only 1.47
- Guha, Khuller. 1998
22Special Cases of Concave Cost Flow
23Facility Location
Warehouse cost f(i)
Warehouse i
Transportation Cost c(i,j)
Outlet j Demand d(j)
Optimize ? c(i,j) d(j) ? f(i)
24Previous Results
- Operations Research
- Kuehn, Hamburger. 1963
- Cornuejols, Fisher, Nemhauser. 1977
- Approximation Algorithms
- Guha, Khuller. 1998 (Lower bound
1.47) - Mahdian, Ye, Zhang. 2002 (1.52 approx)
- Fast combinatorial algorithms known
CG99,JV99,AGKMMP01 - Applications
- Centroid based clustering
- Placement of caches and replicated data objects
- Minimize latency of user access
25Our Result
- Novel variant of facility location
- Each facility needs to satisfy minimum amount of
demand - Load Balanced facility location
- Constant factor approximation algorithm
KM00,GMM00 - Reduction to classical facility location
- Applications
- Subroutine in concave cost flow algorithms
- Solving clustering variants GM02
- Favor either large or small cluster sizes
26Multilevel Facility Location
Production Units
g(k)
c(k,i)
Warehouses
f(i)
c(i,j)
Outlets
d(j)
2-level Warehouse Location
27Previous Results
- Problem formulation
- Kaufman, vanden Eede. Hansen, 1977
- Factor 3 approximation
- Aardal, Chudak, Shmoys. 1999
- Exponential size linear program
- Can be solved using Ellipsoid algorithm
- Very inefficient in practice
- Application in networks
- Hierarchical placement of caches, switches and
routers
28Modeling as a Flow Problem
Two copies of the network
Outlets
f(i)
i
i
g(k)
Sink
k
c(i,j)
c(i,j)
Route flow from outlets to the sink node
29Our Results
- Simple combinatorial algorithm
- 9 approximation GMM00
- Reduce to classical facility location
- Can now use very efficient algorithms
-
- Subsequent results
- 3.27 approximation
- Ageev, Ye, Zhang. 2002
- Combinatorial algorithm
30Buy-at-bulk Network Design
- Provisioning cables to route data to core network
- Bandwidth cost obey economies of scale
- Cable types
- T1 1.5 Mbps 30/mile 20/Mbps/mile
- T3 44 Mbps 440/mile 10/Mbps/mile
- Cost of cables is a concave function
- Metrical special case
- Cost of bandwidth same per unit length everywhere
- Concave function same per unit length on all
edges - Salman, Cheriyan, Ravi, Subramanian. 1997
31Why is this problem simpler?
- Notion of close-by
- If dist(a,b) lt dist(a,c)
- Cheaper to transport demand from a to b than to c
- Independent of demand transported
- Natural algorithm
- Merge close-by demands together
- Cheaper to transport this merged demand to a far
away place - General concave cost flow
- Closeness is a function of demand transported
32Recursive Metric Partitioning
- Just focus on the metric space
- Ignore the cost function completely
- Recursively partition graph based on closeness
(randomized) - Partitions have smaller diameter than original
graph - Bartal96, Bartal98, CCGG98, CCGGP98
- Nodes in different partitions far away from each
other w.h.p. -
- For each partition, have a center node
- Collect all demand within a partition at center
node - Send this demand to the center of the parent of
this partition - Awerbuch, Azar. 1997
33Partitioning
Diameter of Graph D
Diameter lt D/2
w.h.p. Distances gt D/log n
34Routing
Route from centers of children to center of parent
35Discussion
- Paradigm of aggregation
- Group together close-by demand nodes
- Reduce cost of transportation
- Problems with approach
- Same partition for all cost functions
- Some close-by nodes bound to end up in different
partitions - Problem even if graph is just a cycle
- Worst case logarithmic performance expected in
practice
36Other Approaches
- Linear Programming
- Andrews and Zhang. 1998
- Improve the logarithmic ratio for special cases
- Usually produces optimal integer solutions in
practice - The size of the program is huge
- N3 variables
- Inefficient in practice
- Simple algorithms known for very special cases
- Salman, Cheriyan, Ravi, Subramanian. 1997
37Our Solution Idea
- Use cost function to construct the partitioning
- Say we have T1 and T3 lines
- Say cheaper to use T3 line if bandwidth gt 10Mbps
- Then, we should find
- Min cost way of aggregating demands using T1
lines - Each aggregated node receives 10Mbps bandwidth
- Min cost way of connecting aggregated nodes to
sink node - Construct partitioning bottom-up instead of
top-down - Properties of partition
- Close-by demands still grouped together
- The cost function decides group boundaries
38First Aggregation Step
Partition assuming T3 line becomes cheaper at 10
Mbps bandwidth
Aggregation point
Groups with 10 Mbps total bandwidth
T1 lines
39Complete Solution
T3 lines
40Constructing the Partitions
- Given
- A set of demand nodes
- Length metric on edges
- Select Set of aggregation points
- Send at least U demand per point
- Route along shortest paths
- Minimize total routing cost
- Load Balanced Facility Location
- O(1) approximation KM00,GMM00
- Iteratively construct larger partitions
Demand gt U
41One Issue
- Routing with a cable type need not be along
shortest paths
Capacity 1 Cost/Length 1
1
1
0.5
Case 1 Cost 1.5 Cost 2
Demand 0.5 Case 2 Cost 2.5
Cost 2 Demand 1.0
42Another Issue
- We are constructing partition bottom-up
- Optimal partition could look different
- If we make error in first grouping, error
propagates upward - How do we bound cost against optimal cost
- Scaling technique
- Observation Error propagates only if similar
cable types exist - Eliminate all cable types that look similar
except one - Partitioning at every stage close to optimal
partitioning - Constant factor approximation GMM00,GMM01
43Properties of Algorithm
- Simple to implement
- Uses facility location and Steiner trees as
subroutines - Very efficient in practice
- Preliminary experimental results
- Real ISP and geographic data
- Real cable types and costs
- At most 10 away from optimal solution
- Subsequent work
- Talwar. 2002
(213 approx) - Gupta, Kumar, Roughgarden. 2003
(72 approx) - Based on the ideas in our algorithm
44Open Problems
- Better approximation ratios
- Buy-at-bulk 72 GKR03
- Concave cost flow Logarithmic approximation
MMP00 - Multiple sink concave cost flow
- Aggregation paradigm fails!
- Buy-at-bulk problem
- Logarithmic approximation AA97
- Aggregation paradigm applicable to other problems?
45Acknowledgements
- Research collaborators
- Serge Plotkin, Stanford University
- Abhiram Ranade, IIT Bombay
- Sudipto Guha and Adam Meyerson
- Matthew Andrews, Bell Laboratories
- Pat Brown, Stanford University School of Medicine
- Ramesh Hariharan, Strand Genomics Pvt. Ltd.
- Zoe Abrams, Ashish Goel, Baruch Schieber, Debasis
Mitra, Devavrat Shah, Jochen Konemann, Maxim
Sviridenko, Rina Panigrahy, Rob Tibshirani,
Shankar Krishnan, Suresh Venkat and Tracy Kimbrel
46Acknowledgements
- Theory wing
- Mayur Datar, Aris Gionis, Gagan Aggarwal, Keyvan
Mohajer, Liadan OCallaghan, Majid Emami, Moses
Charikar and Piotr Indyk - Friends
- Dhananjay Gore, Rohit Nabar, Aditi Nabar, Kumar
Muthuraman, Mohan Lakhamraju, Nandan Das,
Prashanth Hande and Sameer Siruguri - Parents and Roopa