Title: A'B' Kahng, Ion I' Mandoiu
1Highly Scalable Algorithms for Rectilinear and
Octilinear Steiner Trees
- A.B. Kahng, Ion I. Mandoiu
- University of California at San Diego, USA
- A.Z. Zelikovsky
- Georgia State University, USA
- Supported in part by MARCO GSRC and Cadence
Design Systems, Inc.
2Outline
- Single net routing problem
- Problem definition
- Previous work
- Motivation for highly scalable heuristics
- The batched greedy algorithm
- High-level algorithm
- Efficient generation of triples
- Efficient bottleneck-edge computation
- Experimental results and conclusions
- Single net routing problem
- Problem definition
- Previous work
- Motivation for highly scalable heuristics
- The batched greedy algorithm
- High-level algorithm
- Efficient generation of triples
- Efficient bottleneck-edge computation
- Experimental results and conclusions
3Single Net Routing Problem
Given set of terminals in the plane Find
minimum length interconnection
? Rectilinear/Octilinear Minimum Steiner Trees
(RMST/OMST)
4Why Minimum Steiner Tree Routing?
- Advantages
- Minimum routing area
- Minimum total capacitance
- Reduced power consumption
- Steiner tree routing appropriate for
- Non-critical nets
- Physically small instances
5Previous Work on Steiner Tree Problem
- Long history
- Euclidean version Gauss 1836
- Rectilinear version Hanan 1966
- Octilinear version SarrafzadehWong 1992
- Steiner tree problem in graphs Hakimi/Levin
1971 - Fundamental results
- All versions are NP-hard Karp 1972, GJ77
- Minimum Spanning Tree (MST) gives good
approximation - Always within factor 2 of optimum 3 papers,
1979-1981 - Within factor 1.5 in rectilinear plane Hwang
1976
6Previous RSMT/OSMT Algorithms
- Exact algorithms
- GeoSteiner Warme,WinterZachariasen
- Approximation algorithms
- Zelikovsky 1993, BermanRamaier 1994,
HougardyPromel 1999, RajagopalanVazirani 1999,
RobinsZelikovsky 2000, - Best-performing RSMT heuristics (within 0.5 of
optimum) - Iterated 1-Steiner heuristic KahngRobins 1992
- Edge-based heuristic Borah,OwensIrwin 1999
- Iterated Rajagopalan-Vazirani Mandoiu,VaziraniGa
nley 2000 - Not practical for tens of thousands of terminals!
7Motivation for Highly Scalable Heuristics
- Most nets are small (lt20 terminals)
- But nets with gt104 terminals become increasingly
common - Scan-enable nets in large designs using full-scan
test - All flip-flops need to receive the scan-enable
signal - Nets with pre-routes and non-zero terminal
dimensions - Each terminal represented by set of electrically
equivalent pins - ?RSMT/OSMT instances with up to 105 pins
8Requirements for Highly Scalable RSMT/OSMT
Heuristics
- Linear memory
- Sub-quadratic runtime
- Solutions within 0.5 of optimum
- Previous Steiner tree heuristics do not meet
first two requirements
9Outline
- Single net routing problem
- Problem definition
- Previous work
- Motivation for highly scalable heuristics
- The batched greedy algorithm
- High-level algorithm
- Efficient generation of triples
- Efficient bottleneck-edge computation
- Experimental results and conclusions
10Triple Contraction
- Connect 3 terminals (triple) using shortest
connection
- Remove longest edge on each of the 2 formed
cycles
11High-level Algorithm
- Greedy Triple-Contraction Algorithm Zelikovsky
1993 - Compute MST of terminals
- While there exist triples with positive gain, do
- Find triple with maximum gain
- Contract triple remove longest edges, replace
triple with 2 zero-cost edges - Output MST of terminals and centers of contracted
triples
- Expensive to compute max-gain triple in Step 2
- Best implementation uses complex dynamic MST
datastructures - We use a batched implementation
- Find positive-gain triples
- Contract triples in descending order of gain
without recomputing gains - A triple is selected if 2 longest edges not used
by previous triples
12Efficient Generation of Triples
- O(n3) triples overall
- Use geometry to avoid generating all of them!
- Fossmeier,KaufmannZelikovsky 1997 sufficient
to consider a set of O(n) triples with - Empty interior (? no other terminal in bounding
box) - Positive gain
- We use a practical O(nlogn) divide-and-conquer
algorithm to compute a superset of size O(n logn) - Some triples may have negative gain
- Eliminated after gain computation
- Some triples may be non-empty
- Can be removed, but too few to justify the extra
testing time
13Divide-and-conquer for Empty Triples
14Divide-and-conquer for Empty Triples
15Divide-and-conquer for Empty Triples
16Divide-and-conquer for Empty Triples
17Efficient Bottleneck-edge Computation
Bottleneck edges needed for computing triple gains
Given tree T, vertices u,v Find most expensive
edge on path between u and v
- We give a new data structure
- O(logn) time per bottleneck-edge query
- O(n logn) pre-processing (O(n) if edges already
sorted) - Much more practical than methods based on
least-common-ancestors - ? Gain evaluation in O(logn) time per triple
- ? O(n log2n) total time for the batched greedy
algorithm
18Outline
- Single net routing problem
- Problem definition
- Previous work
- Motivation for highly scalable heuristics
- The batched greedy algorithm
- High-level algorithm
- Efficient generation of triples
- Efficient bottleneck-edge computation
- Experimental results and conclusions
19Experimental Setup
- Testcases
- Random nets with 100 to 500,000 terminals
- 100 samples for each size
- Nets extracted from recent designs (330 to 34,000
terminals) - Compared algorithms
- Batched greedy
O(n log2n) - MST GuibasStolfi 1983
O(n logn) - Prim-based heuristic Rohe 2001
O(nlog2n) - Edge-based heuristic of Borah,OwensIrwin 1999
O(n2) - GeoSteiner 4.0, beta version Nielsen,WinterZacha
riasen 2002
20Quality Random Rectilinear Tests
- BatchGreedy quality slightly better than
Edge-based, 1 better than Prim-based - Within 0.7 of optimum computed by GeoSteiner
21CPU Time Random Rectilinear Tests
- BatchGreedy highly scalable, practical runtime
up to 105 terminals - Edge-Based impractical for gt104 terminals
22CPU Time Rectilinear Industry Tests
- 34k terminals 24 seconds BatchGreedy vs. 86
minutes Edge-based
23Quality Rectilinear Industry Tests
- BatchGreedy up to 1 better than Prim-Based,
within 0.5 of GeoSteiner - Slightly better than Edge-Based in half of the
cases
24CPU Time Octilinear Industry Tests
- 34k terminals 25 seconds BatchGreedy vs. 17.5
hours Edge-based - Octilinear Prim-Based not available
25Quality Octilinear Industry Testcases
- BatchGreedy slightly better than Edge-Based,
within 0.5 of GeoSteiner
26Conclusions
- Presented new rectilinear/octilinear Steiner tree
heuristic - Highly-scalable
- O(n) memory, O(nlog2n) runtime, with small hidden
constants - High-quality
- Better quality than O(n2) edge-based heuristic
- Within 0.7 of optimum computed by GeoSteiner
- Ongoing extensions
- Via costs
- Preferred directions
- Routing obstacles
27Thank You for Your Attention!
- Further details on our work are available on the
groups website, http//vlsicad.ucsd.edu