Enforcing LongPath Timing Closure for FPGA Routing with Path Searches on Clamped Lexicographic Spira - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Enforcing LongPath Timing Closure for FPGA Routing with Path Searches on Clamped Lexicographic Spira

Description:

Minimum-congestion bounded-delay searching (vs tradeoff using weights) ... Given: DAG G=(V,A), min. delays dij, weights wij, long-path constraint T ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 24
Provided by: kei59
Category:

less

Transcript and Presenter's Notes

Title: Enforcing LongPath Timing Closure for FPGA Routing with Path Searches on Clamped Lexicographic Spira


1
Enforcing Long-Path Timing Closure for FPGA
Routingwith Path Searches on Clamped
Lexicographic Spirals
  • Keith So
  • University of New South Wales,
  • Sydney, Australia
  • Feb 25 _at_ FPGA08

2
Outline
  • Problem Statement
  • Related Work
  • SpiralRoute Overview
  • Budget Generation
  • Clamped Lexicographic Search
  • Some Performance Optimizations
  • Experiments
  • Conclusions and Future Work

3
Problem Statement Assumptions
  • Long-Path Timing-Driven Detailed Routing
  • Given Placed circuit mapped into RR Graph
    Timing Requirement D
  • Find Mutually RR-vertex disjoint routing trees
    s.t. Max. Long-Path Comb. Delay lt D
  • Assumptions
  • D is achievable under given placement
  • Buffered switching (delays summable)

4
Related Work
  • F92 Iterative slack allocation
  • AR95 Criticality bin Steiner/Arbor.
  • ME95 Negotiated Congestion
  • BR97 VPR
  • LW03 Lagrangian Rel. Weighting
  • ANC04 Auto. Constraint Gen.
  • FBC04 RCV

5
SpiralRoute Overview
  • Negotiated Congestion Routing over A
  • Paths are lexicographic-costed S07,ISPD07
  • Major Deltas
  • Optimal delay upper bound generation for FPGA
    routing domain
  • Minimum-congestion bounded-delay searching (vs
    tradeoff using weights)
  • Provable timing closure at completion

6
Connection Budget Generation Optimization
Component
  • Weighted Budget Distribution Problem Ghiasi
    et.al, ICCAD04
  • Given DAG G(V,A), min. delays dij, weights wij,
    long-path constraint T
  • Find delay budgets bij such that
  • (dijbij) summed along all paths satisfies T
  • Sum of (wij.bij) over all edges is maximised
  • Transforms into min-cost flow problem
  • budgets recovered from dual of flow solution.

7
Connection Budget Generation Mapping to FPGA
Routing
  • Represent LEs and pads as edges (split clocked
    LEs)
  • Form super-DAG
  • dij min connection delay (from
    congestion-oblivious routing)
  • Set T D
  • Set wij 1 for real edges, 0 for virtuals
  • Solved (dijbij) is the maximum delay for each
    edge in our routing

8
Comparison with It. Minimax PERT(clma runtime
20mins)
9
Search Design n-Lex. Search
  • 1-Line A search f(v)g(v)h(v), expand v with
    minimum f(v) until t
  • 2-component lexicographic search used for
    routability router (Conceptually a8 b)
  • Need n-components and custom comparison functions
    (proofs needed to avoid 8k values!)
  • Theorem A of n-lexicographic search is
    admissible if all components are totally-ordered
    monoids with order-preserving addition
  • Monoids helpful to avoid clutter from max()

10
Search Design Clamping Component
  • 3-component vector
  • Delay, with pivot (x lt y iff x lt T y gt T)
  • Congestion, regular lt
  • Delay, regular lt
  • Ex f(w2)0,2,2 f(x1)1,0,4 f(w3)0,1,3
  • Assumption h(v) is at least close to h(v) for
    clamping component

11
Search Design Timing Closure
  • Delay pivot element splits congestion identical
    paths by budget
  • Will always choose a budget-compliant path (sum
    of finite congestion costs are finite)
  • Over all connections gt successful routing always
    yields timing closure!

12
Performance Low-Hanging Optimizations
  • Original implementation is around 2-2.5x slower
    than current runtime
  • Introduced some low-hanging speed quality
    optimizations
  • Index structure for lexicographic costs
  • Greedy tree mgmt. to ameliorate pin-ordering
  • A high-hanging optimization in future work is
    congestion schedule handling (but many promising
    leads from global routers in ICCAD07)

13
Trie-of-Stacks Index Structure
  • Replaces f(v) index structure
  • Exploits FPGA routing symmetry
  • Index operations independent of size
  • Reduces runtimes by 15

14
Tree Topology Maintainence
15
Experiments - Setup
  • Run against VPR4.30 on architecture similar to
    single-segment challenge arch.
  • (Researcher timing constraints)
  • routability comparison with unclamped lex-search
  • Route at the placement allowed Fmax
  • VPR pres_fac1.5/1.1

16
Routed Solution Timing Quality
17
Runtime Comparison
18
Effects of Budget Quality
19
Future Work
  • Runtime improvements
  • Schedule improvement
  • Performance tuning
  • Multi-CLB segments (see backup slide)
  • Multi-objective routing
  • Other domains (e.g. standard cell global)

20
Conclusions
  • Extended lexicographic search to timing-driven
    routing
  • New budgeting component
  • Clamped search design
  • Supporting techniques for runtime
  • Timing closure is guaranteed on routing success
  • Solution quality is good but need more runtime
    improvement to be viable

21
Acknowledgements
  • J. Rose, V. Betz, A. Marquardt (Toronto)
    VPR4.30 source benchmarks
  • Australian Centre for Advanced Computing and
    Communications (ac3) High Performance Computing
    Support
  • Advisor Dr. Aleks Ignjatovic

22
Question Time
  • To Backup Slides

23
Issues with h(v) / h(v)
  • Node locking occurs when g(v)h(v) lt D but
    really g(v)h(v) gt D
  • Expansion downstream will be truncated
  • But a subpath with less delay but more congestion
    cannot expand into it
  • But if reexpand on shorter delay then backtrace
    will ignore congestion not locally decidable!
  • Quick fix precompute h(v) (Only needed for sink
    pins t) Only bounding components need the
    accuracy
  • Fancy on-the-fly handling?
Write a Comment
User Comments (0)
About PowerShow.com