Efficient and Accurate Gate Sizing with Piecewise Convex Delay Models - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Efficient and Accurate Gate Sizing with Piecewise Convex Delay Models

Description:

... the sum of first order time constants. Convex via variable ... Create a composite output wave form to account for signals with different slew-rates ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 38
Provided by: carl290
Category:

less

Transcript and Presenter's Notes

Title: Efficient and Accurate Gate Sizing with Piecewise Convex Delay Models


1
Efficient and Accurate Gate Sizing with Piecewise
Convex Delay Models
  • Hiran Tennakoon
  • Carl Sechen

University of Washington Department of
Electrical Engineering
2
Overview
  • Introduction to gate sizing
  • Delay modeling
  • Modeling gate delays
  • Delay propagation
  • Optimization issues with piecewise models
  • Sizing Example
  • Algorithmic Issues
  • Results
  • Delay Model
  • Comparisons with a commercial tool
  • Conclusions

3
Introduction to gate sizing
  • Considerable impact on delay, power, and area
  • Generation of delay vs. area or power tradeoff
    curves
  • Exploit readily available fluid standard cell
    libraries
  • Scope
  • Generation of delay vs. area trade off curve
  • Area represented as the sum of transistor sizes
  • Within the Static Timing Analysis frame work
  • Continuous sizing (fluid library)

4
Delay Modeling
  • Elmore model
  • Delay expressed as the sum of first order time
    constants
  • Convex via variable transformation
  • Highly amenable to mathematical programming
    techniques
  • Low accuracy
  • Logical Effort
  • Reformulation of the Elmore model
  • Fitted models
  • Simulation data fitted to predetermined function
    forms
  • e.g. K. Kasamsetty, M. Ketkar, and S. S.
    Sapatnekar,
  • New Class of Convex Functions for Delay
    Modeling, IEEE Transactions on CAD, July 2000

5
Convex Delay Models
  • Higher accuracy
  • Captures input slew-rate effects
  • Accounts for min and max beta ratio limits
  • Globally optimum results
  • Smaller range
  • Model fit may be good for certain ranges in the
    design space
  • e.g. input slew-rates 20ps 300ps output
    loads up to 600fF sizes 0.25?m - 7?m
  • Increasing the range
  • Piecewise model generation

6
Piecewise Convex Delay Model
  • Features
  • Gate level delay model
  • Rise and fall delays and output slew-rates
  • Includes all input to output combinations
  • Functions of input rise and fall slew-rates,
    output loading, nMOS and pMOS device sizes
  • Parameterized gates, one variable each for the
    nMOS and pMOS devices for a gate
  • Min and max Beta ratio limits
  • Increase accuracy by subdividing the data into
    smaller regions
  • Four variables to account for input slew-rate,
    nMOS size, pMOS size, output load
  • Each region or piece is fitted to a convex
    function

7
Dividing the Data Set
  • Data is organized in terms of input slew-rate and
    outputload ratio
  • Load ratio analogous to electrical effort
  • Electrical effort ratio of the output
    capacitance to input gate capacitance
  • Load ratio ratio of the driven gate size to the
    driving gate size.Size is the sum of the
    transistor widths.
  • Change in characterization paradigm
  • Capacitive load vs. active load

8
New Characterization Paradigm
  • Accounts for nonlinear effects on the driving
    gate due to the Miller effect kick-back from
    the driven gate.

9
Data Set Organization
  • Input slew-rate range 20ps 1.2ns
  • Load ratio range 1 100
  • Prune out bad region (may or may not)
  • Non-monotonic delay behaviour negative delay

10
Data Set Granularity
  • For the given input slew-rate range and load
    ratio range,each region contains all possible
    sizes and allowed beta ratios
  • Uniform 80ps step in input slew-rate

11
Delay and Slew-rate functions
  • Generalized posynomial
  • Convex under variable transformation
  • With ai ? 0 , ei ? 0 and bi , ci , and di any
    real number K. Kasamsetty, M. Ketkar, and S. S.
    Sapatnekar,
  • New Class of Convex Functions for Delay
    Modeling, IEEE Transactions on CAD, July 2000

12
Delay Propagation in Static Timing Analysis
  • Latest arrival propagation
  • Signal causing the worst output delay propagated
  • With that signals slew-rate
  • Optimistic delay estimation
  • Max slew-rate propagation
  • Signal causing the worst output delay propagated
  • With the worst output slew-rate
  • Not necessarily corresponding to the same input
    signalused to propagate the delay
  • Pessimistic delay estimation

13
Signal Bounding in STA
  • Create a composite output wave form to account
    for signals with different slew-rates
  • Jim-Fuw lee, D.L. Ostapko, J. Soreff, C.K. Wong,
    On the signal bounding problem in timing
    analysis, Proceedings International Conference
    onCAD Nov 2001

14
Delay Propagation Scheme
  • Propagation based on both arrival times and
    slew-rate
  • Find the signal whose arrival time and slew-rate
    maximizes
  • ksr 0 latest arrival propagation
  • ksr 1 approaches the max slew-rate propagation
  • ksr 0.5 half-envelope method
  • Jim-Fuw lee, D.L. Ostapko, J. Soreff, C.K. Wong,
    On the signal bounding problem in timing
    analysis, Proceedings International Conference
    on CADNov 2001

15
Optimization with Piecewise models
  • Gradients undefined at boundaries of regions
  • Successive iterates may get trapped by the
    boundary

16
Overlapping Regions
  • Adjacent regions overlap by half
  • Diagonally situated regions overlap by quarter

17
Sizing Example minimize worst-case delay

18
New Delay Propagation Scheme
19
Delay Propagation Scheme
20
Problem Simplification
  • Non smooth problem
  • Assume that the correct delay and slew-rate are
    propagated

21
Problem Simplification (contd)
  • Khun-Tucker optimality conditions
  • The primary output arrival times sum of the
    Lagrangemultipliers assigned to the primary
    outputs must sum to one
  • Sum of the multipliers at the input of a gate
    must equal tothe sum of the multipliers at the
    output
  • C. P. Chen, C. C. N. Chu, and D.F. Wong, Fast
    and Exact Simultaneous Gate and Wire Sizingby
    Lagrangian Relaxation, Proceedings
    International Conference on CAD, Nov 1998.

22
Lagrangian of the Problem
  • Introduce one Lagrange multiplier per constraint
  • Function of gate sizes x and Lagrange multipliers
    ?

23
Minimizing Area given a Delay Target
  • Sum of the multipliers at the input of a gate
    must equal tothe sum of the multipliers at the
    output

24
Algorithmic Issues
  • The gate sizing problem is solved with a primal
    dual algorithm
  • For a fixed set of multipliers satisfying the KT
    conditionsfind the minimum with respect to the
    sizes xi
  • Update the multipliers using a sub-gradient
    technique
  • Repeat until convergence
  • A known problem with the sub-gradient scheme is
    how to choose a good step size control mechanism
  • Theoretical optimum update per iteration k

25
Multiplier Update
  • Practical multiplier update
  • A required arrival time at primary output
  • ai arrival time at any node
  • k iteration
  • H. Tennakoon, and C. Sechen, Gate sizing using
    Lagrangian relaxation combined with a fast
    gradient-based pre-processing step, Proc. Intl.
    Conf. on Computer-Aided Design,pp. 395-402, Nov
    2002.

26
Scaling Issues
  • Example of a poorly scaled problem
  • The function is very sensitive to changes in x1
  • Delay constrained area minimization can
    potentially havea scaling problem
  • Dynamic scaling between the objective and the
    constraint functions

27
Duality
  • The primal problem
  • Delay constrained area minimization
  • Minimization of the worst-case delay
  • Lagrangian dual
  • Maximizing L(x,?) with respect to ?
  • Delay constrained area minimization
  • Minimization of the worst-case delay

28
Optimality
  • Relationship between the primal and the dual
  • Global optimum point if
  • In practice for all solutions
  • Primal dual tolerance less than 1
  • Active delay constraints satisfied within 10ps
    tolerance

29
Delay Model Accuracy
  • Library composition
  • 11 inverting gates
  • 0.18?m TSMC technology
  • Min size 0.5?m max size 12?m

?max
?min
30
Comparison with a Leading Commercial Tool
  • 31 benchmarks from ISCAS85 and ITC99
  • Generated 11 points on the area vs. delay curve
  • For each solution the transistor sizes are
    rounded to thenearest 1/10th of a micron
  • Hspice simulation is run on the rise and fall
    critical pathsfor each solution point to compare
    the accuracy of thedelay estimation
  • The leading commercial transistor sizing tool
    (CTST)was given same constraints

31
  • 4757 cells, execution time 502.42s, speedup 7.6X
  • Forge finds a 1.14X faster design, and has 32.74
    less transistor area.

32
  • 5489 cells, execution time 1712s, speedup 6.5X
  • Forge finds a 1.22X faster design, and has 63.45
    less transistor area.

33
  • 21,920 cells, execution time 1hrs
  • CTST failed to complete within 3 days

34
  • 31,635 cells, execution time 1.73hrs
  • CTST failed to complete within 3 days

35
  • 44,615 cells, execution time 2hrs
  • CTST failed to complete within 3 days

36
Summary of Results
  • Average area reduction over CTST 29
  • Average improvement in runtime 6.4
  • Average absolute error in delay estimation
    compared to Hspice simulation on rise and fall
    critical paths 4.23
  • Three of the largest designs from ITC99 failed
    to complete with CTST after running for 3 days

37
Conclusions
  • Forge
  • Combines a piecewise convex delay model with a
    new delay propagation scheme
  • Fast generation of area vs. delay tradeoff curves
  • Critical path delay estimation is on average
    4.23 within Hspice
  • Compared to a leading commercial transistor
    sizing tool
  • Forge produces solutions that are on average
    consume 29 less area
  • Forge is on average 6.4 faster
Write a Comment
User Comments (0)
About PowerShow.com