CSE241 VLSI Digital Circuits Winter 2003 Lecture 03: ASIC Flow and Design Convergence - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

CSE241 VLSI Digital Circuits Winter 2003 Lecture 03: ASIC Flow and Design Convergence

Description:

VLSI Digital Circuits Winter 2003 Lecture 03: ASIC Flow and Design Convergence This Class + Logistics Overview of flow (preparation for Smith Chapters 12-17) Read ... – PowerPoint PPT presentation

Number of Views:377
Avg rating:3.0/5.0
Slides: 56
Provided by: andrewk82
Category:

less

Transcript and Presenter's Notes

Title: CSE241 VLSI Digital Circuits Winter 2003 Lecture 03: ASIC Flow and Design Convergence


1
CSE241VLSI Digital CircuitsWinter 2003Lecture
03ASIC Flow and Design Convergence
2
This Class Logistics
  • Overview of flow (preparation for Smith Chapters
    12-17)
  • Read Smith Chapter 12 (Synthesis), 13.7 (Static
    timing)
  • Lab 1 revised due date Monday January 20
  • Near-term schedule
  • Ben has reserved the lab (EBU I, Room 3329) for
    this Friday, January 17, noon-120pm ? a running
    start into synthesis
  • Recitation 2 tomorrow (noon-1250pm) not on
    RTL design, but on datapaths and memories
  • Lab tomorrow (330-5pm) really Lab 1

Slide courtesy of S. P. Levitan, U. Pittsburg
3
Review
  • Scaling of gates vs. Scaling of wires
  • What happens when you make a gate bigger?
  • What happens when you make a wire taller? Wider?
  • Coupling
  • Inductance
  • How does power/ground distribution affect
    inductance?
  • RC delay
  • Dynamic (useful) power vs. Static (useless) power
  • How do these issues impact estimates and design
    approaches?

Slide courtesy of S. P. Levitan, U. Pittsburg
4
Outline
  • Design types and cost / complexity drivers
  • Basic flow
  • On convergence and hierarchy

5
IC Design Methodologies
  • Full-Custom (high effort, leading-edge
    performance, high-volume)
  • Semi-Custom (strong infrastructure, economical in
    lower volumes)
  • ASIC (Application-Specific Integrated Circuit)
  • COT (Customer-Owned Tooling)
  • ASIC vs. COT Who pays for the scrap?
  • FPGA
  • System-on-a-Chip
  • Larger components, often from outside of design
    team
  • Special
  • Analog (custom layout, I/Os and sense amps)
  • Mixed-Signal / RF (unique to each process, no
    scaling)

Slide courtesy of S. P. Levitan, U. Pittsburg
6
Acceleration of Gate Length Scaling
  • What are some implications?
  • Slide courtesy of Numerical Technologies, Inc.

7
Mask NRE Cost (1999)
8
Design Technology Crises, ITRS-2001
Incremental Cost Per Transistor
Test
Manufacturing
Manufacturing
Turnaround Time
SW Design
NRE Cost
Verification
HW Design
  • 2-3X more verification engineers than designers
    on microprocessor teams
  • Software 80 of system development cost (and
    Analog design hasnt scaled)
  • Design NRE gt 10s of M ?? manufacturing NRE 1M
  • Design TAT months or years ?? manufacturing TAT
    weeks
  • Without DFT, test cost per transistor grows
    exponentially relative to mfg cost

9
Silicon Complexity Challenges
  • Silicon Complexity impact of process scaling,
    new materials, new device/interconnect
    architectures
  • Non-ideal scaling (leakage, power management,
    circuit/device innovation, current delivery)
  • Coupled high-frequency devices and interconnects
    (signal integrity analysis and management)
  • Manufacturing variability (library
    characterization, analog and digital circuit
    performance, error-tolerant design, layout
    reusability, static performance verification
    methodology/tools)
  • Scaling of global interconnect performance
    (communication, synchronization)
  • Decreased reliability (SEU, gate insulator
    tunneling and breakdown, joule heating and
    electromigration)
  • Complexity of manufacturing handoff (reticle
    enhancement and mask writing/inspection flow,
    manufacturing NRE cost)

10
System Complexity Challenges
  • System Complexity exponentially increasing
    transistor counts, with increased diversity
    (mixed-signal SOC, )
  • Reuse (hierarchical design support, heterogeneous
    SOC integration, reuse of verification/test/IP)
  • Verification and test (specification capture,
    design for verifiability, verification reuse,
    system-level and software verification, AMS
    self-test, noise-delay fault tests, test reuse)
  • Cost-driven design optimization (manufacturing
    cost modeling and analysis, quality metrics,
    die-package co-optimization, )
  • Embedded software design (platform-based system
    design methodologies, software verification/analys
    is, codesign w/HW)
  • Reliable implementation platforms (predictable
    chip implementation onto multiple fabrics,
    higher-level handoff)
  • Design process management (team size / geog
    distribution, data mgmt, collaborative design,
    process improvement)

11
Outline
  • Design types and cost / complexity drivers
  • Basic flow
  • On convergence and hierarchy

12
Sylvester-Keutzer Classic Picture
Sylvester-Keutzer, Computer Nov. 99
13
Traditional Flow
Front End
Back End
14
Block-Level Design Methodology
  • Architectural optimization (timing)
  • Inter-group buses, bandwidth
  • Clock, SI, test validation

Design Specs
Fnl. Design
Constraints
Synthesis
Lib.CWLM
  • Floorplanning and custom WLM
  • Power distribution (Internal, I/O)
  • I/O driver, padring design
  • Board-level timing, SI

Floor-plan PG
Lib.CWLM
Placement
Physical re-synth
  • Row definitions
  • Placement of cells
  • Congestion analysis

Clock distribution
Route, scan re-order
  • Placement-based re-synthesis
  • Noise minimization, isolation
  • Clock distribution

Timing analysis, IPO
Fnl., pwr., SI ECO
  • Full routing
  • Scan stitching, re-ordering

A. Khan, Simplex/Altius
Reqmts.
ERC, DRC, LVS
  • Full RC back-annotation
  • Hierarchical timing, electrical and SI analysis
    and IPO/ECO

Tape-out
15
Generic Flow Steps
  • Preparation
  • Library data preparation
  • Design data preparation
  • Logic design
  • Specification to RTL
  • RTL simulation
  • Hierarchical floorplanning
  • Synthesis
  • Formal verification
  • Gate level simulation
  • Static timing analysis
  • Physical design
  • Physical floorplanning
  • Place and route
  • RC extraction
  • Formal verification
  • Physical verification
  • Release to manufacturing
  •  Design for test  
  • Engineering change order

16
Library and Design Data
  • Models and technology data required to execute
    the design flow
  • Power, timing ALF, DCL, OLA, .lib, STAMP
  • Layout LEF, DEF, GDSII
  • Delays and path timing, parasitics SDF, GCF,
    SDC, DSPF, RSPF, SPEF, SPICE
  • Layout rules Dracula, Calibre deck

17
Specification to RTL
  • Defines the logic and fundamental structure of
    the chip at the RTL level in either the verilog
    or VHDL language
  • Requires considerable interaction with the
    customer, plus specs such as the architecture,
    system, design, test and block specs
  • May include RTL from the customer or third party
    IP providers
  • Coding guidelines should be established and
    adhered to, and the code must be compatible with
    the chosen synthesis tool
  • Special design considerations such as multiple
    clock frequencies, asynchronous logic, high speed
    logic, race conditions, gated clocks, etc. must
    be addressed

18
RTL Simulation
  • RTL code, written in Verilog, VHDL or a
    combination of both, is simulated to verify
    functional correctness
  • Testbenches apply input stimulus to the design
  • Several methods are used to verify the outputs
  • Self-checking testbenches automatically verify
    output correctness and report mismatches
  • Results can be stored in a file and compared to
    previous results
  • Waveform displays can be used to interactively
    verify the outputs
  • Verification-specific tools Verisity Specman,
    Synopsys Vera
  • Functional verification
  • Mostly Modelsim
  • Cadences Verilog-XL or NC-Verilog also used

19
Hierarchical Floorplanning
  • Decide on the physical layout strategyflat or
    hierarchical?
  • Advantages of a flat implementation are generally
    a smaller die size, and a more straightforward
    approach to clock and power distribution and RC
    generation
  • Advantages of a hierarchical design
  • better runtimes,
  • better ability to control timing within localized
    areas of the design, and concurrent design
  • For hierarchical design, issues
  • physical partitioning of the logic into blocks
  • assignment of the physical locations for the
    block pins
  • timing budgeting,
  • distribution of clocks, power
  • signal bus routing
  • RC generation
  • Tool Example Cadences design planner

20
Floorplanning
  • Give placement initial clues
  • Cells that are interconnected want to be close
    together
  • Take advantage of RTL hierarchy
  • Generate a physical hierarchy
  • RTL hierarchy best physical hierarchy?
  • Place big blocks on chip (memories)
  • Allow space for power/clk/busses
  • Reduce complexity of placement

21
Synthesis
  • Conversion of RTL to gate level netlist
  • Target foundry specific library
  • Timing driven methodology
  • clock information
  • input arrival times, output required times
  • Input driving cells, output loading
  • False paths, multi-cycle paths
  • Interconnect delay is calculated based on a
    wireload model which uses fanout to calculate
    delay
  • Clocks parameters (insertion delay, skew, jitter,
    etc.) Are assumed to be attainable later in place
    and route

22
Synthesis contd.
  • Hierarchical synthesis
  • Block-by-block basis
  • Minimizes runtimes
  • Functional blocks
  • Tools
  • Cadence Buildgates
  • Synopsys Design Compiler (used for this course)

23
Formal Verification
  • RTL description and gate level netlist are
    compared to verify functional equivalence,
    thereby verifying the synthesis results
  • An emerging technology that supplements the more
    traditional approach of gate level simulation
  • Tools
  • Verplex Tuxedo-lec
  • Design Verifier (Chrysalis), Mentor FormalPro
  • Synopsys Formality (will be used in-class)

24
Gate Level Simulation
  • Another method to verify the synthesis process,
    which covers both the functionality and timing
  • Correctness is only as good as the test vectors
    that are used
  • Especially critical for non-synchronous designs,
    verification of false path and multi-cycle path
    constraints
  • Cell timing is included in the simulation models
    and interconnect delay is passed from the
    synthesis run
  • Worst case PVT conditions are used to analyze for
    setup violations, and best case PVT conditions
    are used to analyze for hold violations
  • PVT Process, Voltage, Temperature
  • Popular tools are Cadences Verilog-XL or
    NC-Verilog

25
Static Timing Analysis
  • Verifies that design operates at desired
    frequency
  • Implicitly assumes correct timing constraints
    (!), e.g., boundary conditions
  • Timing constraints are similar to those used in
    synthesis
  • Verifies setup and hold times at FF inputs can
    also check timing from and to PIs and POs can
    also check point-to-point delay values (with
    blocking of pins, etc.)
  • As with gate-level simulation, both best- and
    worst-case analysis is performed
  • Typically performed on full-chip (not block)
    basis
  • May require modified constraints for inter-block
    issues multiple clock domains, multi-cycle
    paths, etc.
  • For compatibility with timing-driven layout flow,
    helps to have simple / single set of constraints
  • Other issues incremental analysis,

26
Physical Floorplanning
  • Defines the basic chip layout architecture
  • Define the standard cell rows and I/O placement
    locations
  • Place rams and other macro cells
  • Define power bus structures such as power rings
    and stripes
  • Often performed using the standard place and
    route tool
  • Rules of thumb for cell density are used to
    initially calculate design size
  • Popular standalone tools are Cadences design
    planner and avantis planet

27
Place and Route
  • Automatically place the standard cells
  • Generate clock trees
  • Add any remaining power bus connections
  • Route clock lines
  • Route signal interconnects
  • Design rule checks on the routes and cell
    placements
  • Timing driven tools
  • Require timing constraints and analysis
    algorithms similar to those used during the
    static timing analysis step
  • Tools
  • Cadence Silicon Ensemble, Synopsys Apollo, Magma
    Blast Fusion

28
RC Extraction
  • Calculates the resistance and capacitance of
    interconnects
  • Based on placement of cells
  • Routing segments
  • Calculates capacitive effects of adjacent
    segments
  • Extracts capacitance between metal segments
  • RC data is transferred to
  • Static timing analysis (back annotation)
  • Gate level simulation
  • Replaces wire load model used in synthesis
  • Tools used
  • Cadence Hyperextract , Magmas Blast Fusion
  • Sequence Columbus, Synopsys Star-RC, Mentor
    X-Calibre

29
Signal Integrity
  • SI
  • Crosstalk issues
  • Inductance
  • Interference
  • Need new tools
  • Calculate and estimate SI
  • New delay models with SI estimates
  • SI aware routing

30
Formal Verification
  • Compares golden netlist to current netlist
  • Logic equivalence
  • Comparison of pre- and post-layout netlist
  • Similar to the formal verification step after
    synthesis clock tree insertions, drive strength
    changes, etc. have been made
  • Buffer insertion or logic optimization may have
    been performed

31
Physical Verification
  • DRC Design Rule Check
  • Polygon/Layer spacing rules
  • Verifies the design rules (DRC)
  • LVS Layout Versus Schematic
  • Verifies that layout and netlist are equivalent
    at the transistor level
  • Antenna
  • Manufacturing check for long nets
  • Net can accumulate charge during plasma etch and
    damage gate oxide
  • GDSII
  • Final merge of layout, routing and placement data
    for mask production
  • Example tools
  • Mentor Graphics Calibre (DRC, LVS)
  • Cadence Dracula, Diva

32
Release to Manufacturing
  • Final edits to the layout are made
  • Metal fill and metal stress relief rules are
    checked
  • Manufacturing information such as scribe lanes,
    seal rings, mask shop data, part numbers, logos
    and pin 1 identification information for assembly
    are also added
  • DRC and LVS are run to verify the correctness of
    the modified database
  • Tapeout documentation is prepared prior to
    release of the GDSII to the foundry
  • Pad location information is prepared, typically
    in a spreadsheet
  • Cadences Virtuoso is used for custom-manual
    edits of the mask layers
  • Manufacturing steps
  • generation of masks
  • silicon processing
  • wafer testing
  • assembly and packaging
  • manufacturing test

33
Outline
  • Design types and cost / complexity drivers
  • Basic flow
  • On convergence and hierarchy

34
  • System
  • Design
  • Software
  • Design
  • Logic
  • Design
  • RTL
  • Synthesis
  • File
  • Timing Analysis
  • Functional
  • File
  • Verification
  • Place/Wire
  • File
  • Timing Analysis
  • Performance
  • Verification
  • File
  • Testability
  • MASKS
  • Verification

35
Aristo, DAC-2000
TYPICAL DESIGN FLOW
Design Constraints
IP Blocks
Library
Design Netlist
Gate-Level Verilog
Concurrent Block Partitioning, Clustering
Placement
Early Planning
Gate-Level Optimization
Design Refinement
Gate-Level Place Route
Top-Level Routing
Chip Assembly
RC Extraction
Timing Analysis
PREDICTABLE HIERARCHICAL DESIGN CONVERGENCE
36
Monterey, DAC-2000
Design Signoff
Physical Prototyping
GDSII
37
Design Closure
  • Input
  • RT-level HDL technology constraints
  • Output
  • go recipe for invocation and composition of
    SPR results
  • no go diagnosis of RTL code problems
  • Logical and physical hierarchies co-evolve
  • spatial top-down coarse placement ? physical
    hierarchy
  • logic/timing implementable RTL ? logical
    hierarchy
  • limits of human fanout, organizations ? always
    have hierarchy
  • Have seen a natural sequence of no-floorplanning,
    physical-floorplanning, RTL-floorplanning... as
    chip complexities increase
  • Details (must construct, predict, ignore,
    eliminate, ...)
  • pin optimizations, interconnect planning,
    hierarchy reconciliations, budgeting mechanisms,
    compatibility with downstream SPR, ...

38
Logical and Physical Hierarchies
  • Two hierarchies logical/functional, and
    physical
  • (schematic hierarchy also typical in
    structured-custom)
  • RTL design logical/functional hierarchy
  • provides valuable clues for physical embedding
    datapath structure, timing structure, etc.
  • can be incredibly misleading (e.g., all clock
    buffers in a single hierarchy block)
  • Main issues
  • how to leverage logical/functional hierarchy
    during embedding
  • when to deviate from designers hierarchy
  • methodology for hierarchy reconciliation
    (buffers, repartitioning / reclustering, etc.)

39
Functional Partitioning
  • Subblocks in A connected with subblocks in B
    result in
  • 600 top level nets.

Source ReShape
40
Physical Partitioning
Physical partitioning reduced the number of top
level nets from 600 to 0
Source ReShape
41
Unconstrained Placement
42
Floorplanned Placement
43
Thermal Map of Routing Congestion
44
Natural Block Shapes
  • Are not disjoint rectangles, e.g., intersecting
    timing paths all want to be embedded as straight
    paths
  • Traditional chip floorplan dissection into
    rectangles may not be optimum for wirelength and
    timing, but has compensating advantages
    (convenience)

Blk A
Blk B
1.0
0.5,0.5
1.0
45
Physical Hierarchy
  • Physical hierarchy hierarchical, very
    structured organization of the core layout region
  • Potentially, little relation to high-quality
    (e.g., w.r.t. timing, routability) embedding of
    logic
  • Some obvious exceptions
  • regular structures (memories, PLAs, datapaths)
  • hard IP blocks
  • And, physical hierarchy helps to define and plan
    global interconnects
  • Recent trend try to avoid artifactual physical
    hierarchy created by top-down recursive
    bipartitioning-based placement approach

46
Convergence and Predictability
  • We seek a predictable, estimatable back end
    (physical implementation after some handoff level
    of design)
  • Predictability regression models? (e.g.,
    wireload models)
  • Predictability an enforceable assumption?
    (correct by construction)
  • constant-delay paradigm (logical effort, DEC,
    IBM, Magma, ...)
  • Predictability fast constructive prediction?
    (also correct by construction)
  • RT-level (Tera Systems), gate-level flat
    full-chip (Silicon Perspective Corp.
    FirstEncounter)
  • Predictability remove the need for
    predictability?
  • GALS, LIS (global-asynchronous/local-synchronous
    latency-independent synchronization)
  • protocol- / communication-based system-level
    design
  • Or, just make the loops tighter and easier
    (construct by correction)

47
Planning Technology
  • RTL partitioning
  • understand interaction b/w block definition and
    placement quality
  • recognize and cure a physically challenged logic
    hierarchy
  • Global interconnect planning and optimization
  • symbolic route representations to support block
    plan ECOs
  • Controllable SPR back end (including
    power/clock/scan)
  • Incremental / ECO optimizations, and
    optimizations that are robust under partial or
    imperfect design knowledge
  • Estimators (initial wireload models)
  • to account for resource, topological
    heterogeneity
  • to account for optimizations (placement,
    ripup/reroute, timing)
  • ? earliest RTL signoff with detailed PR
    knowledge

48
Extra Slides
49
Sequence, DAC-2000
3D Extraction
Prepare
Database
Timing Sign-off
Delay
True-3D
Calculation
Parasitics
Place
Timing
Timing
Sequence
RTL

Synthesis
Analysis
Analysis
Route
Interconnect
Interconnect
Driven
Driven
Optimization
Optimization
Driver sizing,topology-based optimization
50
Cadence, DAC-2000

RTL, chip constraints
Partitioning Log/Phys Mapping
Block Area/Performance Estimation
Block Placement
Inter-block Routing and Buffering
Communication Logic Synthesis
Concurrent Placement, Synthesis And Route of
Cells in Blocks
Finalize Route/Extract/Back Ann.
51
Magma, DAC-2000 fixed timing
0.6ns
0.6ns
0.6ns
0.6ns
FF
  • Actively managing wire delay
  • Through automatic sizing (sizing-driven
    placement)
  • Through buffer insertion

52
Interconnect Complexities
  • Interconnect effects play a major role in the
    increasing costs for large hard-block or
    rectilinear-outline based design styles
  • Probabilistic wireload models fail
  • Without new capabilities for soft IP design and
    assembly, interconnect problems will
    significantly impact performance and cost for
    emerging IC technologies

Local wires
blocks
Occurrence Rate (Normalized)
global wires
Global wires
Courtesy Pileggi, MARCO GSRC
0.5
53
Technology Scaling
  • Block sizes cannot grow as rapidly as chip sizes
    since block design becomes increasingly more
    difficult --- each block is a chip design over
    multiple configurations
  • If the blocks are inflexible, the global wiring
    problems begin to dominate all aspects of
    performance quality and system cost

Occurrence Rate (Normalized)
Courtesy Pileggi, MARCO GSRC
Larger chip with finer feature sizes
0.5
54
Soft Blocks
  • With soft, flexible blocks, the system assembly
    can more thoroughly exploit the available
    technology
  • Interconnect problem is controlled via soft
    boundaries for area re-shaping re-synthesis and
    re-mapping for timing smart wires and top-down
    specified block synthesis
  • Cf. Amoeba placement, coloring analysis of
    good placements with respect to original logic
    hierarchy, etc.

Occurrence Rate (Normalized)
Courtesy Pileggi, MARCO GSRC
Superior timing, power and cost
0.5
55
Taxonomy of Planning / Implementation
  • Centered on logic design (logic synthesis
    drives)
  • wire-planning methodology with block/cell global
    placement
  • global routing directives passed forward to chip
    finishing
  • constant-delay methodology may be used to guide
    sizing
  • Synopsys, (Magma)
  • Centered on physical design (layout synthesis
    drives)
  • placement-driven or placement-knowledgeable logic
    synthesis
  • Cadence, Avant!
  • Buffer between logic and layout synthesis (thin
    layer)
  • placement, timing, sizing optimization tools
  • Sequence
  • Centered on SOC, chip-level planning
  • interface synthesis between blocks
  • communications protocol, protocol implementation
    decisions guide logic and physical implementation
Write a Comment
User Comments (0)
About PowerShow.com