Integration of Retiming with Architectural Floorplanning: A New Design Methodology for DSM - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Integration of Retiming with Architectural Floorplanning: A New Design Methodology for DSM

Description:

Integration of Retiming with Architectural Floorplanning: A New Design ... timing at the module level not an issue. timing at the chip level is an issue ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 16
Provided by: abdallah1
Category:

less

Transcript and Presenter's Notes

Title: Integration of Retiming with Architectural Floorplanning: A New Design Methodology for DSM


1
Integration of Retiming with Architectural
FloorplanningA New Design Methodology for DSM
  • Abdallah and Bassam Tabbara
  • Profs R.K.Brayton, A.R.Newton, and K.Keutzer
  • The NexSIS Project

2
Issues in DSM
  • timing at the module level not an issue
  • timing at the chip level is an issue
  • bigger raw capacity that can be used by
  • replication
  • reuse
  • NTRS Projections
  • 1997
  • .25u 4M tr/cm2 600 pins 6 layers 3cm2
  • 2006
  • .1u 40M tr/cm2 1500 pins 8 layers 10cm2

3
Problem Description
  • one-level hierarchy of design
  • minimum number of levels to support reuse
  • mid way between flat and two level
  • placement and wireplanning of
  • 200-2000 modules, average size 50k gates
  • dynamic range of modules sizes 1-500k gates
  • types of modules hard, firm, soft
  • hard layout
  • firm gates aspect ratio
  • soft RTL
  • large number of nets 40k-100k
  • pins per module 10-100

4
Problem Statement
  • We address chip level assembly of predesigned IP
    blocks, each under 100k gates in size, either as
    hard or soft macros, optimizing for performance,
    power and area (emphasis in that order).

5
Goals
  • develop a tool that has an impact in DSM by
    supporting IP reuse
  • handle IP blocks that have constraints and should
    be combined to result in a certain functionality.
    User design constraints include
  • delay
  • power dissipation
  • area
  • generate a final layout within 12-24 hours
    (overnight)
  • complexity of algorithms within O(m2) -O(m3) m
    design complexity
  • final result should
  • be within 5-10 of human design (may not be able
    to compare)
  • meet the user constraints if possible or make
    design suggestions

6
Challenges
  • size issues
  • bigger block sizes, aspect ratios and relative
    sizes
  • number of pins, nets much bigger than blocks
  • placement issues
  • special design for memories?
  • partitioning hard, clustering easy
  • routing issues
  • no channels, point to point
  • busses
  • many metal layers to be assigned
  • timing at the chip level

7
Conventional Flows
  • integration of various steps and tools
  • Logic Synthesis - Physical Design
  • Global - Detailed
  • separation of concerns
  • front end - back end
  • no contract
  • separation entails hundreds of iterations
  • number of iterations can be proportional to
    complexity of design

8
Conventional Flow Architecture
9
New Design Flow
  • minimize design iteration
  • planning at the early stages of the flow
  • support incremental changes
  • need for a proof of convergence
  • introduce retiming into the architectural
    floorplanning stage
  • better handle on timing issues
  • path-based vs. net-based

10
Design Flow Architecture
11
Functional Decomposition
  • provides an entry point for reused IPs
  • RTL may already be well characterized
  • area-delay trade-off as an important performance
    characteristic
  • result is
  • a set of blocks
  • some area-delay trade-off estimates

12
Retiming
  • takes in lower bound constraints
  • creates upper bound constraints
  • reduces area of modules whenever possible
  • can be made refinable and incremental
  • depends on granularity of the representation
  • path-based

13
Placement / Routing
  • initial placement/routing step
  • can be a min-cut or any constructive approach
  • has to be fast
  • gives lower bounds on delays between modules
  • placement/routing
  • takes in upper bounds from retiming as
    flexibility on placement
  • replaces modules resulting in better lower bound
    constraints
  • objective is to reduce total chip area
  • delay is reduced indirectly

14
Logic Synthesis
  • assumption
  • problems can be solved at the module level
  • predictable for given size modules
  • can be run in parallel for the different modules
  • provides better estimates of area-delay
    trade-offs for subsequent iterations

15
Iterations
  • loop between placement and retiming
  • until no further improvements are possible
  • may iterate many times
  • very similar to
  • initial min-cut partitioning
  • low temperature simulated annealing
  • have to prove some convergence criteria
  • loop between floorplanning/wireplanning and
    layout
  • only a few iterations
  • each iteration information is retained through
    area-delay trade-offs
  • also proof of convergence
Write a Comment
User Comments (0)
About PowerShow.com