EVE: A CAD Tool Providing Placement and Pipelining Assistance for HighSpeed FPGA Circuit Designs - PowerPoint PPT Presentation

About This Presentation
Title:

EVE: A CAD Tool Providing Placement and Pipelining Assistance for HighSpeed FPGA Circuit Designs

Description:

EVE: A CAD Tool Providing ... Design Objectives of EVE. Target real FPGA architecture : Xilinx Virtex-E ... EVE: two operating modes. Timing Exact Microscopic ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 30
Provided by: willia114
Category:

less

Transcript and Presenter's Notes

Title: EVE: A CAD Tool Providing Placement and Pipelining Assistance for HighSpeed FPGA Circuit Designs


1
EVE A CAD Tool ProvidingPlacement and
Pipelining Assistance for High-Speed FPGA Circuit
Designs
  • William Chow
  • Supervisor Prof. Jonathan Rose
  • M.A.Sc. Thesis
  • Edward S. Rogers Sr. Department of
  • Electrical and Computer Engineering,
  • University of Toronto
  • September 28, 2001

2
Motivation
  • Context High-speed circuit designs, how?
  • Push-button design flow
  • Automatic design -gt circuit
  • 0.18 ?m, struggling to achieve 150MHz
  • Von Herzens paper VonH97
  • 250MHz FPGA, 0.6?m in 1997!
  • Useful Event Horizon concept (later)
  • EVE EVent horizon Editor

3
Xilinx Virtex-E CLB architecture
4
Event Horizon
src CLB
250MHz ? budget 4ns
Max clock skew 0.1ns, clock-to-output delay
1.3ns, LUT delayFF setup time 1.5ns
Max routing delay 4.0-0.1-1.3-1.5 1.1ns
5
Context
  • Von Herzens approach
  • Set speed goal
  • Build by construction using Event Horizon concept
  • EVE
  • Start with placed and routed design
  • Increase speed by manual editing small designs

6
Push-button vs Event Horizon Methodology
7
Goals
  • Construct a manual editor focussing on
    packing/placement/pipelining level of the Event
    Horizon design methodology to allow a designer to
    increase speed easier
  • Gain insights to better placement and routing
    techniques through extensive manual circuit
    editing experience

8
Design Objectives of EVE
  • Target real FPGA architecture Xilinx Virtex-E
  • Give full low-level control
  • Give instant performance feedback
  • Assist pipelining
  • (34) not supported by Xilinx Tools

9
EVE two operating modes
  • Timing Exact Microscopic Placement (TEMP) Mode
  • Change placement and packing of circuit
    components
  • Instant timing feedback
  • Invoke horizon suggest good placement positions
  • Pipelining Mode
  • Maintain correct functionality during flip-flop
    insertion and flip-flop motion
  • Instant feedback of new circuit speed estimation
  • Flip-flop placement optimizations

10
Horizon
(mode 1)
Definition Display effect of critical path delay
should a circuit element moved to indicated
positions
  • From Event Horizon
  • Gradient of colours
  • Horizon Radius
  • Where to evaluate
  • Limit computation
  • Display timing
  • -ve speed improves
  • ve speed degrades

Radius 1
11
Timing Exact Microscopic Placement(TEMP) Mode
  • Placement
  • Packing
  • Timing Feedback
  • Horizon
  • More info
  • Better answer

Radius 3
12
Implementation of TEMP mode
  • Instant feedback
  • Internal Timing Analysis
  • Accurate timing
  • Database of real delays
  • Compression by 100x (100MB-gt1MB)
  • High Interactivity
  • Integrate tightly with Xilinx backend (FPGA
    Editor) for quick incremental PR,timing

13
Partial Incremental Timing Analysis
  • Full Timing Analysis (TA)
  • O(n) Forward Backward Sweep as in HSC83
  • Faster Only rebuild modified portion of circuit

14
Delay Database
  • Delay Extraction
  • RC Models Elmore, Penfield Rubinstein
  • Not possible in EVE
  • Extracting Logic Delays
  • Extracting Routing Delays
  • Delay Database Compression

15
Routing Delay Compression
Symmetric!
Pin-to-pin delay (ns)
Dc(c)
BRAMs
Intersect
Dr(r)
Row of source pin
Column of source pin
16
Backend Integration
  • Existing tools are insufficient
  • Lack ease for incremental flow
  • Full CAD flow is slow
  • Solution Interface with Xilinx manual editor -
    FPGA Editor
  • Full set of commands for circuit editing
  • Use named pipes on WIN NT platform

17
Event Horizon Pipelining
(Mode 2)
Original Event Horizon
dst CLB
src CLB
  • Pipeline to extend Event Horizon

18
Features of Pipelining Mode
  • Represent circuit for easy pipelining
  • Maintain correct functionality during flip-flop
    insertion and flip-flop motion
  • Instant feedback of new circuit speed estimation
  • Flip-flop placement optimizations

19
Pipelining Mode
(Leave for demo)
20
Baseline Circuits Generation
  • (Push-button flow baseline)
  • Input is VHDL or Verilog
  • Synthesize using Synplify Pro 6.2, freq s
  • Place and route using Xilinx backend tools
  • Obtain frequency from reports
  • repeat step (2) to (4), increasing s 10 until
    done
  • Using frequency in (5), do Multi-Pass PlaceRoute
    (MPPR) for 10 runs, pick the best design 10

(skip!)
21
Results Using TEMP mode only
(Note Area is unchanged!)
12.7!
22
Example Vision
23
Vision Before
203.3MHz
24
Vision After
224.8MHz
25
Results Using both TEMP and pipelining modes
(Note FF inserted once only)
26
Observations (1)
  • Pack and unpack slices during placement and
    routing is good

Slice
Slice
27
Observations (2)
Focusing on improving k-most critical path is
effective
28
Observations (3)
Partial re-routing of timing-critical regions is
effective
Reroute!
29
Observations (4)
CAD Tool should show high speed routing
resources on the chip, help user make better
decisions
Fast Routing!
30
Live Demo
31
Conclusion
  • Proposed a high-speed manual circuit design
    methodology
  • Created a manual editor
  • Targets real designs Xilinx Virtex-E
  • Focus on pipelining, placement, packing
  • Full low-level control
  • Instant exact timing feedback
  • Results speed increased up to 19, avg 12.7
    for 8 ccts

32
Future Work
  • Synthesis in Event Horizon framework
  • Extend EVE to support Virtex-II, etc.
  • Automate manual optimizations in EVE
  • Make pipelining mode more useful

33
Xilinx Virtex-E Routing Architecture
34
Xilinx HDL based hw design flow
35
Flip-Flop Insertability
  • Non flip-flop insertable edges
  • Routing edges COUT-gtCIN
  • Routing edges F5-gt F5IN
  • Non-Routing edges
  • Edges in transitive fanin of async reset pins

36
Loop Elimination
37
Flip-flop Insertion
38
Flip-Flop Motion
39
Flip-Flop Tracing
40
Flip-Flop Synthesis and Placement
  • Determine placement of flip-flop on most critical
    edge first
  • An area of valid locations are explored, and the
    best location is picked for each FF
  • Real routing delay values (previously stored in
    delay database) are used to evaluate positions

41
Limitations
  • Only synchronous, single clock.
  • No tri-state buffers
  • No Block RAM and LUT-RAM
  • Synthesized without I/O pads
  • XCV100E or below.

42
Software Architecture
43
Extracting Logic Delays
Logic Slice
  • Enumerate all delay paths in a slice, for each
    path
  • Construct a circuit with the path using XDL
  • Query Xilinx Timing Analyzer (TRACE)
  • Record delay in delay-matching table

44
Extracting Routing Delays (1)
45
Extracting Routing Delays (2)
  • Query each pin-to-pin routing delay from delay
    reporter
  • Delay search space too big, can take 1 month to
    produce database with Manhattan distance lt5,
    takes up 100MB of space
  • Solution A Routing Delay Database
    Compression Scheme

46
Routing Delay Compression (2)
  • Delay values are grouped into delay groups, each
    with a notation G(S1,P1,S2,P2,X,Y)
  • S1,P1 source slice pin
  • S2,P2 target slice pin
  • X,Y relative location of target pin to source
    pin
  • An intersect point is identified for capturing
    column and row vectors of delay values to
    describe delays in the whole group

47
Routing Delay Compression (3)
  • Convert delays into integers by a scaling factor
    of 0.02ns
  • Normalize delays using delay at the intersect pt
  • 2 1-D vector of delay values are collected at the
    intersect
  • Zeroes in the delay vectors are eliminated
  • Duplicates in the delay vectors are eliminated
  • Make use of symmetry of pins P1X P1 Y, P1
    XQ P1 YQ
  • Record data points explicitly when the scheme
    fails
  • Compression Ratio achieved 100x

48
Pipelining Rule 1
  • Forward Retiming
  • Backward Retiming

49
Pipelining Rule 2
  • Flip-Flop with CE
  • Flip-Flop with SR

50
Loop Detection
51
Flip-Flop Insertion
  • Based on a continuous forward and backward
    sweeping algorithm

FFs at 4-gt6 4-gt7 5-gt7 8-gt9
52
Flip-Flop Motion
  • Make use of transitive fanin fanout calculations

FFs at 4-gt6 7-gt9 8-gt9
53
Flip-Flop Placement
  • Edges sorted in increasing order on edge slack  
  • For each edge, E
  • CLB locations in the neighbourhood of the two end
    points of E are explored inside out until N CLBs
    from the center (Delay database stores delays
    with Manhattan distance lt N)
  • If not found, continue to explore for distance gt
    N
  • If still not found, report resource error

54
Period Estimation Case 1
55
Period Estimation Case 2
56
Horizon Calculation
  • Evaluate multiple placement alternatives
  • Goverened by Horizon Radius
  • First check if move is valid
  • Build temporary circuit
  • Full timing analysis to obtain cct speed
  • Display the horizon
  • Speed 2s to display horizon of radius 3 on 1GHz
    Pentium III

57
Backend Integration Summary
Write a Comment
User Comments (0)
About PowerShow.com