Title: Design, Synthesis and Evaluation of Heterogeneous FPGA with Mixed LUTs and MacroGates
1Design, Synthesis and Evaluation of
Heterogeneous FPGA with Mixed LUTs and
Macro-Gates
- Yu Hu1, Satyaki Das2 Steve Trimberger2, and Lei
He1 - 1. Electrical Engineering Dept., UCLA
- 2. Research Lab, Xilinx Inc.
- Presented by Yu Hu
- Address comments to lhe_at_ee.ucla.edu
2Heterogeneous FPGA with Macro-Gates
- There exists trade-off between programmability
and cost (performance, area, power, etc.) - Xilinx V4 benefits from small gates (MUX2, XOR2)
built in SLICEs. - Seek a small set of wider logic functions (macro
gates) to replace a large portion of LUTs. - Reduce logic area and delay
- What is missing?
- Design What should be inside these macro gates?
- CAD Need flexible Synthesis tools to evaluate
the architecture!
3Selection of Logic Functions for Macro-Gates
0000001000000000 0000010000000000 0000100000000000
0001000000000000 0010000000000000 010000000000000
0
Map with LUT-N
Extract logic functions
Generate Utilization NPN Diagram
Calculate score For logic functions
Rank logic functions
Best function abcabc
4Proposed Macro-Gates and FPGA Architecture
- For IWLS05 benchmarks, the following four
6-input functions have the highest ranks - GI1a b c d e f (AND-6)
- GI2a b c b c f b c d b c e (MUX-4)
- GI3a b' c d' e b c e f d e f
- GI4a b' a' c d' b' c' e' f
- The architecture of the proposed macro-gate and
FPGA slice are
5Mapping Resource Utilization Balancer
- The available resource of different logics in an
FPGA is fixed - Technology mapper should optimize logic resource
utilization rate to minimize the packing area - A Binary Integer and Linear Programming is used
to balance the logic resource utilization while
preserving the timing
6Mapping SAT-Based Slice Packing
- Formulate the slice packing problem as a
localized place and route validation problem,
which is solved by SAT - Exclusively constraint (X_at_A) ? (X_at_B)
- Presence constraint (X_at_A) ? (X_at_B)
- Input/Output constraint X_at_A ? U5_at_N10
- Routing constraint G0 ?out ? U5_at_N10) ? U5_at_N12
- More constraints in the paper
7Overall Flow for Technology Mapping
Area weight Setting
Cut-based Mapping
Y
Area-Balance Trade-off?
LUT-MG ratio balancer
N
packing
8Architecture Evaluation
- Four architectures are compared
- LUT4, LUT4 macro gate, LUT6, and LUT6 macro
gate - Power and delay model
- Based on transistor number
- For IWLS05 benchmark, mixing LUT and gates
reduces delay and device area