Mesh of Tree: Unifying Mesh and MFPGA for Better Device Performances - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Mesh of Tree: Unifying Mesh and MFPGA for Better Device Performances

Description:

Improve area, speed, power and Layout. LAB / CLB / Cluster Size ? LUT Size ? ... Hierarchy is interesting for area and delay. optimization ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 25
Provided by: hay69
Category:

less

Transcript and Presenter's Notes

Title: Mesh of Tree: Unifying Mesh and MFPGA for Better Device Performances


1
Mesh of Tree Unifying Mesh and MFPGA for Better
Device Performances
Zied Marrakchi, Hayder Mrabet, Christian Masson
and Habib Mehrez Zied.marrakchi_at_lip6.fr LIP6
University of Paris VI
2
Todays FPGA challenge
Performance
  • Interconnect flexibility
  • Granularity
  • Buffer/pass transistor
  • Transistor Sizing
  • Full Custom
  • Channel Width
  • Wire Length
  • Fanout
  • Threshold Voltage
  • (ULL,LL,HS)

Cost
  • Process Steps
  • -gtLayout generator
  • Manufacturability

Power
Area
  • Dynamic Vs Static consumption
  • Switches area/Logic area
  • Power Management Techniques
  • - Sleep Transistor/ Clock Gating
  • - Multiple VDD
  • Low Leakage Technologies(SOI)

3
Outline
  • Introduction
  • Clustered Mesh architectures
  • Mesh of Tree architecture
  • - Tree-based interconnect
  • - Configuration flow
  • Experimental results
  • Architecture improvement
  • Conclusion

4
FPGA Area distribution
Area Distribution
80 of the area is occupied by the programmable
Interconnect wires, switches and configuration
bits
5
Modern FPGA (cluster-based FPGA)
Improve area, speed, power and Layout LAB / CLB
/ Cluster Size ? LUT
Size ? Local Interconnect topology ?
H Channel
Cluster
LB
V Channel
Ref E.Ahmed J.Rose IEEE Trans VLSI 2004
Local Interconnect
6
Cluster-based Mesh FPGA
  • VPR-Style (Toronto university)
  • Local interconnect Full Crossbar
  • Advantages
  • - Full routability
  • - Internal and external levels
  • separation
  • - Routing respects hierarchy
  • Disadvantages
  • - Area overhead
  • (cluster size 3 to 10)
  • STRATIX-Style (ALTERA)
  • Depopulated local interconnect
  • Advantages
  • - Area 50 reduction
  • Disadvantages
  • - Routing destroying locality

7
Future Interconnection Structure for FPGA
8
Mesh of Tree Architecture
Mesh clusters sizes 256 LBs
References Mrabet, Marrakchi and Mehrez
Performances improvement of FPGA using Novel
Multilevel hierarchical interconnection structure
ICCAD 2006
9
Tree The Downward Network
Cluster in level 0 Depopulated Interconnect
Mini Switch Box
MSB
LB
LB
LB
Logic Block
10
Tree The Downward Network
Butterfly Fat Tree
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
MSB
LB
LB
LB
LB
LB
LB
LB
LB
LB
LB
LB
LB
LB
LB
LB
LB
11
Tree The Upward Network
To connect LBs output to MSBs Each LB is
connected to one MSB in each level gt From
one level, an LB has only one path to reach a
destination gt Number of MSBs in each level is
equal to the number of LBs
12
Tree The Upward Network
Ascending circular shift scheme From two
levels, an LB has two different paths to reach
its destination Interconnect predictability
- Upward network source position level gt
a MSB - Downward network a MSB
destination gt routing path
13
Tree Connection with outside
MSB
LB
LB
LB
LB
Local interconnect
Output pad
Output pad
Input pad
Input pad
14
Mesh and Tree levels boundary
Routing separation, boundary Mesh clusters pins
Tree pads - Bottom-up routing First Tree
routing, Second Mesh routing - Top-down
routing First Mesh routing, Second Tree routing

15
Mesh of Tree flow Bottom-up approach
Initial Netlist
Mesh Partitioning
Sub-netlist
Sub-netlist
Partitioning
Partitioning
Partitioning
Partitioning
Detailed Placement
Detailed Placement
Detailed Placement
Detailed Placement
Routing
Routing
Routing
Routing
Pins re-ordering
VPR Place Route
16
VPR-Style Mesh vs Mesh of Tree
- Gain of 15 in Switches number - Run time
reduced by 2 times external nets reduction
parallel execution
17
Mesh of Tree interconnect distribution
The required inter-clusters interconnect 50 of
the total interconnect The VPR inter-clusters
placer/router cannot handle properly the pins
assignment constraint
  • Inter-clusters interonnect reduction
  • - Tree-based interconnect improvement
    Upward Network
  • - Top-down routing approach More
    flexibility in the Mesh interconnect level

18
Tree Upward network improvement
  • Adding Upward MSB (UMSBs)
  • An LB output can reach all Downward MSBs (DMSBs)
  • LBs and Pads positions equivalence inside owner
    cluster
  • No detailed placement less placement constraints

19
Tree-based interconnect population based on
Rents parameter
Interconnect population levels Rents parameter
(cluster signals bandwidth) Level 0 Rents
0.79 - Inputs 16 gt 12 DMSBs 16 gt 12 (level
1) - outputs
4 gt 3 UMSBs 4 gt 3 (level 1)
  • 19 switches reduction

20
Mesh of Tree flow Top-down approach
21
Inter-clusters interconnect Top-down vs Bottom-up
Bottom-up connection block Fc_in 1
Fc_out 1 large channel width
Top-down connection block Fc_in 0.5
Fc_out 0.25 Reduced channel width
22
Mesh of Tree Interconnect distribution
Top-down vs Bottom-up
Bottom-up 48 external
interconnect Top-down 22 external
interconnect
14 switches reduction
23
Conclusion
Mesh of Tree architecture - Good density 24
switches reduction compared to VPR clustered
Mesh - Good physical scalability
- Generate Tree-based layout (small size)
- Generate the total architecture by
abutment Current Work - Full custom
Layout of the Tree-based architecture (specific
cells library)
24
QA
Thank you
Write a Comment
User Comments (0)
About PowerShow.com