Title: This is a good background color and a good text color
1A Design Environment for High Throughput, Low
Power Dedicated Signal Processing Systems
W. Rhett Davis, Ning Zhang, Kevin Camera, Fred
Chen, Dejan Markovic, Nathan Chan, Borivoje
Nikolic, Robert W. Brodersen University of
California at Berkeley Berkeley Wireless
Research Center
2Outline of Presentation
- Benefits of Direct Mapping
- Standard DSP-ASIC Flow
- Chip-in-a-Day Flow
- Enabling Factors
- Design Examples
- Conclusion
3Direct Mapping Minimizes Power
Flexibility
Efficiency
100x-1000x Difference in Power
4Direct Mapping for Wireless Algorithms
- Low processing rates (wireless baseband 25
Msps) - High Complexity
- Low Power
P ? f ? C ? VDD2
Use low VDD / parallelism
5Standard DSP-ASIC Design Flow
Problems
- Separation of engineering teams makes exploration
hard - Uncontrolled looping when pipeline stalls
- Feedback to system designer is an aberration, but
should be encouraged
Prohibitively Long Design Time for Direct Mapped
Architectures
6Direct Mapping Design Flow
- Encourages iterations of layout
- Controls looping
- Reduces the flow to a single phase
- Depends on fast automation
7Outline of Presentation
- Benefits of Direct Mapping
- Standard DSP-ASIC Flow
- Chip-in-a-Day Flow
- Enabling Factors
- Design Examples
- Conclusion
8Capturing Design Decisions
- Categories
- Function - basic input-output behavior
- Signal - physical signals and types
- Circuit - transistors
- Floorplan - physical positions
How to get layout and performance estimates in a
day?
9Automated Design Flow
- New Software
- Generation of netlists from Simulink
- Merging of floorplan from last iteration
- Automatic routing and performance analysis
- Automation of flow as a dependency graph (UNIX
MAKE program)
10Why Simulink?
- Simulink is an easy sell to algorithm developers
- Closely integrated with popular system design
tool Matlab - Successfully models digital and analog circuits
11Simulink Models Datapath Logic
- Dataflow primitives(parallelism)
- Fixed-Point Types
- Completely specify function and signal decisions
- No need for RTL
Multiply / Accumulate
12Stateflow Models Control Logic
- Extended finite state-machine editor
- Co-simulation with Simulink
- New SoftwareStateflow-VHDL translator
- More complete capture of function decisions
Address Generator / MAC Reset
13Specifying Circuit Decisions
Time-Multiplexed FIR Filter
- Macro choices embedded in Simulink
- Cross-check simulations required
14Hierarchy Hardened Progressively
- Macro characterization saved for fast estimates
- Each level of hierarchy becomes a new hard macro
- Higher levels of hierarchy are adjusted
- When top level of hierarchy is hardened, the
design is done
15Capturing Floorplan Decisions
- Commercial physical design tools used
- Instance names in floorplan match Simulink
- Placements merged on each iteration
- Manhattan distance can be used for parasitic
estimates
16Outline of Presentation
- Benefits of Direct Mapping
- Standard DSP-ASIC Flow
- Chip-in-a-Day Flow
- Enabling Factors
- Design Examples
- Conclusion
17Reduced Impact of Interconnect
- 2X inverter
- 0.25 mm
- 1 mm isolated metal 1 wire
Long wires can be modeled as lumped capacitances
18Race-Immune Clock Tree Synthesis
- Race margin 580 ps
- 0.18 mm
- VDD 1 V
Currently supports 1M transistor designs
19Outline of Presentation
- Benefits of Direct Mapping
- Standard DSP-ASIC Flow
- Chip-in-a-Day Flow
- Enabling Factors
- Design Examples
- Conclusion
20Example 1 Macro Hardening
Most time/disk space spent on extraction and
power simulation
21Example 2 Test Chip
- 300k transistors
- 0.25 mm
- 1.0 V
- 25 MHz
- 6.8 mm2
- 14 mW
- 2 phase clock
- 3 layers of PR hierarchy
Parallel Pipelined FIR Filter(8X decimation
filter for 12-bit 200 MHz SD)
22Example 3 CDMA Baseband Receiver
- 500k transistors
- 0.18 mm
- 1.0 V
- 25 MHz
- 1.1 mm2
- 21 mW
- single phase clock
- 5 clock domains
- 2 layers of PR hierarchy
23Conclusion
- Design flows should encourage system designers to
explore design space by creating layout - Design phases should be determined by hierarchy
hardening, not separation of expertise - Low supply voltages make delay estimates easier,
simplify design flow - Chip layout in a day is feasible - Simulink
netlist generation - Floorplan merge -
Dependency graph automation - Characterization -
Simulink / VHDL translation - Clock Tree Synth.