Title: A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning
1A Study of the Speedups and Competitiveness of
FPGA Soft Processor Cores using Dynamic
Hardware/Software Partitioning
- Roman Lysecky, Frank Vahid
- Department of Computer Science and Engineering
- University of California, Riverside
- rlysecky, vahid_at_cs.ucr.edu
- Also with the Center for Embedded Computer
Systems at UC Irvine - This work was supported in part by the National
Science Foundation and the Semiconductor Research
Corporation
2IntroductionWarp Processors Dynamic HW/SW
Partitioning
- Study the benefits of warp processing for FPGA
soft processor cores
3IntroductionWarp Processors Dynamic HW/SW
Partitioning
- Dynamic HW/SW Partitioning
- Enabler Synthesis from Binaries Stitt Vahid,
2005Stitt Vahid, 2002 - Advantages
- Does not require any special compilers
- Completely transparent
- Provides separation of function and architecture
- Avoid complexities of supporting different FPGAs
- Opens additional market segments (i.e., all
software developers) that otherwise would not use
FPGAs and CAD
Traditional partitioning done here
Dynamic Hardware/Software Partitioning A First
Approach, DAC03 A Configurable Logic Fabric for
Dynamic Hardware/Software Partitioning,
DATE04 Dynamic FPGA Routing for Just-in-Time
FPGA Compilation, DAC04
4IntroductionSoft Processor Cores
- FPGA vendor currently providing soft processor
cores - Xilinx PicoBlaze and MicroBlaze
- Altera NIOS and NIOS II
- Advantages
- Configurability
- Add custom instructions/coprocessors
- Configurable instruction/data caches
- Quickly integrate processor within any FPGA
- Easy to build multi-processor systems
- Disadvantages
- Higher power consumption and decreased
performance - Compared to hard-core embedded processor
Proc.
ltlt
FPGA
- How can warp processing benefit soft processor
cores?
5IntroductionMicroBlaze Soft Processor Core
- MicroBlaze Soft Processor Core
- 32-bit configurable processor core with
three-state pipeline - Execution frequency as high as 150 MHz
- 85 MHz using Spartan3 FPGA
- Configurable instruction and data caches
- Configurable HW datapath components
- Multiplier to support mul instruction
- Divider to support idiv instruction
- Barrel shifter to support bs and bsi instructions
6MicroBlaze Warp Processor Single Processor System
Micro Blaze
Instr. BRAM
i_lmb
lmb_cntrl
d_lmb
lmb_cntrl
Data BRAM
opb
Periph1
Periph2
FPGA
A Configurable Logic Fabric for Dynamic
Hardware/Software Partitioning, DATE04 Dynamic
FPGA Routing for Just-in-Time FPGA Compilation,
DAC04
7MicroBlaze Warp Processor Multi-Processor System
FPGA
A Configurable Logic Fabric for Dynamic
Hardware/Software Partitioning, DATE04 Dynamic
FPGA Routing for Just-in-Time FPGA Compilation,
DAC04
8MicroBlaze Warp ProcessorWarp Configurable Logic
Architecture (WCLA)
- Warp Configurable Logic Architecture (WCLA)
- Data address generators (DADG) and Loop control
hardware (LCH) - Provides fast, efficient coprocessor interface
- Fast, single-cycle 32-bit multiplier-accumulator
(MAC) - Ideally, WCLA would use existing FPGA for
configurable logic - JIT FPGA compilation tools currently only support
our custom CAD-oriented FPGA
Profiler
Micro Blaze
Instr. BRAM
BRAM Intrf.
Data BRAM
DPM
Custom FPGA (250 MHz)
opb
WCLA
P1
P2
FPGA
A Configurable Logic Fabric for Dynamic
Hardware/Software Partitioning, DATE04 Dynamic
FPGA Routing for Just-in-Time FPGA Compilation,
DAC04
9MicroBlaze Warp ProcessorPerformance Speedup
(Single Critical Kernel)
10MicroBlaze Warp Processor Energy Consumption
(Single Critical Kernel)
11MicroBlaze Warp Processor Conclusions Future
Work
- Conclusions
- Studied the benefits of warp processing for FPGA
soft processor cores (MicroBlaze) - Average speedups of 5.8X (gt10X possible for some
applications) - Average energy reduction of 58
- Demonstrated MicroBlaze warp processor is
competitive with hard-core embedded processors - Speedup of 1.3X compared to 325MHz ARM10
- Energy reduction of 26 compared to 325MHz ARM10
- Future Work
- Prototyping our custom FPGA and warp processors
- Supporting a wider range of applications
(PDA/desktop/server) - Incorporating advances on-chip configurable
structures