Designing a Runtime Reconfigurable Processor for General Purpose Applications - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Designing a Runtime Reconfigurable Processor for General Purpose Applications

Description:

Instruction queue stores instructions from decoder ... Real hardware implementation. A model to analyze power consumption. Performance investigation ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 23
Provided by: ytai
Category:

less

Transcript and Presenter's Notes

Title: Designing a Runtime Reconfigurable Processor for General Purpose Applications


1
Designing a Runtime Reconfigurable Processor for
General Purpose Applications
  • Niyonkuru and Zeidler
  • Universitat der Bundeswehr Hamburg

2
  • Basic goal
  • An easy-to-use runtime reconfigurable processor
    based on dynamic reconfiguration
  • General purpose Runs any application with
    comparable performance

3
Related Work
  • Fixed conventional microarchitecture with
    reconfigurable processing units
  • Act as a coprocessor
  • Programs have to be partitioned
  • One code portion executed on conventional path
  • The other executed on reconfigurable path
  • Programmers have to
  • Invoke precompiled HW lib, or
  • Code in HW themselves
  • PRISC, DISC, CoMPARE, GARP, OneChip, etc.

4
Related Work (contd.)
  • The SCORE execution model
  • Based on three essential components
  • Compute page
  • Memory segment
  • Stream link
  • An application is partitioned into consecutive
    operators (multipliers, FFT, FIR-filter, etc.)
  • A conventional processor is required to sequence
    the compute pages

5
Related Work (contd.)
  • The MATRIX architecture
  • A flexible architecture that allows the
    definition of microarchitecture for each
    application
  • Basic functional units (BFU)
  • Contains local memory, 8-bit ALU, Control logic
  • Can be configured to instruction memory, data
    memory, datapath element or control element
  • Hierarchical BFU interconnection
  • Hand-coded mapping of algorithms on BFUs and
    connection setup

6
Related Work (contd.)
  • Flexible instruction processors (FIPs)
  • Processor templates that allow processor types to
    be dynamically configured thru predefined params
  • Adapt the implementation to applications during
    execution
  • Application behaviors determined by runtime
    statistics and used to determine suitable
    microarchitecture

7
Related Work (contd.)
  • Complexity Adaptive Processor (CAP)
  • HW complexity and processor clock cycle adapted
    at runtime
  • Augment conventional HW with partition enable
    signals that turn HW partitions on/off
  • Reduced power dissipation
  • Improved performance

8
Related Work (contd.)
  • All of the above make use of HW reconfiguration
    to enhance performance
  • However, they require different HW/SW tools to
    map applications on them
  • The authors approach an enhanced runtime
    reconfigurable architecture compatible with
    existing general purpose processors

9
Designing A Partial Runtime Reconfigurable
Processor
  • The choice of suitable device
  • SRAM-based programmable device Xilinx Virtex-II
    FPGA
  • Practical design flow Xilinx module based
    partial reconfig design flow
  • Appropriate processor architecture
  • 16-bit ARM Thumb ISA
  • Software development tool chain can be used
    directly

10
Proposed Microarchitecture
11
Proposed Microarchitecture (contd.)
  • Instruction Memory (IM)
  • Use dual-port block SelectRAM to get 64-bit BW
  • Four 16-bit instructions can be fetched per cycle
  • Fetch Unit / Predecoder (FU/P)
  • Provides valid instruction addr to IM or trace
    cache
  • Fetch from IM at program start or trace cache
    miss
  • Fetch from trace cache on trace cache hit
  • Notify configuration manager about execution
    units needed after opcode predecode

12
Proposed Microarchitecture (contd.)
  • Trace Cache (TC)
  • Originally to avoid instruction supply bottleneck
  • In this paper used to determine HW resources
    required at runtime
  • Decoder
  • Act as a conventional instruction decoder
    decodes instructions, reads operands from reg
    file and sends to RUU

13
Proposed Microarchitecture (contd.)
  • Register Update Unit (RUU)

14
Proposed Microarchitecture (contd.)
  • Register Update Unit (RUU)
  • Collects decoded instructions and dispatches them
    to different execution units
  • Instruction queue stores instructions from
    decoder
  • Dependency buffer keeps track of dependencies
  • Allow out-of-order execution, in-order completion
  • Input vectors from CM indicate the number of
    execution units available

15
Proposed Microarchitecture (contd.)
  • Config1 / Config2 / Config3
  • Execution units Int-ALU, Int-MDU, LSU, etc.
  • A specific configuration provides a set of EUs
    with a fixed number of each of them
  • Number of EUs changed dynamically by loading
    different configs
  • Configuration Manager
  • Stores predefined configurations
  • Performs config swapping dynamically

16
Proposed Microarchitecture (contd.)
  • Data Memory
  • Harvard architecture separate from IM
  • Bus Macros
  • The Xilinx bus macros
  • 4 bits each, so multiple entities required

17
Conclusion
  • Design of a general purpose reconfigurable
    processor (ARM ISA) using Xilinx modular design
    flow
  • Functional units partitioned into fixed module
    and configurable module
  • Future work
  • Real hardware implementation
  • A model to analyze power consumption
  • Performance investigation

18
SysteMorph Dynamic/Online/Adaptive System-Level
Optimization for SoC
  • Yoshimatsu et al.
  • Institute of Systems Information Technologies,
    Kyushu

19
SysteMorph Concepts
  • A feedback directed dynamic SW / ISA / HW
    co-optimization technology
  • Elemental technologies
  • Online profiling
  • Adaptive dynamic optimization
  • Smart hardware
  • VLIW execution units
  • Reconfigurable functional units

20
Dynamic Optimization
  • Dynamic software pipelining for VLIW

21
Dynamic Optimization (contd.)
  • Reconfigurable device DAP/DNA-HP

22
Dynamic Optimization (contd.)
  • Online synthesis
Write a Comment
User Comments (0)
About PowerShow.com