SCORE - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

SCORE

Description:

University of California, Berkeley BRASS group. Andr DeHon ... Quant / ZLE. Huffman Enc. FPL 2000 (8/30/00) 13. BRASS. SCORE Components. Graph-based Compute Model ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 24
Provided by: JohnWaw5
Category:
Tags: score | quant

less

Transcript and Presenter's Notes

Title: SCORE


1
SCORE
StreamComputationsOrganized forReconfigurableE
xecution
Eylon Caspi, Michael Chu, Randy Huang, Joseph
Yeh, John WawrzynekUniversity of California,
Berkeley BRASS group André DeHonCalifornia
Institute of Technology Dept. Computer Science
http//brass.cs.berkeley.edu/SCORE/
2
Goal Software Survival
  • Software for microprocessors survives on new
    devices
  • Binary compatibility
  • Automatic improvement
  • Software for reconfigurable devices does not
  • Substantial effort to port/redeploy

3
Outline
  • Problem Software Survival
  • A New Compute Model
  • SCORE Components
  • Preliminary Results
  • Future Work

4
Why Cant Reconfig. Software Survive?
  • Resource constraints/sizes are exposed
  • to programmer
  • in low-level representation (netlist)
  • Design revolves around device size
  • Algorithmic structure
  • Exploited parallelism

5
The SCORE Approach
  • A compute model with unbounded resources
  • Efficient hardware virtualization
  • Demand paging

6
Page-Compatible Devices
  • Family of devices with
  • Common page definition
  • Varying number of pages
  • Binary Compatibility
  • Automatic Performance Improvement

7
Virtualizing a Netlist (is bad)
  • Netlist is sensitive to timing
  • Disallow asynchronous features (e.g. busses)
  • Synchronous
  • WASMII LingAmano, FCCM 93
  • Page I/O via registers
  • Execute each cycle of every page
  • Hugereconfigurationoverhead!

8
Previous Attempts at Virtualization
  • Multi-context
  • DPGA DeHon, FPGA 94
  • TM-FPGA Xilinx, FCCM 97
  • Configuration Cache
  • Striped
  • PipeRench CMU, FPGA 98
  • Pipelined reconfiguration
  • Restricted to feed-forward pipelines

9
Streams
  • Goal
  • Less frequent reconfiguration
  • Batch process block of inputs
  • Amortize reconfiguration cost over large data set

10
Stream Implementation
  • Only one endpoint (page) loaded
  • Stream memory buffer
  • Desire distributed, on-chip memory
  • Both endpoints (pages) loaded
  • Stream wire

11
Execution Example Spatial
12
Execution Example Time-Multiplexed
13
SCORE Components
14
SCORE Compute Model
  • Computation graph of compute nodes
  • Concretely compute pages
  • Abstractly operators with local state (FSM)
  • Communication streaming data flow
  • Storage
  • Streams
  • Memory segments,accessed through streams

15
SCORE Hardware Model
  • Paged FPGA
  • Compute Page (CP)
  • Fixed-size slice of RC hardware
  • Fixed number of I/O ports
  • Distributed, on-chip memory
  • Configurable Memory Block (CMB)
  • Stream access
  • High-level interconnect
  • Microprocessor
  • Run-time support user code

16
SCORE Run-Time Support
  • Mechanics of run-time reconfiguration
  • Page swap context save/load
  • Reconfigure interconnect
  • Page Scheduling
  • Which page to run where, when
  • Static Dynamic

17
Functional Simulation
  • FPGA based on HSRA Berkeley, FPGA 99
  • CP 512 4-LUTs
  • CMB 2Mbit DRAM
  • Area for CP-CMB pair
  • Page reconfiguration 5000 cycles (from CMB)
  • Synchronous operation (same clock speed as
    processor)
  • x86 microprocessor
  • Page Scheduler task
  • Swap on timer interrupt (every 250,000 cycles)
  • Fully dynamic scheduling

18
Applications
  • Multimedia processing applications
  • Hand-partitioned into 512-LUT pages
  • Good applications
  • Primarily feed-forward (feedback loops fit in
    HW)
  • Bad applications
  • Large, tight feedback loops (e.g. ADPCM)

19
Application JPEG Encode
20
Scaling Results JPEG Encode
Total Time (Makespan in millions of cycles)
Physical Compute Pages
21
Summary
  • SCORE enables software survival on reconfigurable
    systems
  • Binary compatibility
  • Automatic performance scaling
  • Virtual Hardware
  • Requirements
  • Graph-based compute model
  • Paged FPGA hardware
  • Run-time support for RTR/Scheduling

22
Future Work
  • Compilation/CAD
  • Partitioning FSM operators into pages
  • Study architectural parameters
  • Page size
  • CMB size
  • Tolerable reconfiguration time
  • Scheduling
  • Static scheduling

23
More Info on the Web
  • SCORE project
  • http//brass.cs.berkeley.edu/SCORE/
  • Tutorial
  • http//brass.cs.berkeley.edu/documents/ score
    _tutorial.html
Write a Comment
User Comments (0)
About PowerShow.com