Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanovic - PowerPoint PPT Presentation

About This Presentation
Title:

Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanovic

Description:

Prototype of a Vector-Thread Processor Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanovi MIT Computer Science and Artificial Intelligence Laboratory, – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 11
Provided by: RonnyKra5
Category:

less

Transcript and Presenter's Notes

Title: Jared Casper, Ronny Krashinsky, Christopher Batten, Krste Asanovic


1
A Parameterizable FPGA Prototype of
a Vector-Thread Processor
  • Jared Casper, Ronny Krashinsky, Christopher
    Batten, Krste Asanovic
  • MIT Computer Science and Artificial Intelligence
    Laboratory,
  • Cambridge, MA, USA

2
SCALE Vector-Thread Processor
  • Key Features
  • 4 lanes, 4 clusters
  • Cluster for indexed accesses
  • 4 segment address generators
  • 4 VLDQs
  • VRU includes throttle logic, refill address
    generator

3
SCALE Cache
  • Key Features
  • Two cycle hit latency
  • Four 8 KB banks
  • 32 way associative
  • 32B cachelines
  • 16B/cycle per bank
  • Four 16B segment buffers per bank

4
SCALE Prototype Chip
  • Prototype SCALE processor in development
  • Control processor MIPS, 1 instr/cycle
  • VTU 4 lanes, 4 clusters/lane, 32
    registers/cluster, 128 VPs max
  • Primary I/D cache 32 KB, 4x128b per cycle,
    non-blocking
  • DRAM 64b, 200 MHz DDR2 (64b at 400Mb/s 3.2GB/s)
  • Estimated 10 mm2 in 0.18µm, 400 MHz (25 FO4)
  • Cycle-level execution-driven C
    microarchitectural simulator
  • Detailed VTU and memory system model

5
Scale Prototype Board
  • Single Xilinx Virtex-II FPGA
  • Configured via direct JTAG connection or
    SystemACE
  • Multiple Memory Chips
  • Six Micron DDR2 SDRAMs
  • Two Micron Mobile SDRAMs
  • One Micron RLDRAM
  • One Samsung SRAM
  • Two Logic Analyzer connections
  • Multiple separate power islands
  • Attached to custom test baseboard
  • Sixteen independently measurable power supplies
  • Byte-serial connection to a Linux PC

6
Module Placement
  • Reduce the risk of the final custom chip
    implementation
  • Allow early rapid prototyping of many of the
    system interactions
  • Provide a parameterizable prototype for
    architectural experiments

7
Testing Setup
8
Testing Setup
9
Status
  • Completed Work
  • Single-issue seven-stage pipeline MIPS processor
    core
  • Mapped to the board and passes our MIPS
    verification test suite
  • Will form the SCALE control processor
  • DDR2 memory controllers
  • Tested in isolation using simple memory traffic
    generators
  • Work in progress
  • Cache subsystem
  • Vector-thread unit

10
Advantages of Using an FPGA
  • Rapid full system simulation of a large variety
    of designs
  • Allows extensive characterization of the design
    space
  • Parameterization allows exploration of various
    tradeoffs
  • Cache parameters and replacement policies
  • Prefetch strategies
  • DRAM access scheduling policies and power-down
    modes
  • DRAM types (e.g., DDR2 vs. Mobile DRAM)
  • Fast emulation system for SCALE software
    development
  • Allows thorough debugging before going to silicon
Write a Comment
User Comments (0)
About PowerShow.com