BEE3 Update - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

BEE3 Update

Description:

The use of pros for PCB and mechanical design was an enormous win. ... Function Engineering (Palo Alto) Did thermal and mechanical engineering. Xilinx (San Jose) ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 18
Provided by: ChuckT2
Category:
Tags: bee3 | update

less

Transcript and Presenter's Notes

Title: BEE3 Update


1
BEE3 Update
  • Chuck Thacker
  • John Davis
  • Microsoft Research
  • 2 March 2008

2
Outline
  • BEE3 Overview
  • BEE3 Status
  • BEE3 Gateware
  • Moving forward

3
BEE3 System
4
BEE3 Package
5
BEE3 Tidbits
  • Design uses essentially every pin on the chip.
  • Design was done to be PC-like to leverage PC
    economies
  • PWB is about half the area of BEE2.
  • PWB is 18 layers rather than 22 for BEE2.
  • Uses PC power and peripherals.
  • System is divided into main board plus a separate
    (and separately designed) Control Board.
  • Allow designs to proceed in parallel at Celestica
    and BWRC, and reduced the risk of having to spin
    the (expensive) main board.
  • Control board has JTAG, and Flash for bitstreams
    and boot flash for each FPGA. Can operate
    without it.
  • The use of pros for PCB and mechanical design was
    an enormous win.
  • Celesticas design was 100 correct, and five
    systems worked with only one problem (which was
    easily corrected).
  • Took (probably) half the time, to produce
    something much more manufacturable and robust
    (and therefore cheaper).

6
BEE3 Subsystems
7
BEE3 Control Board
8
Project Participants and Roles
  • Microsoft Research (Silicon Valley)
  • Funds, manages system engineering, does some
    gateware
  • Celestica (Ottawa and Shanghai)
  • Did main board engineering, prototype fabrication
  • Microsoft has a very deep relationship with
    Celestica
  • BEECube
  • Builds and delivers functioning systems
  • Function Engineering (Palo Alto)
  • Did thermal and mechanical engineering
  • Xilinx (San Jose)
  • Provides FPGAs for academic machines
  • Provides FPGA application expertise
  • Ramp Group (BWRC)
  • Control board, basic software
  • Ramp Community
  • Uses the systems for research
  • Expanding to industrial users (e.g., us)

9
BEE3 Status
  • All subsystems work!
  • Board spin is required to correct MGT placement.
  • 10 Gbit channels require long routing.
  • Due to lack of information from Xilinx, not
    Celesticas error.
  • Respin is in progress. ETA for final board is 1
    May.

10
BEE3 Gateware
  • Today, consists primarily of test and
    characterization routines.
  • Much of this was ported from BEE2, although some
    is new
  • DDR2 Controller
  • Control RISC
  • MS designs use a minimal subset of the Xilinx
    tool suite
  • Just ISE, ChipScope, and (soon) Data2MEM.
  • May need EDK, but not yet.

11
DDR2 Controller
  • Largest piece of new Gateware.
  • 5 Modules, 2000 lines of Verilog.
  • Supports 2 4GB DIMMS/channel, 2 channels per
    FPGA.
  • Transfers are DDR 400 (5ns clock) with -2.
  • Supports only x4 registered DIMMs
  • Unbuffered DIMMs cant work because of
    address/control loading.
  • Handles all initialization, refresh, and
    calibration (semi) automatically.
  • Keeps track of up to 16 open banks/controller.
  • Calibration is fast (768 clocks).
  • So can be done at frequent intervals or in
    response to single errors.
  • Primary user commands are Read and Write
  • Both deal with 36-byte blocks. Simple FIFO
    interfaces.
  • Each channel is about 3 of the LX110T LUTs (no
    BRAMs).

12
DDR Controller Organization
  • Centralized main controller
  • Main control FSM
  • Address Fifo (64 30-bit command/addresses)
  • Open bank CAMs.
  • Clock generation, timing limit enforcement.
  • Six replicated I/O pin bank logic
  • Read and Write Fifos for 24 data bits (3 4-bit
    lanes, with one RAM chip/lane on each DIMM).
  • Calibration state machine, so that all 6 banks
    can calibrate in parallel.

13
DDR Controller (simplified)
14
Control RISC (TC4)
  • 36 bits (memories are 36n bits wide)
  • Harvard architecture
  • 1K instruction memory (1 BRAM)
  • 1K data memory (1 BRAM)
  • 256 register 3-port register file (2 BRAMs)
  • Very small (100 slices) Tiny Computer
  • All instructions execute in three 5ns phases. No
    pipelining.
  • Assembler, no C compiler. Sigh
  • So far, DRAM initialization, DRAM calibration,
    Control shell with UART interface.

15
TC4
16
Next Steps
  • Use Data2Mem to speed up TC4 edit, assemble, load
    cycle time
  • Currently takes 30 minutes, since we regenerate
    cores and rebuild entire design.
  • Should be a couple of minutes.
  • Add DDR2 test system (LFSRs) to do full-speed
    testing with random addresses and data. Should be
    rock solid.
  • Use Xilinx PlanAhead to lock the design so that
    it can be used as a component in larger designs.
  • Develop an on-chip interconnect to allow multiple
    DDR2 requesters without needing huge cross-chip
    busses.
  • Use BEE3 in our own research programs
  • A couple have already started.
  • This is the fun part. Building it was just work.

17
Questions?
Write a Comment
User Comments (0)
About PowerShow.com