Digital Engineering Laboratory Course Introduction - PowerPoint PPT Presentation

Loading...

PPT – Digital Engineering Laboratory Course Introduction PowerPoint presentation | free to download - id: 6acf98-NTEzN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Digital Engineering Laboratory Course Introduction

Description:

Digital Engineering Laboratory Course Introduction & FPGA Concepts and Design ECE 554 Department of Electrical and Computer Engineering University of Wisconsin - Madison – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 58
Provided by: homepages138
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Digital Engineering Laboratory Course Introduction


1
Digital Engineering LaboratoryCourse
Introduction FPGA Concepts and Design
  • ECE 554
  • Department of Electrical and Computer Engineering
  • University of Wisconsin - Madison

2
Instructors and Course Website
  • Nam Sung Kim, nskim3_at_wisc.edu
  • Office 4615 Engineering Hall
  • Office hours Tue,Wed,Thur - 200 to 300 PM
  • Additional hours by
    appointment
  • Chunhua Yao, yao1_at_wisc.edu
  • Teaching Assistant for Labs
  • Office hours are assigned lab hours 330 to
    630 Tuesday and Thursday
  • The course website and wiki are at
  • http//homepages.cae.wisc.edu/ece554/new_web
    site/
  • https//cgi.cae.wisc.edu/ece554/pmwiki/pmwiki.
    php

3
Course Objectives
  • Deal with problems and solutions associated with
    many aspects of a large digital design project
  • Work effectively as a member of a moderate-sized
    team
  • Use contemporary commercial design tools
  • Use programmable user-defined devices (FPGAs) for
    rapid prototyping
  • Learn to live on Pizza and get by on very little
    sleep ? at least during the last part of the
    course.

4
Prerequisites and Location
  • ECE 351 Digital Logic Laboratory
  • ECE/CS 552 Introduction to Computer
    Architecture
  • ECE 551 - Digital System Design and Synthesis
    (strongly recommended)
  • Laboratory 3628 Engineering Hall
  • Lecture 3444 EH
  • Lectures and Reviews during Lab Hours 3444 EH

5
Access to the lab
  • Laboratory 3628 Engineering Hall
  • The lab access is password protected and you
    will have access to the lab 24/7
  • Password

6
Course Overview Grading
  • 15 Miniproject due 2/5
  • Design a Special Purpose Asynchronous
    Receiver/Transmitter (team of 2)
  • 20 Bench Exam on 2/26
  • Designed to test your understanding of Design
    Specifications, Verilog, Debugging, Lab
    Environment, etc. (individual)
  • 65 Project demos 5/5, report 5/14
  • Design, implement, test, and program a general or
    special purpose digital computer that emphasizes
    some particular features (team of 4 to 6)

7
Miniproject
  • For the miniproject, you will
  • Design a Special Purpose Asynchronous
    Receiver/Transmitter (SPART) and its testbench in
    Verilog/VHDL and use EDK toolset
  • Simulate the design to ensure correct performance
  • Download the design and associated files and
    demonstrate correct functionality
  • Preparing a report on your design
  • https//cgi.cae.wisc.edu/ece554/pmwiki/pmwiki.php
    ?nMain.MiniProject

8
Midterm Bench Exam
  • You will be given a set of specifications for a
    small system along with Verilog code for some
    pre-designed modules for the system.
  • You will be expected to
  • Understand the specifications
  • Understand the Verilog code provided
  • Write one or more Verilog modules
  • Debug one or more Verilog modules
  • Simulate one or more modules and the entire
    system
  • Synthesize and implement the design
  • Download, test, and demonstrate the design on the
    FPGA board

9
Project
  • Design, simulate, synthesize, test, download and
    demonstrate a non-trivial computer with an
    original instruction set architecture (ISA)
  • Four key requirements
  • It must be an original ISA (somewhat negotiable)
  • It must be non-trivial
  • It must be tractable - everything takes at least
    twice as long as you expect
  • It must interface through the serial port with
    the terminal emulator on the lab workstations
    (negotiable)
  • Often has significant software component and
    utilizes FPGA board interfaces

10
Project Milestone
  • Several major milestones
  • Project team selection each team of 5 or 6
    (2/3)
  • Project proposal presentation (2/12)
  • Architecture review presentation (2/19)
  • ISA report due (2/24)
  • Microarchitecture review presentation (3/24)
  • Testing and demo review presentation (4/7)
  • Several progress reviews (see syllabus)
  • Project demonstrations (5/5)
  • Project report due (5/14)
  • For details see
  • https//cgi.cae.wisc.edu/ece554/pmwiki/pmwiki.php
    ?nMain.Milestones

11
Major Lab Enhancement
  • We have done a major enhancement to the ECE554
    lab recently, bear with us for version updates
  • All new computers and monitors
  • All new FPGA boards and updated digital design
    software
  • Overall objectives of the lab will stay the same
  • Some additional changes may happen this semester
  • We will try to make the transition as smooth as
    possible thanks to Mitch
  • Go over the syllabus

12
FPGA Concepts and Design
  • CMOS IC design alternatives
  • RAM cell-based FPGA uses
  • The Xilinx Virtex Series FPGA technology
  • The Xilinx Integrated Software Environment (ISE)
    design process

13
CMOS IC Design Alternatives
STANDARD IC
ASIC
FIELD PROGRAM- MABLE
FULL CUSTOM
SEMI- CUSTOM

STANDARD CELL
GATE ARRAY, SEA OF GATES
CPLD
FPGA
  • Field Programmable Gate Array (FPGA) a hardware
    device with programmable logic, routing, memory,
    and I/O

14
RAM Cell-Based FPGA Uses
  • Prototyping gate array, standard cell, or full
    custom integrated circuits (ICs)
  • Prototyping complete systems
  • Implementing hardware simulation
  • Replacing ICs
  • Providing multifunction reconfigurable system ICs
  • Hardware accelerators

15
Xilinx Virtex FPGA Architecture
  • Primary Reference
  • On-Line Xilinx Data Sheet DS003 (v.2.5, April 2,
    2001) - http//www.xilinx.com/partinfo/ds003.pdf
  • Figure 1 Virtex Architecture Overview
  • IOBs - Input/Output Blocks
  • CLBs - Configurable Logic Blocks
  • Function generators, Flip-Flops, Combinational
    Logic, and Fast Carry Logic
  • GRM - General Routing Matrix
  • BRAMs - Block SelectRAM (configurable memory)
  • DLLs - Delay-Locked Loops for clock control
  • VersaRing - I/O interface routing resources

16
Figure 1- Virtex Architecture Overview
17
RAM-based FPGA
Xilinx XC4000ex
18
Virtex FPGA Architecture
  • Logic configured by values stored in SRAM cells
  • CLBs implement logic in SRAM-stored truth tables
  • CLBs also use SRAM-controlled multiplexers
  • Routing uses pass transistors for
    making/breaking connections between wire segments
  • Block RAMs allow programmable memories with
    configurable widths (1, 2, 4, 8, or 16 bits)

19
Look-up Table Based Logic Cell
20
Programmable Routing
21
Table 1 Virtex FPGA Family Members
  • We use the XCV800 device
  • 0.22 micron, five-layer metal process

22
IOB - Input/Output Block
  • See Figure 2 Virtex Input/Output Block
  • Separate signals for input (I), output (O), and
    output enable (T)
  • Three storage elements function as D flip-flops
    or latches with clock enable (CE) and set/reset
    (SR)
  • I/O pins can connect directly to internal logic
    or through the storage element
  • Programmable input delay
  • 3-state output buffer
  • I/O pad can use pull-up, pull-down, or weak
    keeper
  • Supports a wide range of voltages

23
Figure 2 Virtex Input/Output Block
24
CLB - Configurable Logic Block
  • See Figure 4 2-Slice Virtex CLB
  • Each slice contains two logic cells (LCs) and
    consists of
  • 2 4-input look-up tables (LUTs)
  • 2 D flip-flops/latches
  • Fast carry and control logic
  • Three-state drivers
  • SRAM control logic

25
Figure 4 2-Slice Virtex CLB
26
CLB - Configurable Logic Block
  • See Figure 5 Detailed View of Virtex Slice
  • Logic Function Implementation
  • 2 Function Generators - Each a 4-input LUT -
    implements any 4-input function
  • F5 multiplexer - combines two LUTs with select
    input - implements any 5-input function, 4-to-1
    mux, or selected functions of up to 9 inputs.
  • F6 multiplexer - combines outputs of two F5
    multiplexer - implements any 6-input function,
    8-to-1 mux, or selected functions of up to 19
    inputs.
  • Four direct feedthrough paths - useful to
    facilitate routing by use of through-the-cell
    paths

27
Figure 5 Detailed View of Virtex Slice
28
CLB - Configurable Logic Block
  • Storage Elements
  • 2 D flip-flops/latches
  • Optionally included in cell output paths
  • Shared clock enable
  • Shared synchronous/asynchronous Set/Reset signals
  • SR - forces storage element into initialization
    state specified (0 or 1)
  • BY - forces storage element into opposite state

29
CLB - Configurable Logic Block
  • Fast Carry Logic (See Figures 4 and 5)
  • Two chains of two bits per CLB
  • AND gate (for mult), 0/1 Mux, CY Mux, EXOR
  • 3-state Drivers (BUFT) - on-chip drivers with
    independent control and input pins
  • Distributed LUT SelectRAMs one per logic cell,
    2 LUTs can be reconfigured as one of
  • Two 16 x 1-bit synchronous RAM
  • 16 x 2-bit synchronous RAM
  • 32 x 1-bit synchronous RAM
  • 16 x 1-bit dual-port synchronous RAM
  • Two 16-bit shift registers

30
Block SelectRAM
  • Fully synchronous dual-ported 4096-bit RAM
  • Stores address, data and write-control signal on
    inputs at clock edge
  • Cannot change address, even for read, without
    using clock
  • Independent control signals for each port
  • Organized in vertical columns of blocks on left
    and right of CLB array
  • Block height is 4 CLBs gt Number of block RAMs
    per column is (height of CLB of array)/4
  • See Tables 3 4 and Figure 6.

31
Tables 3 4 and Figure 6
32
Programmable Routing Matrix
  • Local Routing
  • See Figure 7 Virtex Local Routing
  • Interconnections among LUTs, flip-flops, and
    General Routing Matrix (GRM)
  • Internal CLB feedback paths that can chain LUTs
    together
  • Direct paths between horizontally-adjacent CLBs
  • Short connections with few pass transistors gt
    low delay gt high-speed connections
  • Combination of hardware and software is used to
    try to minimize routing delay

33
Figure 7 Virtex Local Routing
34
Programmable Routing Matrix
  • General Purpose Routing
  • Majority of interconnect resources
  • In horizontal and vertical routing channels
    associated with rows and columns of CLBs
  • GRM - Switch matrix through which horizontal and
    vertical routing resources connect and means by
    which CLBs access general purpose routing
  • 24 single-length lines between adjacent GRMs in 4
    directions
  • 12 buffered hex lines route GRM signals to other
    GRMs 6 blocks away in 4 directions (can be
    accessed 3 or 6 blocks away)
  • 12 longlines are buffered bidirectional wires
    that distribute signals across the device
  • Vertical - span full device height
  • Horizontal - span full device width

35
Programmable Routing Matrix
  • I/O Routing
  • VersaRing
  • Supports pin-swapping and pin-locking
  • Facilitates pin-out flexibility
  • Dedicated Routing (not programmable)
  • Four partitionable bus lines per CLB row driven
    by BUFTs (See Figure 8 BUFT Connections)
  • Two dedicated nets per CLB for vertical carry
    signals to adjacent cells

36
Figure 8 BUFT Connections
37
Programmable Routing Matrix
  • Global Routing
  • Distribute clocks and other signals with high
    fanout
  • Primary Global Routing
  • Four dedicated global nets with dedicated input
    pins for clocks
  • Driven by global buffers
  • Can drive all CLB, IOB, and BRAM clock pins
  • Secondary Global Routing
  • 24 backbone lines, 12 across top of chip and 12
    across bottom of chip
  • From these, can distribute 12 unique
    signals/column via 12 longlines in column
  • Not restricted to routing only to clock pins

38
Clock Distribution
  • Via primary global routing resources
  • See Figure 9 Global Clock Distribution Network
  • Four global buffers
  • Two at top center
  • Two at bottom center
  • Four dedicated clock input pads
  • Input to global buffers from pads or from general
    purpose routing

39
Figure 9 Global Clock Distribution Network
40
Delay-Locked Loops (DLLs)
  • One associated with each clock buffer
  • Eliminate skew between clock input pad and
    internal clock-input pins within the device
  • Each can drive two global clock networks
  • Clock edges reach internal flip-flops 1 to 4
    clock periods after they arrive at the input.
  • Provides control of multiple clock domains
  • Has minimum clock frequency restrictions!

41
Table 1 and Figures 4 7
42
Boundary Scan
  • IEEE(ANSI) Standard 1149.1
  • Provides Ability to Observe and Control I/O pins
  • Accessed Through a Standard Test Access Port
    (TAP)
  • Additional Logic Includes Test Instruction
    Register, ID Register, two User Registers and a
    One Bit Bypass Register.
  • Uses
  • Test Interconnects between ICs on Boards
  • Perform Tests on Internal Logic
  • Initialize Built-In Self-Test (BIST) Logic
  • Perform Sampling During Normal Operation

43
Configuration
  • How is the FPGA configured?
  • Implemented by
  • Clearing configuration memory
  • Loading configuration data into 2-D configuration
    SRAM
  • Activating logic via a startup process
  • Configuration Modes
  • Slave-Serial FPGA receives bit-serial data
    (e.g., from PROM) synchronized by an external
    clock
  • Master-Serial - FPGA receives bit-serial data
    (e.g., from PROM) synchronized by FPGA clock
  • SelectMAP - Byte-wide data is written into the
    FPGA with a BUSY flag from FPGA controlling the
    flow of data
  • Boundary-scan Configuration is done through the
    Test Access Port
  • The XCV800 device requires 4,715,616
    configuration bits

44
XCV800 Characteristics
  • Maximum Gate Count 888,439
  • CLB Matrix 56 x 84
  • Logic Cells 21,168
  • Maximum IOBs 512
  • Flip-Flop Count 43,872
  • Block RAM Bits 114,688
  • Horizontal TBUF Long Lines 224
  • TBUFs per Long Line 168
  • Program Data (bits) 4,715,616

45
THE ECE 554 XILINX DESIGN PROCESS
  • Design process overview
  • Design reference
  • Design tutorial
  • Whats next

46
Design Process Steps
  • Definition of system requirements.
  • Example ISA (instruction set architecture) for
    CPU.
  • Includes software and hardware interfaces with
    timing.
  • May also include cost, speed, power, reliability
    and maintainability specifications.
  • Definition of system architecture.
  • Example high-level HDL (hardware description
    language) representation - this is optional in
    ECE 554, but is done in the real world).
  • Useful for system validation and verification and
    as a basis for lower level design execution and
    validation or verification.

47
Design Process Steps(continued)
  • Refinement of system architecture
  • In manual design, descent in hierarchy, designing
    increasingly lower-level components
  • In synthesized design, transformation of
    high-level HDL to synthesizable register
    transfer level (RTL) HDL
  • Logic design or synthesis
  • In manual or synthesized design, development of
    logic design in terms of library components
  • Result is logic level schematic or netlist
    representation or combinations of both.
  • Both manual design and synthesis typically
    involve optimization of cost, area, or delay.

48
Design Process Steps (Continued)
  • Implementation
  • Conversion of the logic design to physical
    implementation
  • Involves the processes of
  • Mapping of logic to physical elements,
  • Placing of resulting physical elements,
  • And routing of interconnections between the
    elements.
  • In case of SRAM-based FPGAs, represented by the
    programming bitstream which generates the
    physical implementation in the form of CLBs,
    IOBs, BRAMs, and the interconnections between them

49
Design Process Steps (continued)
  • Validation test and debug (used at several
    steps in the process)
  • At architecture level - functional simulation of
    HDL
  • At RTL level - functional simulation of RTL HDL
  • At logic design or synthesis - functional
    simulation of gate-level circuit - not usually
    done, but recommended in ECE 554
  • At implementation - timing simulation of
    schematic, netlist or HDL with implemention
    based timing information (functional simulation
    can also be useful here)
  • At programmed FPGA level - in-circuit test of
    function and timing

50
Xilinx HDL/Core Design Flow
DESIGN ENTRY
CORE GENERATION
RTL HDL EDITING
RTL HDL-CORE SIMULATION
SYNTHESIS
IMPLEMENTATION
TIMING SIMULATION
FPGA PROGRAMMING IN-CIRCUIT TEST
51
Xilinx HDL/Core Design Flow - HDL Editing
Accessed within ISE Foundation
DESIGN WIZARD
LANGUAGE ASSISTANT
HDL EDITOR
52
Xilinx HDL/Core Design Flow - Core Generation
Select core and specify input parameters
CORE GENERATOR
EDIF netlist for core_name
Other core_name files
53
Xilinx HDL/core Design Flow - HDL Functional
Simulation
Set Up and Map work Library
RTL HDL Files
Testbench HDL Files
Compile HDL Files
Test Inputs or Force Files
MODELSIM
Functional Simulate
Waveforms or List Files
54
Xilinx HDL Design Flow - Synthesis
All HDL Files
Edit FPGA Express Synthesis Constraints
Select Top Level
Synthesis/Implement-ation Constraints
Select Target Device
Xilinx ISE
Synthesize
Gate/Primitive Netlist Files (EDIF or XNF)
Synthesis Report Files
55
Xilinx HDL/core Design Flow - Implementation
Gate/Primitive Netlist Files (XNF or EDN)
Netlist Translation
XILINX ISE
Map
Place Route
Model Extraction
Timing Model Gen
Create Bitstream
HDL or EDIF for Implemented Design
Standard Delay Format File
BIT File
56
Xilinx HDL/core Design Flow - Timing Simulation
HDL or EDIF for Implemented Design
Standard Delay Format File
Set Up and Map work Directory
Testbench HDL Files
Compile HDL Files
MODELSIM
Test Inputs, Force Files
Compiled HDL
HDL Simulate
Waveforms or List Files
57
Xilinx HDL Design Flow - Programming and
In-circuit Verification
Bit File
Input Byte
GXSLOAD
ECE 554 FPGA Board
GXSPORT
Other Inputs
Outputs
58
Design Practices
  • Use synchronous design.
  • CLBs are actually reading functions from SRAM
  • Avoid clock gating.
  • Avoid ripple counters.
  • Avoid use of direct sets and resets except for
    initialization.
  • Synchronize asynchronous signals as needed.
  • Test and debug each component design
  • Rule of 10 it requires ten times more effort to
    debug a design that has untested components in it.

59
Whats Next
  • HDL/core design flow design tutorial will
    employ the flow described for a Verilog HDL/core
    example
  • During lab time on Tuesday
  • https//cgi.cae.wisc.edu/ece554/pmwiki/pmwiki.php
    ?nDocumentation.Tutorial
  • Read over the tutorial before coming to lab
  • Find a partner for the miniproject by next
    Tuesday
  • Start looking over the course website
  • If you feel rusty with Verilog, take a look at
    lecture 2

60
Tutorial Overview
  • Use the tools in the lab to design, simulate, and
    implement a simple design
  • Use of embedded tool kit to help implement the
    miniproject
  • Multiply-accumulate unit
  • Main steps include
  • Performing HDL coding for synthesis (Xilinx ISE)
  • Using cores (Xilinx Core Generator)
  • Behavioral simulation of synthesizable HDL code
    (ModelSim)
  • Design synthesis (translation) (Xilinx ISE)
  • Design implementation (map, place route)
    (Xilinx ISE)
  • Timing (post-Implementation) simulation
    (ModelSim)
  • Generating the FPGA programming file (Xilinx ISE)
About PowerShow.com