Programmable Logic: Introduction - PowerPoint PPT Presentation

1 / 177
About This Presentation
Title:

Programmable Logic: Introduction

Description:

... cost of manufacturing each copy of the system, excluding NRE cost ... Program memory. General datapath with large register file and general ALU. User benefits ... – PowerPoint PPT presentation

Number of Views:249
Avg rating:3.0/5.0
Slides: 178
Provided by: bren178
Category:

less

Transcript and Presenter's Notes

Title: Programmable Logic: Introduction


1
Programmable Logic Introduction
  • Digital electronics a few reminders to basic
    ideas and concepts
  • Combinational logic
  • Sequential logic
  • Synchronous vs. Asynchronous designs
  • Programmable logic devices (PLD)
  • Hardware overview
  • Field-programmable gate arrays (FPGA)
  • Basics
  • Design flow
  • Introduction to VHDL
  • Course Material at http//elearning.uni-giessen
    .de/studip

2
Basic Logic Gates
3
Example
4
Designing with NAND and NOR Gates
  • Implementation of NAND and NOR gates is easier
    than that of AND and OR gates (e.g., CMOS)

5
Combinational Logic
  • Has no memory gtpresent state depends only on
    the present input

X x1 x2... xn
Z z1 z2... zm
x1
z1
x2
z2
xn
zm
6
Combinational-Circuit Building Blocks
  • Multiplexers
  • Decoders
  • Encoders
  • Code Converters
  • Comparators
  • Adders/Subtractors
  • Multipliers
  • Shifters

7
Example Multiplexers 2-to-1 Multiplexer
  • Have number of data inputs, one or more select
    inputs, and one output
  • It passes the signal value on one of data inputs
    to the output

w
s
0
w
0
0
f
s
f
w
1
1
w
1
(a) Graphical symbol
(c) Sum-of-products circuit
f
s
w
0
0
w
1
1
(b) Truth table
8
Example Full Adder
Module
Truth table
9
Sequential Circuits
  • Circuits with Feedback
  • Outputs f(inputs, past inputs, past outputs)
  • Basis for building "memory" into logic circuits
  • Door combination lock is an example of a
    sequential circuit
  • State gt memory
  • State is can be "output" and "input" to
    combinational logic or to other sequential logic

10
Simplest Circuits with Feedback
  • Two inverters form a static memory cell
  • Will hold value as long as it has power
    applied
  • How to get a new value into the memory cell?
  • Selectively break feedback path
  • Load new value into cell

11
Clocks
  • Used to keep time
  • Wait long enough for inputs to settle
  • Then allow to have effect on value stored
  • Clocks are regular periodic signals
  • Period (time between ticks)
  • Duty-cycle (time clock is high between ticks -
    expressed as of period)

duty cycle (in this case, 50)
period
12
Edge-Triggered Flip-Flops
  • Positive edge-triggered
  • Inputs sampled on rising edge outputs change
    after rising edge
  • Negative edge-triggered flip-flops
  • Inputs sampled on falling edge outputs change
    after falling edge

100
D CLK Qpos Qpos' Qneg Qneg'
positive edge-triggered FF
negative edge-triggered FF
13
Comparison of Latches and Flip-Flops
D CLK Qedge Qlatch
CLK
positiveedge-triggeredflip-flop
CLK
transparent(level-sensitive)latch
behavior is the same unless input changes while
the clock is high
14
Timing Methodologies
  • Rules for interconnecting components and clocks
  • Guarantee proper operation of system when
    strictly followed
  • Approach depends on building blocks used for
    memory elements
  • Focus on systems with edge-triggered flip-flops
  • Found in programmable logic devices
  • Basic rules for correct timing
  • (1) Correct inputs, with respect to time, are
    provided to the flip-flops
  • (2) No flip-flop changes state more than once per
    clocking event

15
Timing Methodologies (contd)
  • Definition of terms
  • clock periodic event, causes state of memory
    element to change can be rising or falling edge,
    or high or low level
  • setup time minimum time before the clocking
    event by which the input must be stable (Tsu)
  • hold time minimum time after the clocking event
    until which the input must remain stable (Th)

data
clock
there is a timing "window" around the clocking
event during which the input must remain stable
and unchanged in order to be recognized
changing
stable
data
clock
16
Typical Timing Specifications
  • Positive edge-triggered D flip-flop
  • Setup and hold times
  • Minimum clock width
  • Propagation delays (low to high, high to low, max
    and typical)

all measurements are made from the clocking event
that is, the rising edge of the clock
17
Synchronous vs. Asynchronous Designs
  • Clocked synchronous circuits
  • Inputs, state, and outputs sampled or changed in
    relation to acommon reference signal (the clock)
  • Asynchronous circuits
  • Inputs, state, and outputs sampled or changed
    independently of a common reference signal
    (glitches/hazards a major concern)
  • Stay away from asynchronous designs !
  • Asynchronous inputs to synchronous circuits
  • Inputs can change at any time, will not meet
    setup/hold times
  • Dangerous, synchronous inputs are greatly
    preferred
  • Cannot be avoided (e.g., reset signal, memory
    wait, user input)
  • Solution synchronize with clock as early as
    possible !

18
Overview IC Technology
  • In the early 80s
  • Generic logic circuits (Example TTL SN7400)
  • Complex applications assembled from basic
    building blocks chips with few ( lt 10) hardwired
    logic functions
  • Many PCBs, interconnects, inflexibility, cost ...
  • 90s VLSI Circuits glue logic
  • Now three types of IC technologies
  • Full-custom ASIC
  • Semi-custom ASIC (gate array and standard cell)
  • PLD (Programmable Logic Device)

19
NRE and unit cost metrics
  • Unit cost
  • the monetary cost of manufacturing each copy of
    the system, excluding NRE cost
  • NRE cost (Non-Recurring Engineering cost)
  • The one-time monetary cost of designing the
    system
  • total cost NRE cost unit cost of
    units
  • per-product cost total cost / of units
  • (NRE cost / of units) unit cost

20
General-purpose processors
  • Programmable device used in a variety of
    applications
  • Also known as microprocessor
  • Features
  • Program memory
  • General datapath with large register file and
    general ALU
  • User benefits
  • Low time-to-market and NRE costs
  • High flexibility
  • Example Pentium, ARM,

21
Application-specific processors
  • Programmable processor optimized for a particular
    class of applications having common
    characteristics
  • Features
  • Program memory
  • Optimized datapath
  • Special functional units
  • Benefits
  • Some flexibility, good performance, size and
    power
  • Example DSP, Media Processor

Datapath
Controller
Registers
Control logic and State register
Custom ALU
IR
PC
Data memory
Program memory
Assembly code for total 0 for i 1 to
22
Single-purpose hardware
  • Digital circuit designed to execute exactly one
    program
  • coprocessor, accelerator
  • Features
  • Contains components needed to execute a single
    program
  • No program memory
  • Benefits
  • Fast
  • Low power
  • Small size

23
Full-custom/VLSI
  • All layers are optimized for an embedded systems
    particular digital implementation
  • Placing transistors
  • Sizing transistors
  • Routing wires
  • Benefits
  • Excellent performance, small size, low power
  • Drawbacks
  • High NRE cost (e.g., 300k), long time-to-market

24
Semi-custom
  • Lower layers are fully or partially built
  • Designers are left with routing of wires and
    maybe placing some blocks
  • Benefits
  • Good performance, good size, less NRE cost than a
    full-custom implementation (perhaps 10k to
    100k)
  • Drawbacks
  • Still require weeks to months to develop

25
PLD (Programmable Logic Device)
  • All layers already exist
  • Designers can purchase an IC
  • Connections on the IC are either created or
    destroyed to implement desired functionality
  • Field-Programmable Gate Array (FPGA) very popular
  • Benefits
  • Low NRE costs, almost instant IC availability
  • Drawbacks
  • Bigger, expensive (perhaps 30 per unit), power
    hungry, slower

26
Comparison of different technologies
Flexibility
Speed
27
Roadmap for Programmable Logic
  • PROM
  • PLA
  • PAL
  • CPLD
  • FPGA

28
PLD Definition
  • Programmable Logic Device (PLD)
  • An integrated circuit chip that can be configured
    by end use to implement different digital
    hardware
  • Also known as Field Programmable Logic Device
    (FPLD)

29
PLD Advantages
  • Short design time
  • Less expensive at low volume

Nonrecurring engineering cost
PLD
ASIC
Cost
Volume
30
PLD Categorization

PLD
HCPLD
SPLD
High Capacity PLD
Simple PLD
PLA
PAL
Programmable Logic Array
Programmable Array Logic
FPGA
CPLD
Field Programmable Gate Array
Complex PLD
31
Programmable ROM (PROM)
2 N x M ROM
N input
M output
  • Address N bits Output word M bits
  • ROM contains 2 N words of M bits each
  • The input bits decide the particular word that
    becomes available
  • on output lines

32
Logic Diagram of 8x3 PROM
33
Combinational Circuit Implementation using PROM
I0 I1 I2 F0 F1 F2
F0 F1 F2
34
PROM Types
  • Programmable PROM
  • Break links through current pulses
  • Write once, Read multiple times
  • Erasable PROM (EPROM)
  • Program with ultraviolet light
  • Write multiple times, Read multiple times
  • Electrically Erasable PROM (EEPROM)/ Flash Memory
  • Program with electrical signal
  • Write multiple times, Read multiple times

35
PROM Advantages and Disadvantages
  • Widely used to implement functions with large
    number of inputs and outputs
  • Design of control units (Micro-programmed control
    units)
  • For combinational circuits with lots of dont
    care terms, PROM is a waste of logic resources

36
CPLDs and FPGAs

Complex Programmable Logic Device (CPLD)
Field-Programmable Gate Array (FPGA)
Architecture PAL/22V10-like Gate
array-like More Combinational More Registers
RAM Density Low-to-medium Medium-to-high
0.5-10K logic gates 1K to 3.2M system
gates Performance Predictable timing
Application dependent Up to 250 MHz today
Up to 200 MHz today Interconnect Crossbar
Switch Incremental
37
CPLD
Logic Block
Logic Block
I/O
I/O
Programmable Interconnect
Logic Block
Logic Block
38
FPGA Overview
  • Basic idea 2D array of combination logic blocks
    (CL) and flip-flops (FF) with a means for the
    user to configure both
  • 1. the interconnection between the logic blocks,
  • 2. the function of each block.

Simplified version of FPGA internal architecture
39
Where are FPGAs in the IC Zoo?
Source Dataquest
Programmable Logic Devices (PLDs)
Gate Arrays
Cell-Based ICs
Full Custom ICs
SPLDs (PALs)
FPGAs
Acronyms SPLD Simple Prog. Logic Device PAL
Prog. Array of Logic CPLD Complex PLD FPGA
Field Prog. Gate Array (Standard logic is SSI or
MSI buffers, gates)
Common Resources Configurable Logic Blocks
(CLB) Memory Look-Up Table AND-OR planes Simple
gates Input / Output Blocks (IOB) Bidirectional,
latches, inverters, pullup/pulldowns Interconnect
or Routing Local, internal feedback, and global
40
FPGA Variations
  • Families of FPGAs differ in
  • physical means of implementing user
    programmability,
  • arrangement of interconnection wires, and
  • basic functionality of logic blocks
  • Most significant difference is in the method for
    providing flexible blocks and connections
  • Anti-fuse based (ex Actel)
  • Non-volatile, relatively small
  • - fixed (non-reprogrammable)
  • (Almost used in 150 Lab only 1-shot at getting
    it right!)

41
User Programmability
  • Latches are used to
  • 1. make or break cross-point connections in
    interconnect
  • 2. define function of logic blocks
  • 3. set user options
  • within the logic blocks
  • in the input/output blocks
  • global reset/clock
  • Configuration bit stream loaded under user
    control
  • All latches are strung together in a shift chain
  • Programming gt creating bit stream
  • Latch-based (Xilinx, Altera, )
  • reconfigurable
  • - volatile
  • relatively large die size
  • Note Today 90 die is interconnect, 10 is gates

42
Xilinx Programmable Logic
43
Xilinx CPLD/FPGA
Features
Virtex-II
Spartan-IIE
FPGAs SRAM-based Feature Rich High Performance
FPGAs SRAM-based Feature Rich Low Cost
CPLDs Low Power
10K 600K 10M
Density (System Gates)
44
Idealized FPGA Logic Block
  • 4-input Look Up Table (4-LUT)
  • implements combinational logic functions
  • Register
  • optionally stores output of LUT
  • Latch determines output register or LUT

45
XILINX SPARTAN IIE CLB Structure
  • Each slice has 2 LUT-FF pairs with associated
    carry logic
  • Two 3-state buffers (BUFT) associated with each
    CLB, accessible by all CLB outputs

46
LUT Implementation
  • n-bit LUT is actually implemented as a 2n x 1
    memory
  • inputs choose one of 2n memory locations.
  • memory locations (latches) are normally loaded
    with values from users configuration bit stream.
  • Inputs to mux control are the CLB (Configurable
    Logic Block) inputs.
  • Result is a general purpose logic gate.
  • n-LUT can implement any function of n inputs!

47
LUT as general logic gate
Example 4-lut
  • An n-lut as a direct implementation of a function
    truth-table
  • Each latch location holds value of function
    corresponding to one input combination

Example 2-lut
Implements any function of 2 inputs.
How many functions of n inputs?
48
More functionality for free?
  • Given basic idea
  • LUT built from RAM
  • Latches connected as shift register
  • What other functions could be provided at very
    little extra cost?
  • Using CLB latches as little RAM vs. logic
  • Using CLB latches as shift register vs. logic

49
1. Distributed RAM
  • CLB LUT configurable as Distributed RAM
  • A LUT equals 16x1 RAM
  • Implements Single and Dual-Ports
  • Cascade LUTs to increase RAM size
  • Synchronous write
  • Synchronous/Asynchronous read
  • Accompanying flip-flops used for synchronous read

50
2. Shift Register
  • Each LUT can be configured as shift register
  • Serial in, serial out
  • Saves resources can use less than 16 FFs
  • Faster no routing
  • Note CAD tools determine which CLB is used as
    LUT, RAM, or shift register, rather than up to
    designer

51
System Interfaces
19 Different Standards Supported!
  • Supports multiple voltage and signal standards
    simultaneously
  • Eliminate costly bus transceivers

52
SelectI/OTM Standards
  • VCCO defines output voltage
  • VREF defines input threshold reference voltage
  • Available as user I/O when using internal
    reference

53
I/Os Separated into 8 Banks
Bank 1
Bank 0
GCLK2
GCLK3
Bank 2
Bank 7
Banks 2 and 3 used during configuration
Bank 3
Bank 6
GCLK0
GCLK1
Bank 4
Bank 5
IOBI/O Blocks
54
I/O Signal Types
I/O Signal Type
Single-Ended
Differential
LVTTL
LVCMOS
HSTL
SSTL
LVDS
Bus LVDS
LVPECL
NOTE Only the popular IO types shown here
55
Single Ended I/O
  • Traditional means of data transfer
  • Data is carried on a single line
  • Bigger voltage swing between logic Low and High

3.3 V
Logic High
Driver
Receiver
2 V
1.2V swing
Data Out
Data In
0.8 V
Logic Low
Single ended data transfer
LVTTL input levels
56
SystemI/OSingle-Ended I/O Standards Summary
57
Differential I/O
  • Latest means of data transfer
  • One data bit is carried through two signal lines
  • Voltage difference determines logic High or Low
  • Smaller voltage swing between logic Low and High
  • Higher performance
  • Lower power
  • Lower noise

3.3 V
1.7 V
0.4V swing
1.3 V
Data Out
Differential signal data transfer
LVDS Input levels
58
Differential I/O Types
  • LVDS (Low Voltage Differential Signal)
  • Unidirectional data transfer
  • Bus LVDS
  • Bi-directional communication between 2 or more
    devices
  • Can transmit and receive LVDS signals through the
    same pins
  • LVPECL (Low Voltage Positive Emitter Coupled
    Logic)
  • Unidirectional data transfer
  • Popular industry standard for fast clocking

59
Programmable Logic Design Flow
Design Entry in schematic, ABEL, VHDL, and/or
Verilog.
Implementation includes Placement Routing and
bitstream, analyze timing, view layout, and more.
Download directly to the Xilinx hardware
device(s) with unlimited reconfigurations
3
60
How Program FPGA Generic Design Flow
  • Design Entry
  • Create your design files using
  • schematic editor or
  • hardware description language (Verilog, VHDL)
  • Design implementation on FPGA
  • Partition, place, and route (PPR) to create
    bit-stream file
  • Divide into CLB-sized pieces, place into blocks,
    route to blocks
  • Design verification
  • Use Simulator to check function,
  • Other software determines max clock frequency.
  • Load onto FPGA device (cable connects PC to
    board)
  • check operation at full speed in real environment.

61
  • The Routing software automatically determines
    which switches should be on and which should be
    off to make the tens of thousands of connections
    required between logic blocks.
  • The goal in this stage is to minimize the length
    of connections and the number of switches
    required to route a signal
  • Shorter connections mean faster circuits
    (unwanted parasitic elements can add considerable
    RC interconnect delay if the number of anti-fuses
    connected in series is not kept to an absolute
    minimum. Clever routing techniques are therefore
    crucial to anti-fuse-based FPGAs.

62
The Placement software chooses which one of the
thousands of logic blocks within an FPGA will
implement the circuit. The goals in this stage
are 1.Minimize the amount of routing required
to make all the necessary connections between
the logic blocks, and 2.Maximize the circuit
speed.
63
(No Transcript)
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
What is VHDL?
  • Programming Language Hardware Modeling Language
  • It has all of the following
  • Sequential Procedural language PASCAL and ADA
    like
  • Concurrency statically allocated network of
    processes
  • Timing constructs
  • Discrete-event simulation semantics
  • Object-oriented goodies libraries, packages,
    polymorphism

73
(No Transcript)
74
What is Logic or Register-Transfer Level
Synthesis?
  • A register-transfer system is a sequential
    machine
  • Combinational logic connecting registers
  • Functionality described as new values of register
    in a clock cycle
  • Depends on input and current register values
  • Depends on the transfer functions associated
    with the various combinational logic blocks
  • Register-transfer design is structural - complex
    combinations of state machines may not be easily
    described solely by a large state transition
    graph.
  • Register-transfer design concentrates on
    functionality, not details of logic design.

75
Register-Transfer Simulation
  • Simulates to clock-cycle accuracy. Doesnt
    guarantee timing.
  • Important to get proper function of machine
    before jumping into detailed logic design.
  • But be sure to take into account critical delays
    when choosing register-transfer organization

76
A NAND Gate Example
  • -- black-box definition (interface)entity NAND
    is port ( A, B in bit Y out bit )end
    NAND-- an implementation (contents)architectur
    e BEHAVIOR_1 of NAND isbegin Y lt A nand Bend
    BEHAVIOR_1

Important Conceptsentityarchitecturegenericpor
t
77
Another Implementation of NAND
  • -- there can be multiple implementationsarchitect
    ure BEHAVIOR_2 of NAND isbegin Y lt 1 when
    A0 or B0 else                   0end
    BEHAVIOR_2

78
The process Statement
  • label process (sensitivity_list)
    declarations begin sequential_statement
    end process label
  • It defines an independent sequential process
    which repeatedly executes its body
  • Following are equivalent process
    (A,B) process begin begin C lt A or
    B C lt A or B end wait on A,
    B end
  • No wait statements allowed in the body if there
    is a sensitivity_list

79
The wait Statement
  • wait   on list_of_signals         
     until boolean_expression
  • This is the ONLY sequential statement during
    which time advances!
  • examples
  • -- wait for a rising or falling edge on CLKwait
    on CLKwait until CLKEVENT -- this is
    equivalent to the above -- wait for rising edge
    of CLKwait on CLK until CLK1wait until
    CLK1 -- this is equivalent to the above --
    wait for 10 nswait until 10 ns(only for
    simulation purposes, cannot be used in synthesis
    !!) -- wait for ever (the process effectively
    dies!)wait

80
A Simple Producer-Consumer Example
Producer
Consumer
DATA
REQ
ACK
81
Producer-Consumer in VHDL
  • entity producer_consumer isend
    producer_comsumerarchitecture two_phase of
    producer_consumer is signal REQ, ACK
    bit signal DATA integerbegin P process
    begin DATA lt produce() REQ lt not
    REQ wait on ACK end P C process
    begin wait on REQ consume(DATA) ACK lt
    not ACK end C end two_phase

82
Producer-Consumer in VHDL4-Phase Case
  • architecture four_phase of producer_consumer
    is signal REQ, ACK bit 0 signal DATA
    integerbegin P process begin DATA lt
    produce() REQ lt 1 wait until
    ACK1 REQ lt 0 wait until ACK0 end
    P C process begin wait until
    REQ1 consume(DATA) ACK lt 1 wait
    until REQ0 ACK lt 0 end C end
    four_phase

83
Behavior vs. Structure Description
  • An entity can be described by its behavior or by
    its structure, or in a mixed fashion
  • Example a 2-input XOR gate

84
XOR in VHDL Behavior
  • entity XOR is port ( A,B in bit Y out
    bit)end XORarchitecture BEHAVIOR of XOR
    isbegin Y lt (A and not B) or (not A and
    B)end BEHAVIOR

85
XOR in VHDL Structure
  • architecture STRUCTURE of XOR is component
    NAND port ( A, B in bit Y out bit) end
    component signal C, D, E bitbegin G1
    NAND port map (A, B, C) G2 NAND port map (A
    gt A, B gt C, Y gt D) G3 NAND port map (C,
    B gt B, Y gt E) G4 NAND port map (D, E,
    Y)end STRUCTURE

Component Instantiationis just
anotherConcurrent Statement!
86
XOR in VHDL Mixed
  • architecture MIXED of XOR is component NAND
    port ( A, B in bit Y out bit) end
    component signal C, D, E bitbegin D lt A
    nand C E lt C nand B G1 NAND port map (A,
    B, C) G4 NAND port map (D, E, Y)end MIXED

87
Concurrent vs. Sequential Statements
  • Concurrent Statements
  • Process    independent sequential
    processBlock    groups concurrent
    statementsConcurrent Procedure
    convenient syntax forConcurrent Assertion
    commonly occurring formConcurrent
    Signal Assignment of processesComponent
    Instantiation    structure decompositionGener
    ate Statement    regular description

88
Concurrent vs. Sequential Statements
  • Sequential Statements
  • Wait    synchronization of processesAssertionS
    ignal AssignmentVariable AssignmentProcedure
    CallIfCaseLoop (for, while)Next
    ExitReturnNull

89
VHDLs Model of a System
  • Static network of concurrent processes
    communicating using signals
  • A process has drivers for certain signals
  • A signal may be driven by multiple processes

90
XILINX Design Tools
91
Signals versus Variables
  • architecture DUMMY_1 of JUNK is signal Y
    bit 0begin process variable X bit
    0 begin wait for 10 ns X 1 Y
    lt X wait for 10 ns -- What is Y at this
    point ? 1 ... end processend DUMMY_1
  • architecture DUMMY_2 of JUNK is signal X, Y
    bit 0begin process begin wait
    for 10 ns X lt 1 Y lt X wait for 10
    ns -- What is Y at this point ? 0 ...
    end processend DUMMY_2

Signal assignments with 0 delay take effect only
aftera delta delay. i.e., in the next simulation
cycle.
92
Transaction Delay Models
  • VHDL has two distinct ways of modeling delays
  • Transport delay model
  • signal lt transport waveform
  • for ideal devices with infinite frequency
    response in which every input pulse, however
    small, produces an output pulse
  • Inertial delay model
  • signal lt reject time inertial waveform
  • for devices with inertia so that not all pulses
    go through
  • pulses lt reject time are rejected

Projected waveformPreemptive timingTransport
delayInertial delay
93
Case 1 Transport Delay Model
  • Let
  • Tnew earliest transaction scheduled by a new
    signal assignment
  • Model
  • All pending transactions at time ? Tnew are
    deleted
  • All new transactions are added

94
Transport Delay Model (contd.)
  • Y lt 0 after 0 ns, 2 after 2 ns, 4 after 4 ns, 6
    after 6 ns
  • wait for 1 ns
  • Y lt transport 3 after 2 ns, 5 after 4 ns, 7
    after 6 ns

95
Case 2 Inertial Delay Model
  • Let
  • Tnew earliest transaction scheduled by a new
    signal assignment
  • Tr pulse rejection limit
  • Model
  • All pending transactions at time ? Tnew are
    deleted
  • All new transactions are added
  • Any pending transactions in the interval Tnew-Tr
    to Tnew are examined. If there is a run of
    consecutive transactions immediately preceding
    the earliest new transaction at Tnew with the
    same value as that new transaction, the yare
    kept. Other pending transactions in the interval
    are deleted.

96
Inertial Delay Model (contd.)
  • Y lt 0 after 0 ns, 2 after 2 ns, 4 after 4 ns, 6
    after 6 ns
  • wait for 1 ns
  • Y lt 3 after 2 ns, 5 after 4 ns, 7 after 6 ns

97
Inertial Delay Model (contd.)
  • -- suppose current time is 10 ns
  • S lt reject 5 ns inertial 1 after 8 ns

retained
retained
deleted
pulse rejection interval (13 ns to 18 ns)
new transaction
98
Signals with Multiple Drivers
  • Y lt A -- in process1and, Y lt B -- in
    process2
  • What is the value of the signal in such a case?
  • VHDL uses the concept of a Resolution Function
    that is attached to a signal or a type, and is
    called every time the value of signal needs to be
    determined -- that is every time a driver changes
    value

99
Resolution Function Example Wire-And (open
collector)
  • package RESOLVED is function wired_and
    (Vbit_vector) return bit subtype rbit is
    wired_and bitend RESOLVEDpackage body
    RESOLVED is function wired_and(Vbit_vector)
    return bit is begin for I in VRANGE
    loop if V(I) 0 then return 0 end
    if end loop return 1 end wired_andend
    RESOLVED

100
Guarded Signals register bus
  • Guarded signals are those whose drivers can be
    turned off
  • What happens when all drivers of a guarded signal
    are off?
  • Case 1   retain the last driven value
  • signal X bit register
  • useful for modeling charge storage nodes
  • Case 2    float to a user defined default value
  • signal Y bit bus
  • useful for modeling busses

101
Guarded Signals (contd.)
  • Two ways to turn off the drivers
  • null waveform in sequential signal assignment
  • signal_name lt null after time_expression
  • guarded concurrent signal assignment
  • block (data_bus_enable1)begin data_bus lt
    guarded 0011end block

102
Using VHDL Like C!
  • Normal sequential procedural programs can be
    written in VHDL without ever utilizing the event
    scheduler or the concurrent concepts.
  • Example
  • entity HelloWorld is endarchitecture C_LIKE of
    HelloWorld is use std.textio.allbegin main
    process variable buf line begin write(buf,
    Hello World!) writeln(output, buf) wait
      -- needed to terminate the program end process
    mainend C_LIKE

103
Language Features Types
  • TYPE Set of Values Set of Operations
  • VHDL types SCALAR ENUMERATION e.g. character,
    bit, boolean INTEGER e.g. integer FLOATING e
    .g. real PHYSICAL e.g. time COMPOSITE ARRAY
    e.g. bit_vector, string RECORD ACCESS FILE

104
Examples of VHDL Types
  • type bit is (0, 1)type thor_bit is (U,
    0, 1, Z)type memory_address is range 0 to
    232-1type small_float is range 0.0 to
    1.0type weight is range 0 to 1E8 units Gm
    -- base unit Kg 1000 Gm -- kilogram Tonne
    1000 Kg -- tonne end units

105
Language Features Subtypes
  • SUBTYPE TYPE constraints on values
  • TYPE is the base-type of SUBTYPE
  • SUBTYPE inherits all the operators of TYPE
  • SUBTYPE can be more or less used interchangeably
    with TYPE
  • Examples
  • subtype natural is integer range 0 to
    integerHIGHsubtype good_thor_bit is thor_bit
    range 0 to 1subtype small_float is real
    range 0.0 to 1.0

106
Array and Record Types
  • -- unconstrained array (defines an array
    type)type bit_vector is array (natural range ltgt)
    of bit-- constrained array (define an array
    type and subtype)type word is array (0 to 31) of
    bit-- another unconstrained arraytype memory
    is array (natural range ltgt) of word-- following
    is illegal!type memory is array (natural range
    ltgt) of bit_vector-- an example recordtype
    PERSON is record name string(1 to
    20) age integer range 0 to 150 end record

107
Language Features Overloading
  • Pre-defined operators (e.g., , -, and, nand
    etc.) can be overloaded to call functions
  • Example
  • function and(L,R thor_bit) return thor_bit
    is begin if L0 or R0 then return
    0 elsif L1 and R1 then return
    1 else return U end if end
    and -- now one can say C lt A and B --
    where A, B and C are of type thor_bit

108
Overloading (contd.)
  • Two subprograms (functions or procedures) can
    have the same name, i.e., the names can be
    overloaded. They are distinguished by parameter
    types. e.g.,
  • function MAX(A,Binteger) return integer
  • function MAX(A,Breal) return real

109
Language Features Configurations
  • Component declarations really define a template
    for a design entity
  • The binding of an entity to this template is done
    through a configuration declaration
  • entity data_path is ...end data_patharchitect
    ure INCOMPLETE of data_path is component
    alu port(function in alu_function op1,
    op2 in bit_vector_32 result out
    bit_vector_32) end componentbegin ...end
    INCOMPLETE

110
Configurations (contd.)
  • configuration DEMO_CONFIG of data_path isfor
    INCOMPLETE for allalu use entity
    work.alu_cell(BEHAVIOR) port map (function_code
    gt function, operand1 gt op1, operand2 gt
    op2, result gt result, flags gt open) end
    forend for
  • end DEMO_CONFIG

111
Language Features Packages
  • A package is a collection of reusable
    declarations (constants, types, functions,
    procedures, signals etc.)
  • A package has a
  • declaration (interface), and a
  • body (contents) optional

112
Example of a Package
  • package SIMPLE_THOR is type thor_bit is (U,
    0,1,Z) function and(L,R thor_bit)
    return thor_bit function or(L,Rthor_bit)
    return thor_bit ...end SIMPLE_THORpackage
    body SIMPLE_THOR is function and(L,R
    thor_bit) return thor_bit is begin ... end
    and ...end SIMPLE_THOR-- and then it can
    be used after sayinglibrary my_lib use
    my_lib.SIMPLE_THOR.all

113
Language Features Design Units and Libraries
  • VHDL constructs are written in a design file and
    the compiler puts them into a design library.
  • Libraries are made up of design units
  • primary design units
  • entity declarations
  • package declarations
  • configuration declarations
  • secondary design units
  • architecture bodies
  • package bodies

114
Design Units and Libraries (contd.)
  • Libraries have a logical name and the OS maps the
    logical name to a physical name
  • for example, directories on UNIX
  • Two special libraries
  • work    the working design library
  • std         contains packages standard and
    textio
  • To declare libraries that are referenced in a
    design unit
  • library library_name
  • To make certain library units directly visible
  • use library_name.unit_name
  • use also defines dependency among design units.

115
Logic Simulation in VHDL
  • The 2-state bit data type is insufficient for
    low-level simulation
  • Multi-Valued types can easily be built in VHDL
  • several packages available
  • IEEE standard logic
  • Example one may use a 4-value (U,0,1,Z)
    system
  • but no notion of strength
  • only wired-X resolution
  • Multi-State/Strength systems with interval logic

116
A Typical VHDL Simulator (MCC)
117
Common VHDL IssuesCombinational Processes
  • process (A, B, SELECT)begin if (SELCT1)
    then OUT lt A else OUT lt B end ifend
    process
  • Sensitivity list must consist of all signals that
    are read inside the process
  • Synthesis tools often ignore sensitivity list,
    but simulation tools do not a forgotten signal
    will lead to difference in behavior of the
    simulated model and the synthesized design

118
Common VHDL IssuesCombinational Processes
  • process (A, B, SELECT)begin if (SELCT1) then
    OUT lt A else OUT lt B end ifend process
  • processbegin if (SELCT1) then OUT lt
    A else OUT lt B end if wait on A, B,
    SELend process
  • Can use WAIT ON instead of sensitivity list
  • But not both!

119
Common VHDL IssuesWait-free Paths
processbegin if (some condition) wait on
CLKevent and CLK1 X lt A B end ifend
process
  • Every possible path that the code can take
    through the process body in a process without
    sensitivity list must have a WAIT
  • Otherwise the process can hang (feedback loop)

120
Common VHDL IssuesMistakenly Inferences Latches
process (A,B)begin if (cond1) X lt A
B elseif (cond2) X lt X B end ifend
process
  • Remember, incomplete assignments imply latches
  • in the above example, if neither cond1 nor cond2
    is true then X will retain its value basically,
    X is stored in a latch
  • if you are writing combinational logic, make sure
    that every output gets assigned a value along
    each path (e.g. if statements, case statements)
    through the process body
  • in general, latches are not recommended any way
    in synchronous designs (not testable via scan
    paths)

121
Common VHDL IssuesImplicit Register Inference
processbegin wait until CLKevent and CLK1
if (COUNT gt 9) then COUNT lt 0 else COUNT lt
COUNT 1 end process
1
COUNT
CLK
  • Storage registers are synthesized for all signals
    that are driven within a clocked process
  • Storage registers are also synthesized for all
    variables that are read before being updated

122
Common VHDL IssuesReset (or Set) in Synthesis
processbegin wait until CLKevent and CLK1
if (RST1) then -- synchronous reset
else -- combinational stuff end if end
process
process (CLK, RST)begin if (RST1) then
-- asynchronous reset elsif ( CLKevent and
CLK1) then -- combinational stuff end
if end process
  • Must reset all regsiters, other syntehsized chip
    wont work
  • unlike simulation, you cant set initial values
    in synthesis!
  • Asynchronous reset possible only with a process
    that ha sensitivity list

123
Common VHDL IssuesCoding Style Influence
process(A, B, C, SEL)begin if (SEL1) then
Z lt A B else Z lt A C end ifend
process
process(A, B, C, SEL) variable TMP
bitbegin if (SEL1) then TMP B
else TMP C end if Z lt A TMPend
process
B C
A B A C


SEL
A

SEL
Z
Z
  • Structure of initially generated hardware is
    determined by the VHDL code itself
  • Synthesis optimizes that initially generated
    hardware, but cannot do dramatic changes
  • Therefore, coding style matters!

124
Common VHDL IssuesIF vs CASE
  • IF-THEN-ELSIF-THEN--ELSE maps to a chain of
    2-to-1 multiplexors, each checking for the
    successive conditions
  • if (COND1) then OUT lt X1elsif (COND2) then
    OUT lt X2else OUT lt Xn
  • CASE maps to a single N-to-1 multiplexor
  • case EXPRESSION is when VALUE1 gt
  • OUT lt X1
  • when VALUE2 gt
  • OUT lt X2
  • when others gt
  • OUT lt Xn
  • end case

125
Common VHDL IssuesLet the tool do the Synthesis
  • Dont do synthesis by hand!
  • do not come up with boolean functions for outputs
    of arithmetic operator
  • let Synopsys decide which adder, multiplier to
    use
  • you will only restrict the synthesis process
  • Best to use IEEE signed and unsigned types, and
    convert to integers if needed (IEEE NUMERIC_STD
    and NUMERIC_BIT packages)
  • example
  • A_INT lt TO_INTEGER(A_VEC)
  • B_INT lt TO_INTEGER(B_VEC)
  • C_INT lt A_INT B_INT
  • C_VEC lt TO_UNSIGNED(C_INT, 4)

126
Other VHDL Issues
  • Let synthesis tool decide the numeric encoding of
    the FSM states
  • use enumerated type for state
  • Split into multiple simpler processes
  • Keep module outputs registered
  • simplifies timing constraints

127
VHDL
128
Programmable Logic Array
n x k links
k AND gates
m OR gates
m outputs
k X m links
n inputs
n x k links
  • Programmable AND array programmable OR array
  • n x k x m PLA has 2n x k k x m links
  • Sum of products

129
Programmable Array Logic (PAL)
  • Programmable AND array
  • Fixed OR array
  • Each output line permanently connected to a
    specific set of product terms
  • Number of switching functions that can be
    implemented with PAL are more limited than PROM
    and PLA

130
CPLD
Logic Block
Logic Block
I/O
I/O
Programmable Interconnect
Logic Block
Logic Block
131
CPLD Logic Block
  • Simple PLD
  • Inputs
  • Product-term array
  • Product term allocation function
  • Macro-cells (registers)
  • Logic blocks executes sum-of-product expressions
    and stores the results in micro-cell registers
  • Programmable interconnects route signals to and
    from logic blocks

132
Major CPLD Resources
  • Number of macro-cells per logic block
  • Number of inputs from programmable interconnect
    to logic block
  • Number of product terms in logic block

133
Structure of FPGA (Xilinx)
Logic Block
I/O Block
Interconnect
134
Configurable Logic Block CLB
135
Logic Function
  • Implemented as look-up table (LUT)
  • K-input LUT corresponds to 2 K x 1 bit memory
  • K-input LUT can implement any k-input 1-output
    logic function

136
Configuring FPGA
  • Configure CLB and IOB
  • Configure interconnect
  • Interconnect technology
  • SRAM
  • Anti-fuse (program once)
  • EPROM / EEPROM

137
Programming Technology
138
FPGA Applications
  • Glue Logic (replace SSI and MSI parts)
  • Rapid turnaround
  • Prototype design
  • Emulation
  • Custom computing
  • Dynamic reconfiguration

139
PLD Logic Capacity
  • SPLD about 200 gates
  • CPLD
  • Altera FLEX (250K logic gates)
  • Xilinx XC9500
  • FPGA
  • Xilinx Vertex-E ( 3 million logic gates)
  • Xilinx Spartan (10K logic gates)
  • Altera

140
FPGA Design Flow
Design Entry
Design Implementation
Design Verification
FPGA Configuration
141
Design Entry (VHDL in our case)
Schematic
HDL
Compile
Logic Equations
Minimize
Test vectors
Reduced Logic Equations (Netlist)
Simulation
142
Design Implementation
  • Input Netlist Output bitstream
  • Map the design onto FPGA resources
  • Break up the circuit so that each block has
    maximum n inputs
  • NP-hard problem
  • However, optimal solution is not required

143
Design Implementation (Cont.)
  • Place assigns logic blocks created during
    mapping process to specific location on FPGA
  • Goal minimize length of wires
  • Again NP-hard
  • Route routes interconnect paths between logic
    blocks
  • NP-hard

144
Design Implementation Techniques
  • Simulated annealing
  • Genetic algorithm
  • Mincut method
  • Heuristic method

145
Design Verification FPGA Configuration
  • Functional Simulation
  • Timing Simulation
  • Download bitstream into FPGA

146
Introduction to Programmable Logic
  • FPGAs Overview
  • Why use FPGAs?(a short history lesson).
  • FPGA variations
  • Internal logic blocks.
  • Designing with FPGAs.
  • Specifics of Xilinx Virtex-E series.

147
FPGA Overview
  • Basic idea 2D array of combination logic blocks
    (CL) and flip-flops (FF) with a means for the
    user to configure both
  • 1. the interconnection between the logic blocks,
  • 2. the function of each block.

Simplified version of FPGA internal architecture
148
Why FPGAs? (1 / 5)
  • By the early 1980s most of logic circuits in
    typical systems were absorbed by a handful of
    standard large scale integrated circuits (LSI
    ICs).
  • Microprocessors, bus/IO controllers, system
    timers, ...
  • Every system still needed random small glue
    logic ICs to help connect the large ICs
  • generating global control signals (for resets
    etc.)
  • data formatting (serial to parallel,
    multiplexing, etc.)
  • Systems had a few LSI components and lots of
    small low density SSI (small scale IC) and MSI
    (medium scale IC) components.

Printed Circuit (PC) board with many small SSI
and MSI ICs and a few LSI ICs
149
Why FPGAs? (2 / 5)
  • Custom ICs sometimes designed to replace glue
    logic
  • reduced complexity/manufacturing cost, improved
    performance
  • But custom ICs expensive to develop, and delay
    introduction of product (time to market)
    because of increased design time
  • Note need to worry about two kinds of costs
  • 1. cost of development, Non-Recurring
    Engineering (NRE), fixed
  • 2. cost of manufacture per unit, variable
  • Usually tradeoff between NRE cost and
    manufacturing costs

NRE
NRE
150
Why FPGAs? (3 / 5)
  • Therefore custom IC approach was only viable for
    products with very high volume (where NRE could
    be amortized), and not sensitive in time to
    market (TTM)
  • FPGAs introduced as alternative to custom ICs for
    implementing glue logic
  • improved PC board density vs. discrete SSI/MSI
    components (within around 10x of custom ICs)
  • computer aided design (CAD) tools meant circuits
    could be implemented quickly (no physical layout
    process, no mask making, no IC manufacturing),
    relative to Application Specific ICs (ASICs)
    (3-6 months for these steps for custom IC)
  • lowers NREs (Non Recurring Engineering)
  • shortens TTM (Time To Market)
  • Because of Moores law the density (gates/area)
    of FPGAs continued to grow through the 80s and
    90s to the point where major data processing
    functions can be implemented on a single FPGA.

151
Why FPGAs? (4 / 5)
  • FPGAs continue to compete with custom ICs for
    special processing functions (and glue logic) but
    now try to compete with microprocessors in
    dedicated and embedded applications
  • Performance advantage over microprocessors
    because circuits can be customized for the task
    at hand. Microprocessors must provide special
    functions in software (many cycles)
  • MICRO Highest NRE, SW fastest TTM
  • ASIC Highest performance, worst TTM
  • FPGA Highest cost per chip (unit cost)

152
Why FPGAs? (5 / 5)
  • As Moores Law continues, FPGAs work for more
    applications as both can do more logic in 1 chip
    and faster
  • Can easily be patched vs. ASICs
  • Perfect for courses
  • Can change design repeatedly
  • Low TTM yet reasonable speed
  • With Moores Law, now can do full CS 152 project
    easily inside 1 FPGA

153
Where are FPGAs in the IC Zoo?
Source Dataquest
Programmable Logic Devices (PLDs)
Gate Arrays
Cell-Based ICs
Full Custom ICs
SPLDs (PALs)
FPGAs
Acronyms SPLD Simple Prog. Logic Device PAL
Prog. Array of Logic CPLD Complex PLD FPGA
Field Prog. Gate Array (Standard logic is SSI or
MSI buffers, gates)
Common Resources Configurable Logic Blocks
(CLB) Memory Look-Up Table AND-OR planes Simple
gates Input / Output Blocks (IOB) Bidirectional,
latches, inverters, pullup/pulldowns Interconnect
or Routing Local, internal feedback, and global
154
FPGA Variations
  • Families of FPGAs differ in
  • physical means of implementing user
    programmability,
  • arrangement of interconnection wires, and
  • basic functionality of logic blocks
  • Most significant difference is in the method for
    providing flexible blocks and connections
  • Anti-fuse based (ex Actel)
  • Non-volatile, relatively small
  • - fixed (non-reprogrammable)
  • (Almost used in 150 Lab only 1-shot at getting
    it right!)

155
User Programmability
  • Latches are used to
  • 1. make or break cross-point connections in
    interconnect
  • 2. define function of logic blocks
  • 3. set user options
  • within the logic blocks
  • in the input/output blocks
  • global reset/clock
  • Configuration bit stream loaded under user
    control
  • All latches are strung together in a shift chain
  • Programming gt creating bit stream
  • Latch-based (Xilinx, Altera, )
  • reconfigurable
  • - volatile
  • relatively large die size
  • Note Today 90 die is interconnect, 10 is gates

156
Idealized FPGA Logic Block
  • 4-input Look Up Table (4-LUT)
  • implements combinational logic functions
  • Register
  • optionally stores output of LUT
  • Latch determines whether read reg or LUT

157
4-LUT Implementation
  • n-bit LUT is actually implemented as a 2n x 1
    memory
  • inputs choose one of 2n memory locations.
  • memory locations (latches) are normally loaded
    with values from users configuration bit stream.
  • Inputs to mux control are the CLB (Configurable
    Logic Block) inputs.
  • Result is a general purpose logic gate.
  • n-LUT can implement any function of n inputs!

158
LUT as general logic gate
Example 4-lut
  • An n-lut as a direct implementation of a function
    truth-table
  • Each latch location holds value of function
    corresponding to one input combination

Example 2-lut
Implements any function of 2 inputs.
How many functions of n inputs?
159
More functionality for free?
  • Given basic idea
  • LUT built from RAM
  • Latches connected as shift register
  • What other functions could be provided at very
    little extra cost?
  • Using CLB latches as little RAM vs. logic
  • Using CLB latches as shift register vs. logic

160
1. Distributed RAM
  • CLB LUT configurable as Distributed RAM
  • A LUT equals 16x1 RAM
  • Implements Single and Dual-Ports
  • Cascade LUTs to increase RAM size
  • Synchronous write
  • Synchronous/Asynchronous read
  • Accompanying flip-flops used for synchronous read

161
2. Shift Register
  • Each LUT can be configured as shift register
  • Serial in, serial out
  • Saves resources can use less than 16 FFs
  • Faster no routing
  • Note CAD tools determine with CLB used as LUT,
    RAM, or shift register, rather than up to designer

162
How Program FPGA Generic Design Flow
  • Design Entry
  • Create your design files using
  • schematic editor or
  • hardware description language (Verilog, VHDL)
  • Design implementation on FPGA
  • Partition, place, and route (PPR) to create
    bit-stream file
  • Divide into CLB-sized pieces, place into blocks,
    route to blocks
  • Design verification
  • Use Simulator to check function,
  • Other software determines max clock frequency.
  • Load onto FPGA device (cable connects PC to
    board)
  • check operation at full speed in real environment.

163
Decoders 2-to-4 Decoder
y
w
w
y
y
y
En
164
Encoders
  • Opposite of decoders
  • Encode given information into a more compact form
  • Binary encoders
  • 2n inputs into n-bit code
  • Exactly one of the input signals should have a
    value of 1,and outputs present the binary number
    that identifies which input is equal to 1
  • Use reduce the number of bits (transmitting and
    storing information)

165
Designing with NAND and NOR Gates (2)
  • Any logic function can be realized using only
    NAND or NOR gates gt NAND/NOR is complete
  • NAND function is complete can be used to
    generate any logical function
  • 1 a I (a a) a a 1
  • 0 a I (a a) a I (a a) 1 1 0
  • a a a a
  • ab (a b) (a b) (a b) ab
  • ab (a a) (b b) a b a b

166
Multiplexers 4-to-1 Multiplexer
s
0
s
0
s
f
s
s
1
1
0
w
0
w
00
w
0
0
0
s
0
1
w
01
w
1
0
1
f
1
w
10
w
2
1
0
2
w
w
11
3
w
1
1
1
3
f
(b) Truth table
(a) Graphic symbol
w
2
w
3
(c) Circuit
167
Multiplexers Building Larger Mulitplexers
s
0
s
1
s
1
w
s
0
0
w
3
w
0
0
w
Write a Comment
User Comments (0)
About PowerShow.com