EECS 150 - Components and Design Techniques for Digital Systems Lec 27 - PowerPoint PPT Presentation

About This Presentation
Title:

EECS 150 - Components and Design Techniques for Digital Systems Lec 27

Description:

EECS 150 - Components and Design Techniques for Digital Systems Lec 27 Summary (whirlwind) 12-9-04 David Culler Electrical Engineering and Computer Sciences – PowerPoint PPT presentation

Number of Views:247
Avg rating:3.0/5.0
Slides: 66
Provided by: instEecsB
Category:

less

Transcript and Presenter's Notes

Title: EECS 150 - Components and Design Techniques for Digital Systems Lec 27


1
EECS 150 - Components and Design Techniques for
Digital Systems Lec 27 Summary
(whirlwind)12-9-04
  • David Culler
  • Electrical Engineering and Computer Sciences
  • University of California, Berkeley
  • http//www.eecs.berkeley.edu/culler
  • http//www-inst.eecs.berkeley.edu/cs150

2
Background
3
Course Content
  • Components and Design Techniques for Digital
    Systems
  • Synchronous Digital Hardware Systems
  • Synchronous Clocked - all changes in the
    system are controlled by a global clock and
    happen at the same time (not asynchronous)
  • Digital All inputs/outputs and internal values
    (signals) take on discrete values (not analog).

4
Trick you into building an extreme project
  • FPGA/SDRAM provides full game logic
  • Court, obstructions
  • Moving paddles
  • Moving, colliding ball
  • All the physics
  • Court displayed to NTSC (TV) Video Output
  • Real time Sound effects ???
  • N64 controller (and switches) for input
  • How to make it multiplayer?
  • The network

5
Levels of Digital Design
6
What makes Digital Systems tick?
Combinational Logic
clk
time
What determines the systems performance?
7
The 150 stuff
  • Building blocks of computer systems
  • ICs (Chips), PCBs, Chassis, Cables Connectors
  • CMOS Transistors
  • Voltage controlled switches
  • Complementary forms (nmos, pmos)
  • Logic gates from CMOS transistors
  • Logic gates implement particular boolean
    functions
  • N inputs, 1 output
  • Serial and parallel switches
  • Dual structure
  • P-type pull up transmit 1
  • N-type
  • Complex gates mux
  • Synchronous Sequential Elements
  • D FlipFlops

8
Combinational Logic (CL) Defined
  • yi fi(x0 , . . . . , xn-1), where x, y are
    0,1.
  • Y is a function of only X.
  • If we change X, Y will change
  • immediately (well almost!).
  • There is an implementation dependent delay from X
    to Y.

9
Transistor-level Logic Circuits - NAND
  • NAND gate
  • Logic Function
  • out 0 iff both a AND b 1 therefore out
    (ab)
  • pFET network and nFET network are duals of one
    another.
  • Inverter (NOT gate)

nand (out, a, b)
How about AND gate?
10
Combinational logic summary
  • Logic functions, truth tables, and switches
  • NOT, AND, OR, NAND, NOR, XOR, . . ., minimal set
  • Axioms and theorems of Boolean algebra
  • Proofs by re-writing and perfect induction
  • Gate logic
  • Networks of Boolean functions and their time
    behavior
  • Canonical forms
  • Two-level and incompletely specified functions
  • Optimization
  • Two-level simplification using K-maps
  • Automation of simplification
  • Multi-level logic
  • Later
  • Design case studies
  • Time behavior

11
Transistor-level Logic Circuits - Latch
  • Positive Level-sensitive latch
  • Transistor Level

D FlipFlop
  • Positive Edge-triggered flip-flop built from two
    level-sensitive latches

clk
clk
clk
clk
12
D Flip-Flop
  • Make S and R complements of each other in Master
    stage
  • Eliminates 1s catching problem
  • Input only needs to settle by clock edge
  • Can't just hold previous value (must have new
    value ready every clock period)
  • Value of D just before clock goes low is what is
    stored in flip-flop
  • Can make R-S flip-flop by adding logic to make D
    S R' Q

10 gates
13
Timing Methodologies
  • Rules for interconnecting components and clocks
  • Guarantee proper operation of system when
    strictly followed
  • Approach depends on building blocks used for
    memory elements
  • Focus on systems with edge-triggered flip-flops
  • Found in programmable logic devices
  • Many custom integrated circuits focus on
    level-sensitive latches
  • Basic rules for correct timing
  • (1) Correct inputs, with respect to time, are
    provided to the flip-flops
  • (2) No flip-flop changes state more than once per
    clocking event

14
Timing Methodologies (contd)
  • Definition of terms
  • clock periodic event, causes state of memory
    element to change can be rising or falling edge,
    or high or low level
  • setup time minimum time before the clocking
    event by which the input must be stable (Tsu)
  • hold time minimum time after the clocking event
    until which the input must remain stable (Th)

data
clock
there is a timing "window" around the clocking
event during which the input must remain stable
and unchanged in order to be recognized
changing
stable
data
clock
15
Whats an FSM?
  • Next state is function of state and input
  • Moore Machine output is a function of the state
  • Mealy Machine output is a function of state and
    input

inputA
State / output
inputB
inputA/outputA
State
inputB/outputB
16
Formal Design Process for FSMs
Logic equations from table OUT PS NS PS xor
IN
  • Review of Design Steps
  • 1. Circuit functional specification
  • 2. State Transition Diagram
  • 3. Symbolic State Transition Table
  • 4. Encoded State Transition Table
  • 5. Derive Logic Equations
  • 6. Circuit Diagram
  • FFs for state
  • CL for NS and OUT
  • Circuit Diagram
  • XOR gate for ns calculation
  • DFF to hold present state
  • no logic needed for output

17
Composing FSMs into larger designs
FSM
FSM
CL
CL
18
Sequential Synchronous Elements
  • Basic registers
  • Common control, MUXes
  • Simple, important FSMs
  • simple internal feedback
  • Ring counters, Pattern detectors
  • Binary Counters
  • Universal Shift Register
  • Using Counters to build controllers
  • Simplify control by controlling simpler FSM

19
150 and the changing times
  • Advancing technology changes the trade-offs and
    design techniques
  • 2x transistors per chip every 18 months
  • ASIC, Programmable Logic, Microprocessor
  • Programmable logic invests chip real-estate to
    reduce design time time to market
  • FPGA
  • programmable interconnect,
  • configurable logic blocks
  • LUT storage
  • Block RAM
  • IO Blocks
  • PLAs
  • General devices for SoP or PoS logic

20
Virtex-E Configurable Logic Block (CLB)
  • CLB 4 logic cells (LC) in two slices
  • LC 4-input function generator, carry logic,
    storage elet
  • 80 x 120 CLB array on 2000E

FF or latch
16x1 synchronous RAM
21
HDLs
  • Basic Idea
  • Language constructs describe circuits with two
    basic forms
  • Structural descriptions similar to hierarchical
    netlist.
  • Behavioral descriptions use higher-level
    constructs (similar to conventional programming).
  • Originally designed to help in abstraction and
    simulation.
  • Now logic synthesis tools exist to
    automatically convert from behavioral
    descriptions to gate netlist.
  • Greatly improves designer productivity.
  • However, this may lead you to falsely believe
    that hardware design can be reduced to writing
    programs!
  • Structural example
  • Decoder(output x0,x1,x2,x3
  • inputs a,b)
  • wire abar, bbar
  • inv(bbar, b)
  • inv(abar, a)
  • nand(x0, abar, bbar)
  • nand(x1, abar, b )
  • nand(x2, a, bbar)
  • nand(x3, a, b )
  • Behavioral example
  • Decoder(output x0,x1,x2,x3
  • inputs a,b)
  • case a b
  • 00 x0 x1 x2 x3 0x0
  • 01 x0 x1 x2 x3 0x2

22
Finite State Machines in Verilog
Mealy outputs
Moore outputs
next state
combinational logic
inputs
combinational logic
current state
23
Design Methodology in Detail
Postsynthesis Design Validation
Design Specification
Postsynthesis Timing Verification
Design Partition
Design Entry Behavioral Modeling
Test Generation and Fault Simulation
Simulation/Functional Verification
Cell Placement/Scan Insertation/Routing
Verify Physical and Electrical Rules
Design Integration And Verification
Synthesize and Map Gate-level Net List
Pre-Synthesis Sign-Off
Design Sign-Off
Synthesize and Map Gate-level Net List
24
Configuring CLBs
out
25
Configuring Routes
0
0
111
1
1
A
0
0
1
1
A
1
1
1
1
A
2
FF
1
000
A
A
A
2
1
0
in
nextstate A2 xor A1
out (A1 A2 A3)
26
Timing for Synchronous Circuits
  • In general, for correct operation
  • for all paths.
  • How do we enumerate all paths?
  • Any circuit input or register output to any
    register input or circuit output.
  • setup time for circuit outputs depends on what
    it connects to
  • clk-Q time for circuit inputs depends on from
    where it comes.

T ? time(clk?Q) time(CL) time(setup) T ?
?clk?Q ?CL ?setup
27
Typical SRAM Organization 16-word x 4-bit
Din 0
Din 1
Din 2
Din 3
WrEn
A0
Word 0
SRAM Cell
SRAM Cell
SRAM Cell
SRAM Cell
A1
A2
Address Decoder
Word 1
SRAM Cell
SRAM Cell
SRAM Cell
SRAM Cell
A3




Word 15
SRAM Cell
SRAM Cell
SRAM Cell
SRAM Cell
Dout 0
Dout 1
Dout 2
Dout 3
28
Classical DRAM Organization (Square)
bit (data) lines
r o w d e c o d e r
Each intersection represents a 1-T DRAM Cell
RAM Cell Array
Square keeps the wires short Power and speed
advantages Less RC, faster precharge
anddischarge is faster access time!
word (row) select
Column Address
Column Selector I/O Circuits
row address
  • Row and Column Address together select 1 bit a
    time

data
29
DRAM with Column buffer
R O W D E C O D E R

11
A0A10
(2,048 x 2,048)
Storage
W
ord Line
Cell
Sense
Amps
Column Latches
MUX
Pull column into fast buffer storage Access
sequence of bit from there
30
Digital Arithmetic
  • Circuit design for unsigned addition
  • Full adder per bit slice
  • Delay limited by Carry Propagation
  • Ripple is algorithmically slow, but wires are
    short
  • Carry select
  • Simple, resource-intensive
  • Excellent layout
  • Carry look-ahead
  • Excellent asymptotic behavior
  • Great at the board level, but wire length effects
    are significant on chip
  • Digital number systems
  • How to represent negative numbers
  • Simple operations
  • Clean algorithmic properties
  • 2s complement is most widely used
  • Circuit for unsigned arithmetic
  • Subtract by complement and carry in
  • Overflow when cin xor cout of sign-bit is 1

31
2s Complement Adder/Subtractor
A - B A (-B) A B 1
32
Digital design - as weve seen it
System specification (in words)
Datapath specification
Controller specification
FSM generation
Comb. logic operations
Verilog dataflow
STT / STD / Encoding
Logic nextstate/outputs
Gates / LUTs
Verilog behavior
Gates / LUTs / FF
33
Final Example Ant Brain (Ward, MIT)
  • Sensors L and R antennae, 1 if in touching
    wall
  • Actuators F - forward step, TL/TR - turn
    left/right slightly
  • Goal find way out of maze
  • Strategy keep the wall on the right

34
Serial Line TX/RX dealing with I/O

35
The GAME
  • CP1 N64 interface
  • CP2 Digital video encoder
  • CP3 SDRAM controller
  • CP4 IEEE 802.15.4 (cc2420) interface
  • Project CP game engine
  • Endgame

composite video
ADV7194
8
ITU 601/656
FPGA
Video Encode
SDRAM
Control
Render Engine
SDRAM Control
Data
player-1 input
32
Game Physics
player-0 input
Joystick Interface
N64 controller interface
36
Computer Organization
  • Computer design as an application of digital
    logic design procedures
  • Computer processing unit memory system
  • Processing unit control datapath
  • Control finite state machine
  • Inputs machine instruction, datapath conditions
  • Outputs register transfer control signals, ALU
    operation codes
  • Instruction interpretation instruction fetch,
    decode, execute
  • Datapath functional units registers
    interconnect
  • Functional units ALU, multipliers, dividers,
    etc.
  • Registers program counter, shifters, storage
    registers
  • Interconenct busses and wires
  • Instruction Interpreter vs Fixed Function Device

37
Design hierarchy
system
control
data-path
coderegisters
stateregisters
combinationallogic
multiplexer
comparator
register
logic
switchingnetworks
38
Datapath vs Control
Datapath
Controller
Control Points
  • Datapath Storage, FU, interconnect sufficient to
    perform the desired functions
  • Inputs are Control Points
  • Outputs are signals
  • Controller State machine to orchestrate
    operation on the data path
  • Based on desired function and signals

39
Datapath Design
  • Datapath consists of state (reg, reg file),
    function units (adders, ALUs), and interconnect
    (mux, tri-state bus)
  • It can perform certain register transfers source
    regs through function units and interconnect to
    dest reg
  • Set of reg. Transfers occur on each cycle
  • Each datapath element has control points
  • Reg (LD), FU (op), MUX (sel), TriState (OE)
  • Controller asserts the proper control point to
    cause the data path to carryout the requested
    register transfers
  • The RTLs associated with each step in the high
    level algorithm determine the STD of the
    contoller
  • Controller inputs are datapath outputs
    (conditions)
  • Controller outputs are datapath inputs (control
    points)

40
Array Multiplier
Generates all n partial products simultaneously.
Each row n-bit adder with AND gates
What is the critical path?
41
Shift and Add Multiplier
  • Sums each partial product, one at a time.
  • In binary, each partial product is shifted
    versions of A or 0.
  • Control Algorithm
  • 1. P ? 0, A ? multiplicand,
  • B ? multiplier
  • 2. If LSB of B1 then add A to P
  • else add 0
  • 3. Shift PB right 1
  • 4. Repeat steps 2 and 3 n-1 times.
  • 5. PB has product.
  • Cost ? n, ? n clock cycles.
  • What is the critical path for determining the min
    clock period?

42
DIVIDE HARDWARE Version 2
  • 32-bit Divisor register, 32-bit ALU, 64-bit
    Remainder register, 32-bit Quotient register

Divisor
32 bits
Shift Left
Quotient
32 bits
add/sub
Shift Left
Remainder
Control
Write
64 bits
43
Register Transfers - interconnect
  • Point-to-point connection
  • Dedicated wires
  • Muxes on inputs ofeach register
  • Common input from multiplexer
  • Load enablesfor each register
  • Control signalsfor multiplexer
  • Common bus with output enables
  • Output enables and loadenables for each register

44
Register Transfer Level Descriptions
  • RTL comprises a set of register transfers with
    optional operators as part of the transfer.
  • Example
  • regA ? regB
  • regC ? regA regB
  • if (start1) regA ? regC
  • Personal style
  • use to separate transfers that occur on
    separate cycles.
  • Use , to separate transfers that occur on the
    same cycle.
  • Example (2 cycles)
  • regA ? regB, regB ? 0
  • regC ? regA
  • A standard high-level representation for
    describing systems.
  • It follows from the fact that all synchronous
    digital system can be described as a set of state
    elements connected by combination logic (CL)
    blocks

45
List Processor Example
  • RTL gives us a framework for making high-level
    optimizations.
  • Fixed function unit
  • Approach extends to instruction interpreters
  • General design procedure outline
  • 1. Problem, Constraints, and Component Library
    Spec.
  • 2. Algorithm Selection
  • 3. Micro-architecture Specification
  • 4. Analysis of Cost, Performance, Power
  • 5. Optimizations, Variations
  • 6. Detailed Design

46
3. Architecture 1
Direct implementation of RTL description
Datapath
Controller
If (START1) NEXT?0, SUM?0 repeat
SUM?SUM MemoryNEXT1
NEXT?MemoryNEXT until (NEXT0) R?SUM,
DONE?1
47
Approaching an ISA
  • Instruction Set Architecture
  • Defines set of operations, instruction format,
    hardware supported data types, named storage,
    addressing modes, sequencing
  • Meaning of each instruction is described by RTL
    on architected registers and memory
  • Given technology constraints assemble adequate
    datapath
  • Architected storage mapped to actual storage
  • Function units to do all the required operations
  • Possible additional storage (eg. MAR, MBR, )
  • Interconnect to move information among regs and
    FUs
  • Map each instruction to sequence of RTLs
  • Collate sequences into symbolic controller STD
  • Lower symbolic STD to control points
  • Implement controller

48
Instruction Sequencing
  • Example an instruction to add the contents of
    two registers (Rx and Ry) and place result in a
    third register (Rz)
  • Step 1 Fetch the ADD instruction from memory
    into an instruction register
  • Step 2 Decode instruction
  • Instruction in IR has the code of an ADD
    instruction
  • Register indices used to generate output enables
    for registers Rx and Ry
  • Register index used to generate load signal for
    register Rz
  • Step 3 Execute instruction
  • Enable Rx and Ry output and direct to ALU
  • Setup ALU to perform ADD operation
  • Direct result to Rz so that it can be loaded into
    register

49
Instruction Execution
  • Control State Diagram (for each diagram)
  • Reset
  • Fetch instruction
  • Decode
  • Execute
  • Instructions partitioned into three classes
  • Branch
  • Load/store
  • Register-to-register
  • Different sequencethrough diagram for each
    instruction type
  • Controller manipulates the data path to perform
    the instruction

Reset
Init
InitializeMachine
FetchInstr.
XEQInstr.
Load/Store
Branch
Register-to-Register
BranchNot Taken
Branch Taken
Incr.PC
50
Networking Layers
Application
send _at_sdata dest
actual
actual
Analog Transmitter
Analog Receiver
time
51
What the PHY does
  • Code, transmit, receive, decode frames
  • activation and deactivation of the radio
    transceiver
  • energy detection (ED) within current channel
  • link quality indication (LQI) for received
    packets
  • channel selection
  • clear channel assessment (CCA) for CSMA-CA

52
CSMA
  • Carrier Sense Media Access Collision Avoidance
    (CSMA-CA)
  • Listen for a period of time to hear if the
    channel is free (CCA)
  • If hear traffic, back off for random period of
    time
  • Typically exponentially increasing backoff
  • Try again
  • May also due random delay before first CCA
  • If channel is clear, transmit
  • Ethernet does CSMA-CD (collision detect)

53
Error Correction Codes (ECC)
  • Memory systems generate errors (accidentally
    flipped-bits)
  • DRAMs store very little charge per bit
  • Soft errors occur occasionally when cells are
    struck by alpha particles or other environmental
    upsets.
  • Less frequently, hard errors can occur when
    chips permanently fail.
  • Problem gets worse as memories get denser and
    larger
  • Where is perfect memory required?
  • servers, spacecraft/military computers, ebay,
  • Memories are protected against failures with ECCs
  • Extra bits are added to each data-word
  • used to detect and/or correct faults in the
    memory system
  • in general, each possible data word value is
    mapped to a unique code word. A fault changes
    a valid code word to an invalid one - which can
    be detected.

54
Correcting Code Concept
Space of possible bit patterns (2N)
  • Detection bit pattern fails codeword check
  • Correction map to nearest valid code word
  • Example Parity bit

55
SECDED
1 2 3 4 5 6 7 positions 001 010 011 100 101 110
111 P1 P2 d1 P3 d2 d3 d4 role
Position of error C3C2C1 Where Ci is parity of
group i
  • You receive
  • 1111110
  • 0000010
  • 1010010
  • What is the correct value?

56
Concept Redundant Check
  • Send a message M and a check word C
  • Simple function on ltM,Cgt to determine if both
    received correctly (with high probability)
  • Example XOR all the bytes in M and append the
    checksum byte, C, at the end
  • Receiver XORs ltM,Cgt
  • What should result be?
  • What errors are caught?


bit i is XOR of ith bit of each byte
57
CRC concept
  • I have a msg polynomial M(x) of degree m
  • We both have a generator poly G(x) of degree m
  • Let r(x) remainder of M(x) xn / G(x)
  • M(x) xn G(x)p(x) r(x)
  • r(x) is of degree n
  • What is (M(x) xn r(x)) / G(x) ?
  • So I send you M(x) xn r(x)
  • mn degree polynomial
  • You divide by G(x) to check
  • M(x) is just the m most signficant coefficients,
    r(x) the lower m
  • n-bit Message is viewed as coefficients of
    n-degree polynomial over binary numbers

n bits of zero at the end
tack on n bits of remainder Instead of the zeros
58
Controlling Energy Consumption
What control do you have as a designer?
  • Largest contributing component to CMOS power
    consumption is switching power
  • Factors influencing power consumption
  • n total number of nodes in circuit
  • ? activity factor (probability of each node
    switching)
  • f clock frequency (does this effect energy
    consumption?)
  • Vdd power supply voltage
  • What control do you have over each factor?
  • How does each effect the total Energy?

59
Digital Design
  • Given a functional description and performance,
    cost, power constraints, come up with an
    implementation using a set of primitives.
  • How do we learn how to do this?
  • 1. Learn about the primitives and how to generate
    them.
  • 2. Learn about design representation.
  • 3. Learn formal methods to optimally manipulate
    the representations.
  • 4. Look at design examples.
  • 5. Use trial and error - CAD tools and
    prototyping.
  • Digital design is in some ways more an art than a
    science. The creative spirit is critical in
    combining primitive elements other components
    in new ways to achieve a desired function.
  • However, unlike art, we have objective measures
    of a design performance cost power

60
Traversing Digital Design
CS61C
EE 40
61
So whats on the final?
  • 5 questions (one full design problem)
  • Focused on latter third, but build upon
    everything weve done
  • Digital arithmetic
  • Datapath / Control / Computer Organization
  • RTL
  • Error coding
  • But also
  • Combinational logic, timing and delays,
    controller design
  • Partly recalling what was presented, partly
    putting your knowledge to work to solve a new
    problem

62
Maintaining the Digital Abstraction (in an analog
world)
  • Circuit design with very sharp transitions
  • Noise margin for logical values
  • Carefully Design Storage Elements (SE)
  • Internal feedback
  • Structured System Design
  • SE CL, cycles must cross SE
  • Timing Methodology
  • All SE advance state together
  • All inputs stable across state change
  • Channel coding, framing, encapulation
  • Error coding, detection, correction

63
Moores Law 2x stuff per year or so
64
Bells Law new computer class per 10 years
log (people per computer)
streaming information to/from physical world
  • Enabled by technological opportunities
  • Smaller, more numerous and more intimately
    connected
  • Ushers in a new kind of application
  • Ultimately used in many ways not previously
    imagined

year
65
What to take away from EECS 150
  • Hands-on understanding of digital design
    techniques and their relationship to the
    underlying technology.
  • Experience with the fundamental process of the
    design of digital systems
  • Components, DP, RTL, FSM, Controller
  • An intellectual toolbox for a changing world.
Write a Comment
User Comments (0)
About PowerShow.com