Chapter 2: Custom Single-Purpose Processors - PowerPoint PPT Presentation

Loading...

PPT – Chapter 2: Custom Single-Purpose Processors PowerPoint presentation | free to view - id: 27221c-YjUxN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Chapter 2: Custom Single-Purpose Processors

Description:

JK flip-flop. 17. RT-Level Sequential Components. Register. Shift register. counter. 18 ... Content shifted - I stored in msb. 19. Register ... – PowerPoint PPT presentation

Number of Views:1843
Avg rating:3.0/5.0
Slides: 47
Provided by: vah48
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 2: Custom Single-Purpose Processors


1
Chapter 2 Custom Single-Purpose Processors
2
Outline
  • Introduction
  • Combinational Logic
  • Sequential Logic
  • Custom Single-Purpose Processor Design
  • RT-level Custom Single-Purpose Processor Design
  • Optimizing Custom Single-Purpose Processors
  • Summary

3
Introduction
  • Processor
  • Is a digital circuit that performs a computation
    tasks
  • Consists of controller and datapath
  • General-purpose can perform variety of
    computation tasks
  • Single-purpose perform one particular
    computation task
  • Custom single-purpose non-standard task
  • A custom single-purpose processor may be
  • Fast, small, low power
  • But, high NRE, longer time-to-market, less
    flexible

4
Combinational Logic
  • Transistor
  • The basic electrical component in digital systems
  • Acts as an on/off switch
  • Voltage at gate controls whether current flows
    from source to drain
  • Dont confuse this gate with a logic gate
  • CMOS transistor on silicon

5
CMOS Transistor Implementations
  • Complementary Metal Oxide Semiconductor
  • We refer to logic levels
  • Typically 0 is 0V, 1 is 5V
  • Two basic CMOS types
  • nMOS conducts if gate1
  • pMOS conducts if gate0
  • Hence complementary
  • Basic gates built from two basic CMOS
  • Inverter, NAND, NOR

6
Basic Logic Gates
  • Each gate is represented symbolically, with a
    Boolean equation, and with a truth table.

F x y AND
F x ? y XOR
F x Driver
F x y OR
F (x y) NAND
F x Inverter
F (xy) NOR
7
Basic Combinational Logic Design
  • Combinational circuit
  • Is a digital circuit whose output is purely a
    function of its present inputs.
  • Has no memory of past inputs
  • Simple technique to design a combinational
    circuit from basic logic gates
  • Problem description
  • Truth table
  • Output equations
  • Minimized output equations (by Karnaugh maps)
  • Draw the circuit diagram

8
Combinational Logic Design
A) Problem description y is 1 if a is to 1, or
b and c are 1. z is 1 if b or c is to 1, but not
both, or if all are 1.
9
RT-Level Combinational Components
  • Combinational components often called
    register-transfer, or RT, level components
  • Multiplexor (selector)
  • Decoder
  • Adder
  • Comparator
  • Arithmetic-logic unit (ALU)
  • Shifter

10
Combinational Components
O I0 if S0..00 I1 if S0..01 I(m-1) if
S1..11
less 1 if AltB equal 1 if AB greater1 if
AgtB
O A op B op determined by S.
O0 1 if I0..00 O1 1 if I0..01 O(n-1) 1 if
I1..11
sum AB (first n bits) carry (n1)th
bit of AB
With enable input e ? all Os are 0 if e0
With carry-in input Ci? sum A B Ci
May have status outputs carry, zero, etc.
11
Multiplexor (Selector)
  • Allows only one of its data inputs to pass
    through to the output
  • m-by-1 multiplexor m data inputs, 1 data output
  • n-bit multiplexor
  • Each data input as well as the output consists of
    n lines
  • n is independent of the number of select lines
  • 4-bit 81 multiplexor
  • If I61110, then output would be 1110

O I0 if S0..00 I1 if S0..01
I(m-1) if S1..11
12
Decoder
  • Converts its binary input I into a one-hot output
    O.
  • Log2(n)n decoder
  • An extra input called enable
  • When enable is 0, all outputs are 0

O0 1 if I0..00 O1 1 if I0..01
O(n-1) 1 if I1..11
With enable input e ? all Os are 0 if e0
13
Adder
  • Adds two n-bit binary inputs A and B, generating
    an n-bit output sum along with an output carry

sum AB (first n bits) carry (n1)th bit
of AB
With carry-in input Ci? sum A B Ci
14
Comparator
  • Compares two n-bit binary inputs A and B,
    generating outputs that indicate whether A is
    less than, equal to, or greater then B

less 1 if AltB equal 1 if AB greater1 if
AgtB
15
Arithmetic-logic unit (ALU)
  • performs a variety of arithmetic and logic
    functions on its two n-bit binary inputs A and B
  • Select lines S choose the current function
  • Common functions addition, subtraction, AND, OR

O A op B op determined by S.
May have status outputs carry, zero, etc.
16
Sequential Logic
  • Sequential circuit
  • Is a digital circuit whose outputs are a function
    of the present as well as previous input values.
  • Has memory
  • Basic sequential circuits flip-flop
  • Stores a single bit
  • D flip-flop
  • SR flip-flop
  • JK flip-flop

17
RT-Level Sequential Components
  • Register
  • Shift register
  • counter

18
Sequential Components
Q lsb - Content shifted - I stored in msb
Q 0 if clear1, I if load1 and
clock1, Q(previous) otherwise.
Q 0 if clear1, Q(prev)1 if count1 and
clock1.
19
Register
  • Stores n bits from its n-bit data input I, with
    those stored bits appearing at is output Q
  • Parallel-load register
  • All n bits of the register can be stored in
    parallel

Q 0 if clear1, I if load1 and
clock1, Q(previous) otherwise.
20
Shift Register
  • Stores n bits, but these bits cannot be stored in
    parallel. Instead, they must be shifted into the
    register serially, meaning one bit per clock
    edge.
  • 1-bit data input I, with I stored in MSB, content
    shifted, LSB shifted out and appearing at is
    output Q

Q lsb - Content shifted - I stored in msb
21
Counter
  • A register than can also increment
  • A common counter feature is both up and down
    counting or incrementing and decrementing,
    requiring an additional control input

Q 0 if clear1, Q(prev)1 if count1 and
clock1.
22
Sequential Logic Design
  • Problem description
  • Translate to a state diagram, called a finite
    state machine (FSM)
  • Implement FSM
  • Using a register to store the current state, and
    combinational logic to generate the output values
    and the next state
  • State table
  • Assign to each state a unique binary value, and
    create a truth table for the combinational logic
  • Minimized output equations (by Karnaugh maps)
  • Draw the combinational logic circuit

23
Sequential Logic Design (Cont.)
A) Problem Description You want to construct a
clock divider. Slow down your pre-existing clock
so that you output a 1 for every four clock cycles
  • Given this implementation model
  • Sequential logic design quickly reduces to
    combinational logic design

24
Sequential Logic Design (cont.)
25
Custom Single-Purpose Processor Design
  • A basic processor consists of a controller and a
    datapath
  • Datapath
  • Stores and manipulates a systems data
  • Contains register units, functional units, and
    connection units like wires and multiplexors.
  • Controller
  • Carries out such configuration of the datapath

26
Custom Single-Purpose Processor Basic Model


external control inputs
external data inputs
controller
datapath


registers
datapath control inputs
next-state and control logic
controller
datapath
datapath control outputs
functional units
state register


external control outputs
external data outputs


a view inside the controller and datapath
controller and datapath
27
Example Greatest Common Divisor
  • Building a single-purpose processor implementing
    the GCD program
  • First create algorithm
  • Convert algorithm to complex state machine
  • Divide the functionality into a datapath part and
    a controller part
  • Construct the datapath
  • Construct the controller
  • Perform optimizations to datapath and controller

28
Example Greatest Common Divisor (Cont.)
  • First create algorithm
  • Convert algorithm to complex state machine
  • Known as FSMD finite-state machine with datapath
  • Can use templates to perform such conversion

(c) state diagram
(b) desired functionality
0 int x, y 1 while (1) 2 while
(!go_i) 3 x x_i 4 y y_i 5 while
(x ! y) 6 if (x lt y) 7
y y - x else 8
x x - y 9 d_o x
29
State Diagram Templates
Assignment statement
Loop statement
Branch statement
a b next statement
while (cond) loop-body-
statements next statement
if (c1) c1 stmts else if c2 c2
stmts else other stmts next statement
C
c1
!c1!c2
!c1c2
c2 stmts
others
c1 stmts
J
next statement
30
Creating the Datapath
  • Create a register for any declared variable
  • Create a functional unit for each arithmetic
    operation
  • Connect the ports, registers and functional units
  • Based on reads and writes
  • Use multiplexors for multiple sources
  • Create unique identifier
  • for each datapath component control input and
    output

31
Creating the Controllers FSM
  • Same structure as FSMD
  • Replace complex actions/conditions with datapath
    configurations

32
Splitting into a Controller and Datapath
go_i
Controller
!1
1
0000
1
!(!go_i)
2
0001
!go_i
2-J
0010
x_sel 0 x_ld 1
3
0011
y_sel 0 y_ld 1
4
0100
x_neq_y0
5
0101
x_neq_y1
6
0110
x_lt_y1
x_lt_y0
y_sel 1 y_ld 1
x_sel 1 x_ld 1
7
8
0111
1000
6-J
1001
5-J
1010
d_ld 1
9
1011
1-J
1100
33
Controller State Table for the GCD Example
34
Completing the GCD Custom Single-Purpose
Processor Design
  • We finished the datapath
  • We have a state table for the next state and
    control logic
  • All thats left is combinational logic design
  • This is not an optimized design, but we see the
    basic steps

35
RT-level Custom Single-Purpose Processor Design
  • We often start with a state machine
  • rather than a program
  • since the cycle-by-cycle timing of a system is
    central to the system, but programming languages
    dont typically support cycle-by-cycle
    description.
  • Example
  • Bus bridge that converts 4-bit bus to 8-bit bus
  • One device (the sender) sends an 8-bit number to
    another device (the receiver)
  • The receiver can receive all 8 bits at once
  • The sender sends 4 bits at a time it sends the
    low-order 4 bits, then the high-order 4 bits.
  • Start with FSMD
  • Known as register-transfer (RT) level

36
RT-level Custom Single-Purpose Processor Design
Example
  • Example
  • Bus bridge that converts 4-bit bus to 8-bit bus
  • Start with FSMD
  • Known as register-transfer (RT) level

37
RT-Level Custom Single-Purpose Processor Design
Example (Cont.)
Bridge
(a) Controller
rdy_in
rdy_out
clk
data_in(4)
data_out

data_lo
data_hi
to all registers
data_lo_ld
data_hi_ld
data_out_ld
data_out
(b) Datapath
38
Optimizing Custom Single-Purpose Processors
  • Optimization is the task of making design metric
    values the best possible
  • Optimization opportunities
  • original program
  • FSMD
  • datapath
  • FSM

39
Optimizing the Original Program
  • Analyze program attributes and look for areas of
    possible improvement
  • number of computations
  • size of variables
  • time and space complexity
  • operations used
  • multiplication and division very expensive
  • The choice of algorithm can have perhaps the
    biggest impact on the efficiency of the desired
    processor.

40
Optimizing the Original Program (cont.)
original program
optimized program
0 int x, y 1 while (1) 2 while
(!go_i) 3 x x_i 4 y y_i 5 while
(x ! y) 6 if (x lt y) 7
y y - x else 8
x x - y 9 d_o x
0 int x, y, r 1 while (1) 2 while
(!go_i) // x must be the larger number
3 if (x_i gt y_i) 4 xx_i 5
yy_i 6 else 7
xy_i 8 yx_i 9
while (y ! 0) 10 r x y 11
x y 12 y r 13 d_o
x
replace the subtraction operation(s) with modulo
operation in order to speed up program
GCD(42, 8) - 9 iterations to complete the loop x
and y values evaluated as follows (42, 8), (34,
8), (26,8), (18,8), (10, 8), (2,8), (2,6), (2,4),
(2,2).
GCD(42,8) - 3 iterations to complete the loop x
and y values evaluated as follows (42, 8),
(8,2), (2,0)
41
Optimizing the FSMD
  • Scheduling
  • The task of assigning operations from the
    original program to states in an FSMD
  • The scheduling obtained using the template-based
    method can be improved
  • Areas of possible improvements
  • merge states
  • states with constants on transitions can be
    eliminated, transition taken is already known
  • states with independent operations can be merged
    (e.g., xx_i,yy_i)
  • separate states
  • states which require complex operations (e.g.,
    abcd) can be broken into smaller states to
    reduce hardware size
  • A design must be aware of whether output timing
    may or may not be modified

42
Optimizing the FSMD (cont.)
original FSMD
optimized FSMD
eliminate state 1 transitions have constant
values
merge state 2 and state 2J no loop operation in
between them
merge state 3 and state 4 assignment operations
are independent of one another
merge state 5 and state 6 transitions from
state 6 can be done in state 5
eliminate state 5J and 6J transitions from each
state can be done from state 7 and state 8,
respectively
eliminate state 1-J transition from state 1-J
can be done directly from state 9
43
Optimizing the Datapath
  • Sharing of functional units
  • one-to-one mapping, as done previously, is not
    necessary
  • e.g., subtractor for x-y, substractor for y-x
  • if same operation occurs in different states,
    they can share a single functional unit
  • e.g., a single subtractor and use multiplexors to
    choose whether inputs are x and y, or instead y
    and x
  • Multi-functional units
  • ALUs support a variety of operations, it can be
    shared among operations occurring in different
    states

44
Optimizing the FSM
  • Designing a sequential circuit to implement an
    FSM also provides some opportunities for
    optimization
  • State encoding
  • task of assigning a unique bit pattern to each
    state in an FSM
  • size of state register and combinational logic
    vary for different encodings
  • can be treated as an ordering problem
  • State minimization
  • task of merging equivalent states into a single
    state
  • state equivalent if for all possible input
    combinations the two states generate the same
    outputs and transitions to the next same state

45
Summary
  • Designing a custom single-purpose processors
    requires understanding of various aspects of
    digital design.
  • Design of a circuit to implement Boolean
    functions
  • Combinational design
  • Building a truth table
  • Optimizing the output functions
  • Draw a circuit
  • Design of a circuit to implement a state diagram
  • Sequential design
  • Drawing an implementation model with a state
    register and a combinational logic block
  • Binary encoding to each state
  • Drawing a state table
  • Repeat combinational design process for this table

46
Summary (Cont.)
  • Design of a single-purpose processor circuit to
    implement a program
  • Schedule the programs statements into a complex
    state diagram (FSMD)
  • constructs a datapath
  • Create a new state diagram (FSM) that replaces
    complex actions and conditions by datapath
    control operations
  • Design a controller circuit for the new state
    diagram using sequential design
  • Much optimization can be performed at each level
    of design
  • CAD tools can be of great assistance
About PowerShow.com