Loading...

PPT – Chapter 2: Custom Single-Purpose Processors PowerPoint presentation | free to view - id: 27221c-YjUxN

The Adobe Flash plugin is needed to view this content

Chapter 2 Custom Single-Purpose Processors

Outline

- Introduction
- Combinational Logic
- Sequential Logic
- Custom Single-Purpose Processor Design
- RT-level Custom Single-Purpose Processor Design
- Optimizing Custom Single-Purpose Processors
- Summary

Introduction

- Processor
- Is a digital circuit that performs a computation

tasks - Consists of controller and datapath
- General-purpose can perform variety of

computation tasks - Single-purpose perform one particular

computation task - Custom single-purpose non-standard task
- A custom single-purpose processor may be
- Fast, small, low power
- But, high NRE, longer time-to-market, less

flexible

Combinational Logic

- Transistor
- The basic electrical component in digital systems
- Acts as an on/off switch
- Voltage at gate controls whether current flows

from source to drain - Dont confuse this gate with a logic gate
- CMOS transistor on silicon

CMOS Transistor Implementations

- Complementary Metal Oxide Semiconductor
- We refer to logic levels
- Typically 0 is 0V, 1 is 5V
- Two basic CMOS types
- nMOS conducts if gate1
- pMOS conducts if gate0
- Hence complementary
- Basic gates built from two basic CMOS
- Inverter, NAND, NOR

Basic Logic Gates

- Each gate is represented symbolically, with a

Boolean equation, and with a truth table.

F x y AND

F x ? y XOR

F x Driver

F x y OR

F (x y) NAND

F x Inverter

F (xy) NOR

Basic Combinational Logic Design

- Combinational circuit
- Is a digital circuit whose output is purely a

function of its present inputs. - Has no memory of past inputs
- Simple technique to design a combinational

circuit from basic logic gates - Problem description
- Truth table
- Output equations
- Minimized output equations (by Karnaugh maps)
- Draw the circuit diagram

Combinational Logic Design

A) Problem description y is 1 if a is to 1, or

b and c are 1. z is 1 if b or c is to 1, but not

both, or if all are 1.

RT-Level Combinational Components

- Combinational components often called

register-transfer, or RT, level components - Multiplexor (selector)
- Decoder
- Adder
- Comparator
- Arithmetic-logic unit (ALU)
- Shifter

Combinational Components

O I0 if S0..00 I1 if S0..01 I(m-1) if

S1..11

less 1 if AltB equal 1 if AB greater1 if

AgtB

O A op B op determined by S.

O0 1 if I0..00 O1 1 if I0..01 O(n-1) 1 if

I1..11

sum AB (first n bits) carry (n1)th

bit of AB

With enable input e ? all Os are 0 if e0

With carry-in input Ci? sum A B Ci

May have status outputs carry, zero, etc.

Multiplexor (Selector)

- Allows only one of its data inputs to pass

through to the output - m-by-1 multiplexor m data inputs, 1 data output
- n-bit multiplexor
- Each data input as well as the output consists of

n lines - n is independent of the number of select lines
- 4-bit 81 multiplexor
- If I61110, then output would be 1110

O I0 if S0..00 I1 if S0..01

I(m-1) if S1..11

Decoder

- Converts its binary input I into a one-hot output

O. - Log2(n)n decoder
- An extra input called enable
- When enable is 0, all outputs are 0

O0 1 if I0..00 O1 1 if I0..01

O(n-1) 1 if I1..11

With enable input e ? all Os are 0 if e0

Adder

- Adds two n-bit binary inputs A and B, generating

an n-bit output sum along with an output carry

sum AB (first n bits) carry (n1)th bit

of AB

With carry-in input Ci? sum A B Ci

Comparator

- Compares two n-bit binary inputs A and B,

generating outputs that indicate whether A is

less than, equal to, or greater then B

less 1 if AltB equal 1 if AB greater1 if

AgtB

Arithmetic-logic unit (ALU)

- performs a variety of arithmetic and logic

functions on its two n-bit binary inputs A and B - Select lines S choose the current function
- Common functions addition, subtraction, AND, OR

O A op B op determined by S.

May have status outputs carry, zero, etc.

Sequential Logic

- Sequential circuit
- Is a digital circuit whose outputs are a function

of the present as well as previous input values. - Has memory
- Basic sequential circuits flip-flop
- Stores a single bit
- D flip-flop
- SR flip-flop
- JK flip-flop

RT-Level Sequential Components

- Register
- Shift register
- counter

Sequential Components

Q lsb - Content shifted - I stored in msb

Q 0 if clear1, I if load1 and

clock1, Q(previous) otherwise.

Q 0 if clear1, Q(prev)1 if count1 and

clock1.

Register

- Stores n bits from its n-bit data input I, with

those stored bits appearing at is output Q - Parallel-load register
- All n bits of the register can be stored in

parallel

Q 0 if clear1, I if load1 and

clock1, Q(previous) otherwise.

Shift Register

- Stores n bits, but these bits cannot be stored in

parallel. Instead, they must be shifted into the

register serially, meaning one bit per clock

edge. - 1-bit data input I, with I stored in MSB, content

shifted, LSB shifted out and appearing at is

output Q

Q lsb - Content shifted - I stored in msb

Counter

- A register than can also increment
- A common counter feature is both up and down

counting or incrementing and decrementing,

requiring an additional control input

Q 0 if clear1, Q(prev)1 if count1 and

clock1.

Sequential Logic Design

- Problem description
- Translate to a state diagram, called a finite

state machine (FSM) - Implement FSM
- Using a register to store the current state, and

combinational logic to generate the output values

and the next state - State table
- Assign to each state a unique binary value, and

create a truth table for the combinational logic - Minimized output equations (by Karnaugh maps)
- Draw the combinational logic circuit

Sequential Logic Design (Cont.)

A) Problem Description You want to construct a

clock divider. Slow down your pre-existing clock

so that you output a 1 for every four clock cycles

- Given this implementation model
- Sequential logic design quickly reduces to

combinational logic design

Sequential Logic Design (cont.)

Custom Single-Purpose Processor Design

- A basic processor consists of a controller and a

datapath - Datapath
- Stores and manipulates a systems data
- Contains register units, functional units, and

connection units like wires and multiplexors. - Controller
- Carries out such configuration of the datapath

Custom Single-Purpose Processor Basic Model

external control inputs

external data inputs

controller

datapath

registers

datapath control inputs

next-state and control logic

controller

datapath

datapath control outputs

functional units

state register

external control outputs

external data outputs

a view inside the controller and datapath

controller and datapath

Example Greatest Common Divisor

- Building a single-purpose processor implementing

the GCD program - First create algorithm
- Convert algorithm to complex state machine
- Divide the functionality into a datapath part and

a controller part - Construct the datapath
- Construct the controller
- Perform optimizations to datapath and controller

Example Greatest Common Divisor (Cont.)

- First create algorithm
- Convert algorithm to complex state machine
- Known as FSMD finite-state machine with datapath
- Can use templates to perform such conversion

(c) state diagram

(b) desired functionality

0 int x, y 1 while (1) 2 while

(!go_i) 3 x x_i 4 y y_i 5 while

(x ! y) 6 if (x lt y) 7

y y - x else 8

x x - y 9 d_o x

State Diagram Templates

Assignment statement

Loop statement

Branch statement

a b next statement

while (cond) loop-body-

statements next statement

if (c1) c1 stmts else if c2 c2

stmts else other stmts next statement

C

c1

!c1!c2

!c1c2

c2 stmts

others

c1 stmts

J

next statement

Creating the Datapath

- Create a register for any declared variable
- Create a functional unit for each arithmetic

operation - Connect the ports, registers and functional units
- Based on reads and writes
- Use multiplexors for multiple sources
- Create unique identifier
- for each datapath component control input and

output

Creating the Controllers FSM

- Same structure as FSMD
- Replace complex actions/conditions with datapath

configurations

Splitting into a Controller and Datapath

go_i

Controller

!1

1

0000

1

!(!go_i)

2

0001

!go_i

2-J

0010

x_sel 0 x_ld 1

3

0011

y_sel 0 y_ld 1

4

0100

x_neq_y0

5

0101

x_neq_y1

6

0110

x_lt_y1

x_lt_y0

y_sel 1 y_ld 1

x_sel 1 x_ld 1

7

8

0111

1000

6-J

1001

5-J

1010

d_ld 1

9

1011

1-J

1100

Controller State Table for the GCD Example

Completing the GCD Custom Single-Purpose

Processor Design

- We finished the datapath
- We have a state table for the next state and

control logic - All thats left is combinational logic design
- This is not an optimized design, but we see the

basic steps

RT-level Custom Single-Purpose Processor Design

- We often start with a state machine
- rather than a program
- since the cycle-by-cycle timing of a system is

central to the system, but programming languages

dont typically support cycle-by-cycle

description. - Example
- Bus bridge that converts 4-bit bus to 8-bit bus
- One device (the sender) sends an 8-bit number to

another device (the receiver) - The receiver can receive all 8 bits at once
- The sender sends 4 bits at a time it sends the

low-order 4 bits, then the high-order 4 bits. - Start with FSMD
- Known as register-transfer (RT) level

RT-level Custom Single-Purpose Processor Design

Example

- Example
- Bus bridge that converts 4-bit bus to 8-bit bus
- Start with FSMD
- Known as register-transfer (RT) level

RT-Level Custom Single-Purpose Processor Design

Example (Cont.)

Bridge

(a) Controller

rdy_in

rdy_out

clk

data_in(4)

data_out

data_lo

data_hi

to all registers

data_lo_ld

data_hi_ld

data_out_ld

data_out

(b) Datapath

Optimizing Custom Single-Purpose Processors

- Optimization is the task of making design metric

values the best possible - Optimization opportunities
- original program
- FSMD
- datapath
- FSM

Optimizing the Original Program

- Analyze program attributes and look for areas of

possible improvement - number of computations
- size of variables
- time and space complexity
- operations used
- multiplication and division very expensive
- The choice of algorithm can have perhaps the

biggest impact on the efficiency of the desired

processor.

Optimizing the Original Program (cont.)

original program

optimized program

0 int x, y 1 while (1) 2 while

(!go_i) 3 x x_i 4 y y_i 5 while

(x ! y) 6 if (x lt y) 7

y y - x else 8

x x - y 9 d_o x

0 int x, y, r 1 while (1) 2 while

(!go_i) // x must be the larger number

3 if (x_i gt y_i) 4 xx_i 5

yy_i 6 else 7

xy_i 8 yx_i 9

while (y ! 0) 10 r x y 11

x y 12 y r 13 d_o

x

replace the subtraction operation(s) with modulo

operation in order to speed up program

GCD(42, 8) - 9 iterations to complete the loop x

and y values evaluated as follows (42, 8), (34,

8), (26,8), (18,8), (10, 8), (2,8), (2,6), (2,4),

(2,2).

GCD(42,8) - 3 iterations to complete the loop x

and y values evaluated as follows (42, 8),

(8,2), (2,0)

Optimizing the FSMD

- Scheduling
- The task of assigning operations from the

original program to states in an FSMD - The scheduling obtained using the template-based

method can be improved - Areas of possible improvements
- merge states
- states with constants on transitions can be

eliminated, transition taken is already known - states with independent operations can be merged

(e.g., xx_i,yy_i) - separate states
- states which require complex operations (e.g.,

abcd) can be broken into smaller states to

reduce hardware size - A design must be aware of whether output timing

may or may not be modified

Optimizing the FSMD (cont.)

original FSMD

optimized FSMD

eliminate state 1 transitions have constant

values

merge state 2 and state 2J no loop operation in

between them

merge state 3 and state 4 assignment operations

are independent of one another

merge state 5 and state 6 transitions from

state 6 can be done in state 5

eliminate state 5J and 6J transitions from each

state can be done from state 7 and state 8,

respectively

eliminate state 1-J transition from state 1-J

can be done directly from state 9

Optimizing the Datapath

- Sharing of functional units
- one-to-one mapping, as done previously, is not

necessary - e.g., subtractor for x-y, substractor for y-x
- if same operation occurs in different states,

they can share a single functional unit - e.g., a single subtractor and use multiplexors to

choose whether inputs are x and y, or instead y

and x - Multi-functional units
- ALUs support a variety of operations, it can be

shared among operations occurring in different

states

Optimizing the FSM

- Designing a sequential circuit to implement an

FSM also provides some opportunities for

optimization - State encoding
- task of assigning a unique bit pattern to each

state in an FSM - size of state register and combinational logic

vary for different encodings - can be treated as an ordering problem
- State minimization
- task of merging equivalent states into a single

state - state equivalent if for all possible input

combinations the two states generate the same

outputs and transitions to the next same state

Summary

- Designing a custom single-purpose processors

requires understanding of various aspects of

digital design. - Design of a circuit to implement Boolean

functions - Combinational design
- Building a truth table
- Optimizing the output functions
- Draw a circuit
- Design of a circuit to implement a state diagram
- Sequential design
- Drawing an implementation model with a state

register and a combinational logic block - Binary encoding to each state
- Drawing a state table
- Repeat combinational design process for this table

Summary (Cont.)

- Design of a single-purpose processor circuit to

implement a program - Schedule the programs statements into a complex

state diagram (FSMD) - constructs a datapath
- Create a new state diagram (FSM) that replaces

complex actions and conditions by datapath

control operations - Design a controller circuit for the new state

diagram using sequential design - Much optimization can be performed at each level

of design - CAD tools can be of great assistance