Title: Chapter 5' Control Design
1Chapter 5. Control Design
2(No Transcript)
3 Two approaches for control unit design
A hard-wired control unit
a sequential logic circuit to generate
specific fixed sequences of control
signals change in behavior only by redesign.
4- A microprogrammed control unit
- by organizing control signals into
microinstructions. The signals are - implemented by a kind of software(or
firmware) rather than hardware. - design change change the contents
of control memory. - emulation a microprogrammed CPU
can execute programs written in - the machine
language of other computers. - Disadvantage
- Slower due to fetch.
- more costly due to the
presence of the control memory and its - access circuits.
55.1.2. Hardwired Control
- design method 1 The classical method of
sequential circuit design. For a P-state - circuit
log2P flip-flops are required. - design method 2 One-hot method one
flip-flop per state. Expensive in terms of - F/F but
simplify CU design and debugging. - GCD processor
6(No Transcript)
7(No Transcript)
8(No Transcript)
9Classical method
S0 00 S1 01 S2 10 and S3 11
10(No Transcript)
11(5.9)
(5.10)
(5.11)
12(No Transcript)
13(No Transcript)
14One-hot method
S0 0001 S1 0010 S2 0100 and S3 1000
The one-hot method is limited to a
small number of states The
next-state and output equations have a simple and
systematic form The one-hot design method 1.
Construct a P-row state table that defines the
desired input-output behavior. 2. Associate a
separate D-type flip-flop Di with each state Si
and assign the P-bit one-hot binary code
D1 D2 Di-1 Di Di1 Dp
000100 to Si. 3. Design a
combinational circuit C that generates the
primary and secondary output signals
Di and zk respectively. Di is defined by
the logic equation where
denote all input combinations that
cause a transition from Sj to Si. If
zk 1 ( active ) only in rows kh for h
12mk then zk is defined by
15Design of 2C multiplier hardwired control
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
215.2 Microprogrammed Control Instruction
implemented by a sequence of one or more
sets of concurrent micro-operations.
Microprogramming control-signal
selection and sequencing information is stored in
a ROM or RAM called a control
memory(CM) and microinstruction is fetched from
CM. A microprogrammed computer C1 can
be used to execute program written in the
machine language L2 of some other computer C2 by
placing an emulation for L2 in the CM of C1.
22 Wilkers Design microinstruction (I)
23How to decide I word length 1. The degree of
parallelism required at the micro-operation level
2. How the control
information is represented or encoded 3. How to
specify the next I address
- Parallelism in I
- If all useful combination of parallel
micro-operation are specified by a single opcode
it would be enormous and decoder will be
complicated. - divide the micro-operation specification part
into k disjoint control field any one of
which can be performed simultaneously with other. - In IBM 360/50 I 90 bits (21 partitioned
control field). Wilker design
1-bit control field for each control signal.
Un-encoded form (4-bit)
c0
c1
c2
c3
Micro-operation
1 0 0 0 R
X0 0 1 0 0
R X1 0 0 1 0
R X2 0 0 0 1
R X3 0 0 0 0
No op
24Encoded form (3-bit)
n independent control signal log2(n1) bits
decoder is needed
I horizontal VS vertical horizontal form
long format
able to express a high degree of parallelism
little encoding for the
control information. vertical form
short format
limited ability to express parallelism
considerable encoding of the control information.
25(No Transcript)
26- I addressing
- use PC (as the primary source)
- conditional branching
- Condition select subfield
- branch address store a complete
address field or -
lower-order bits of address. -
restricting the range of branch instruction to
a small -
region of CM - Timing
- monophase a simple clock pulse
synchronize all the control signals. - control signals are
active for the duration of instructions
execution cycle - polyphase divide a clock cycle into
phases and control signal is active - during one of the
phase. Increase the complexity of the
I -
format ( to
specify the phase of which -
control signal)
27Ex) Timing of 4-phase I. ( R R1 op R2 )
28(No Transcript)
29A microprogram sequencer generates a I
addresses for CM and comprises PC and all
the logics needed for next address generation
30Minimizing the width of CM
Is I1 I2 In Each activates a
subset of control signals C1 C2 Cm want
an encoding method
cant be activated at the same time.
An encoded control field can activate only one
control signal at a time. Two control signals can
be included in the same control field if and only
if they are never simultaneously activated by a
I.
31Algorithm
1. Find the set of Maximal compatibility class
(MCC) defined as the compatibility classes
to which no control signal can be added without
introducing a pair of incompatible control
signals. An encoded control field can activate
only one control signal at a time. Two
control signals can be included in the same
control field iff they are never
simultaneously activated by a I. (i.e. they are
compatible). Two control signals Ci1 and Ci2
are compatible if Ci1Ij implies Ci2Ij and
vice versa. The compatibility class is a set
of control signals that are pairwise compatible.
2. Determine all minimal MCC covers. A minimal
MCC cover is the minimal set of MCC that
includes each control signal. ( Note that a
minimal MCC cover does not always yield a
minimum value of the cost function W ).
3. For each
minimal MCC covers include each control signal
in exactly one subset of some Ci and
execute the cost W of the resulting solutions and
select one with the minimal cost.
32Deriving MCC
- Denote Si as the set of compatibility
classes Ci such that Ci - contains i Cij control signals.
- S1simply the n original control
signals - Si forms all possible(i)- member
compatibility classes. - Using Si construct Si1 as follow
- For each CiSi add a control signal
Cik to Ci to form C. - If C is a compatibility class then
add C to Si1 and delete Ci and - all subset of C from Si .
- Stop when Sk for some kn1.
- The MCCs are from .
-
- Example Find the minimum of bits in the
control fields. -
-
33Minimal MCC covers (similar to the prime
implicant covering problem)
Cover Table row for each MCC Ci
column
for each control signal Cij C1 a C2 cd C3
bde C4 bdh C5 deg C6 dgh C7 efg C8
fgh
34- Find the Minimal MCC covers
Row and column deletion from a cover table.
1. Delete all essential MCC and
all column with in essential rows.
2. Delete all but one of identical columns.
3. Delete all domination columns.
4. Delete all domination rows. - After finding two essential MCC C1 and C2
we can get the reduced cover table.
35If C1C2C4C7a cd bh efg width W
7 bits If C1C2C4C7a c bdh
efg width W 6 bits
36Encoding by function
- A drawback of the minimum-width control field
functionally unrelated control -
signals are
combined.
37Multiple -Instruction formats
- Branch instructions which specify no
control signals. - action instructions with no branching
capability. - This approach is used at the instruction
level.
38-program sequencer
- to place all the circuitry required to
generate I addresses in a single IC - with the advance of VLSI.
- a general purpose building block for
-programmed CU. - simplify CPU design.
39 Nanoprogrammed Computer
-programmed
Computer.
Instruction
PC
Control signals
CM
IR
nanoprogrammed Computer
Instruction
Control signals
nIR
Criteria Size of CM
Speed reduction(programming needs fetch one
time/nanoprogramming twice) due to extra
memory access and complex controller. The
advantage of nanoprogramming is the greater
design flexibility
40(Compare the size of CM) Size of control memory
in nanoprogramming
CM
HmWm
Hm
Total size HmWmHnWn S2
Size of comparable single-level CM
HmWm S1
Hm
Usually Hm large Wm small Hn
small Wn large (Many micro-instructions can use
the same nano- programmed control)
41Big adv. of nanoprogramming Design
flexibility
1-level CM
Nanoprogramming
S2 Hm (log2Hm log2Hn) Hn N Let r
Hn/Hm ratio of unique nano-control states to
total of -control states for all instructions.
Hn rHm S2 Hm (log2Hm
log2rHm) rHm N Hm ( 2
log2Hm log2r rN )
42Example) For 68000 Processor(N 70 Hm 650
r 0.4) which approach is better
1-level CM design S1 650 (log2650 70)
52000
Nanoprogramming S1 650 (log2650
log2260 ) 260 70 30550
In this case nanoprogramming is better than
microprogramming
435.3 Pipeline Control
- Performance measure by throughput in MIPS
-
where f is the pipelines clock frequency.
44- Efficiency(utilization)
- Speedup
-
- T(m) the execution time on an
m-stage pipeline - T(1) the execution time on a
non-pipelined processor - S(m) m E(m)
45- Performance/cost ratio
-
- where f pipelines clock
frequency - K hardware cost
- Suppose the pipeline has m stages for SI.
- a the delay of a non-pipelined processor
for SI - each stage of P delay a/m and extra delay
b due to the buffer resister - hardware cost K cm d
- c buffer-register cost per stage
- d cost of the pipelines data
processing logic -
46- To maximize PCR with respect to m
475.3.3 Superscalar Processing
- Superscalar operation performs more than one
instruction per cycle by - fetching decoding and executing several
instructions concurrently. - A superscalar computer has a single CPU
that attempts to exploit the parallelism that is
implicit in computer programs with multiple
execution units.
48- In Fig. 5.66 the superscalar design has a
potential speedup of 10. - With K independent m-stage pipeline E-units
speedup factors of a - superscalar CPU
- heavy demand on the
instruction-fetch logic - a large fast instruction and
data cache -
- Important factors for PCU of a superscalar
computer - Instruction types A floating-point add
instruction has to be issued to a - floating add instruction has to be
issued to a floating-point E-unit not to - an integer E-unit.
- E-unit availability.
- Data dependencies To avoid conflicting use
of register data-dependency - constraints among the operands must be
satisfied. - Control dependencies Reduce the impact of
branch instructions on pipeline - efficiency.
- Program order Instructions must eventually
produce results in the order - even if the results may be computed
out-of-order internally. - read dynamic instruction scheduling
and branch prediction.