EEE515J1 ASICs and DIGITAL DESIGN - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

EEE515J1 ASICs and DIGITAL DESIGN

Description:

EEE515J1 ASICs and DIGITAL DESIGN Lecture 6: Data Processors and Control Units Ian McCrum Room 5D03B Tel: 90 366364 voice mail on 6th ring Email: IJ.McCrum_at_Ulster.ac.uk – PowerPoint PPT presentation

Number of Views:176
Avg rating:3.0/5.0
Slides: 22
Provided by: eejUlstA2
Category:

less

Transcript and Presenter's Notes

Title: EEE515J1 ASICs and DIGITAL DESIGN


1
EEE515J1ASICs and DIGITAL DESIGN Lecture 6 Data
Processors and Control Units
Ian McCrum Room 5D03B Tel 90 366364 voice mail
on 6th ring Email IJ.McCrum_at_Ulster.ac.uk Web
site http//www.eej.ulst.ac.uk
Last changed 01/11/04_at_1800
2
Designing Larger Digital Systems
  • We have seen how designing Finite state machines
    (FSMs) is relatively straightforward once the
    state diagram or design specification is drawn.
  • Together with combinational logic these design
    methods will stand you in good stead.
  • Of course there are problems that would be rather
    large or tedious to solve using these methods
    such as a system with a large number of inputs or
    one with a large variety of actions or steps to
    be performed.  
  • We can modify the FSM approach.
  • Having one FSM send inputs and receive outputs
    from another FSM is a useful technique, such
    cascaded or coupled FSMs are found in real
    designs
  • the design techniques used will depend on whether
    the two FSMs have synchronous clocks.
  • If not then the system is an asynchronous one
    and will use handshake and control to effect
    synchronisation between the machines.
  • We will not dwell (sic) on such machines here
    except to note that testing asynchronous systems
    is difficult, error prone and can give a design
    which is difficult to modify late in the design
    cycle.

3
The Algorithmic State Machine method
  • Other modifications to the basic FSM method might
    add memory such as stack or heap structures and
    have state machines route data to and from these
    memory structures.
  • A more general approach is described below.
  • Another alternative is to use a computer or
    microprocessor system and write software.
  • Actually a computer is just an instance of a
    digital system and the stored program concept on
    which its application is based is similar to the
    design method below so it should come as no
    surprise that if you can master the method below
    you will understand how computers actually work,
    and could even design your own CPU.

4
The ASM Method
  • Instead of concentrating on simply moving from
    state to state we can decompose our problem into
    a number of sections.
  • If we must process input data and can identify
    simple operations to be performed on the data
    then we can sequence and control the flow of data
    to and from each data processing block using FSM
    design methods.
  • Thus we partition our system into a DATA
    PROCESSOR and a CONTROL LOGIC section.
  • The data processor has functional blocks that do
    something to the incoming data or locally
    generated data such as a count of items
    processed.
  • A good design rule is that each functional block
    should do one thing and be easily described. It
    might be a counter, an added or comparator or
    shift register. It could even be a complete ALU.
  • The Control Logic sends control signals to each
    block and receives status signals or information
    about the data but not the data itself. Many
    choices can be made by the designer but as a rule
    this partition gives an easily designed, easily
    tested and easily modified system

5
The ASM Method

An ALU or Arithmetic Logic Unit has typically 2
data inputs and a data output all 8 or 9 bits
wide. It also has 3 or 4 inputs to indicate what
to do. The 3 bit binary number 000111 might
specify FAB, A-B, B-A, A and B, A or B and
maybe FA, FB and F11111111
6
Example of ASM method
  • Averaging 16 numbers each of 8 bits in size
  • Method 1 use 8 adders to add 8 pairs of numbers,
    this gives 8 9 bit numbers (worst case)
  • Use 4 9-bit adders to give four 10 bit answers
  • Use 2 10 bit adders to give two 11 bit answers
  • Finally use a 11 bit adder giving a 12 bit
    answer, we can use a trick to divide by 16
    simply use the 8 left most bits of the 12 bit
    number, akin to shifting right 4 bits, this is
    division by 24.
  • This is obviously most wasteful of space, but
    achieves a reasonably fast answer, 4 add-times.
  • Actually adders are slow, though there are a
    number of special techniques to speed up
    addition, c.f carry-lookahead-adders.
  • Clearly a more space efficient system would be to
    do the calculation the way humans would do it.
    Use a running total and add sequentially, I.e use
    one adder and pass the data through it one number
    at a time.

7
Example of ASM method
  • State equations
  • S0.D S0./s S2
  • S1.D S0. S
  • S2.D S5.EQ16
  • S3.D S1 S6
  • S4.D S3
  • S5.D S4
  • S6.D S5./EQ16

Output equations CLEAR S1 ADD S3 STROBE
S5 COUNT S6 DATAVALID S2
8
Signals to the outside world
  • Several unanswered problems remain with the
    previous design
  • Exactly when the input arrives
  • The datavalid pulse is only available for a short
    time
  • It would be better ( cheaper?)to use
    countdown counter.
  • Often when doing an initial ASM design, the
    interface to the outside world (or the next
    machine in the chain)is not given much attention.
  • A typical, useful approach is to provide
    handshake lines to allow flow control. Thus

ack
RECEIVER driven, Wait for REQUEST I/p then o/p
data, then o/p DATAVALID, often just a timed
pulse , a low-high-low
Sender driven, o/p data, then o/p strobe, keep it
high until ack is seen from far end
9
ASM machines demand synchronous logic
  • Even simple latches are best driven in a
    synchronous manner, even though applying a
    latch or strobe signal to the clocks of a
    register ( e.g 8 D-type flip-flops) will work, a
    more testable circuit results if the master clock
    goes to every component.
  • Thus the D-types spend most of their time in a
    held state and only load data when the strobe
    signal is high
  • This is easily achieved by adding multiplexors

10
Using a CLOCK
  • The role of the clock is very important in the
    ASM method.
  • As has been said before, having everything
    synchronised to a single clock can ease testing
    and last minute design modifications.
  • In very large systems you will find systems that
    use two phase clocks where the rising edge is
    used by one section of a system and the next
    section uses the falling edge.
  • Or latches are provided to isolate adjacent
    sections.
  • Multiphase clocks exist, a 4 phase solution
    allows the soldiers all to march in step.
  • Very large fast systems will have problems
    routing a clock signal from one edge of a chip to
    the other and several solutions exist to fix
    this.
  • Often the designer will lay down the clock
    distribution network before adding other gates.
  • A matrix of equal delay buffers may allow
    distribution with a low timing skew across chip.
  • Also used today is local generation of the clock
    and a system of phase locking ( cf www.altera.com
    for a description of their DPLL cells). This can
    also allow the clock frequency off-chip to be
    much lower than the clock on the chip, the phase
    locking can be done at a sub multiple of the
    clock frequency. I first saw this on a Transputer
    chip were the chip internally worked at 20MHz but
    you only needed to supply the chip with a 5 Mhz
    oscillator. The PCB layout was less critical and
    the emitted RF noise was much less with this
    approach. You may be aware it is used a lot in
    modern PC CPU design, sometimes the internal
    clocks run at 3.5 times the external clocks!. (
    cf www.tomshardwareguide.com )

11
Synchronous Control signals
  • A key to initial ASM designs is to have very
    strict synchronisation. This rule has even
    prompted some TTL companies to bring out two
    versions of their chips the 74163 and 74163A
    counters are identical except that the RESET
    action is synchronised on one version but
    asynchronous on the other.
  • Once you are familiar with the method and have a
    dozen designs under your belt you may relax this
    strict rule somewhat.
  • Chips such as counters and shift registers can
    undertake various control actions the RESET,
    LOAD, PRESET, DIRECTION controls for a counter
    are all VERBS of ACTION. An important part of
    the method is to recognise that whilst your
    control logic may assert these control inputs
    they are NOT acted upon until the next clock
    pulse. Thus the ACTION is not taken until the
    clock pulse. This makes the design diagrams
    easier to follow.

12
The Design Method
  • There are two main steps both graphical in
    nature a block diagram of the data processor and
    the ASM chart describing the sequence of data
    operations to be performed. Different problems
    sometimes lend themselves to applying these in
    different orders. The data processor is a block
    diagram or circuit diagram where each block is a
    simple functional circuit. As a guide each block
    should be available as a TTL chip but if you have
    little experience of the TTL family a further
    guide should be to ensure that it performs a
    single, easily explained task. Each block should
    be simple to design such as a combinational
    problem or a very simple FSM.
  •  
  • All control signals MUST be synchronous.
    Combinational circuits such as ADDERS might have
    a synchronous ADD control signal or you can just
    assume the answer pops out the bottom of the
    adder. You must ensure that the propagation
    delays of each data processor block do not cause
    problems if these are all much faster than the
    clock then there will be no problem. It is
    possible to insert dummy states into the Control
    logic to wait for answers to appear, or we must
    complicate our system by adding status signals
    e.g ADDER_COMPLETE

13
The Design Method continued
  • The ASM chart is comprised of boxes of just three
    types.
  • It superficially resembles a programming
    flowchart. There is one crucial difference
    Programming Flowcharts are read sequentially from
    the top of the page to the bottom, if there is
    only one CPU then this also represents the time
    behaviour of the program.
  • Obviously in a hardware circuit with a couple of
    counters the counting of one counter does not
    wait for the counting of another. Both pieces of
    hardware operate at the same time, concurrently.
  • In fact the different parts of the Data Processor
    in an ASM all operate at the same time. If we
    have a section of an ASM chart where a counter is
    told to count, an input is tested and an output
    is generated then these actions will all be
    scheduled to happen at the same time.
  • Of course it will take the next clock pulse to
    action the events.
  • Each state in an ASM chart has only one output
    box.
  • It may have a number of input testing boxes and
    output boxes conditional on some inputs but there
    must only be one main output box per state.
  • All arrows arriving at that state must go through
    this box.
  • We label the state by labelling that output box
    but be clear where the dotted lines that form the
    boundary of our state lie, see Figure 2 overleaf.

14
The Design Method continued
  • Note some texts will name the state inside a
    bubble shown as a dotted circle. Here I have
    listed the state S0, with a state code of 0001.
    (I will use one-hot codes for the state code but
    there is no reason why a more efficient code
    couldnt be used)
  • When in state zero you are in all boxes inside
    the dotted line simulaneously! Depending on input
    conditions. Thus the single bit input E is
    tested at the same time as the single bit input
    F is tested, the PRESET or LOAD_ALL_ONES
    control signal of the 8 bit register R2 is
    asserted if E is high, it flickers if E flickers
    but of course we should try and use synchronous
    inputs where possible. The Adder ( or counter?) A
    is to increment and the RESET signal of R1 is
    asserted.
  •  
  • Maybe you see now why all control signals are
    only activated on a clock pulse. All these
    control signals are set or cleared but NO
    action takes place until the clock pulse arrives
    that will take the machine to its next state,
    down one of the three arrows exiting the box.

15
The Design Method continued
  • One of the consequences of this method means that
    if a test is activated instantly on entering a
    state then it is based on the old values of the
    inputs.
  • If the state alters an input then we must be most
    careful. If the conditional boxes above tested
    the counter/adder A then it would exit depending
    on the old value of A, despite A altering as we
    left the state.
  • It is a good idea not to test a signal in the
    same state as you attempt to alter it
  • It is easy to add dummy states (empty state
    boxes) to cause a one clock cycle delay and this
    can decouple the two effects. It is usually a
    good idea to avoid two tests within one state.  
  • These rules or guidelines can be broken but
    adherence will increase the likelihood that the
    system will work!

16
Counting 1s in a 16 bit word.
The previous example was extremely abstract, a
more typical application follows we begin with
an English description of the problem. A
system is needed that will count the number of
ones in a 16 bit word. The design should be
easily modified for a 32 bit word.   This is a
nice example because, as in real life, there are
many possible solutions, the good designer will
reject all but one of these, the one that is
picked will be for a good reason! Here we will
adopt an ASM method to illustrate the design
method. Speed of response or cost may push a real
designer to different conclusions.
17
Solution 1b create a 4 bit cell and iterate the
answer. Adders will be needed to combine the four
outputs and this will be a slower, but easier to
design solution.
Solution 1a
The answer will be between zero and 16 inclusive.
This needs 5 bits to represent it (0000010000)
Solution 2 Use a shift Register and counter.
This will demonstrate the ASM method quite
nicely. Note that the two solutions trade space
and time. The pure combinational approach is
fastest but largest. We will use a shift register
and shift each bit out in turn if it is a 1 we
will increment a counter. As is often the case we
need to know when to stop. This could be done by
having a loop counter keep track of how many
shifts we had done, beginners usually set up a
counter to go from zero ( or 1) to 16. This may
be out by one and a comparator is needed.
Experienced ASM designers ( and programmers)
preload a counter with 15 and decrement to zero
or find an alternative. Here we will use a clever
trick to save time. By shifting zeros into our
word as we shift our data out we can test for all
zeros to exit our loop. In the case where there
are few ones this may give an impressive speed
advantage, at the disadvantage that the execution
time of our machine varies according to the input
data that is not always allowed.
18
Solution 2 Shift Register and adder
  • This will demonstrate the ASM method quite
    nicely. Note that the two solutions trade space
    and time. The pure combinational approach is
    fastest but largest.
  • We will use a shift register and shift each bit
    out in turn if it is a 1 we will increment a
    counter.
  • As is often the case we need to know when to
    stop. This could be done by having a loop counter
    keep track of how many shifts we had done,
    beginners usually set up a counter to go from
    zero ( or 1) to 16. This may be out by one and a
    comparator is needed.
  • Experienced ASM designers ( and programmers)
    preload a counter with 15 and decrement to zero
    or find an alternative.
  • Here we will use a clever trick to save time. By
    shifting zeros into our word as we shift our data
    out we can test for all zeros to exit our loop.
  • In the case where there are few ones this may
    give an impressive speed advantage, at the
    disadvantage that the execution time of our
    machine varies according to the input data that
    is not always allowed.

Initial sketch of Data Processor
19
Solution 2 Shift Register and adder
20
Solution 2 Shift Register and adder
The one-hot equations for this machine are as
follows T0.d T0 /S T1 Z T1.d
T3 E T0 S T2.d T1 /Z T3
/E T3.d T2 this causes a one clock
delay between altering E and testing E. Also the
control signals are LOAD T0 S COUNT
T1 SHIFT T2
21
Try the tut questions!
See the file ASMTUTS.pdf on the website
The only trick to some of them is the use of a
pipeline, a line of registers to allow access to
older data
Ill do a DSP pipeline design on the board, its
not hard. Remember real ADCs will need to be
given a SC control signal and will return an EOC
status signal. These stand for START_CONVERSION
and END_OF_CONVERSION.
Write a Comment
User Comments (0)
About PowerShow.com