Title: Architectures of Digital Information Systems Part 2: Programmable I/O and Multiprocessors
1Architectures ofDigital Information SystemsPart
2 Programmable I/O and Multiprocessors
- dr.ir. A.C. VerschuerenEindhoven University of
TechnologySection of Digital Information Systems
2Programmable input/output controllers
- Many I/O control tasks can be done in software,
using simple parallel ports and timers - Keyboard scanning and encoding
- Simple motor control
- Pulse counting for position encoding
- Other non-standard low speed (but time-critical)
tasks - Dont use main processor for this simple stuff
! - Use programmable I/O controllers here modified
single-chip microcomputers
3The Intel 8042 slave processor
4The 8042 'master CPU interface'
- Flag 0 (and 4 other status reg. bits) are user
defined
5The Z8090 Universal Peripheral Controller
- Based upon the Zilog Z8 microcomputer
- 8 bits CPU, 2 KB program ROM, 256 byte data RAM
- Memory mapped I/O includes timers and parr. ports
- Master CPU interface differs a lot from 8042
- Master reads/writes 16 byte window in data
RAM window location controlled by Z8090 program - Simple form of DMA to/from data RAM start and
end locations controlled by Z8090 program - Z8090 interrupts master by setting output bit
- Master interrupts Z8090 by dummy write action
6Co-processors divide and conquer
- A co-processor' is hardware which takes over
(software) functions from the main CPU - This increases the speed of the system as a whole
- The CPU has fewer functions to perform
- Co-processors can use customised (fast)
hardware instead of standard hardware running
software - Co-processors should not bother the CPU
- Use DMA to transfer data, commands and results
- Use interrupts to signal important things
only interrupts may run in both directions !
7Closely coupled' co-processors
- Keep track of instructions executed by main CPU
- Are actually controlled by these instructions
- Some instructions are treated as 'no-operation'
by main CPU - These trigger the co-processor to start a
specific operation - Data transfer is done with DMA
- The address may be provided by main CPU using a
'dummy' read cycle during execution of the
'no-operation' instruction - Result codes transferred with DMA or special I/O
ports - Synchronisation is absent or uses special
hardware - Used to extend the main CPU instruction set (f.i.
floating point)
8Loosely coupled' co-processors
- Have no connection with main CPU instructions
- May even execute their own programs !
- Commanded by explicit I/O actions from the CPU or
command blocks in memory (with an attention
signal) - Returns results through memory or explicit I/O
actions after interrupting the main CPU - Used to off-load complete I/O related tasks from
the main CPU(for instance the device drivers in
an O.S.) - Also used to speed complex data processing tasks
if theco-processor contains better hardware than
the CPU
9DMA co-processor programmable I/O
- Handle I/O tasks including high speed transfer of
data blocks (8042 DMA is low speed) - Run their own programs (stored in DMA memory),
controlled by 'messages' in main memory
10Shared memory
- Direct Memory Access allows both the CPU and I/O
devices access to the same main memory - The fastest solution multi-ported shared memory
- CPU and I/O memory accesses do not interfere
- Real 2-port memory is very expensive, 3
ports and up is not available!
11Shared memory with an arbiter
- Multi ported memory may be simulated with an
arbiter and a higher speed (normal) memory
True simultaneous access is impossible!
May haveto wait !
Fast memory is expensive !
12Combine shared and private memory
Simple to have more devices
- Communication confined to a small memory area
- CPU works mostly in private memory using an
arbiter does not degrade performance!
13Modular systems
- Access to the system bus and shared memories
requires arbitration ( data traffic control)
14Memory mapping
- Mapping done by address decoding hardware
- Which can place memories at different addresses !
- Shared local memories require complex arbiters
15Standard system buses
- Standardisation needed for plug and play
- A lot of them exist (Multibus, VME, EISA....)
- Multibus designed by Intel for 80x86 series
- VME bus designed by Motorola for 680x0 series
- They compete for the most complex protocols ?
- Bus signals optimised for one processor (series)
- Using an Intel processor on a VME bus is not
simple
16Special purpose co-processors (1)
- Relatively simple co-processors with a
specialdata path can beat complex standard
processors ! - Co-processors for standard algorithms exist
- Data encryption and decryptionDES and RSA
devices are available.Separate devices are
preferred because of security reasons ! - Data compression and expansionImage (CCITT FAX,
JPEG, MPEG)and data file (LZW ZIP)
(de-)compression devices exist
17Special purpose co-processors (2)
- Parametrisation is possible with writable
constants and programmable sequencing logic - Fast Fourier Transform devices have programmable
address generators and multiplication constants - (In-)Finite Impulse Response filters are
parametrised in the same way to generate
different characteristics - 2-D graphics image filter devices are more of the
same - Used for noise reduction, smoothing
- Edge detection, sharpening, contrast enhancement
- Removing distortions and blurr (very complex!)
18Digital Signal Processing
- Lots of Digital Signal Processors (DSP's) have
been designed for digital filtering operations
- One output requires l adds and (l 1)
multiplications - The last l input values must be remembered and an
array of (l 1) constants must be available
somewhere
DSP multiply-add datapath gt1 memory loop
addressing
19Digital Signal Processors
- Support standard CPU operations more general
purpose than FIR/IIR filter devices ! - They can take decisions based upon the filtered
values and switch between different filter
characteristics - Needed for, for instance, telephone line modems
- They can be programmed for 'strange' input value
addressing schemes - Like used in two-dimensional image filtering
20High performance DSPs parallel
- Multiple on-chip memories with parallel access
using independent data and address buses - Multiple I/O interfaces use DMA to read/write the
memories in parallel to calculations - Programmable address generators running in
parallel to actual multiply/add datapath
Actual calculations use floating point for a
wider'dynamic range' and lower digital output
noise
21The ultimate in DSPs real-time video
- Need on the order of 1 billion operations/second
for 3-D picture generation or video filtering - Intels Multi-Media eXtension (MMX) 8 identical
byte operations with one instruction - Texas Instruments 32080 5 processors (w.
MMX) and 25 memories on one chip - Philips TriMedia 5 MMX-like instructions in
one super-instruction