Topic II Instruction-Set Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

Topic II Instruction-Set Architecture

Description:

Accumulator is only really beneficial for a chain (sequence) of calculations ... Add, subtract, shift can only be done to A (8-bit accumulator) ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 45
Provided by: guang4
Category:

less

Transcript and Presenter's Notes

Title: Topic II Instruction-Set Architecture


1
Topic IIInstruction-Set Architecture
  • Introduction
  • A Case Study The MIPS Instruction-Set
    Architecture

2
Reading List
  • Slides Topic2x
  • Henn Patt Chapter 2
  • Other papers as assigned in class or homeworks

3
The Stored Memory Computer
  • Five parts of a computer
  • Datapath (channels/changes bits)
  • Control (directs operations)
  • Memory (places to keep bits)
  • Input (get data from outside)
  • Output (send data to outside

4
Steps in Executing an Instruction
  • Instruction Fetch Fetch the next instruction
    from memory
  • Instruction Decode Examine instruction to
    determine
  • What operation is performed by the instruction
    (e.g., addition)
  • What operands are required, and where the result
    goes
  • Operand Fetch Fetch the operands
  • Execution Perform the operation on the operands
  • Result Writeback Write the result to the
    specified location
  • Next Instruction Determine where to get next
    instruction

5
What is Specified in an ISA?
  • Instruction Decode How are operations and
    operands specified?
  • Operand Fetch Where can operands be located? How
    many?
  • Execution What operations can be performed? What
    data types and sizes?
  • Result Writeback Where can results be written?
    How many?
  • Next Instruction How can we choose the next
    instruction?

6
A Simple ISA Memory-Memory
  • What operation can be performed? Basic arithmetic
    (for now)
  • What data types and sizes? 32-bit integers
  • Where can operands and results be located? Memory
  • How many operands and results ? 2 operands, 1
    result
  • How are operations and operands specified?
  • OP DEST, SRC1, SRC2
  • How can we choose the next instruction? Next in
    sequence

7
Memory Model
  • Think of memory as being a large array of n
    integers, referenced by the index (random Access
    Memory, or RAM)

For instance, M1 contains the value 3. We can
read and write these locations. These are the
only locations available to us. All abstract
locations (such as variables in a C program) must
be assigned locations in M.
Address Contents
0
14
1
3
2
99
. . .
. . .
N - 1
0
8
Simple Code Translation
  • Given the C code
  • A B C
  • Assuming that we could decide that variable A
    uses location 100, B uses 48, and C uses 76.
    Convert the code above to the following
    assembly code
  • ADD M100, M48, M76
  • How would we express
  • A (B C) (D E)

9
Using a Temporary Location
  • Assume we put A in 100, B in 48, C in 76, D in
    20, and E in 32.
  • Now choose an unused memory location (e.g., 84).
  • ADD M100, M48, M76 A B C
  • ADD M84, M20, M32 temp D E
  • MUL M100, M100, M84 A A temp

10
Problems with Memory-Memory ISAs
  • Main memory much slower than arithmetic circuits
  • This was as true in 1950 as in today!
  • It takes a lot of room to specify memory
    addresses
  • Results are often used one or two instructions
    later
  • Remember make the common case fast!
  • Solution store temporary or intermediate results
    in fast memories near the arithmetic units.

11
Accumulator Machines
  • An accumulator machine keeps a single
    high-speed buffer (e.g., a set of D latches or
    flip-flops, one for each data bit) near the
    arithmetic logic.
  • In the simplest kind, only one operand can be
    specified the accumulator is implicit OP
    operand means
  • acc. acc. OP operand
  • Example
  • LOAD M48 Load B into acc.
  • ADD M76 Add C to acc. (now has BC)
  • STORE M100 Write acc. To A

12
Accumulator Machines Does A(BC)(DE)
  • LOAD M20 Load D into acc.
  • ADD M32 Add E to acc. (now has DE)
  • STORE M100 Write acc. To A
  • LOAD M48 Load B into acc.
  • ADD M76 Add C to acc. (now has BC)
  • MUL M100 Multiply A to acc.
  • STORE M100 Write (BC) (DE) to A

13
Shortcomings of Accumulator Machines
  • Still requires storing lots of temporary and
    intermediate values in memory
  • Accumulator is only really beneficial for a chain
    (sequence) of calculations where the result of
    one is the input to the next.

14
Still, Accumulator Machines Were Common in Early
Computers
  • A simple design, and hence popular, especially
    for
  • Early computers
  • Early microprocessors (4004, 8008)
  • Low-end (cheap) models
  • Reason accumulator logic much more expensive
    than memory
  • Vacuum tubes vs. core memory
  • D flip-flops vs. DRAM
  • Precious space on processor chip vs. off-chip DRAM

15
Alternatives to Accumulator Machines
  • If more hardware resources are available, put
    more fast storage locations alongside the
    accumulator
  • Stack machines
  • Register machines
  • Special purpose
  • General purpose

16
Stack Machines
  • Idea A pile of fast storage locations with a top
    and a bottom.

An instruction can only get at the top value, or
may be the top two or three values. We can put
new values on the top (push) or take them off
the top (pop) but thats it. We cant get to
locations underneath the top unless we remove
everything above.
Address Contents
top
14
2nd from top
3
3rd from top
99
. . .
. . .
bottom
0
17
Stack Machine ISA
  • Basic operations include
  • Load get value from memory and push onto stack
  • Store pop value off of stack and put into memory
  • Arithmetic pop 1 or 2 values off of stack push
    result on stack
  • Dup Get value at top of stack without removing
    push new copy onto stack (why is this useful?)

18
Stack Machine Does A(BC)(DE)
(stack top at start)
(DE)
ADD
XXX
(D)
LOAD M20
XXX
(B)
(DE)
LOAD M48
XXX
(E)
(D)
(continued next slide)
LOAD M32
XXX
19
Stack Machine (cont.)
((BC)(DE))
(B)
XXX
MULT
(DE)
LOAD M76
XXX
STORE M100
(BC)
(DE)
ADD
XXX
Note that the stack is now the same as when we
began.
20
Stack Machines Used
  • Some early computers
  • 8086 floating point unit (sort of)
  • Java Virtual Machine (JVM)

21
Register Machines
  • Idea Put more storage locations (registers)
    near the accumulator
  • Regs have names/numbers and can be used instead
    of memory
  • Accessed much faster than main memory
  • (1-2 CPU cycles vs. 10s to 100 cycles)
  • Far fewer registers than memory locations
  • MIPS has 32 32-bit registers
  • Fewer regs, smaller addresses, fewer bits to name
    them
  • A scarce resource use them carefully!

22
Special- vs. General-Purpose Registers
  • A special-purpose register is used for specific
    purposes and there may be limitations on which
    operations can use it
  • Easier on the HW design put the reg right where
    its needed
  • More difficult for the compiler to use
    effectively
  • A general-purpose register can be used in any
    operation
  • - Datapaths more general, but routing is more
    difficult

23
Special-Purpose Registers The Z-80 CPU
  • Seven 8-bit registers A, B, C, D, E, H, L (BC,
    DE, HL can be pairs)
  • Three 16-bit registers SP, IX, IY, plus PC
    (Program counter)
  • Add, subtract, shift can only be done to A (8-bit
    accumulator)
  • Increment and decrement can be done to all regs
    and reg pairs
  • Can fetch from memory at address (HL) and put in
    any 8-bit reg
  • A fetch from address (BC) or(DE) can only go to A
  • Fetches from (BC), (HL) and (IX) take different
    numbers of cycles
  • Anyone want to write a compiler for this?

24
General Purpose Register (GPR) Machines
  • The MIPS (and similar processors) has 32 General
    Purpose Registers (GPRs), each 32 bits long. All
    can be read or written, except register 0,
    whichis always 0 and cant be changed.
  • Register access time is uniform.

Address Contents
0
0
1
3
2
99
. . .
. . .
31
14
25
GPR Machine Does A(BC)(DE)
  • ADD 1 M48, M76 R1 B C
  • ADD 2 M20, M32 R2 D E
  • MUL M100, 1, 2 A R1 R2

26
Some Trend
  • From hardware technology number of Rs can be
    put on chip has potential grow very fast (Moores
    Law ?)
  • Very large register set will have slow access
    time.
  • Instruction set evolution is slow to accommodate
    the change of of Rs

27
Memory and Data Sizes
  • So far, weve only talked about uniform data
    sizes. Actual data come in many different sizes
  • Single bits (boolean values, true or false)
  • Bytes (8 bits) Characters (ASCII), very small
    integers
  • Halfwords (16 bits) Characters (Unicode), short
    integers
  • Words (32 bits) Long integers, floating-point
    (FP) numbers
  • Double-words (64 bits) Very long integers,
    double-precision FP
  • Quad-words (128 bits) Quad-precision
    floating-point numbers

NOTE There is another data size which is called
extended double precision which is 80 bits long.
Used in x86 FPUs
28
Different Data Sizes
  • How do we handle different data sizes?
  • Pick one size to be the unit stored in a single
    address
  • Store larger datum in a set of contiguous memory
    locations
  • Store smaller datum in one location use shift
    mask ops
  • Today, almost all machines (including MIPS) are
    byte-addressable each addressable location in
    memory holds 8 bits.

29
MIPS Memory
  • On a byte-addressable machine such as the MIPS,
    if we say a word (32 bits) is stored at address
    80, we mean it occupies locations 80-83. (The
    next word would start at 84.)
  • Normally, multi-byte loads and stores must be
    aligned. The address of an n-byte load/store
    must be a multiple of n. For instance, halfwords
    can only be stored at even addresses.
  • MIPS allow non-aligned loads and stores using
    special instructions, but they may be slower.
    (Most processors dont allow this at all!)

30
Byte-Order (Endianness)
  • For a multi-byte datum, which part goes in which
    byte?
  • If 1 contains 1,000,000 (F4240H) and we store it
    into address 80
  • On a big-endian machine, the big end goes
    into address 80
  • On a little-endian machine, its the other way
    around

00 0F 42 40
79 80 81 82 83 84

40 42 0F 00
79 80 81 82 83 84

31
Big-Endian vs. Little-Endian
  • Big-endian machines MIPS, Sparc, 68000
  • Little-endian machines most Intel processors,
    Alpha, VAX, Intel 8086
  • No real reason one is better than the other
  • Compatibility problems transferring multi-byte
    data between big-endian and little-endian
    machines CAREFUL!
  • Read Appendix A-43 for more information.

32
Addressing Modes
  • - An ISAs addressing modes answer the question
    where can operands be located?
  • We have two types of storage in the MIPS (and
    most other machines) registers and main memory.
  • We can go to either or both for operands. A
    single operand can come from either a register or
    a memory location
  • and addressing modes offer various ways of
    specifying this location.

33
Simple Addressing Modes
  • In these modes, a location or datum is given
    directly in the instruction

Mode name Example Meaning
Register mov 1, 2 R2 R1
Direct (or absolute) mov 1, (40) M40 R1
Immediate mov 1, 40 40 R1
34
Indirect Addressing Modes
  • One or more registers are used to produce a
    memory address

Mode name Example Meaning
Reg. Indirect mov 1, (2) MR2 R1
Displacement mov 1, 40(2) M40R2 R1
Indexed mov 1, 4(2) MR4R2 R1
Mem. Indirect mov 1, _at_(2) MMR2 R1
35
Advanced Addressing Modes
  • Extra features to support features in high-level
    languages or reduce the number of instructions
    during common memory accesses

Mode name Example Meaning
Auto-increment mov 1, 4(2) M4R2 R1
Auto-decrement mov 1, 4(2) - - MR2-4 R1
Scaled mov 1, 40(2) s M40R2xs R1
36
Choices in Addressing Modes
  • Anything goes Any addressing mode may be used
    for any operand at any time
  • - Easier to map high-level statements directly
    to instructions
  • - Hard to design processor, due to all the
    complexity
  • Limited addressing Only allow a few modes,
    and/or restrict some operands to certain modes
  • - Harder for compiler/programmer to follow all
    the rules
  • - Code may be longer

37
Frequency of Addressing Modes
  • 3 programs measured on VAX, which supports all
    kinds of modes

Frequency of mode () Min. ave. max.
Mode Name
Displacement 32 42 55
Immediate 17 33 43
Reg. Indirect 3 13 24
Scaled 0 7 16
Mem. Indirect 1 3 6
Others 0 2 3
38
Empirical Data on Addressing Modes
  • How big do the displacements need to be?
  • In study of SPECin92 and SPECfp92, 99 of
    displacements fell within 215
  • How big do the immediates (constants) need to be?
  • Studies show 50 - 60 fit within 8 bits
  • 75-80 fit within 16 bits

Excercise search current results (e.g. for
SPEC2005 ?)
39
How Do We Represent Instructions?
  • We need some bits to tell what operation is
    performed (e.g., add, sub, mul, etc.) this is
    called the opcode.
  • We need some bits for each operand and result (3
    total, in our case)
  • What type of addressing mode
  • Number of the register, memory address and/or
    immediate constant

40
Variable-Length Instructions
  • Since the VAX allows any mode for any operand,
    there could be an instruction with three 32-bit
    addresses (direct addressing) ? gt 12 bytes in
    this instruction.
  • But registers need only a few bits to specify, so
    12 bytes would be wasteful for an instruction
    using 3 registers only!
  • Must use variable-length instructions. On the
    VAX, instructions can vary from 1 to 17 bytes!

41
Fixed-Length Instructions
  • If every instruction has the same number of bits
    (preferable a nice even number like 16 or 32),
    many components of the processor will be simpler.
  • But we either waste some amounts of space or
    cant support all the addressing modes!

42
Loading Small Integers
  • All registers in MIPS are 32 bits
  • What if we load a byte or halfword into a reg?
  • Load the bits into the lowest 8 or 16 bits of the
    reg.
  • Unsigned load All upper bits set to 0
  • Signed load All upper bits set to sign bit (MSB
    of byte/halfword)

43
The RISC Approach
  • In a Reduced Instruction Set Computer
  • All instructions are the same size (32 bits on
    the MIPS)
  • Few addressing modes are supported (only the
    frequent ones)
  • Only a few instruction formats (makes decoding
    easier!)
  • Arithmetic instructions can only work on
    registers
  • Data in memory must be loaded into registers
    before processing
  • - This is called a load-store architecture

44
RISC Criteria Colwell 85
  • Single cycle operation
  • Load/store machine
  • Hardwired control
  • Relative few instructions and addressing modes
  • Fixed instruction format
  • More compile time effort
Write a Comment
User Comments (0)
About PowerShow.com