Instructions: Language of the Computer - PowerPoint PPT Presentation

1 / 87
About This Presentation
Title:

Instructions: Language of the Computer

Description:

Memory is big (lots of address bits) 7. CS 331. Xiaoyu Zhang, CSUSM. Memory-to-memory machine ... big endian byte 0. Alignment: require that objects fall on address ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 88
Provided by: xiaoyu1
Category:

less

Transcript and Presenter's Notes

Title: Instructions: Language of the Computer


1
Instructions Language of the Computer
  • Chapter 2

2
Instruction Set Architecture a Critical
Interface
software
instruction set
hardware
Portion of the machine that is visible to the
programmer or the compiler writer.
3
Good ISA
  • Lasts through many implementations (portability,
    compatibility)
  • Can be used for many different applications
    (generality)
  • Provide convenient functionality to higher levels
  • Permits an efficient implementation at lower
    levels

4
Von Neumann Machines
  • Von Neumann invented stored program computer in
    1945
  • Instead of program code being hardwired, the
    program code (instructions) is placed in memory
    along with data

Control
ALU
Program Data
5
Stored Program Concept
  • Instructions are bits
  • Programs are stored in memory to be read or
    written just like data
  • Fetch Execute Cycle
  • Instructions are fetched and put into a special
    register
  • Bits in the register "control" the subsequent
    actions
  • Fetch the next instruction and continue

memory for data, programs, compilers, editors,
etc.
6
Execution Cycle
Obtain instruction from program storage
Determine required actions and instruction size
Locate and obtain operand data
Compute result value or status
Deposit results in storage for later use
Determine successor instruction
7
Basic ISA Classes
  • Memory to Memory Machines
  • Every instruction contains a full memory address
    for each operand
  • Maybe the simplest ISA design
  • However memory is slow
  • Memory is big (lots of address bits)

8
Memory-to-memory machine
  • Assumptions
  • Two operands per operation
  • first operand is also the destination
  • Memory address 16 bits (2 bytes)
  • Operand size 32 bits (4 bytes)
  • Instruction code 8 bits (1 byte)
  • Example A BC (hypothetical code)
  • mov A, B A
  • add A, C A
  • 5 bytes for instruction
  • 4 bytes for fetch 1st and 2nd operands
  • 4 bytes to store results
  • add needs 17 bytes and mov needs 13 byts
  • Total 30 bytes memory traffic

9
Why CPU Storage?
  • A small amount of storage in the CPU
  • To reduce memory traffic by keeping repeatedly
    used operands in the CPU
  • Avoid re-referencing memory
  • Avoid having to specify full memory address of
    the operand
  • This is a perfect example of make the common
    case fast.
  • Simplest Case
  • A machine with 1 cell of CPU storage the
    accumulator

10
Accumulator Machine
  • Assumptions
  • Two operands per operation
  • 1st operand in the accumulator
  • 2nd operand in the memory
  • accumulator is also the destination (except for
    store)
  • Memory address 16 bits (2 bytes)
  • Operand size 32 bits (4 bytes)
  • Instruction code 8 bits (1 byte)
  • Example A BC (hypothetical code)
  • Load B acc
  • Add C acc
  • Store A A
  • 3 bytes for instruction
  • 4 bytes to load or store the second operand
  • 7 bytes per instruction
  • 21 bytes total memory traffic

11
Stack Machines
  • Instruction sets are based on a stack model of
    execution.
  • Aimed for compact instruction encoding
  • Most instructions manipulate top few data items
    (mostly top 2) of a pushdown stack.
  • Top few items of the stack are kept in the CPU
  • Ideal for evaluating expressions (stack holds
    intermediate results)
  • Were thought to be a good match for high level
    languages
  • Awkward
  • Become very slow if stack grows beyond CPU local
    storage
  • No simple way to get data from middle of stack

12
Stack Machines
  • Binary arithmetic and logic operations
  • Operands top 2 items on stack
  • Operands are removed from stack
  • Result is placed on top of stack
  • Unary arithmetic and logic operations
  • Operand top item on the stack
  • Operand is replaced by result of operation
  • Data move operations
  • Push place memory data on top of stack
  • Pop move top of stack to memory

13
General Purpose Register Machines
  • With stack machines, only the top two elements of
    the stack are directly available to instructions.
    In general purpose register machines, the CPU
    storage is organized as a set of registers which
    are equally available to the instructions
  • Frequently used operands are placed in registers
    (under program control)
  • Reduces instruction size
  • Reduces memory traffic

14
General Purpose Registers Dominate

1975-present all machines use general purpose
registers

Advantages of registers

registers are faster than memory

registers are easier for a compiler to use
-
e.g., (AB) (CD) (EF) can do multiplies in
any order
vs. stack

registers can hold variables
-
memory traffic is reduced, so program is sped up
(since registers are faster than memory)
-
code density improves (since register named with
fewer bits
than memory location)
15
Classifying General Purpose Register Machines
  • General purpose register machines are
    sub-classified based on whether or not memory
    operands can be used by typical ALU instructions
  • Register-memory machines machines where some ALU
    instructions can specify at least one memory
    operand and one register operand
  • Load-store machines the only instructions that
    can access memory are the load and the store
    instructions

16
Comparing number of instructions
  • Code sequence for A BC for five classes of
    instruction sets

Register (Register-memory) load R1 B add R1
C store A R1
Register (Load-store) Load R1 B Load R2 C Add R1
R1 R2 Store A R1
Stack push B push C add pop A
Memory to Memory mov A B add A C
Accumulator load B add C store A
MIPS is one of these
17
Instruction Set Definition
  • Objects architecture entities machine state
  • Registers
  • General purpose
  • Special purpose (e.g. program counter, condition
    code, stack pointer)
  • Memory locations
  • Linear address space 0, 1, 2, ,2s -1
  • Operations instruction types
  • Data operation
  • Arithmetic
  • Logical
  • Data transfer
  • Move (from register to register)
  • Load (from memory location to register)
  • Store (from register to memory location)
  • Instruction sequencing
  • Branch (conditional)
  • Jump (unconditional)

18
Registers (MIPS)
  • 32 registers provided (but not 32-useable
    registers!)
  • R0 .. R31
  • Register R0 is hard-wired to zero
  • Register R1 is reserved for assembler
  • Arithmetic instructions operands must be
    registers

19
MIPS Software conventions for Registers
0 zero constant 0 1 at reserved for
assembler 2 v0 expression evaluation
3 v1 function results 4 a0 arguments 5 a1 6 a2 7
a3 8 t0 temporary caller saves . . . (callee
can clobber) 15 t7
16 s0 callee saves . . . (callee must
save) 23 s7 24 t8 temporary (contd) 25 t9 26 k0
reserved for OS kernel 27 k1 28 gp Pointer to
global area 29 sp Stack pointer 30 fp frame
pointer 31 ra Return Address (HW)
20
Memory Organization
  • Viewed as a large, single-dimension array, with
    an address.
  • A memory address is an index into the array
  • "Byte addressing" means that the index points to
    a byte of memory.

0
8 bits of data
1
8 bits of data
2
8 bits of data
3
8 bits of data
4
8 bits of data
5
8 bits of data
6
8 bits of data
...
21
Memory Addressing
  • Bytes are nice, but most data items use larger
    "words"
  • For MIPS, a word is 32 bits or 4 bytes.
  • 2 questions for design of ISA
  • Since one could read a 32-bit word as four loads
    of bytes from sequential byte addresses or as one
    load word from a single byte address,
  • How do byte addresses map to word addresses?
  • Can a word be placed on any byte boundary?


22
Addressing Objects Endianess and Alignment
  • Big Endian address of most significant byte
    word address (xx00 Big End of word)
  • IBM 360/370, Motorola 68k, MIPS, Sparc, HP PA
  • Little Endian address of least significant byte
    word address (xx00 Little End of word)
  • Intel 80x86, DEC Vax, DEC Alpha (Windows NT)

little endian byte 0
3 2 1 0
msb
lsb
0 1 2 3
0 1 2 3
Aligned
big endian byte 0
Not Aligned
Alignment require that objects fall on address
that is multiple of their size.
23
Assembly Language vs. Machine Language
  • Assembly provides convenient symbolic
    representation
  • much easier than writing down numbers
  • e.g., destination first
  • Machine language is the underlying reality
  • e.g., destination is no longer first
  • Assembly can provide 'pseudoinstructions'
  • e.g., move t0, t1 exists only in Assembly
  • would be implemented using add t0,t1,zero
  • When considering performance you should count
    real instructions

24
MIPS arithmetic
  • Design Principle simplicity favors regularity.
    Why?
  • Of course this complicates some things... C
    code A B C D E F - A MIPS
    code add t0, s1, s2 add s0, t0,
    s3 sub s4, s5, s0
  • Operands must be registers, only 32 registers
    provided
  • Design Principle smaller is faster. Why?

25
MIPS arithmetic
  • All ALU instructions have 3 operands
  • add R1, R2, R3
  • sub R1, R2, R3
  • Operand order is fixed (destination
    first)Example C code A B
    C MIPS code add s0, s1, s2 (registers
    associated with variables by compiler)

26
Execution assembly instructions
  • Program counter holds the instruction address
  • CPU fetches instruction from memory and puts it
    onto the instruction register
  • Control logic decodes the instruction and tells
    the register file, ALU and other registers what
    to do
  • An ALU operation (e.g. add) data flows from
    register file, through ALU and back to register
    file

27
ALU Execution Example
28
ALU Execution example
29
Memory Instructions
  • Load and store instructions
  • lw t1, offset(t0)
  • sw t1, offset(t0)
  • Example C code A8 h A8 assume h in
    s2 and base address of the array A in s3
  • MIPS code lw t0, 32(s3) add t0, s2,
    t0 sw t0, 32(s3)
  • Store word has destination last
  • Remember arithmetic operands are registers, not
    memory!

30
Accessing Data
  • ALU generates address
  • Address goes to Memory address register
  • When memory is accessed, results are returned to
    Memory data register
  • Notice that data and instruction addresses can be
    the same both just address memory

31
Memory Operations - Loads
  • Load data from memory
  • lw R6, 0(R5) R6

32
Memory Operations - Stores
  • Storing data to memory works essentially the same
    way
  • sw R6 , 0(R5)
  • R6 200 lets assume R5 0x18
  • mem0x18

33
So far weve learned
  • MIPS loading words but addressing bytes
    arithmetic on registers only
  • Instruction Meaningadd s1, s2, s3 s1
    s2 s3sub s1, s2, s3 s1 s2 s3lw
    s1, 100(s2) s1 Memorys2100 sw s1,
    100(s2) Memorys2100 s1

34
Use of Registers
  • Example
  • a ( b c) - ( d e) // C statement
  • s0 - s4 a - e
  • add t0, s1, s2
  • add t1, s3, s4
  • sub s0, t0, t1
  • a b A4 // add an array element to a var
  • // s3 has address A
  • lw t0, 16(s3)
  • add s1, s2, t0

35
Use of Registers load and store
  • Example
  • A8 a A6 // A is in s3, a is in s2
  • lw t0, 24(s3) t0 gets A6 contents
  • add t0, s2, t0 t0 gets the sum
  • sw t0, 32(s3) sum is put in A8


36
load and store
  • Ex
  • a b Ai // A is in s3, a,b, i in //
    s1, s2, s4
  • add t1, s4, s4 t1 2 i
  • add t1, t1, t1 t1 4 i
  • add t1, t1, s3 t1 addr. of Ai
  • (s3(4i))
  • lw t0, 0(t1) t0 Ai
  • add s1, s2, t0 a b Ai

37
Example Swap
  • Swapping words
  • s2 has the base address of the array v

temp v0 v0 v1 v1 temp
swap lw t0, 0(s2) lw t1, 4(s2) sw t0,
4(s2) sw t1, 0(s2)
38
Machine Language
  • Instructions, like registers and words of data,
    are also 32 bits long
  • Example add t0, s1, s2
  • registers have numbers, t08, s117, s218
  • Instruction Format 000000 10001 10010 01000
    00000 100000 op rs rt rd shamt funct
  • Can you guess what the field names stand for?

39
Arithmetic Operation
  • op operation of the instruction
  • rs first register source operand
  • rt second register source operand
  • rd register destination operand
  • shamt shift amount
  • funct function (select type of ALU operation)
  • add 32
  • sub 34

40
Machine Language
  • Consider the load-word and store-word
    instructions,
  • What would the regularity principle have us do?
  • New principle Good design demands a compromise
  • Introduce a new type of instruction format
  • I-type for data transfer instructions
  • other format was R-type for register
  • Example lw t0, 32(s2) 35 18 8
    32 op rs rt 16 bit number
  • Where's the compromise?

41
Generic Examples of Instruction Format Widths

Variable Fixed

42
Instruction Formats
  • If code size is most important, use variable
    length instructions
  • If performance is most important, use fixed
    length instructions
  • Recent embedded machines (ARM, MIPS) added
    optional mode to execute subset of 16-bit wide
    instructions (Thumb, MIPS16) per procedure
    decide performance or density
  • Some architectures actually exploring on-the-fly
    decompression for more density.

43
Example
  • C code A 300 h A 300
  • Assembler code
  • lw t0, 1200(s3)
  • add t0, s2, t0
  • sw t0, 1200(s3)
  • Binary code (decimal notation)

9
9
44
Example
  • Real binary code
  • The highlighted number shows the difference of
    only 1 bit in the op codes.

45
Constants
  • Small constants are used quite frequently (50 of
    operands) e.g., A A 5 B B 1 C
    C - 18
  • Solutions? Why not?
  • put 'typical constants' in memory and load them?
  • create hard-wired registers (like zero) for
    constants like one?
  • MIPS Instructions addi s0, s0, 4 andi
    s0, s0, 6 ori s0, s0, 4
  • Make the common case fast

46
Loading Immediate Values
  • How do we put a constant (immediate) value into a
    register
  • addi R6, R0, 100
  • Put the value 100 into register R6 R6 0100 100

47
Loading Immediate Values
op rs rt rd shamt funct
R I
op rs rt 16 bit address
  • What should be the format of addi?
  • addi is in I format
  • Whats the largest immediate value that can be
    loaded into a register?
  • But, how do we load larger numbers?

48
Load Upper Immediate
  • Example lui t0, 255

Transfers the immediate field into the registers
top 16 bits and fills the registers lower 16
bits with zeros R83116 bits of R8
49
How about larger constants?
  • We'd like to be able to load a 32 bit constant
    into a register
  • Must use two instructions, new "load upper
    immediate" instruction lui t0,
    1010101010101010
  • Then must get the lower order bits right,
    i.e., addi t0, t0, 0010101010101010
  • ori t0, t0, 0010101010101010

1010101010101010
0000000000000000
0000000000000000
0010101010101010
addi
50
Logical Operations Shifting Bits
  • Shift left or right with instructions sll and
    srl.
  • sll t2, s0, 2 t2 s0
  • srl t2, s0, 2 t2 s0 2
  • Fill with zeros for shift operations
  • Example sll t0, s2, 3
  • S2 0110 0000 0000 0000 1100 1000 0000 1111
  • t0 0000 0000 0000 0110 0100 0000 0111 1000
  • The sll and srl instructions are R format
    instructions

0
0
16
10
2
0
op
rs
rt
rd
shamt
funct
the shift amount field is used
51
More Logical Operations
  • Logical Operations
  • AND
  • bit-wise AND between registers
  • and t1, s0, s1
  • OR
  • bit-wise OR between registers
  • or t1, s0, s1
  • NOR
  • Bit-wise NOR between registers
  • nor t1, s0, s1
  • nor t1, t0, 0 t1 NOT(t0)
  • Immediate modes
  • andi and ori

52
Example
  • Example and R3, R10, R16 or R4, R10,
    R16
  • nor R5 , R10, R16
  • R16 0000 0000 0000 0000 1100 1000 0000 1111
  • R10 0000 0000 0000 0110 0100 0000 0111 1000
  • R3 0000 0000 0000 0000 0100 0000 0000 1000
  • R4 0000 0000 0000 0110 1100 1000 0111 1111
  • R5 1111 1111 1111 1001 0011 0111 1000 0000

53
Example C Bit Fields
  • int data
  • struct
  • unsigned int ready 1
  • unsinged int enable 1
  • unsigned int receivedByte 8
  • reciever
  • data receiver.receivedByte
  • receiver.ready 0
  • receiver.enable 1

54
Example C Bit Fields
  • Assume data and receiver are in s0 and s1
  • sll s0, s1, 22
  • srl s0, s0, 24
  • andi s1, s1, 0xfffe
  • ori s1, Ss1, 0x0002
  • Alternative code sequence
  • srl s0, s1, 2
  • andi s0, s0, 0x00ff

55
Instructions for Making Decisions
  • beq reg1, reg2, L1
  • Go to the statement labeled L1 if the value in
    reg1 equals the value in reg2
  • bne reg1, reg2, L1
  • Go to the statement labeled L1 if the value in
    reg1 does not equals the value in reg2
  • j L1
  • Unconditional jump
  • jr t0
  • jump register. Jump to the instruction
    specified in register t0

56
Making Decisions
  • Example
  • if ( a ! b) goto L1 // x,y,z,a,b mapped
    to s0-s4
  • x y z
  • L1 x x a
  • bne s3, s4, L1 goto L1 if a ! b
  • add s0, s1, s2 x y z (ignored if
    a!b)
  • L1sub s0, s0, s3 x x a (always ex)
  • Reminder
  • Registers variable in C code s0 ... s7 16
    ... 23
  • Registers temporary variable t0 ... t7 8
    ... 15
  • Register zero always 0

57
if-then-else
  • Example
  • if ( ab) x y z
  • else x y z
  • bne s3, s4, Else goto Else if a!b
  • add s0, s1, s2 x y z
  • j Exit goto Exit
  • Else sub s0,s1,s2 x y z
  • Exit

58
Example Loop with array index
  • Loop g g A i i i j if (i !
    h) goto Loop ....
  • s1, s2, s3, s4 g, h, i, j, array A base
    s5
  • LOOP add t1, s3, s3 t1 2 i add t1,
    t1, t1 t1 4 i add t1, t1, s5 t1
    adr. Of Ai lw t0, 0(t1) load
    Ai add s1, s1, t0 g g Ai add s3,
    s3, s4 i i j bne s3, s2, LOOP

59
Loops
  • Example
  • while ( Ai k ) // i,j,k in s3. s4, s5
  • i i j // A is in s6
  • Loop sll t1, s3, 2 t1 4 i
  • add t1, t1, s6 t1 addr. Of Ai
  • lw t0, 0(t1) t0 Ai
  • bne t0, s5, Exit goto Exit if Ai!k
  • add s3, s3, s4 i i j
  • j Loop goto Loop
  • Exit

60
Other decisions
  • Set R1 on R2 less than R3 slt R1, R2, R3
  • Compares two registers, R2 and R3
  • R1 1 if R2 R3
  • Example slt t1, s1, s2
  • Branch less than
  • Example if(A
  • slt t1, s1, s2 t1 1 if A
  • bne t1, 0, LESS

61
Switch statement
  • switch(k)
  • case 0 f I j break
  • case 1 f g h break
  • case 2 f g h break
  • case 3 f i j break
  • f-k in s0-s5 and t2 contains 4 (maximum of var
    k)
  • The switch statement can be converted into a big
    chain of if-then-else statements.
  • A more efficient method is to use a jump address
    table of addresses of alternative instruction
    sequences and the jr instruction. Assume the
    table base address in t4

62
Switch cont.
  • slt t3, s5, zero is k
  • bne t3, zero, Exit if k
  • slt t3, s5, t2 is k
  • beq t3, zero, Exit if k 4 goto Exit
  • sll t1, s5, 2 t1 4 k
  • add t1, t1, t4 t1 addr. Of t4k
  • lw t0, 0(t1) t0 t4k
  • jr t0 jump to addr. In t0
  • t40L0, t41L1, ,
  • L0 add s0, s3, s4 f i j
  • j Exit
  • L1 add s0, s1, s2 f g h
  • j Exit
  • L2 sub s0, s1, s2 f g h
  • j Exit
  • L3 sub s0, s1, s2 f i j
  • Exit

63
MIPS Instruction Formats
  • More than more than one format for instructions,
    usually
  • Different kinds of instructions need different
    kinds of fields, data
  • Example 3 MIPS instruction formats

R I J
64
Addresses in Branches and Jumps
  • Instructions
  • bne t4,t5,Label Next instruction is at Label
    if t4 ? t5
  • beq t4,t5,Label Next instruction is at Label
    if t4 t5
  • j Label Next instruction is at Label
  • Formats
  • Addresses are not 32 bits How do we handle
    this with large programs?
  • First idea limitation of branch space to the
    first 216 bits

op rs rt 16 bit address
I J
op 26 bit address
65
Addresses in Branches
  • Instructions
  • bne t4,t5,Label Next instruction is at Label if
    t4?t5
  • beq t4,t5,Label Next instruction is at Label if
    t4t5
  • Formats
  • Treat the 16 bit number as an offset to the PC
    register PC-relative addressing
  • Word offset instead of byte offset, why??
  • most branches are local (principle of locality)
  • Jump instructions just use the high order bits of
    PC Pseudodirect addressing
  • 32-bit jump address 4 Most Significant bits of
    PC concatenated with 26-bit word address (or 28-
    bit byte address)
  • Address boundaries of 256 MB

op rs rt 16 bit address
I
66
Conditional Branch Distance
25 of integer branches are 2 to 4 instructions
67
Conditional Branch Addressing
  • PC-relative since most branches are relatively
    close to the current PC
  • At least 8 bits suggested (?128 instructions)
  • Compare Equal/Not Equal most important for
    integer programs (86)

68
PC-relative addressing
  • For larger distances Jump register jr required.

69
Example
  • LOOP mult 9, 19, 10 R9 R19R10 lw 8,
    1000(9) R8 _at_(R91000)
  • bne 8, 21, EXIT add 19, 19, 20 i
    i j j LOOP EXIT ......
  • Assume LOOP is placed at location 80000

70
Example
  • LOOP mult 9, 19, 10 R9 R19R10
    lw 8, 1000(9) R8 _at_(R91000)
  • bne 8, 21, EXIT add 19, 19, 20 i
    i j j LOOP EXIT ...
  • Assume LOOP is placed at location 80000

2
20000
71
MIPS Addressing Modes
CS 331
Xiaoyu Zhang, CSUSM
71
72
(No Transcript)
73
Procedure calls
  • Procedures or subroutines
  • Needed for structured programming
  • Steps followed in executing a procedure call
  • Place parameters in a place where the procedure
    (callee) can access them
  • Transfer control to the procedure
  • Acquire the storage resources needed for the
    procedure
  • Perform desired task
  • Place results in a place where the calling
    program (caller) can access them
  • Return control to the point of origin

74
Resources Involved
  • Registers used for procedure calling
  • a0 - a3 four argument registers in which to
    pass parameters
  • v0 - v1 two value registers in which to
    return values
  • ra one return address register to return to
    the point of origin
  • Transferring the control to the callee
  • jal ProcedureAddress
  • jump-and-link to the procedure address
  • the return address (PC4) is saved in ra
  • Example jal 20000
  • Returning the control to the caller
  • jr ra
  • instruction following jal is executed next

75
Memory Stacks
Useful for stacked environments/subroutine call
return even if operand stack not part of
architecture
Stacks that Grow Up vs. Stacks that Grow Down
High address
0 Little
inf. Big
a
Memory Addresses
grows up
grows down
SP
b
c
inf. Big
0 Little
Low address
76
Calling conventions
  • int func(int g, int h, int i, int j)
  • int f
  • f ( g h ) ( i j )
  • return ( f )
  • // g,h,i,j - a0,a1,a2,a3, f in s0
  • func
  • addi sp, sp, -12 make room in stack for 3
    words
  • sw t1, 8(sp) save the regs we want to use
  • sw t0, 4(sp)
  • sw s0, 0(sp)
  • add t0, a0, a1 t0 g h
  • add t1, a2, a3 t1 i j
  • sub s0, t0, t1 s0 has the result
  • add v0, s0, zero return reg v0 has f

77
Calling (cont.)
  • lw s0, 0(sp) restore s0
  • lw t0, 4(sp) restore t0
  • lw t1, 8(sp) restore t1
  • addi sp, sp, 12 restore sp
  • jr ra
  • we did not have to restore t0-t9 (caller save)
  • we do need to restore s0-s7 (must be preserved
    by callee)

78
Nested Calls
Stacking of Subroutine Calls Returns and
Environments
A
A CALL B CALL C
C RET
RET
A
B
B
D
A
B
C
A
B
A
  • Some machines provide a memory stack as part of
    the
  • architecture (e.g., VAX, JVM)
  • Sometimes stacks are implemented via software
    convention

79
Compiling a String Copy Proc.
  • void strcpy ( char x , y )
  • int i0
  • while ( x i y i ! 0)
  • i
  • // x and y base addr. are in a0 and a1
  • strcpy
  • addi sp, sp, -4 reserve 1 word space in
    stack
  • sw s0, 0(sp) save s0
  • add s0, zer0, zer0 i 0
  • L1 add t1, a1, s0 addr. of y i in t1
  • lb t2, 0(t1) t2 y i
  • add t3, a0, s0 addr. Of x i in t3
  • sb t2, 0(t3) x i y i
  • beq t2, zero, L2 if y i 0 goto L2
  • addi s0, s0, 1 i
  • j L1 go to L1
  • L2 lw s0, 0(sp) restore s0
  • addi sp, sp, 4 restore sp
  • jr ra return

80
Array vs. Pointers
  • Clear1 ( int array , int size)
  • int i
  • for ( i 0 i
  • array i 0
  • Clear2 ( int array , int size)
  • int p, i0
  • for(parray0 p
  • p 0
  • // a0 addr. of array, a1 has size

81
Arrays vs. Pointers MIPS for array version
  • Clear1 ( int array , int size)
  • int i
  • for ( i 0 i
  • array i 0
  • // a0 addr. of array, a1 has size, t0 i
  • move t0, zero i 0
  • Loop1add t1, t0, t0 t1 2 i
  • add t1, t1, t1 t1 4 i
  • add t2, a0, t1 t2 addr. of array
  • sw zero, 0(t2) arrayi 0
  • addi t0, t0,1 i
  • slt t3, t0, a1 check end of loop (i
  • bne t3, zero, Loop1 if i

82
Arrays vs. Pointers MIPS for pointer version
  • Clear2 ( int array , int size)
  • int p, i0
  • for(parray0 p
  • p 0
  • // a0 addr. of array, a1 has size, t0 p
  • move t0, a0 p addr. of array0
  • add t1, a1, a1 t1 2 size
  • add t1, t1, t1 t1 4 size
  • add t2, a0, t1 t2 addr. of arraysize
  • Loop2 sw zero, 0(t0) store 0 in p
  • addi t0, t0, 4 p p 4
  • slt t3, t0, t2 t3(p
  • bne t3, zero, Loop2if p Loop2
  • The pointer version reduces the of instructions
    per iteration
  • from 7 to 4
  • Many optimizing compilers will generate this
    code, even for
  • array-based C code

83
Alternative Architectures
  • Design alternative
  • provide more powerful operations
  • goal is to reduce number of instructions executed
  • danger is a slower cycle time and/or a higher CPI
  • Sometimes referred to as RISC vs. CISC
  • virtually all new instruction sets since 1982
    have been RISC
  • VAX minimize code size, make assembly language
    easy instructions from 1 to 54 bytes long!
  • Well look at PowerPC and Intel IA-32

84
PowerPC
  • PowerPC is a RISC architecture very similar to
    MIPS, but has some unique instructions
  • Indexed addressing
  • example lw t1,a0s3 t1Memorya0s3
  • What do we have to do in MIPS?
  • Update addressing
  • update a register as part of load (for marching
    through arrays)
  • example lwu t0,4(s3) t0Memorys34s3s3
    4
  • What do we have to do in MIPS?
  • Others
  • load multiple/store multiple
  • a special counter register bc Loop, ctr!0
    decrement counter, if not 0 goto loop

85
IA - 32
  • 1978 The Intel 8086 is announced (16 bit
    architecture)
  • 1980 The 8087 floating point coprocessor is
    added
  • 1982 The 80286 increases address space to 24
    bits, instructions
  • 1985 The 80386 extends to 32 bits, new
    addressing modes
  • 1989-1995 The 80486, Pentium, Pentium Pro add a
    few instructions (mostly designed for higher
    performance)
  • 1997 57 new MMX instructions are added,
    Pentium II
  • 1999 The Pentium III added another 70
    instructions (SSE)
  • 2001 Another 144 instructions (SSE2)
  • 2003 AMD extends the architecture to increase
    address space to 64 bits, widens all registers
    to 64 bits and other changes (AMD64)
  • 2004 Intel capitulates and embraces AMD64
    (calls it EM64T) and adds more media extensions
  • This history illustrates the impact of the
    golden handcuffs of compatibilityadding new
    features as someone might add clothing to a
    packed bagan architecture that is difficult
    to explain and impossible to love

86
IA-32 Overview
  • Complexity
  • Instructions from 1 to 17 bytes long
  • one operand must act as both a source and
    destination
  • one operand can come from memory
  • complex addressing modes e.g., base or scaled
    index with 8 or 32 bit displacement
  • Saving grace
  • the most frequently used instructions are not too
    difficult to build
  • compilers avoid the portions of the architecture
    that are slow
  • what the 80x86 lacks in style is made up in
    quantity, making it beautiful from the right
    perspective

87
To summarize
88
Summary
  • Instruction complexity is only one variable
  • lower instruction count vs. higher CPI / lower
    clock rate
  • Design Principles
  • simplicity favors regularity
  • smaller is faster
  • good design demands compromise
  • make the common case fast
  • Instruction set architecture
  • a very important abstraction indeed!
Write a Comment
User Comments (0)
About PowerShow.com