Lecture 4 Instruction Set Architecture - PowerPoint PPT Presentation

1 / 99
About This Presentation
Title:

Lecture 4 Instruction Set Architecture

Description:

Lecture 4 Instruction Set Architecture Instruction Set Architecture 1950s to 1960s: Computer Architecture Course Computer Arithmetic 1970 to mid 1980s: Computer ... – PowerPoint PPT presentation

Number of Views:216
Avg rating:3.0/5.0
Slides: 100
Provided by: 6649732
Category:

less

Transcript and Presenter's Notes

Title: Lecture 4 Instruction Set Architecture


1
Lecture 4Instruction Set Architecture
2
Instruction Set Architecture
  • 1950s to 1960s Computer Architecture Course
    Computer Arithmetic
  • 1970 to mid 1980s Computer Architecture Course
    Instruction Set Design, especially ISA
    appropriate for compilers
  • 1990s Computer Architecture Course Design of
    CPU, memory system, I/O system, Multiprocessors

3
Languages of Computers
  • Machine Language
  • Programs consist of machine instructions
  • Directly executable without preprocessing
  • Direct manipulation of machine registers
  • Efficient in view of machine resource utilization
  • Difficult to program
  • Assembly language
  • Improved version of machine language with
    emphasis on user-friendliness
  • Symbolic machine language(symbols for operations
    and addresses)
  • Assembler is needed to translate into a machine
    language program
  • High-Level Language
  • Programs consist of statements, each of which can
    be translated into several machine language
    instructions
  • Need a compiler to translate into a machine
    language program
  • Relatively easy to program compare to ML or AL
  • Hardware resource utilization may be inefficient

4
Semantic Gap Between ML and HLL
  • As Hardware cost goes down, Software cost goes up
  • Shortage of programmers
  • Unreliable Software gt Unreliable Computers
  • Response Keep the programming cost down
  • Develop powerful, complex user-friendly HLL
  • HLL programmers are easy to train
  • Greater Semantic Gap between HLL and Machine
    Language
  • Execution inefficiency
  • Software complexity
  • Compiler complexity
  • To offset the semantic gap
  • Large instruction set
  • Variety of addressing modes
  • Hardware/Firmware implementation of HLL
    primitives

5
Instruction Set
  • Boundary between Designers(architects) and
    programmers
  • For designers Specification of the function of
    CPU
  • For Programmers A pool of functions from which
    they choose to use in the program

One would expect that human language should
directly reflect the characteristics of human
intellectual capabilities that language should be
a direct mirror of mind in ways which other
systems of knowledge and belief cannot. - Noam
Chomsky
  • Instruction Set
  • Language of a machine
  • Characterizes the machines capability and
    behavior
  • Performance Issues
  • Memory Bandwidth is used 1/2 for Instructions and
    1/2 for Data
  • For efficient utilization of MB, instruction
    representation must as compact as possible whilst
    still being compatible with data
  • von Neumann Bottleneck exists in MB

6
Memory Bandwidth Issue
  • Memory Bandwidth is used by CPU and I/O
  • Memory Bandwidth given to CPU is used for
    Instruction Fetches and Operand Fetches or
    Operand Stores
  • Consider an AC-machine ADD X, or LDA X

7
Machine Language
  • Machine Language
  • Vocabulary
  • Operations
  • Addressing Modes for operands addresses and the
    next instruction address
  • Syntax
  • Methods of representing operation(OP-code),
    operands, addresses in an instruction
  • Instruction format
  • Encoding of Instruction fields
  • Grammar
  • Rules of using instructions to make a program

8
Components of an Instruction
  • Operation Code(OP-code)
  • Format specifier
  • Long / Short
  • Field definition
  • Operation
  • Types of operands
  • Operand Address(es)
  • Operand itself
  • Address themselves(including abbreviated)
  • Address modification specification
  • Automatic indexing
  • Relative address
  • Sequencing

9
Instruction Set and Computer Architecture
  • Computer Architectures are classified into three
    classes according to the Register Structures for
    operands storage
  • Stack Computer Architecture
  • AC Computer Architecture
  • General Purpose Register Computer Architecture

10
Stack Computer Architecture
Instruction Operation
PUSH X if F1, then S overflow
POP X if E1, then empty S
if E1, then empty S
Unary Instr.
(Shift Left)
if E1, then empty S
Binary Instr.
(ADD)
if SP(n-1),
11
Characteristics of the Stack Architecture
  • Instruction length is short
  • No need to represent the address(es) of
    operand(s) in functional instructions
  • Instruction execution time is fast
  • Operand(s) access is fast because they are in the
    stack(register)
  • Operand(s) must be stored in the stack before
    operating on them
  • Inconvenient to prepare data in the stack
  • Frequent use of PUSH and POP instructions to
    prepare data in the stack - memory access

12
AC Computer Architecture
(CPA)
(ADD X)
Transfer Instruction
  • Characteristics
  • - Instruction execution time of binary
    instructions are slow
  • One of the operands must be read from memory
  • - Instruction length is longer than in the stack
    architecture
  • One of the operands memory address must be
    specified in the instruction
    although AC(a data register) can be implied
  • - Frequency of LDA/STA instructions is high
  • There is only one data register

13
GPR Computer Architecture
Unary Instruction
Binary Instruction
Transfer Instruction
Characteristics - Instruction length is short
because register addresses are used
for operands - Instruction execution time is
fast because all the operands are in the
registers - Frequency of using LD/ST
instructions depends on the number of
registers - Opportunities of storing the results
of operations in GPR is high because there
are many registers
14
Computer Architecture?
  • . . . the attributes of a computing system as
    seen by the programmer, i.e. the conceptual
    structure and functional behavior, as distinct
    from the organization of the data flows and
    controls, the logic design, and the physical
    implementation.
  • Amdahl, Blaaw, and Brooks, 1964

SOFTWARE
15
Towards Evaluation of ISA and Organization
16
Interface Design
  • A Good Interface
  • Lasts through many implementations (portability,
    compatibility)
  • Is used in many different ways (generality)
  • Provides convenient functionality to higher
    levels
  • Permits an efficient implementation at lower
    levels

17
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
18
Evolution of Instruction Sets
  • Major advances in computer architecture are
    typically associated with landmark instruction
    set designs
  • Ex Stack(B1700) vs GPR (System S/360)
  • Design decisions must take into account
  • technology(component)
  • machine organization
  • programming languages
  • compiler technology
  • operating systems
  • And they in turn influence these

19
Design Space of ISA
  • Five Primary Dimensions
  • Number of explicit operands ( 0, 1, 2, 3 )
  • Operand Storage Where besides memory?
  • Effective Address How is memory location
    specified?
  • Type and Size of Operands byte, int, float,
    vector, . . .
  • How is it specified?
  • Operations add, sub, mul, . . .
  • How is it specified?
  • Other Aspects
  • Successor How is it specified?
  • Conditions How are they determined?
  • Encoding Fixed or variable? Wide?
  • Parallelism

20
Number of Explicit Operands
  • To optimize the memory bandwidth required by
    instructions(for fetching from Memory), the
    number of explicitly specified operands in the
    instruction needs to be reduced
  • 2 operands(GPR machine)
  • 2 source operands(1 of the source operands is
    destroyed after execution to store the result)
  • 1 operand(AC machine)
  • 1 of the operands is implied to a specific
    hardware register called Accumulator(AC)(result
    of the execution is also stored in this register)
  • 0 operand(Stack machine)
  • Both of the operands and the result are implied
    to a stack

21
Operand Storage
  • Storage
  • Memory
  • - Long memory addressing
  • - Need to represent the address with a few bits
  • Relative addressing with displacement
  • Page/Segment addressing
  • Register
  • - General purpose register
  • Short register addressing
  • - AC
  • Stack(register)
  • - Does not need for addresses

22
Address Space and Storage Space
  • Address Space
  • Consists of addresses that programmers can use
  • Storage Space
  • Consists of physical storage locations
  • For a simple low cost machine, the Address Space
    and the Storage Space are identical
  • Programmers program with the actual storage
    addresses
  • Modern computers provide the storage systems with
    Independent Address and Storage Spaces
  • An Effective Address(EA) needs to be obtained
    from the Address used in the program to access
    the operand from the memory
  • Usually the Address Space is much larger than the
    Storage Space
  • Virtual Storage System

23
Effective Address
  • Address and Physical Storage Location are two
    different concepts.
  • Addresses of Operands are represented or
    implied in the instruction.
  • Operands address needs to be mapped into an
    Effective Address of the physical
    storage location

Basic Addressing Modes(A or R in instructions)
Immediate opdA of M refer limited
value Direct EAA simple limited addr
space Indirect EAMA large addr space
multiple M refer Register EAR no M refer
limited addr space R Indirect EA MR large addr
space extra M refer Displacement EA
AR flexibility complexity Stack opdSTOP n
o M refer limited applications
24
Specification of Type and Size of Operand
  • Specification of the Type of the operand
  • Usually different op-codes for different types of
    operands
  • Specification of the Size of the operand
  • op-code represents the resolution of the operand
    address
  • bit, byte, half word(upper/lower half) , word,
    ...
  • Length of operands
  • Implicit
  • Variable length
  • Specified explicitly in the instruction
  • Specified by a designated register
  • Specified by the delimiter marks in the operand
  • reserved-bit delimiter(field or word mark)
  • reserved-bit configuration(record or group mark)

25
(No Transcript)
26
Operation
  • Specification
  • Encoded to reduce the instruction length reason
  • Types
  • Minimal Instruction Set
  • Complex Instruction Set vs RISC

27
Four Types of Operations
  • Functional
  • ADD, AND, CPA, CPC, ROL, CLA, CLC, INC,
  • Transfer
  • LDA, STA(LD, ST),
  • Control
  • JMP, JNA, JZA, JZC(SMA, SZA, SZC),
  • Input/Output
  • INP, OUT,

28
Minimal Instruction Set
29
Why NOT Use a Minimal Instruction Set?
Inefficient Program Size(M bandwidth) -
Large IC and CPI Programming difficulty
30
Instruction Set DesignOperations to Include in
the Instruction Set
  • Trade-off 3 Es(Elegance, Efficiency,
    Environment)
  • Elegance
  • Completeness(Even Bn instruction is complete)
  • Symmetry AC lt f(AC, MX) and MX lt
    f(AC, MX)
  • Flexibility, Generality
  • Efficiency
  • Space
  • Bit budget
  • Efficient specification of address
  • Fewer instructions require fewer bits to encode
    OP-code
  • Frequency of use arguments
  • Bandwidth arguments(NOP simply waste memory
    bandwidth)
  • Ratio of overheads non-functional to
    functional
  • Environment
  • Multiprogramming(Relocation, Protection, Sharing)
  • Code generation by compilers(Compiler favors only
    a little portion of instruction set)

31
ISA Metrics
  • Aesthetics
  • Orthogonal
  • No special registers, few special cases, all
    operand modes available with any data type or
    instruction type
  • Completeness
  • Support for a wide range of operations and target
    applications
  • Regularity
  • No overloading for the meanings of instruction
    fields
  • Streamlined
  • Resource needs easily determined
  • Ease of compilation (programming?)
  • Ease of implementation
  • Scalability

32
Powerful Instruction
Overhead for Execution(O)
(E)
  • Rich, Powerful Instruction
  • Instruction with longer Execution Time(E) to
    balance the overhead penalty(O)
  • Instruction which has a large E/O

33
Powerful Instructions
  • Extended Arithmetic Function
  • Multiply, divide, Trigonometric Functions, etc
  • Automatic Indexing
  • BCT R1, addr (R1 lt- R1 - 1, if R1 0 then PC
    lt- addr)
  • BXLE R1, R3, addr (R1 lt- R1 R3,
    if R3odd, R1 lt R3,
    PC lt- addr if R3even, R1 lt
    R31, PC lt- addr)
  • Subroutine Linkage
  • JMS X (MX lt- PC, PC lt- X1)

34
Powerful Instructions
  • Process State Exchange(Context Switch)
  • Instructions required in the multiprogramming
    environments

Otherwise LD R1, addr LD R2,
addr1 LD R5, addr4
XJ(Exchange Jump of CDC 6000 series)
35
Basic ISA ClassesType of Internal Storage
36
Stack Machines
  • Instruction set
  • Arithmetic operators(, -, , /, . . .)
  • push A, pop A

37
The Case Against Stacks
  • Performance is derived from the existence of
    several fast registers, not from the way they are
    organized
  • Data does not always surface when needed
  • Constants, repeated operands, common
    sub-expressions
  • so TOP and Swap instructions are required
  • Code density is about equal to that of GPR
    instruction sets
  • Registers have short addresses
  • Keep things in registers and reuse them
  • Slightly simpler to write a poor compiler, but
    not an optimizing compiler

38
(No Transcript)
39
GPR Machines
GPR(General Purpose Register)
  • Faster than memory
  • Easier for a compiler to use
  • Used to hold variables, intermediate operands
  • the memory traffic reduces
  • the code density improves
  • How many registers?
  • depends on how they are used by the compiler

40
How Many Registers in RF
6 algorithms from CALGO(ACM) written in 4
languages ALGOL,BASIC, BLISS,FORTRAN
We need to try to keep the live registers in the
RF
41
GPR Machines
  • Maximum number of operands(O)
  • two or three operands
  • Number of memory addresses(M)
  • 0,1,2,3

42
GPR Machines
Type Register-register (0,3) Register-memory (1,
2) Memory-memory (3,3)
Advantages Simple,
fixed-length instr. encoding. Simple
code generation model Data can be accessed
without loading first. Instruction format tends
to be easy to encode and yields good
density. Program becomes most compact. No waste
of registers for temporaries.
Disadvantages Higher instruction count. Some
instructions are short and bit encoding may be
wasteful. A source operand is destroyed. Clocks
per instruction varies by operand
location. Large variation in instruction sizes
and in work per instruction. Memory accesses
create memory bottleneck.
43
R-R vs RM
ABC
RR Instructions LD R1,A LD R2,B LD R3,C ADD R4
,R1,R2 ADD R5,R4,R3 RM instructions LD R1,A AD
D R1,B ADD R1,C
RM instructions reduce IC
44
What About Actual Programs
  • Consider a GPR machine with a large register
    file.
  • - Highly probable that the intermediate data can
    be found in a register
  • - Thus, LD/ST instruction will be used less
    frequently
  • - However, frequency of using LD/ST instructions
    in the computers that use RM instructions will
    reduced further

45
VAX-11
Variable format, 2- and 3-address instructions
  • 32-bit word size, 16 GPR (4 reserved)
  • Rich set of addressing modes (apply to any
    operand)
  • Rich set of operations
  • bit field, stack, call, case, loop, string,
    poly, system
  • Rich set of data types (B, W, L, Q, O, F, D,
    G, H)
  • Condition codes

46
Kinds of Addressing Modes
Addressing Mode value in is the
operand
  • Register direct Ri
  • Immediate (literal) v
  • Direct (absolute) Mv
  • Register indirect MRi
  • BaseDisplacement MRi v
  • BaseIndex MRi Rj
  • Scaled Index MRi Rjd v, eg. d8
  • Autoincrement MRi1
  • Autodecrement MRi - 1
  • Memory Indirect M MRi
  • Indirection Chains

47
Memory Addressing Modes (VAX)
48
Operand Address bitsDisplacement Values
  • This value is related to the operand address
    field when the address is represented by the
    displacement from the base address
  • Wide distribution
  • The vast majority --- positive
  • A majority of the large displacements -negative

49
Operand Address bits Immediate Addressing Mode
Percentage of operations that use immediates
50
Operand Address bits Immediate Addressing Mode
51
Operations in the Instr. Set
Operator type Examples

Add, Subtract, Data transfer
Arithmetic and logical
Load, Store, Move, Control
Branch, Jump, Procedure Call, Return,
Trap System
Operating System Call, VMM instructions Floatin
g Point
Floating Point Add Decimal
Decimal Add, Decimal-to-Character
Conversion String
String Move, String Compare, String
Search Graphics
Pixel operations, Compress/Decompress op.
52
Operations in the Instr. Set
Integer average ( total executed) 22 20 16
12 8 6 5 4 1 1 96
Rank 1 2 3 4 5 6 7 8 9 10 Total
80x86 instructions load conditional
branch compare store add and sub move
reg-reg call return
53
Control Flow Instructions
54
(No Transcript)
55
RISC
56
Instruction Execution CharacteristicsType of
Operations
Relative Dynamic Frequencies of statements in HLL
programs
  • What type of statements is most frequent?
  • Assignment statements dominate
  • Functional instructions and Transfer
    instructions
  • Movements of data must be made simple, thus fast
  • Conditional Statements(if and loop together)
  • Instructions with Control function
  • Sequence control mechanism is important

57
Instruction Execution CharacteristicsTime
Consumed by Statements
Machine instruction weighted Average No.
of machine Instr. / Statements x Frequency of
Occurrences Memory reference weighted
Average No. of memory references / Statement x
Frequency of Occurrences Most time consuming
statement is procedure CALL/RETURN
58
Instruction Execution CharacteristicsType of
Operands
  • Majority of references to scalar
  • 80 are local to a procedure
  • References to arrays/structure require index or
    pointer
  • Locations of operands(Average per instruction)
  • 0.5 operands in memory
  • 1.4 operands in registers

59
Instruction Execution CharacteristicsProcedure
Calls
  • Two most significant aspects in implementing
    procedure Call/Returns
  • Number of parameters
  • Depth of nesting
  • Statistics on Number of Parameters
  • 98 of dynamically called procedures were passed
    fewer than 6 parameters
  • 92 of them used fewer than 6 local scalar
    variables

60
Multiple Register Sets
Multiple register sets - Assume that we have
several sets of registers that each set can be
used by each different procedure - Saves some
time in procedure CALL/RETURN simply by changing
the R set pointer value
61
Instruction Execution CharacteristicsDepth of
Procedure Nesting
Procedure Nesting and Register Set Window
t
Depth
Shifting register set window need to save the
information in one register
set in the memory so that a register set can
be used by the new procedure
Statistics Window depth of 8 will need to shift
only on less than 1 of calls and
returns
62
RISC Philosophy(1)Make the Most Frequent
Statements Execute Fast
Most frequent statements are Assignment Type of
Statements and each of them are translated by the
compiler into a set of Functional Instructions
and/or Transfer Instruction. Thus Functional and
Transfer Instructions need to be made to execute
fast.
Instruction Cycle of Functional Instruction or
Transfer Instruction
63
Assignment Statements
  • To make the Instruction Fetch fast
  • Short OP-code part Small number of instructions
    in the instruction set
  • Short Operand Address part Make the operands in
    the registers instead of M
  • To make the Instruction Preparation fast
  • Fixed length instruction
  • Fixed format instruction
  • Simple addressing modes
  • To make the Operand Fetch fast
  • Make the operands available from registers
    instead of memory
  • Needs a large register file
  • To make the Instruction Execution fast
  • Multiple register set Overlapping MRS
  • Instruction execution pipeline

64
RISC Philosophy(2)Make the Most Time-Consuming
Statements Execute Fast
  • Methods of passing Parameters
  • Through memory
  • Parameters are stored in the memory locations
    which are commonly accessible by both calling
    and called procedures
  • Execution of CALL and RETURN instructions are
    very slow due to the memory accesses, especially
    when there are many parameters to pass
  • Through registers
  • Parameters are stored in the registers in CPU
  • Calling procedure needs to save the registers,
    which are not used for passing parameters, in
    the memory. This results in a lot of memory
    accesses and makes the execution times of these
    instructions slow.

65
CISC and RISC
  • RISC
  • A limited and simple instruction set
  • A large number of GPR(Register File)
  • An emphasis on optimizing the instruction
    pipeline

66
Large Register File
Quick access to operands is desirable -
Assignment Statements rely on Functional and
Transfer Instructions - Functional
Instructions heavily rely on registers -
Frequency of Transfer Instructions depends on the
number of registers in the register file
If the number of registers is small, it needs a
strategy to keep the most frequently accessed
operands in registers to minimize Register-Memory
traffic - Software approach Maximize
register usage by compiler (Requires
sophisticated program analysis) - Hardware
approach More registers in the register file
67
Register Window
  • Fact
  • Statistically, most operand references are to
    local scalars - 80
  • Local variables to a procedure cannot be accessed
    by other procedure(s)
  • Problem
  • Local changes with each procedure CALL/RETURN
  • CALL/RETURN occurs frequently
  • Parameters need to be passed around
  • Observations
  • Statistically, a few parameters(lt6) and local
    variables(lt6)
  • Statistically, depth of procedure activation
    fluctuates within relatively narrow range(lt8)
  • Solution
  • Multiple small sets of registers
  • Each set is assigned to a different procedures
  • Windows for adjacent procedures overlap to allow
    parameter passing

68
Multiple Register Set
Each Register Set is assigned to a different
procedure - Size of a Register Set is equal to
the size of a window - Parameters need to be
copied in the called/calling procedures Register
Set, however, there is no need to copy all the
registers from the switched off register
set - Require register move instructions
69
Overlapping Register Window
When multiple of Register Sets are implemented in
a large Register File, we call a Register Set as
a Register Window. Multiple register sets still
require to copy the parameter values between
register sets. Overlapping Register Window -
Portions of register windows overlap for passing
parameters - At any time only one window is
visible - No need for moving information for
parameter passing
How about global variables?
70
Global Variables
  • Global Variables are commonly accessible by all
    the procedures
  • Assign to memory locations by compiler
  • Straight forward but inefficient for the
    frequently accessed global variables because of
    frequent memory accesses
  • Set aside a set of Global Variable registers
  • Available to all procedures
  • Unified register numbering system to simplify
    instruction format
  • e.g. R0 R7 Global

    R8 R13 Current window

71
Linear Organization of Register Windows
72
(No Transcript)
73
(No Transcript)
74
(No Transcript)
75
Code Size
  • Smaller programs
  • Program takes less memory space
  • Smaller program improves performance
  • Fewer instructions
  • Fewer bytes to fetch
  • In paging environment, occupy in fewer pages and
    reduces page faults
  • CISC
  • Smaller number of instructions in the
    program(program may be shorter but not
    necessarily smaller space)

76
Example
CISC
Memory Traffic Instruction 56
bits Data 32 x 3 96 bits Total MB
used 56 96 152 bits
RISC LD Rb B
LD Rc
C ADD Ra Rb
Rc ST Ra
A
Memory Traffic Instruction 112
bits Data 96 bits Total MB used 200 bits
77
Characteristic of RISC(1, 2)
  • (1) 1 Instruction per cycle(memory cycle)
  • Machine cycle IF IP Time to fetch the
    operands from registers
    Perform operation Store the result in
    a register
  • RISC instruction ltgt CISC micro-instruction

    gt No need to
    microprogram(Hardwired control)
  • (2) Register-to-Register operation
  • With only simple Load and Store operations for
    accessing memory(Load/Store Arch.)
  • Simplifies the instruction set, and control unit

78
Characteristic of RISC(3, 4)
  • (3) Simple Addressing Modes - Shorten EA
    generation time
  • Almost all instructions use register addressing
  • Relative addressing using PC, BAR, and Index
    address
  • Other complex modes may be synthesized by software
  • (4) Simple Instruction Format - Shorten
    instruction Decoding Time
  • Usually one format
  • Fixed length/align on word boundary
  • Fixed field length

79
Characteristic of RISC(5)
  • (5) Pipelining (We will learn this later in
    detail)
  • At this time, you just need to know that
  • - Instruction execution hardware can be made of
    a few inter- connected independent
    sub-modules, called pipeline STAGEs

- An instruction execution progresses at each
pipeline stage in sequence - When an
instruction completes its execution at the i-th
stage, the next instruction commences
its execution at the i-th stage - Thus, in the
ideal situation, throughput increases nearly n
times, where n is the number of
pipeline stages - Branch instruction makes the
pipelined execution inefficient
80
Pipelined Execution
1 instruction execution
I0
Execution of a Sequence of Instructions
I0
S3
At 4t I0
N instructions complete at (n3)t When n
is large it becomes nt Thus, 1 instruction
in every t
I1
S3
At 5t I1
I2
At 6t I2
I3
At 7t I3
I4
At 8t I4
81
A "Typical" RISC
  • 32-bit fixed format instruction (3 formats)
  • 32 32-bit GPR (R0 contains zero, DP takes a pair)
  • 3-address, R-R functional instruction
  • Single address mode for load/store base
    displacement
  • no indirection
  • Simple branch conditions
  • Delayed branch

see SPARC, MIPS, MC88100, AMD2900, i960, i860
PARisc, DEC Alpha, Clipper, CDC
6600, CDC 7600, Cray-1, Cray-2, Cray-3
82
Branch Displacement
83
Implementation of Conditional Branch Instructions
Evaluating branch conditions
How condition is tested Test special bits set
by ALU operations, possibly under program
control. Test arbitrary registers set by the
result of a comparison. Compare is part of the
branch. Often compare is limited to subset.
Name Condition Code(CC) Condition
Register Compare and Branch
Advantages Sometimes condition is set for free,
if not 2 instrs for a branch. Simple. 2
instrs for a branch 1 instr. rather than 2 for
a branch
Disadvantages CC is an extra state. CCs
constrain the ordering of instrs since they pass
info from one instr to a branch. Uses up a
register. May be too much work per instruction.
84
Putting It All Together DLX Architecture
  • Read Section 2.8 ---- MUST
  • DLX emphasizes
  • A simple load-store instruction set
  • Design for pipelining efficiency
  • A fixed instr. set encoding
  • Efficiency as a compiler target

85
Example MIPS
86
The Different Goals for VAX and MIPS
  • VAX - simple compilers and code density
  • powerful addressing modes
  • powerful instructions
  • efficient instruction encoding
  • few registers
  • MIPS - high performance via pipelining, ease of
    HW implementation, compatibility with highly
    optimizing compiler
  • simple instruction
  • simple addressing modes
  • fixed-length instruction formats
  • a large number of registers

87
VAX vs. MIPS
88
Fallacies and Pitfalls
  • Pitfall Designing a high-level instruction set
    feature specifically oriented to
    supporting a high-level language structure.
  • Fallacy There is such a thing as a typical
    program.
  • Fallacy An architecture with flaws cannot be
    successful.
  • 80x86 supports Segmentation while other support
    page
  • Extended AC for integer, while others use
    GPR
  • Stack for FP operations, while others
    abandoned stack
  • Fallacy You can design a flawless architecture.
  • All architecture design involves trade-off made
    in the context of a set of HW and SW technologies.

89
Most Popular ISA of All TimeIntel 80x86
  • 1971 Intel invents microprocessor 4004/8008,
    8080 in 1975
  • 1975 Gordon Moore realized one more chance for
    new ISA before ISA locked in for decades
  • Hired CS people in Oregon
  • Werent ready in 1977 (CS people did 432 in 1980)
  • Started crash effort for 16-bit microcomputer
  • 1978 8086 dedicated registers, segmented
    address, 16- bit
  • 8088 8-bit external bus version of 8086 added
    as after thought

90
Most Popular ISA of All TimeIntel 80x86
  • 1980 IBM selects 8088 as basis for IBM PC
  • 1980 8087 floating point coprocessor adds
    60 instructions using hybrid stack/register
    scheme
  • 1982 80286 24-bit address, protection, memory
    mapping
  • 1985 80386 32-bit address, 32-bit GP registers,
    paging
  • 1989 80486 Pentium in 1992 faster MP few
    instructions

91
80x86 Addressing/Protection
92
80x86 Instruction Format
  • 8086 in blue 80386 extensions in red

93
80x86 Instruction Encoding Address Specifier
Field Mod, Reg, R/M
  • r w0 w1 r/m mod0 mod1
    mod2 mod3
  • 16b 32b 16b 32b 16b 32b 16b 32b
  • 0 AL AX EAX 0 addrBXSI EAX same same same same
    same
  • 1 CL CX ECX 1 addrBXDI ECX addr addr addr
    addr as
  • 2 DL DX EDX 2 addrBPSI EDX mod0 mod0 mod0 mo
    d0 reg
  • 3 BL BX EBX 3 addrBPSI EBX d8 d8 d16 d32
    field
  • 4 AH SP ESP 4 addrSI (sib) SId8 (sib)d8 SId8
    (sib)d32
  • 5 CH BP EBP 5 addrDI d32 DId8 EBPd8 DId16
    EBPd32
  • 6 DH SI ESI 6 addrd16 ESI BPd8 ESId8 BPd16
    ESId32
  • 7 BH DI EDI 7 addrBX EDI BXd8 EDId8 BXd16
    EDId32

Address Specifier Reg3 bits, R/M3 bits, Mod2
bits
94
80x86 Instruction EncodingSc/Index/Base field
Base Scaled Index Mode Used when mod
0,1,2 in 32-bit mode and r/m 4 2-bit
Scale Field 3-bit Index Field 3-bit Base Field
  • 0 EAX EAX
  • 1 ECX ECX
  • 2 EDX EDX
  • 3 EBX EBX
  • 4 no index ESP
  • 5 EBP if mod0, d32 if modltgt0, EBP
  • 6 ESI ESI
  • 7 EDI EDI

95
80x86 Addressing Mode Usage for 32-bit Mode
Register indirect 10 10 6 2 7 Base 8-bit
disp 46 43 32 4 31 Base 32-bit
disp 2 0 24 10 9 Indexed 1 0 1 0 1
Based indexed 8b disp 0 0 4 0 1 Based
indexed 32b disp 0 0 0 0 0 Base Scaled
Indexed 12 31 9 0 13 Base Scaled Index
8b disp 2 1 2 0 1 Base Scaled Index 32b
disp 6 2 2 33 11 32-bit Direct 19 12 20
51 26
96
80x86 Length Distribution
97
Instruction Counts 80x86 vs. DLX
gcc 3,771,327,742 3,892,063,460 1.03
espresso 2,216,423,413
2,801,294,286 1.26 spice 15,257,026,309
16,965,928,788 1.11 nasa7 15,603,040,963
6,118,740,321 0.39
98
Intel Compiler vs. Compilers YOU Can Buy
  • 66 MHz Pentium Comparison SpecInt92 SpecFP92
  • Intel Internal Optimizing Compiler 64.6 59.7
  • Best 486 Compiler (June 1993) 57.6 39.9
  • Typical 486 Compiler in 1990,
    when Intel
    started project 41.0 32.5
  • Integer Intel 1.1X faster, FP 1.5X faster
  • .
  • 486 Comparison SpecInt92 SpecFP92
  • Intel Internal Optimizing Compiler 35.5 17.5
  • Best 486 Compiler (June 1993) 32.2 16.0
  • Typical 486 Compiler in 1990,
  • when Intel started project 23.0 12.8
  • Integer Intel 1.1X faster, FP 1.1X faster

99
Intel Summary
  • Archeology history of instruction design in a
    single product
  • Address size 16 bit vs. 32-bit
  • Protection Segmentation vs. paged
  • Temp. storage accumulator vs. stack vs.
    registers
  • Golden Handcuffsof binary compatibility affect
    design 20 years later, as Moore predicted
  • Not too difficult to make faster, as Intel has
    shown
  • HP/Intel announcement of common future
    instruction set by 2000 means end of 80x86???
  • Beauty is in the eye of the beholder
  • At 50M/year sold, it is a beautiful business
Write a Comment
User Comments (0)
About PowerShow.com