Structure of Computer Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Structure of Computer Systems

Description:

Structure of Computer Systems Course 4 The Central Processing Unit - CPU CPU - Central Processing Unit Classic (idyllic) view Incorporates 2 of the 5 components ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 29
Provided by: sebes4
Category:

less

Transcript and Presenter's Notes

Title: Structure of Computer Systems


1
Structure of Computer Systems
  • Course 4
  • The Central Processing Unit - CPU

2
CPU - Central Processing Unit
  • Classic (idyllic) view
  • Incorporates 2 of the 5 components of the von
    Neumanns classical model
  • ALU
  • CU Control Unit
  • It is the brain (intelligent part) of a computer
  • Fetch (read) instruction, decode/interpret it,
    read data, execute instruction and store the
    result
  • Do its job in a synchronized and sequential way
    one thing at a time

3
CPU - Central Processing Unit
  • Todays view
  • Contains all kind of computer components
  • Multiple CPUs
  • symmetric, asymmetric,
  • multiple cores,
  • multiple ALUs, specialized ALUs (e.g. floating
    point, multimedia MMX, SSE2)
  • Memory multiple levels of cache memory (L0, L1,
    L2, Trace cache)
  • Interfaces and Peripheral devices (in case of
    microcontrollers and DSPs)
  • Serial channels
  • Parallel interfaces,
  • Timers, counters
  • Converters (ADC, DAC)
  • Network interfaces
  • Interrupt system
  • Bus controller(s) and arbiter(s)
  • Memory management units
  • Execute instructions in parallel and in a
    speculative order
  • Intelligence may be distributed in memories and
    interfaces as well
  • Where is that nice idyllic image ?

4
Starting with the beginning
  • A simple computer
  • Attributes sequential, one (accumulator)
    register, one memory for instructions and data

Legend CG - clock generator PhG phase
generator PC program counter IR instruction
register Acc - accumulator
5
A simple computer
  • How does it work?
  • 4 phases
  • IF instruction fetch read the instruction
    into IR
  • Dec - Decode the instruction generate control
    signals
  • PreEx - Prepare execution e.g. read the data
    from memory
  • Exe Execute e.g. adding, subtraction

6
A simple computer
  • Example 1 ADD Acc, M100h
  • IF Sel0 gt Address PC IR_ld impuls gt IR
    ADD 100
  • Dec Sel1 gtAddress IR_adr100 Inc1
    increment PC
  • PreEx Op_sel code_add gt ALU is doing an
    adding
  • Exe Acc_ld gt Acc Acc M100

7
A simple computer
  • Example 2 JMP 200h
  • IF Sel0 gt Address PC IR_ld impulse gt
    IRJMP 200
  • Dec Inc 1 gt increment PC
  • PreEx PC_ld 1 gt PCIR_addr100
  • Exe
  • Example 3 SHR Acc
  • IF and Dec the same
  • PreEx
  • Exe Acc_shr 1 gt shift the accumulator one
    position to the right

8
A simple computer
  • Homework try to implement
  • MOV Maddr, Acc
  • MOV Acc, Maddr
  • Conditional jump (e.g if Acc0, gt0, lt0)
  • MOV Acc, 0

9
A simple computer
  • Issues
  • Every instruction executed in a fixed (4) number
    of steps
  • Too many for simple instructions
  • Too few for complex instructions (e.g. multiply)
  • Only one internal register hard to operate with
    data
  • No Input and Output devices
  • Limited number of possible operations small
    instruction set
  • Possible improvements
  • Variable number of phases -gt the phase generator
    should depend on the instruction code
  • Multiple internal registers -gt 2 buses input
    data output data
  • Front panel with 7segment LEDs and switches
  • Increase the number of instructions -gt more
    complex Decoder and Command and Control Unit

10
A more sophisticated computer, but still simple
the MIPS architecture
  • Attributes
  • Sequential
  • 32 internal registers of 16 bits
  • Instructions fixed length, variable content
  • Harvard memory architecture separate instruction
    and data memory
  • An instruction is executed in 5 phases
  • IF instruction fetch
  • ID decode the instruction and prepare (read)
    the data
  • Ex execute the instruction
  • M - operation with the memory
  • Wb write back store the result
  • Instruction types
  • R Register ex. ADD RS, RD,RT
  • I Immediate ex. ADDI RT,RS, constant
    LW RT, offset(RS)
  • J Jump ex. JMP target

11
MIPS architecture
  • Instruction formats
  • Fixed length (4 bytes) but multiple content
  • R register type instructions
  • ltinstrgt rd, rs, rt
  • rd destination register
  • rs source register
  • rt target register
  • Ex add s1, s2, s3 s1s2s3

12
MIPS architecture Instruction formats
  • I immediate type instruction - with immediate
    value (constant)
  • ltinstrgt rt, rs, IMM
  • rs source register
  • rt target register
  • Ex addi s1, s2, 55 s1s255
  • J jump type instructions
  • ltinstrgt LABEL
  • Ex j et1 jump

13
MIPS architecture
  • Address generation and instruction fetch

PC_MUX_Sel1
PC_ld
IR_ld
4
Op_code
MUX
Program Memory
PC
Address
IR
Instr. code
op_address
Add
0
MUX
const.
Jump address
PC_MUX_Sel2
PC PC4 - increment the PC PCJump_Address
absolute jump PCPC Jump_Address relative jump
14
MIPS architecture
  • Decode and data preparation

Exec cmds.
DEC
op_code
Mem. cmds.
WB cmds.
Instruction register
reg. 0
MUX
A (data)
reg. 1
reg. 2
IR
op1_ad
reg. 31
op2_ad
MUX
B (data)
Register Block
address
I (Immediate value)
15
MIPS architecture
  • Execute and memorize

Data out
16
MIPS architecture
  • Write back the result

17
MIPS architecture
  • The whole picture

Clk
Phase gen.
Clock gen.
Instr. dec
4
IR
PC
Instr. mem
Data Mem
Regs
Regs
ALU
0
18
Pipeline execution
  • What does it mean?
  • Work as an assembly line
  • idea General Motors around 1900
  • How to do it?
  • Specialized components (units) for every phase of
    instruction execution
  • Memorize the partial results in temporary buffers
  • What can we achieve?
  • Higher execution speed at the same clock
    frequency
  • CPI 1

19
Sequential v.s. Pipeline execution
  • Sequential execution CPI5
  • Pipeline execution CPI1 (in the ideal case)

T1 T2 T3 T4 T5 T6
T7 T8 T9 T10
i1
IF ID Ex M Wb
IF ID Ex M Wb
i2
i3
IF ID Ex M Wb
i4
IF ID Ex M Wb
i5
IF ID Ex M Wb
20
Superscalare and superpipeline architectures
  • Superscalar
  • Multiple pipelines
  • 2 instructions are fetched every clock
  • CPI ½
  • Superpipeline
  • phases require only half clock period
  • CPI 1/2

T1 T2 T3 T4 T5 T6
instr. i IF ID Ex M Wb
instr. i1 IF ID Ex M Wb
instr. i2 IF ID Ex M
Wb
instr. i3 IF ID Ex M
Wb
T1 T2 T3 T4 T5 T6
instr. i IF ID Ex M Wb
instr. i1 IF ID Ex M Wb
instr. i2 IF ID Ex M
Wb
instr. i3 IF ID Ex M
Wb
21
Pipelined MIPS architecture
22
Pipeline architecture
  • There is no free meal!
  • Hazard cases
  • Data hazard
  • Data dependency between consecutive instructions
  • Control hazard
  • Jump/branch instructions change the normal
    (sequential) order of instruction execution
  • Structural hazard
  • Instructions in different phases use the same
    structural component (e.g. ALU, registers,
    memory, bus, etc.)
  • Result reduce the speed and the efficiency of
    the pipeline architecture

23
Hazard cases in pipeline architectures
  • Data hazard
  • Data hazard types
  • RAW - read after write
  • Occurs very often avoided through forwarding
    (see Common data bus)
  • WAR write after read
  • It is rare in classic pipeline more often in
    superscalar pipelines
  • WAW write after write
  • RAR not a hazard

24
Hazard cases in pipeline architectures
  • Data hazard (cont.)
  • Solutions
  • Detection and Stall phases
  • instruction with unsolved data dependency waits
    in the instruction fetch stage until the data
    is available
  • the next instructions are also stalled
  • Register renaming
  • multiple copies of a register (see alias
    registers for Pentium Pro)
  • instructions with no logical dependency between
    them can get different copies of the same
    register
  • avoid artificial data dependency caused by the
    limited number of internal registers
  • Forwarding (see Common data bus)
  • transfer a result in advance before it is written
    in the final place (register or memory location)
  • Out-of-order execution
  • speculative execution (see Pentium Pro
    architecture)

25
Hazard cases in pipeline architectures
  • Structural hazard
  • Solutions
  • Detection and Stall phases
  • Redundant functional units see Pentium
    processors
  • Harvard memory organization separate code and
    data memory see microcontrollers
  • Multiple buses see DSPs
  • Out-of-order execution

26
Hazard cases in pipeline architectures
  • Control hazard
  • Solutions
  • Stall phases
  • Branch prediction
  • Out-of-order execution

27
Pipeline architecture hazard cases
  • Solving hazard cases
  • Detect hazard cases and introduce stall phases
  • Rearrange instructions
  • re-arrange instructions in order to reduce the
    dependences between consecutive instructions
  • Methods
  • Static scheduling made before program execution
    optimization made by the compiler or user
  • Dynamic scheduling made during program
    execution optimization made by the processor
    out-of-order execution
  • Branch prediction techniques

28
Static v.s. dynamic scheduling
  • Static scheduling
  • The optimal order of instructions is established
    by the compiler, based on information about the
    structure of the pipeline
  • Advantages it is made once and benefit every
    time the code is executed
  • Drawback compiler should know about the
    structure of the hardware (e.g. pipeline stages,
    phases of every instruction) compiler must be
    changed when the processor version changes
  • Dynamic scheduling
  • The hardware has the capacity to reorder
    instruction to avoid or reduce the effect of
    hazard cases
  • Advantage the processor knows best its
    structure optimization can be better connected
    to the hardware some dependences are reviled on
    at run-time
  • Drawbacks reordering decisions are made every
    time the code is executed mode complex hardware
    is needed
Write a Comment
User Comments (0)
About PowerShow.com