Introduction to VLSI Programming Lecture 7: Introduction to the DLX - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to VLSI Programming Lecture 7: Introduction to the DLX

Description:

Introduction to VLSI Programming Lecture 7: ... Introduce resource sharing: commands, auxiliary variables, expressions, operators. ... DLX ('Deluxe' ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 33
Provided by: keesvan
Learn more at: http://www.cs.unc.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction to VLSI Programming Lecture 7: Introduction to the DLX


1
Introduction to VLSI Programming Lecture 7
Introduction to the DLX
  • (course 2IN30)
  • Prof. dr. ir.Kees van Berkel

2
VLSI programming for
  • Low costs
  • introduce resource sharing.
  • Low delay (high throughput)
  • introduce parallelism.
  • Low energy (low power)
  • reduce activity

3
VLSI programming for low costs
  • Keep it simple!!
  • Introduce resource sharing commands, auxiliary
    variables, expressions, operators.
  • Enable resource sharing, by
  • reducing parallelism
  • making similar commands equal

4
Procedure definition vs declaration
  • Procedure definition P proc (). S
  • provides a textual shorthand (expansion)
  • each call generates copy of resource, i.e. no
    sharing
  • Procedure declaration P proc (). S
  • defines a sharable resource
  • each call generates access to this resource

5
Hints and Tips optimization
  • When asked to optimize for area (low cost) it is
    allowed to invest time (execution time, extra
    iterations, )
  • When asked to optimize for speed, it is allowed
    to invest area (pipeline stages, parallelism, )

6
Hints and Tips a known bug
  • Statement of form if x then S0 else S1 fi
  • During simulation wrong alternative is selected
    (e.g. S0 when x true)
  • Work around remove negation if x then S1 else
    S0 fi

7
Instruction Set Architecture
  • ISA is interface between hardware and software.
  • Hence, a good ISA
  • allows easy programming (compilers, OS, ..)
  • allows efficient implementations (hardware)
  • has a long lifetime (survives many HW
    generations)
  • is general purpose.

8
ISA classification
Code sequence for C AB
9
Reduced Instruction Set Computer
  • 1980 Patterson and Ditzel The Case for RISC
  • fixed 32-bit instruction set, with few formats
  • load-store architecture
  • large register bank (32 registers), all general
    purpose
  • On processor organization
  • hard-wired decode logic
  • pipelined execution
  • single clock-cycle execution

10
RISC processors
  • Advantages
  • smaller die size (single chip processor)
  • shorter development time (simplicity)
  • higher performance
  • Disadvantages
  • poor code density
  • cannot execute X86 code

11
A Typical RISC
  • 32-bit instructions, 3 fixed formats
  • 32 general purpose registers, 32-bit
  • 3 address arithmetic instructions, reg-reg
  • single address mode for load/store address
    displacement
  • simple branch conditions delayed branch

12
DLX (Deluxe)
  • (AMD 29K DECstation 3100 HP850 IBM801
    Intel i860 MIPS M/120A MIPS M/1000 Motorola
    88K RISC I SGI 4D/60 SPARCstation-1 Sun
    4/110 Sun-4/260) / 13
  • DLX
  • Other RISC examples include
    Cray-1,2,3, AMD2900, DEC
    Alpha, ARM.

13
DLX instruction formats
31 26, 25 21, 20 16, 15 11, 10
0
14
Example instructions
15
GCD in GCL
  • x,y X,Y
  • do x?y ? if xgty ? x x-y
  • xlty ? y y-x
  • fi
  • od
  • R xgcd(X,Y)

16
GCD in DLX assembler
  • pre LW R1,4(R0) R1Mem40
  • LW R2,8(R0) R2Mem80
  • loop SUB R3,R1,R2 R3R1-R2
  • BEQZ R3,exit if (R30) then PCexit
  • SLT R4,R1,R2 R4(R1ltR2)
  • BEQZ R4,pos2 if (R40) then PCpos2
  • pos1 SUB R2,R2,R1 R2R2-R1
  • J loop PCloop
  • pos2 SUB R1,R1,R2 R1R1-R2
  • J loop PCloop
  • exit SW 20(R0),R1 Mem200R1
  • HLT

17
DLX instruction mixes
from HP, Figs 2.26, 2.27
18
DLX interface, state
Instruction memory
Mem (Data memory)
address
address
r0
pc
r1
r2
DLX CPU
Reg
instruction
data
r/w
r31
clock
interrupt
19
DLX Moore machine(ignoring interrupts)
  • ?Reg0,pc ? ?0,0?
  • do ?MemRegrs1 immediate, pc, Regrd ?
  • ? if SW ? Regrd fi
  • , if J ? pc4offset
  • BEQZ ? if Regrs0 ? pc4
    immediate Regrs0 ? pc4 fi
  • else ? pc4
  • fi
  • , if LW ? Memrs1immediate
  • ADD ? ALU(add, Regrs1, Regrs2)
  • fi ?
  • od

20
DLX 5-step sequential execution
21
DLX 5-step sequential execution
IF
ID
EX
MM
WB
22
Bibliography
  • Computer Architecture a Quantitative Approach
    (3rd Ed.) John L Hennessy David A Patterson
    Morgan Kaufmann Publishers Inc, 1996.
  • ARM System Architecture Steve Furber Addison
    Wesley, 1996.
  • DSP Processor Fundamentals, Architectures and
    Features Phil Lapsey et al (Berkeley Design
    Technology Inc.), IEEE, 1996.
  • www.handshakesolutions.com
  • www.arm.com/news/6936.html
  • www.research.philips.com/ newscenter/archive/2004/
    handshake.html

23
Some references
  • www.handshakesolutions
  • www.arm.com/news/6936.html
  • www.research.philips.com/ newscenter/archive/2004/
    handshake.html

24
Pipelining in Tangram (cntd)
  • Output sequence b identical for P0, P1, and P2.
  • P0 and P1 have same communication behavior P1
    is larger, slower, and warmer.
  • P2 vs P1 similar in size, energy, and latency,
    but up to 3 times higher throughput, depending
    on (relative) complexity of f0, f1, f2.

25
DLX 5-step sequential execution
IF
ID
EX
MM
WB
26
DLX pipelined execution
Time ? in clock cycles 1 2 3
4 5 6 7 8
...
Program execution ? instructions
27
DLX pipelined execution
Instruction Fetch
Inst.Decode
EXecute
Memory
Write Back
4
0?
pc
Instr. mem
Reg
Mem
28
DLX system organization
RAMaddrdatatoRAMdatafromRAM
ROMaddrROMdata
dlx()

systemboundary
rom()
ram()
filesRAMoutRAMin
system_dlx()
file gcd.bin
29
dlx0.ht
  • include types.ht
  • dlx0 export proc ( ROMaddr!chan adtype
  • ROMdata?chan word
  • RAMaddr!chan rwadtype datatoRAM!chan
    S30 datafromRAM?chan S30
  • ) .
  • begin
  • RF ram array U5 of S30
  • end

30
system_dlx0.ht
  • include "dlx0.ht"
  • dlx0 proc ( ROMaddr!chan adtype
  • ROMdata?chan word
  • RAMaddr!chan rwadtype datatoRAM!chan
    S30 datafromRAM?chan S30
  • ) . import
  • env_dlx4 main proc (
  • ROMfile? chan word
  • RAMinfile? chan S30
  • RAMfile! chan S30 / ltltaddress,datagtgt
    /
  • ) .
  • begin
  • next slide
  • end

31
system_dlx0.ht main body
  • begin
  • ROMaddr chan adtype
  • ROMdata chan word
  • RAMaddr chan rwadtype
  • datatoRAM chan S30
  • datafromRAM chan S30
  • ROMinterface proc() . begin .. end
  • RAMinterface proc() . begin .. end
  • initialise() ROMinterface()
    RAMinterface() dlx0( ROMaddr, ROMdata,
    RAMaddr, datatoRAM, datafromRAM )
  • end

32
script
  • htcomp -B system_dlx0
  • htsim -limit 1000 system_dlx0 gcd.bin RAMin
    RAMout
  • htview system_dlx0
Write a Comment
User Comments (0)
About PowerShow.com