Embedded System HW - PowerPoint PPT Presentation

1 / 127
About This Presentation
Title:

Embedded System HW

Description:

Microprocessors use much more logic to implement a function than does ... (ACORN and Apple Computer) ARM Architecture. ARM versions. ARM assembly language. ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 128
Provided by: wayne74
Category:

less

Transcript and Presenter's Notes

Title: Embedded System HW


1
Embedded System HW
2
Why use microprocessors?
  • Alternatives field-programmable gate arrays
    (FPGAs), custom logic, etc.
  • Microprocessors are often very efficient can use
    same logic to perform many different functions.
  • Microprocessors simplify the design of families
    of products.

3
The performance paradox
  • Microprocessors use much more logic to implement
    a function than does custom logic.
  • But microprocessors are often at least as fast
  • heavily pipelined
  • large design teams
  • aggressive VLSI technology.

4
Power
  • Custom logic is a clear winner for low power
    devices.
  • Modern microprocessors offer features to help
    control power consumption.
  • Software design techniques can help reduce power
    consumption.

5
Microprocessor varieties
  • Microcontroller includes I/O devices, on-board
    memory.
  • Digital signal processor (DSP) microprocessor
    optimized for digital signal processing.
  • Typical embedded word sizes 8-bit, 16-bit,
    32-bit.

6
Many Types of Programmable Processors
  • Past
  • Microprocessor
  • Microcontroller
  • DSP
  • Graphics Processor
  • Now / Future
  • Network Processor
  • Sensor Processor
  • Cryptoprocessor
  • Game Processor
  • Wearable Processor
  • Mobile Processor

7
Application-Specific Instruction Processors
(ASIPs)
  • Processors with instruction-sets tailored to
    specific applications or application domains
  • instruction-set generation as part of synthesis
  • Pluses
  • customization yields lower area, power etc.
  • Minuses
  • higher h/w s/w development overhead
  • design, compilers, debuggers
  • higher time to market

8
Reconfigurable SoC
Other Examples Atmels FPSLIC(AVR
FPGA) Alteras Nios(configurable RISC on a PLD)
  • Triscends A7 CSoC

9
Instruction Sets
10
von Neumann architecture
  • Memory holds data, instructions.
  • Central processing unit (CPU) fetches
    instructions from memory.
  • Separate CPU and memory distinguishes
    programmable computer.
  • CPU registers help out program counter (PC),
    instruction register (IR), general-purpose
    registers, etc.

11
CPU memory
memory
address
CPU
PC
200
data
IR
ADD r5,r1,r3
ADD r5,r1,r3
200
12
Harvard architecture
address
CPU
data memory
PC
data
address
program memory
data
13
von Neumann vs. Harvard
  • Harvard cant use self-modifying code.
  • Harvard allows two simultaneous memory fetches.
  • Most DSPs use Harvard architecture for streaming
    data
  • greater memory bandwidth
  • more predictable bandwidth.

14
RISC vs. CISC
  • Complex instruction set computer (CISC)
  • many addressing modes
  • many operations.
  • Reduced instruction set computer (RISC)
  • load/store
  • pipelinable instructions.

15
Instruction set characteristics
  • Fixed vs. variable length.
  • Addressing modes.
  • Number of operands.
  • Types of operands.

16
Programming model
  • Programming model registers visible to the
    programmer.
  • Some registers are not visible (IR).

17
Multiple implementations
  • Successful architectures have several
    implementations
  • varying clock speeds
  • different bus widths
  • different cache sizes
  • etc.

18
ARM Architecture
  • Advanced RISC Machines(1990)
  • (ACORN and Apple Computer)

19
ARM Architecture
  • ARM versions.
  • ARM assembly language.
  • ARM programming model.

20
ARM versions
  • ARM architecture has been extended over several
    versions.
  • We will concentrate on ARMv5

21
Evolution of the ARM architecture versions
22
ARMv6 Improvement
  • Memory management
  • Multiprocessing
  • Multimedia support SIMD capability

23
Evolution of the ARM architecture
ARM11
24
Introduction
  • To allow very small, yet high-performance
    implementations
  • RISC
  • Large uniform register file
  • Load/store architecture
  • Simple addressing modes
  • Uniform and fixed-length instr fields
  • Auto-increment and auto-decrement addr modes
  • Conditional execution of all instrcutions

25
ARM assembly language
  • Fairly standard assembly language
  • LDR r0,r8 a comment
  • label ADD r4,r0,r1

26
Programming Model
27
ARM data types
  • Byte
  • Halfword 16 bits
  • Must be aligned to two-byte boundaries
  • Word 32 bits
  • Must be aligned to four-byte boundaries
  • ARM addresses can be 32 bits long.
  • Address refers to byte.
  • Address 4 starts at byte 4.
  • Can be configured at power-up as either little-
    or bit-endian mode.

28
Processor modes
  • User usr Normal program execution modes
  • FIQ fiq Supports a high-speed data transfer or
    channel process
  • IRQ irq Used for general-purpose interrupt
    handling
  • Supervisor svc A protected mode for OS
  • Abort abt Implements VM and/or memory
    protection
  • Undefined und Supports software emulation of
    HW coprocessors
  • System sys Runs privileged OS tasks
  • fiq, irq, svc, abt, und exception modes

29
Registers
r0
r8
r1
r9
0
31
r2
r10
CPSR
r3
r11
r4
r12
r5
r13
r6
r14
r7
r15 (PC)
Link register
unbanked registers
banked registers
30
(No Transcript)
31
Endianness
  • Relationship between bit and byte/word ordering
    defines endianness

bit 31
bit 0
bit 0
bit 31
byte 3
byte 2
byte 1
byte 0
byte 0
byte 1
byte 2
byte 3
little-endian
big-endian
32
ARM status bits
  • Every arithmetic, logical, or shifting operation
    may set CPSR (current program statues register)
    bits
  • N (negative), Z (zero), C (carry), V (overflow).
  • Examples
  • -1 1 0 NZCV 0110.
  • 231-11 -231 NZCV 0101.

33
ARM data processing operand addressing
  • Instruction syntax
  • ltopcodegtltcondgtS ltRdgt, ltRngt, ltshifter-operandgt
  • ltshifter-operandgt has 11 options

34
Condition field
  • Almost all ARM instrs. conditionally executed

35
ARM data processing operand addressing
Data processing immediate shift
Data processing register shift
Data processing 32-bit immediate
36
Shifter operand
  • Immediate
  • 8-bit constant and a 4-bit rotate (0,2,4,8,,30)
  • mov r0, 0
  • add r9, r9,1
  • Register operand
  • mov r2, r0
  • Shifted register operand
  • ASR, LSL, LSR, ROR, RRX (by one bit)
  • mov r2, r0, LSL 2 shift r0 left by 2, write
    to r2 (r2r0x4)
  • sub r10,r9,r8, LSR 4 r10 r9 - r8/16
  • sov r10,r9,r8, ROR r3 r10 r9 - (r8 rotated by
    value of r3)

37
ARM data-processing
  • AND
  • EOR
  • SUB Rd Rn - shifter operand
  • RSB Rd shifter operand - Rn
  • ADD
  • ADC (with carry)
  • SBC
  • RSC (reverse SBC)
  • TST update flags after Rn AND shifter operand
  • TEQ
  • CMP
  • CMN copmare negated
  • ORR (logical OR)
  • MOV
  • BIC
  • MVN (mov not)

38
ARM data-processing
  • Shift, Rotate ? shifter-operand
  • LSL, LSR logical shift left/right
  • ASR arithmetic shift left/right
  • ROR rotate right
  • RRX rotate right extended with C

39
Data operation varieties
  • Logical shift
  • fills with zeroes.
  • Arithmetic shift
  • fills with sign extension
  • RRX performs 33-bit rotate, including C bit from
    CPSR above sign bit.

40
Load and Store instructions
  • Two types
  • 32-bit word or an 8-bit unsigned byte
  • Load and store halfword and load signed byte
  • Addressing modes
  • Base register
  • Any one of GPR (including the PC)
  • Offset
  • Three format

41
Addressing modes
  • Offset
  • Immediate unsigned number (12 bits or 8 bits)
  • Register GPR (not the PC)
  • Scaled register shifted by an immediate value
  • LSL, LSR, ASR, ROR, RRX
  • Three ways to form the memory address
  • EA Base register or Offset
  • Offset
  • Pre-indexed
  • Post-indexed

42
Addressing modes
  • Base-plus-offset addressing
  • LDR r0,r1,16
  • Loads from location r116
  • Pre-indexing increments base register
  • LDR r0,r1,16!
  • Post-indexing fetches, then does offset
  • LDR r0,r1,16
  • Loads r0 from r1, then adds 16 to r1.

43
Load and store
  • LDR
  • LDRB
  • LDRH
  • LDRSB (signed byte)
  • LDRSH (signed halfw)
  • STR
  • STRB
  • STRH

44
Examples
  • LDR R1, R0 load R1 from the address in R0
  • LDR R8, R3, 4 EA R3 4
  • LDR R8, R3, -4 EA R3 4
  • STRB R10, R7, -R4 EA R7 R4
  • LDR R11, R3, R5, LSL 2 EA R3 (R5x4)
  • LDR R3, R9, 4 EA R9, R9 R9 4
    post-indexed
  • LDR R1, R0, 2 ! EA R02, R0R02
    pre-indexed
  • LDR R0, PC, 40 load R0 from PC0x40 (
    address of the instruction 8 0x40)

45
Load and store multiple
  • Addressing modes
  • IA increment after
  • IB increment before
  • DA decrement after
  • DB decrement before

46
Load and store multiple
  • LDM
  • STM
  • Examples
  • LDMIA r0, r5 r8
    load multiple r5-r8 from
    the
    address in r0
  • STMDA r1!, r2, r5, r7 r9, r11
    update r1

47
Branch instructions
  • Conditional branch forwards or backwards up to 32
    MB
  • Sign-extending the 24-bit imm_data to 32 bits
  • Shifting the result left two bits
  • Adding this to the PC (the addr of branch 8)
  • Approximately 32MB
  • B, BL

48
Examples
  • B label
  • BCC label branch if carry flag is clear
  • BEQ label if zero flag is set
  • MOV PC, 0 branch to location zero
  • BL func subroutine call
  • MOV PC,LR return
  • MOV LR, PC
  • LDR PC, func

49
ARM ADR pseudo-op
  • Cannot refer to an address directly in an
    instruction.
  • Generate value by performing arithmetic on PC.
  • ADR pseudo-op generates instruction required to
    calculate address
  • ADR r1,FOO

50
Examples
  • start MOV r0, 10
  • ADR r4, start gt SUB r4,pc,0xc
  • start pc - 4 - 8 pc - 12 pc - 0xc

51
Example C assignments
  • C
  • x (a b) - c
  • Assembler
  • ADR r4,a get address for a
  • LDR r0,r4 get value of a
  • ADR r4,b get address for b, reusing r4
  • LDR r1,r4 get value of b
  • ADD r3,r0,r1 compute ab
  • ADR r4,c get address for c
  • LDR r2r4 get value of c

52
C assignment, contd.
  • SUB r3,r3,r2 complete computation of x
  • ADR r4,x get address for x
  • STR r3r4 store value of x

53
Example C assignment
  • C
  • y a(bc)
  • Assembler
  • ADR r4,b get address for b
  • LDR r0,r4 get value of b
  • ADR r4,c get address for c
  • LDR r1,r4 get value of c
  • ADD r2,r0,r1 compute partial result
  • ADR r4,a get address for a
  • LDR r0,r4 get value of a

54
C assignment, contd.
  • MUL r2,r2,r0 compute final value for y
  • ADR r4,y get address for y
  • STR r2,r4 store y

55
Example C assignment
  • C
  • z (a ltlt 2) (b 15)
  • Assembler
  • ADR r4,a get address for a
  • LDR r0,r4 get value of a
  • MOV r0,r0,LSL 2 perform shift
  • ADR r4,b get address for b
  • LDR r1,r4 get value of b
  • AND r1,r1,15 perform AND
  • ORR r1,r0,r1 perform OR

56
C assignment, contd.
  • ADR r4,z get address for z
  • STR r1,r4 store value for z

57
Example if statement
  • C
  • if (a lt b) x 5 y c d else x c - d
  • Assembler
  • compute and test condition
  • ADR r4,a get address for a
  • LDR r0,r4 get value of a
  • ADR r4,b get address for b
  • LDR r1,r4 get value for b
  • CMP r0,r1 compare a lt b
  • BGE fblock if a gt b, branch to false block

58
If statement, contd.
  • true block
  • MOV r0,5 generate value for x
  • ADR r4,x get address for x
  • STR r0,r4 store x
  • ADR r4,c get address for c
  • LDR r0,r4 get value of c
  • ADR r4,d get address for d
  • LDR r1,r4 get value of d
  • ADD r0,r0,r1 compute y
  • ADR r4,y get address for y
  • STR r0,r4 store y
  • B after branch around false block

59
If statement, contd.
  • false block
  • fblock ADR r4,c get address for c
  • LDR r0,r4 get value of c
  • ADR r4,d get address for d
  • LDR r1,r4 get value for d
  • SUB r0,r0,r1 compute a-b
  • ADR r4,x get address for x
  • STR r0,r4 store value of x
  • after ...

60
Example Conditional instruction implementation
  • true block
  • MOVLT r0,5 generate value for x
  • ADRLT r4,x get address for x
  • STRLT r0,r4 store x
  • ADRLT r4,c get address for c
  • LDRLT r0,r4 get value of c
  • ADRLT r4,d get address for d
  • LDRLT r1,r4 get value of d
  • ADDLT r0,r0,r1 compute y
  • ADRLT r4,y get address for y
  • STRLT r0,r4 store y

61
Conditional instruction implementation, contd.
  • false block
  • ADRGE r4,c get address for c
  • LDRGE r0,r4 get value of c
  • ADRGE r4,d get address for d
  • LDRGE r1,r4 get value for d
  • SUBGE r0,r0,r1 compute a-b
  • ADRGE r4,x get address for x
  • STRGE r0,r4 store value of x

62
Example FIR filter
  • C
  • for (i0, f0 iltN i)
  • f f cixi
  • Assembler
  • loop initiation code
  • MOV r0,0 use r0 for I
  • MOV r8,0 use separate index for arrays
  • ADR r2,N get address for N
  • LDR r1,r2 get value of N
  • MOV r2,0 use r2 for f

63
FIR filter, cont.d
  • ADR r3,c load r3 with base of c
  • ADR r5,x load r5 with base of x
  • loop body
  • loop LDR r4,r3,r8 get ci
  • LDR r6,r5,r8 get xi
  • MUL r4,r4,r6 compute cixi
  • ADD r2,r2,r4 add into running sum
  • ADD r8,r8,4 add one word offset to array
    index
  • ADD r0,r0,1 add 1 to i
  • CMP r0,r1 exit?
  • BLT loop if i lt N, continue

64
Nested subroutine calls
  • Nesting/recursion requires coding convention
  • f1 LDR r0,r13 load arg into r0 from stack
  • call f2()
  • STR r14,r13! store f1s return adrs
  • STR r0,r13! store arg to f2 on stack
  • BL f2 branch and link to f2
  • return from f1()
  • SUB r13,4 pop f2s arg off stack
  • LDR r15,r13! restore register and return

65
Summary
  • Load/store architecture
  • Most instructions are RISCy, operate in single
    cycle.
  • Some multi-register operations take longer.
  • All instructions can be executed conditionally.

66
MPC850
  • Integrated Communication Microprocessor

67
Reference Manuals
  • MPC850 Family User Manual
  • PowerPC Programming Environment Manual
  • Course Home Page http//calab.kaist.ac.kr/maeng/c
    s310/micro02.htm
  • Motorola Home Page
  • http//e-www.motorola.com

68
Overview
  • Versatile, one-chip, integrated communication
    processor
  • Embedded PowerPC core
  • Versatile memory controller
  • Communication processor module (CPM)
  • Serial communication controllers (SCCs)
  • One USB
  • Etc.

69
(No Transcript)
70
Embedded PowerPC core
  • Single issue, 32-bit version
  • Branch folding and prediction
  • 2-K byte I-cache, 1K byte D-cache
  • 2-way set-associative
  • Physical
  • MMUs with 8-entry TLBs
  • 4K, 16K, 256K, 512K, and 8MB page sizes

71
Other Features
  • Dynamic data bus sizing 8-, 16-, 32-bit
  • CPU clock 0-80MHz
  • System Integration Unit (SIU)
  • Memory Controller
  • General Purpose timer
  • CPM, SCCs, SMCs, etc.

72
PowerPC Architecture
73
PowerPC instruction set
  • Overview
  • Operand Conventions
  • PowerPC Registers and programming model
  • Addressing Modes
  • Instruction Set
  • Cache model
  • Exception Model
  • Memory management model

74
PowerPC Architecture
  • Motorola, IBM, Apple computer
  • Power Architecture RS/6000 family
  • 64-bit architecture with a 32-bit subset
  • Three Levels of the architecture
  • Flexibility degrees of SW compatibility
  • UISA (User instruction set architecture)
  • VEA (Virtual environment architecture)
  • OEA (Operating environment architecture)

75
Features not defined by the PowerPC Architecture
  • For flexibility
  • System bus interface signals
  • Cache design
  • The number and the nature of execution units
  • Other internal micro-architecture issues

76
Endianness
  • Relationship between bit and byte/word ordering
    defines endianness

bit 31
bit 0
bit 0
bit 31
byte 3
byte 2
byte 1
byte 0
byte 0
byte 1
byte 2
byte 3
little-endian
big-endian
PowerPC, IBM, Motorola
ARM, Intel
77
Programming Model Registers
78
(No Transcript)
79
PowerPC programming model - Register Set
  • User Model UISA (32-bit architecture)

Condition register
GPR0(32)
FGPR0(64)
CR(32)
GPR1(32)
FGPR1(64)


FP status and control register
GPR31(32)
FPSCR(32)
FGPR31(64)
XER register
Link register
Count register
CTR(64/32)
XER(32)
LR(64/32)
80
Condition Registers (CR)
  • For testing and branching

CR0
CR1
CR7
CR6
CR5
CR4
CR3
CR2
0
31
FP
Condition register CRn Field Compare Instruction
For all integer instrs. Bit0 Negative(LT) Bit1
Positive(GT) Bit2 Zero (EQ) Bit3 Summary
Overflow(SO)
back
81
XER Register (XER)
back
82
XER Register (XER), contd
83
Link Register (LR), Count Register (CTR)
bclrx (bc to link register) Branch with link
update
84
Counter Register
  • Loop count

85
VEA Register Set Time Base
86
OEA Register Set
87
Machine State Register (MSR)
88
(No Transcript)
89
(No Transcript)
90
Addressing Modes
  • Effective Address Calculation
  • Register indirect with immediate index mode
  • Register indirect with index mode
  • Register indirect mode

91
Register Indirect with Immediate Index Addressing
back
92
Register Indirect with Index
back
93
Register Indirect
back
94
Instruction Formats
  • 4 bytes long and word-aligned
  • Bits 0-5 always specify the primary opcode
  • Extended opcode

95
Instruction set
  • Integer
  • Floating-point
  • Load and store
  • Flow control
  • Processor control
  • Memory synchronization
  • Memory control
  • External control

96
Integer Instructions
  • Arithmetic, compare, logical, rotate and shift
  • Integer arithmetic, shift, rotate, and string
    move
  • May update or read values from the XER
  • The CR may be updated if the Rc bit is set.
  • addic - addic.

97
(No Transcript)
98
(No Transcript)
99
(No Transcript)
100
Integer Compare
  • Algebraically, logically
  • crfD can be omitted if the result is to be placed
    in CR0
  • crfD field the target CR
  • The L bit has no effect on 32-bit operations

101
Integer compare, contd
102
Integer Logical
103
Integer Logical, contd
104
Rotate and Shift Instructions
  • SH specify the number of bits to rotate
  • MB mask start
  • ME mask stop

105
Integer Rotate
106
Integer Shift
107
Load and Store
  • Integer load and store
  • Integer load and store with byte-reverse
  • Integer load and store multiple
  • FP load and store
  • Memory synchronization

108
(No Transcript)
109
(No Transcript)
110
(No Transcript)
111
(No Transcript)
112
Branch and Flow Control
  • EA calculation
  • Branch relative
  • Branch conditional to relative address
  • Branch to absolute address
  • Branch conditional to absolute address
  • Branch conditional to link register
  • Branch conditional to count register

113
Branch Relative
114
Branch conditional to relative
115
Branch to Absolute
116
Branch conditional to absolute
117
Branch conditional to LR
118
Branch conditional to count register
119
Conditional Branch control
120
Branch Instructions
121
CR logical Instructions
122
Trap, System Linkage
123
Processor Control
124
(No Transcript)
125
Memory Synchronization
126
Example
  • Test and Set
  • loop lwarx r5,0,r3 load and reserve
  • cmpwi r5,0 done if word
  • bne 12 not equal to 0
  • stwcx. r4,0,r3 try to store
    non-zero
  • bne- loop loop if lost
    reservation

127
Summary
  • UISA, VEA, OEA
  • Register set
  • Fixed size instruction - RISC
  • Load and store architecture
  • 3 addressing modes
  • Condition Register Update Rc field
  • 8 condition registers
  • Branch addressing modes
  • BO, BI fields
  • Relative, absolute, LR, CTR
Write a Comment
User Comments (0)
About PowerShow.com