Assembly Process - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Assembly Process

Description:

In machine code, the target address in a branch must be specified as an offset ... word addresses are multiples of 4 there is no need to store the last two bits ... – PowerPoint PPT presentation

Number of Views:320
Avg rating:3.0/5.0
Slides: 19
Provided by: csU77
Learn more at: http://www.cs.uwm.edu
Category:

less

Transcript and Presenter's Notes

Title: Assembly Process


1
Assembly Process
2
Machine Code Generation
  • Assembling a program entails translating the
    assembly language into binary machine code
  • This requires more than simply mapping assembly
    instructions to machine instructions
  • Each instruction is bound to an address
  • Labels are bound to addresses
  • Assembly instructions which refer to labels
    generate machine instructions which contain the
    label's address
  • Pseudo-instructions are translated into one or
    more machine instructions

3
Instruction Format
addi 13,7,50
0010 00
00111
01101
0000 0000 0011 0010
16 bits
6 bits
5 bits
5 bits
immediate operand
opcode
add 13,7,8
0000 00
00 111
01000
01101
000 0010 0000
extended opcode
opcode
4
The symbol table
  • The assembler scans the source code and generates
    the appropriate bit string for each line
    encountered
  • The assembler must remember
  • what memory locations have been allocated
  • to which address each label is bound
  • A symbol table is a list of (label, address)
    pairs
  • When the data and text segments have been
    generated, they are stored as an executable file
  • The file is used by a program called the loader
    to initialize memory to the appropriate state
    before execution

5
Instructions
  • The .text directive tells the assembler that the
    lines which follow are instructions.
  • By default, the text segment starts at 0x00400000
  • In some cases, a symbol may not have an assigned
    address yet when the assembler scans the line
    where it belongs
  • A second pass through the code can update
    instructions containing unresolved labels
  • Maintain a list of addresses in which each
    unresolved label appears
  • When the labeled is added to the symbol table,
    all locations in the corresponding list are
    updated to hold the address associated with the
    label

6
Branch offset in the MIPS R2000
  • In machine code, the target address in a branch
    must be specified as an offset from the address
    of the branch.
  • During execution, this offset is simply added to
    the program counter to fetch the next instruction
  • PC contains the address
  • Offset is measured in words, not bytes
  • PC_NEW offset4 PC_OLD
  • To calculate the offset, the assembler uses the
    formula
  • offset (target instruction address
    (branch instruction address))/4

7
Branch offset calculation
  • The offset is stored in the instruction as a word
    offset rather than a byte offset.
  • Instructions are only stored at word boundaries
  • For both target and branch instruction, the least
    two bits of the address are zero
  • An offset maybe negative
  • If the target instruction preceded the branch
    instruction
  • The offset is stored in the 16-bit immediate
    field
  • This means the branch can only jump about 215
    instructions before or after the current address
  • 215 instructions (words) 217 bytes

8
Branch offset calculation
  • An entry in the SPIM instruction list

offset in bytes (__start 0x00400000) 0x00400000
(0x00400068) - 104
stored offset ffe6 -26 -104/4
offset calculation, in bytes ignores PC increment
0x00400068 0x1440ffe6 bne 2, 0, -104
__start-0x00400068 44 bnez v0, __start
machine code
orignal assembly code
instruction address
line number in source file
9
Jump target calculation
  • The jump instruction has two forms
  • Pseudo-direct, for j and jal
  • Register direct for jr and jalr
  • jr and jalr specify a register containing the
    address to be loaded into the PC
  • j and jal specify most of the address of the
    target within the instruction.
  • However, they have a range of at most
    one-sixteenth of the memory space

f e d c b a 9 8 7 6 5 4 3 2 1 0
10
Jump target calculation
  • The target address is a 32 bit quantity
  • Since all word addresses are multiples of 4 there
    is no need to store the last two bits
  • The jump instruction format has 26 bits for the
    target address
  • The remaining 6 bits of the instruction are used
    for the opcode
  • The highest-order 4 bits of the target are taken
    from the address currently stored in the program
    counter

11
Jump Target Calculation
  • jump instructions have a range of 226 words or
    226 x 22 228 bytes
  • This range is NOT symmetric about the jump
    instruction

f e d c b a 9 8 7 6 5 4 3 2 1 0
0x0fffff7c
0x80000080
-0x00000080
12
Program relocation
  • It is possible that program modules are developed
    separately by individual programmers. When these
    programs are to be loaded into memory they should
    not be assigned overlapping memory space.
  • Thus,the modules have to be relocated
  • relative addresses are relocatable
  • Any absolute references must be "fixed" by the
    loader
  • Use a logical base address known at load time
  • Absolute addresses are stored as offsets from
    this TBD base

13
From source to executable
high-level source code
lib
obj
asm
exe
obj
asm
linker
loader
assembler
memory
compiler
14
Some examples of assembling code
  • .data
  • a1 .word 3
  • a2 .word 16, 16, 16, 16
  • a3 .word 5
  • .text
  • __start
  • la 6, a2
  • loop
  • lw 7, 4(6)
  • mul 9, 10, 7
  • b loop
  • li v0, 10
  • syscall

15
Some examples of assembling code
  • Symbol Table
  • symbol address
  • a1 1000 0000
  • a2 1000 0004
  • a3 1000 0014
  • __start 0040 0000
  • loop 0040 0008
  • Memory map of data section
  • address contents
  • 1000 0000 0000 0003
  • 1000 0004 0000 0010
  • 1000 0008 0000 0010
  • 1000 000c 0000 0010
  • 1000 0010 0000 0010
  • 1000 0014 0000 0005
  • .data
  • a1 .word 3
  • a2 .word 16, 16, 16, 16
  • a3 .word 5
  • .text
  • __start
  • la 6, a2
  • loop
  • lw 7, 4(6)
  • mult 9, 10, 7
  • b loop
  • li v0, 10
  • syscall

16
Translate pseudo-instructions
  • lui 6, 6, 0x1000
  • ori 6, 6, 0x0004
  • lw 7, 4(6)
  • mult 10, 7
  • mflo 9
  • b loop
  • ori v0, 0, 10
  • syscall
  • la 6, a2
  • loop
  • lw 7, 4(6)
  • mul 9, 10, 7
  • b loop
  • li v0, 10
  • syscall

17
Translate to machine code
  • lui 6, 0x1000
  • ori 6, 0x0004
  • lw 7, 4(6)
  • mult 10, 7
  • mflo 9
  • b loop
  • ori v0, 0, 10
  • syscall

address contents 00400000 3c06 1000 (lui)
00400004 34c6 0004 (ori) 00400008 8cc7 0004
(lw) 0040000c 012a 0018 (mult) 00400010 0000
4812 (mflo) 00400014 1000 xxxx (beq) 00400018
3402 000a (ori) 0040001c 0000 000c (syscall)
18
Resolve relative references
  • lui 6, 0x1000
  • ori 6, 0x0004
  • lw 7, 4(6)
  • mult 10, 7
  • mflo 9
  • b loop
  • ori v0, 0, 10
  • syscall

address contents 00400000 3c06 1000 00400004
34c6 0004 00400008 8cc7 0004 0040000c 012a
0018 00400010 0000 4812 00400014 1000 fffd
(-3) 00400018 3402 000a 0040001c 0000 000c
0x400008 - (0x400014)/4 -12/4 -3 0xfffd
Write a Comment
User Comments (0)
About PowerShow.com