Crosscutting Issues: The R - PowerPoint PPT Presentation

About This Presentation
Title:

Crosscutting Issues: The R

Description:

Crosscutting Issues: The R le of Compilers. Architects must be aware of current ... Fallacy: You can design a flawless architecture. All designs have trade-offs ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 26
Provided by: csgw
Category:

less

Transcript and Presenter's Notes

Title: Crosscutting Issues: The R


1
(No Transcript)
2
Crosscutting Issues The Rôle of Compilers
  • Architects must be aware of current compiler
    technology

3
Modern Compilers
E.g. procedure inlining, loop transformations
Register allocation
Machine dependent optimisations
4
Compiler Technology
  • Multiple passes complicate matters
  • E.g. common subexpression elimination must assume
    that a register will be allocated for the
    temporary value
  • E.g. Procedure inlining before size is known
  • Register allocation is critical
  • Uses graph colouring techniques
  • Requires at least 16 registers to be effective

5
Architectural Issues
  • How are variables allocated and addressed?
  • Stack local variables, scalars
  • Global data area global variables, constants,
    arrays
  • Heap dynamic objects, not scalars
  • How many registers are needed?
  • Integer 26 registers
  • FP 20 registers

6
Aiding Compiler Writers
  • Architectures should
  • Be regular (orthogonal instruction set)
  • Provide primitives, not solutions
  • Simplify trade-offs among alternatives
  • Not require run-time interpretation of data known
    at compile-time
  • VAX CALLS

Keep it simple!
7
Compiler Support for Multimedia Instructions
  • SIMD instructions act on multiple smaller data
    items in a large word
  • Solutions, not primitives!
  • Too few registers!
  • Data types not found in programming languages!

Result Only used by low-level graphics libraries.
8
Multimedia Instructions
  • These SIMD instructions act like a mini-vector
    architecture
  • E.g. MMX in 64 bits
  • 8 8-bit vectors
  • 4 16-bit vectors
  • 2 32-bit vectors
  • SSE 128 bits
  • Much more limited than genuine vector processors

9
Putting It All Together MIPS
  • 64-bit load/store design
  • RISC features
  • GPR, load-store architecture
  • Small, simple instruction set
  • Designed for efficient pipelining (fixed length
    instructions)
  • Efficient compiler target

10
MIPS
  • 32 64-bit integer registers
  • R0R31
  • R0 fixed 0
  • 32 64-bit or 32-bit floating point registers
  • Supports paired single operations

11
MIPS Data Types
  • Integer
  • Bytes, 16-bit halfwords, 32-bit words, 64-bit
    double words
  • Operations are all 64-bit
  • Floating point
  • 32-bit and 64-bit

12
MIPS Addressing Modes
  • Only immediate and displacement
  • 16-bit displacements/immediates
  • Register-indirect set displacement 0
  • 16-bit absolute use R0
  • Byte addressable with 64-bit addresses
  • Big-endian or little-endian
  • Alignment required

13
MIPS Instructions
  • Three instruction formats

14
MIPS Operations
  • Load-store
  • ALU operations
  • Add, subtract, multiply, divide, and, or, xor,
    LUI (load upper immediate), shifts
  • Control transfer
  • Set conditions
  • Branch (reg0, reg?0, reg1reg2, reg1?reg2),
    jump, jump-and-link (call)
  • Conditional move
  • Floating point
  • Paired single operations
  • Multiply-add (DSP)

15
MIPS Instruction Usage
  • Integer applications
  • Load, add, branch, store, or, compare
  • FP applications
  • Add (int), load (int), load, multiply, add, store

Figure 2.34.
16
Another View Trimedia Media Processor
  • Embedded processor for multimedia applications
  • E.g. set-top boxes (decoders, etc.) and TVs
  • Very different architecture
  • 128 32-bit registers (FP or int)
  • Partitioned (SIMD) instructions
  • 2s complement and saturating arithmetic
  • VLIW architecture

17
Trimedia VLIW Approach
  • Compiler can group up to five instructions for
    simultaneous execution
  • Must be independent
  • Use NOPs if there are insufficient independent
    instructions
  • Large program size
  • Trimedia uses memory compression
  • Programs are 2-3 times larger than MIPS (even
    with compression)!

18
Fallacies and Pitfalls
  • Pitfall Designing a high-level instruction set
    to support HLLs
  • Seldom provide an exact match
  • Often too general (VAX CALLS)

19
Fallacies and Pitfalls
  • Fallacy There is such a thing as a typical
    program
  • Programs vary very significantly
  • Pitfall Designing an architecture to reduce code
    size without considering compilers
  • Compilers have much greater impact on code size
  • Start with densest compiled code

20
Fallacies and Pitfalls
  • Pitfall Expecting good compiled performance for
    DSPs
  • Hand-tuned assembler is faster and more compact
  • Fallacy An architecture without flaws cannot be
    successful
  • 80x86!
  • Segments, accumulators, stack-based FP

21
Fallacies and Pitfalls
  • Fallacy You can design a flawless architecture
  • All designs have trade-offs
  • VAX code size more important than easy decoding
  • Early RISCs delayed branches
  • Address space

22
2.15. Concluding Remarks
  • 1960s Stack architectures
  • Matched the compiler technology of the day
  • 1970s CISC era
  • Tried to support HLL features in hardware
  • Today RISC era
  • Simple, load-store architectures

23
Concluding Remarks
  • Trends in the 1990s
  • Move to 64 bits
  • Conditional instructions
  • Eliminating branches
  • Optimisation of cache access (prefetch
    instructions)
  • Support for multimedia
  • Faster floating point

24
The Future
  • Trend towards VLIW architectures
  • Increased use of conditional execution
  • Blending of general-purpose and DSP architectures
  • Emulating 80x86 architecture

25
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com