Computer Architecture Lecture 4 17th May, 2006 - PowerPoint PPT Presentation

About This Presentation
Title:

Computer Architecture Lecture 4 17th May, 2006

Description:

Static compiler techniques load delay slot, etc. ... SRAM cells. They contain recently-used data. They contain data in blocks' May 17, 2006 ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 20
Provided by: Hem5
Category:

less

Transcript and Presenter's Notes

Title: Computer Architecture Lecture 4 17th May, 2006


1
Computer ArchitectureLecture 417th May, 2006
  • Abhinav Agarwal
  • Veeramani V.

2
Recap
  • Simple Pipeline hazards and solution
  • Data hazards
  • Static compiler techniques load delay slot,
    etc.
  • Hardware solutions Data forwarding,
    out-of-order execution, register renaming
  • Control hazards
  • Static compiler techniques
  • Hardware speculation through branch predictors
  • Structural hazards
  • Increase hardware resources
  • Superscalar out-of-order execution
  • Memory organisation

3
Memory Organization in processors
  • Caches inside the chip
  • Faster Closer
  • SRAM cells
  • They contain recently-used data
  • They contain data in blocks

4
Rational behind caches
  • Principle of spatial locality
  • Principle of temporal locality
  • Replacement policy (LRU, LFU, etc.)
  • Principle of inclusivity

5
Outline
  • Instruction Level Parallelism
  • Thread-level Parallelism
  • Fine-Grain multithreading
  • Simultaneous multithreading
  • Sharable resources Non-sharable resources
  • Chip Multiprocessor
  • Some design issues

6
Instruction Level Parallelism
  • Overlap execution of many instructions
  • ILP techniques try to reduce data and control
    dependencies
  • Issue out-of-order independent instructions

7
Thread Level Parallelism
  • Two different threads have more independent
    instructions
  • Better utilization of functional units
  • Multi-thread performance is improved drastically

8
A simple pipeline
source EV8 DEC Alpha Processor, (c) Intel
9
Superscalar pipeline
source EV8 DEC Alpha Processor, (c) Intel
10
Speculative execution
source EV8 DEC Alpha Processor, (c) Intel
11
Fine Grained Multithreading
source EV8 DEC Alpha Processor, (c) Intel
12
Simultaneous Multithreading
source EV8 DEC Alpha Processor, (c) Intel
13
Out of Order Execution
source EV8 DEC Alpha Processor, (c) Intel
14
SMT pipeline
source EV8 DEC Alpha Processor, (c) Intel
15
Resources Replication required
  • Program counters
  • Register maps

16
Replication not required
  • Register file (rename space)
  • Instruction queue
  • Branch predictor
  • First and second level caches etc.

17
Chip multiprocessor
  • Number of transistors going up
  • Have more than one core on the chip
  • These still share the caches

18
Some design issues
  • Trade-off in choosing the cache size
  • Power and performance
  • Super pipelining trade-off
  • Higher clock frequency and speculation penalty
    Power
  • Power consumption

19
Novel techniques for power
  • Clock gating
  • Run non-critical elements at a slower clock
  • Reduce voltage swings (Voltage of operation)
  • Sleep Mode/ Standby Mode
  • Dynamic Voltage Frequency scaling
Write a Comment
User Comments (0)
About PowerShow.com