CS184b: Computer Architecture (Abstractions and Optimizations) - PowerPoint PPT Presentation

About This Presentation

Title:

CS184b: Computer Architecture (Abstractions and Optimizations)

Description:

CS184b: Computer Architecture Abstractions and Optimizations – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 42

Provided by: andre57

Learn more at: http://courses.cms.caltech.edu

Category:

Tags: abstractions | architecture | computer | courses | cs184b | ince | optimizations | rhue

Transcript and Presenter's Notes

Title: CS184b: Computer Architecture (Abstractions and Optimizations)

1
CS184bComputer Architecture(Abstractions and
Optimizations)

Day 12 April 27, 2005
Caching Introduction

2
Today

Memory System
Issue
Structure
Idea
Cache Basics

3
Memory and Processors

Memory used to compactly store
state of computation
description of computation (instructions)
Memory access latency impacts performance
timing on load, store
timing on instruction fetch

4
Issues

Need big memories
hold large programs (many instructions)
hold large amounts of state
Big memories are slow
Memory takes up areas
want dense memories
densest memories not fast
fast memories not dense
Memory capacity needed not fit on die
inter-die communication is slow

5
Problem

Desire to contain problem
implies large memory
Large memory
implies slow memory access
Programs need frequent memory access
e.g. 20 load operations
fetch required for every instruction
Memory is the performance bottleneck?
Programs run slow?

6
Opportunity

Architecture mantra
exploit structure in typical problems
What structure exists?

7
Memory Locality

What percentage of accesses to unique addresses
addresses distinct from the last N unique
addresses

8
Hierarchy/Structure Summary
from CS184a

Memory Hierarchy arises from area/bandwidth
tradeoffs
Smaller/cheaper to store words/blocks
(saves routing and control)
Smaller/cheaper to handle long retiming in larger
arrays (reduce interconnect)
High bandwidth out of registers/shallow memories

9
From AlphaSort A Cache-Sensitive Parallel
External Sort ACM SIGMOD'94 Proceedings/VLDB
Journal 4(4) 603-627 (1995).
10
Opportunity

Small memories are fast
Access to memory is not random
temporal locality
short and long retiming distances
Put commonly/frequently used data (instructions)
in small memory

11
Memory System Idea

Dont build single, flat memory
Build a hierarchy of speeds/sizes/densities
commonly accessed data in fast/small memory
infrequently used data in large/dense/cheap
memory
Goal
achieve speed of small memory
with density of large memory

12
Hierarchy Management

Two approaches
explicit data movement
register file
overlays
transparent/automatic movement
invisible to model

13
Opportunity Model

Model is simple
read data and operate upon
timing not visible
Can vary timing
common case fast (in small memory)
all cases correct
can answered from larger/slower memory

14
Cache Basics

Small memory (cache) holds commonly used data
Read goes to cache first
If cache holds data
return value
Else
get value from bulk (slow) memory
Stall execution to hide latency
full pipeline, scoreboarding

15
Cache Questions

How manage contents?
decide what goes (is kept) in cache?
How know what we have in cache?
How make sure consistent ?
between cache and bulk memory

16
Cache contents

Ideal cache should hold the N items that
maximize the fraction of memory references which
are satisfied in the cache
Problem
dont know future
dont know what values will be needed in the
future
partially limitation of model
partially data dependent
halting problem
(cant say if will execute piece of code)

17
Cache Contents

Look for heuristics which keep most likely set of
data in cache
Structure temporal locality
high probability that recent data will be
accessed again
Heuristic goal
keep the last N references in cache

18
Temporal Locality Heuristic

Move data into cache on access (load, store)
Remove old data from cache to make space

19
Ideal Locality Cache

Stores N most recent things
store any N things
know which N things accessed
know when last used

20
Ideal Locality Cache

Match address
If matched,
update cycle
Else
drop oldest
read from memory
store in newly free slot

21
Problems with Ideal Locality?

Need O(N) comparisons
Must find oldest
(also O(N)?)
Expensive

22
Relaxing Ideal

Keeping usage (and comparing) expensive
Relax
Keep only a few bits on age
Dont bother
pick victim randomly
things have expected lifetime in cache
old things more likely than new things
if evict wrong thing, will replace
very simple/cheap to implement

23
Fully Associative Memory

Store both
address
data
Can store any N addresses
approaches ideal of best N things

24
Relaxing Ideal

Comparison for every address is expensive
Reduce comparisons
deterministically map address to a small portion
of memory
Only compare addresses against that portion

25
Direct Mapped
Addr low

Extreme is a direct mapped cache
Memory slot is f(addr)
usually a few low bits of address
Go directly to address
check if data want is there

Addr high

hit
26
Direct Mapped Cache

Benefit
simple
fast
Cost
multiple addresses will need same slot
conflicts mean dont really have most recent N
things
can have conflict between commonly used items

27
Set-Associative Cache

Between extremes set-associative
Think of M direct mapped caches
One comparison for each cache
Lookup in all M caches
Compare and see if any have target data
Can have M things which map to same address

28
Two-Way Set Associative
Low address bits
High address bits
29
Two-way Set Associative
Hennessy and Patterson 5.8e2
30
Set Associative

More expensive that direct mapped
Can decide expense
Slower than direct mapped
have to mux in correct answer
Can better approximate holding N most
recently/frequently used things

31
Classify Misses

Compulsory
first refernce
(any cache would have)
Capacity
misses due to size
(fully associative would have)
Conflict
miss because of limit places to put

32
Set Associativity
Hennessy and Patterson 5.10e2
33
Absolute Miss Rates
Hennessy and Patterson 5.10e2
34
Policy on Writes

Keep memory consistent at all times?
Or cachememory holds values?
Write through
all writes go to memory and cache
Write back
writes go to cache
update memory only on eviction

35
Write Policy

Write through
easy to implement
eviction trivial
(just overwrite)
every write is slow (main memory time)
Write back
fast (writes to cache)
eviction slow/complicate

36
Cache Equation...

Assume hits satisfied in 1 cycle
CPI Base CPI Refs/Instr (Miss Rate)(Miss
Latency)

37
Cache Numbers

CPI Base CPI Ref/Instr (Miss Rate)(Miss
Latency)
From ch2/experience
load-stores make up 30 of operations
Miss rates
1-10
Main memory latencies
50ns
Cycle times
300ps shrinking

38
Cache Numbers
300ps Cycle 30ns Main mem

No Cache
CPIBase0.3100Base30
Cache at CPU Cycle (10 miss)
CPIBase0.30.1100Base 3
Cache at CPU Cycle (1 miss)
CPIBase0.30.01100Base 0.3

39
Wrapup
40
Big Ideas

Structure
temporal locality
Model
optimization preserving model
simple model
sophisticated implementation
details hidden

41
Big Ideas

Balance competing factors
speed of cache vs. miss rate
Getting best of both worlds
multi level
speed of small
capacity/density of large

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

CS184b: Computer Architecture (Abstractions and Optimizations) PowerPoint PPT Presentation

CS184b: Computer Architecture (Abstractions and Optimizations) - probably going to hit exposure over details. Caltech CS184 Spring2005 -- DeHon. 7. Lectures ... Schedule MWF. Accommodate holes as necessary. Currently have 25 ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture (Abstractions and Optimizations) PowerPoint PPT Presentation

CS184b: Computer Architecture (Abstractions and Optimizations) - give to highest priority which requests. consider ordering ... Arrange N=2n nodes in n-dimensional cube. At most n hops from source to sink. N = log2(N) ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] PowerPoint PPT Presentation

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] - [Single Threaded Architecture: abstractions, quantification, and ... no effect of exec op. Caltech CS184b Winter2001 -- DeHon. 19. Avoiding Lost Cycle (2) ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] PowerPoint PPT Presentation

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] - update memory only on eviction. Caltech CS184b Winter2001 -- DeHon. 34. Write Policy ... eviction trivial (just overwrite) every write is slow (main memory time) ... | PowerPoint PPT presentation | free to view

IIT CS570 Graduate Advenced Computer Architecture PowerPoint PPT Presentation

IIT CS570 Graduate Advenced Computer Architecture - Title: IIT CS570 Graduate Advenced Computer Architecture Author: David Last modified by: sun Created Date: 2/8/2005 3:17:21 AM Document presentation format | PowerPoint PPT presentation | free to view

EECS 252 Graduate Computer Architecture Lec 12 PowerPoint PPT Presentation

EECS 252 Graduate Computer Architecture Lec 12 - Lec 12 Vector Wrap-up and Multiprocessor Introduction David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley | PowerPoint PPT presentation | free to view

EECS 252 Graduate Computer Architecture Lec 3 PowerPoint PPT Presentation

EECS 252 Graduate Computer Architecture Lec 3 - Title: EECS 252 Graduate Computer Architecture Lec 01 - Introduction Last modified by: EECS Created Date: 1/12/2005 3:15:41 PM Document presentation format | PowerPoint PPT presentation | free to view

EECS 252 Graduate Computer Architecture Lec 12 PowerPoint PPT Presentation

EECS 252 Graduate Computer Architecture Lec 12 - Lec 12 [Removed: Vector Wrap-up] Multiprocessor Introduction David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley | PowerPoint PPT presentation | free to view

New Directions in Computer Architecture PowerPoint PPT Presentation

New Directions in Computer Architecture - Outline Desktop/Server Microprocessor State of the Art Mobile Multimedia Computing as New ... Trends Affecting New ... edu/papers/direction/paper ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] PowerPoint PPT Presentation

CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] - A IW WS. Caltech CS184b Winter2001 -- DeHon. 12. Registers. How many virtual registers needed? ... Gets delay down to log(WS) w/ linear layout, delay still linear ... | PowerPoint PPT presentation | free to view

EECS 252 Graduate Computer Architecture Lec 15 PowerPoint PPT Presentation

EECS 252 Graduate Computer Architecture Lec 15 - Lec 15 T1 ( Niagara ) and Papers Discussion David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs ... | PowerPoint PPT presentation | free to view

ECE 366 -- Computer Architecture Lecture Notes 11 -- Adders Shantanu Dutt Univ. of Illinois at Chicago Excerpted from CS152 Computer Architecture and Engineering Lecture 5: Cost and Design PowerPoint PPT Presentation

ECE 366 -- Computer Architecture Lecture Notes 11 -- Adders Shantanu Dutt Univ. of Illinois at Chicago Excerpted from CS152 Computer Architecture and Engineering Lecture 5: Cost and Design - Lecture Notes 11 -- Adders Shantanu Dutt Univ. of Illinois at Chicago Excerpted from CS152 Computer Architecture and Engineering Lecture 5: Cost and Design | PowerPoint PPT presentation | free to view

Computer Architecture PowerPoint PPT Presentation

Computer Architecture - Computer Architecture Lecture 7 Compiler Considerations and Optimizations | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) PowerPoint PPT Presentation

CS184a: Computer Architecture (Structures and Organization) - CS184a: Computer Architecture (Structures and Organization) Day1: September 25, 2000 Introduction and Overview Today Matter Computes Architecture Matters This Course ... | PowerPoint PPT presentation | free to view

CS252 Graduate Computer Architecture Lecture 16 Memory Technology (Con PowerPoint PPT Presentation

CS252 Graduate Computer Architecture Lecture 16 Memory Technology (Con - Graduate Computer Architecture. Lecture 16. Memory Technology (Con't) ... 4 for access time, 10 cycle time, 1 to send data. Cache Block is 4 words. Simple M.P. ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture (Abstractions and Optimizations) PowerPoint PPT Presentation

CS184b: Computer Architecture (Abstractions and Optimizations) - Basic Architecture Requirements. Fine-Grained Threading. TAM (Threaded Abstract Machine) ... Basic blocks (fine-grained threads) Think of as coarser-grained DF ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture (Abstractions and Optimizations) PowerPoint PPT Presentation

CS184b: Computer Architecture (Abstractions and Optimizations) - ... any point in time, can fail (produce the wrong result) (2nd ... may fail. provide ... If Fail no ack. Retry. Preferably with different resource. Caltech CS184 ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture (Abstractions and Optimizations) PowerPoint PPT Presentation

CS184b: Computer Architecture (Abstractions and Optimizations) - Only necessary to write/broadcast a value if someone else has it cached ... Why did we need broadcast in Snoop-Bus protocol? Detect sharing ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture Abstractions and Optimizations PowerPoint PPT Presentation

CS184b: Computer Architecture Abstractions and Optimizations - CS184b: Computer Architecture Abstractions and Optimizations | PowerPoint PPT presentation | free to view

Sample Undergraduate Lecture: MIPS Instruction Set Architecture PowerPoint PPT Presentation

Sample Undergraduate Lecture: MIPS Instruction Set Architecture - Sample Undergraduate Lecture: MIPS Instruction Set Architecture Jason D. Bakos Optics/Microelectronics Lab Department of Computer Science University of Pittsburgh | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) PowerPoint PPT Presentation

CS184a: Computer Architecture (Structures and Organization) - CS184a: Computer Architecture (Structures and Organization) Day20: November 29, 2000 Review Today Review content and themes N.B. EOT Feedback Questionnaire return end ... | PowerPoint PPT presentation | free to view

CS184b: Computer Architecture (Abstractions and Optimizations) PowerPoint PPT Presentation

CS184b: Computer Architecture (Abstractions and Optimizations) - Problems we want to run are bigger than the real memory we ... Convenient to run more than one program at a time on a computer. Convenient/Necessary to isolate ... | PowerPoint PPT presentation | free to view

Digital Design and Computer Architecture PowerPoint PPT Presentation

Digital Design and Computer Architecture - Digital Design and Computer Architecture 60-265 Dr. Robert D. Kent LT 5100 519-253-3000 Ext. 2993 rkent@uwindsor.ca Lecture 1 Introduction ... | PowerPoint PPT presentation | free to view

Theories and Frameworks for Ubiquitous Computing Alan Dearle School of Computer Science University of St Andrews PowerPoint PPT Presentation

Theories and Frameworks for Ubiquitous Computing Alan Dearle School of Computer Science University of St Andrews - Theories and Frameworks for Ubiquitous Computing Alan Dearle School of Computer Science University of St Andrews | PowerPoint PPT presentation | free to view

Effective Compilation Support for Variable Instruction Set Architecture PowerPoint PPT Presentation

Effective Compilation Support for Variable Instruction Set Architecture - Title: Effective Compilation Support for Variable Instruction Set Architecture Last modified by: Fred Chow Document presentation format: On-screen Show | PowerPoint PPT presentation | free to view

Computer Architecture: Intro Anatomy of a CPU PowerPoint PPT Presentation

Computer Architecture: Intro Anatomy of a CPU - Design an example architecture using SOTA tools ... One step closer on the abstraction scale is to look at the programmer's model of ... | PowerPoint PPT presentation | free to view

Assembly Language for IntelBased Computers, 5th Edition PowerPoint PPT Presentation

Assembly Language for IntelBased Computers, 5th Edition - Example: (Y S) (X S) Two ... a computer's architecture is an abstraction of a machine ... are essential to the design of computer hardware and software ... | PowerPoint PPT presentation | free to view