Improving DirectMapped Cache Performance by the Addition of a Small FullyAssociative Cache - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Improving DirectMapped Cache Performance by the Addition of a Small FullyAssociative Cache

Description:

This two effects can results in miss cost. ... If every instruction had cache miss then machine performance can go down by `. ... – PowerPoint PPT presentation

Number of Views:129

Avg rating:3.0/5.0

Slides: 27

Provided by: admi85

Category:

more less

Transcript and Presenter's Notes

Title: Improving DirectMapped Cache Performance by the Addition of a Small FullyAssociative Cache

1

Improving Direct-Mapped Cache Performance by the
Addition of a Small Fully-Associative Cache
And Pefetch Buffers
Norman P. Jouppi
PresenterShrinivas Narayani

2
Contents

Cache Basics
Types of Cache misses
Cost of Cache misses
How to remove the cache misses
Larger Block size
Adding Associativity (Reducing Conflict Misses)
Miss Cache
Victim Cache .. An Improvement over miss
cache
Removing Capacity Misses and Compulsory Misses
Prefetch Technique
Stream Buffers
Conclusion

Mapping
(Block Address) modulo (Number of cache
blocks in the cache)
Cache is accessed using lower order bits.
e.g Memory address between (0001) and 11101
map to locations 001 and 101 in cache.
Data is addressed using tag (higher order bits
of address)

4
Direct Mapped Cache
000 001 010 011 100 101
110 111
00001
00101 01001
01101
10001

5
Cache Terminology

Cache Hit
Cache Miss
Miss Penalty The miss penalty is the time to
replace the block in the upper level with
corresponding block from the lower level.

In a direct-Mapped cache, there is only one
place the newly requested item and hence only one
choice of what to replace.

7
Types of Misses

CompulsoryThe first access to a block is not in
the cache, so the block must be brought into the
cache. These are also called cold start misses or
first reference misses.(Misses in Infinite
Cache)
CapacityIf the cache cannot contain all the
blocks needed during execution of a program,
capacity misses will occur due to blocks being
discarded and later retrieved.(Misses in Size )
ConflictIf the block-placement strategy is set
associative or direct mapped, conflict misses (in
addition to compulsory and capacity misses) will
occur because a block can be discarded and later
retrieved if too many blocks map to its set.
These are also called collision misses or
interference misses.(Misses in N-way
Associative)
Coherence Misses Result of invalidation to
preserve multiprocessor cahce consistency.

Conflict Misses account for
Between 20 to 40 of of all direct-mapped cache
misses

9
Cost of Cache Misses

Cycle time has been decreasing much faster than
memory access time.
Average number of machine cycles per instruction
has been decreasing dramatically. This two
effects can results in miss cost.
Eg Cache miss on VAX11/780 only cost 60 of the
average instruction execution. If every
instruction had cache miss then machine
performance can go down by 60.

10
How to Reduce the Cache Miss

Increase Block Size
Increase Associativity
Use a Victim Cache
Use a Pseudo Associative Cache
Hardware Prefetching
Compiler-Controlled Prefetching
Compiler Optimizations

11
Increasing Block size

One way to reduce the miss rate is to increase
the block size
Reduce compulsory misses - why?
Take advantage of spacial locality
However, larger blocks have disadvantages
May increase the miss penalty (need to get more
data)
May increase hit time (need to read more data
from cache and larger mux)
May increase conflict and capacity misses.

12
Adding Associativity
From processor
To processor
tag
data
one cache line of data
MRU entry Fully-associative miss cache LRU entry
tag and comparator
one cache line of data
tag and comparator
one cache line of data
tag and comparator
one cache line of data
tag and comparator
From next lower cache

when a miss occur,data is returned
to DM and miss cache
Each time the upper cache and miss cache is probed

13
Performance of Miss cache

Replaces a long off-chip miss penalty with a
short one-cycle on-chip miss.
Data conflict misses more removed

14
Disadvantage of Miss Cache

Waste of storage space in the miss cache due
to duplication of data.

15
Victim Cache

An improvement over miss cache.
Loads victim line instead of requested line.
In case of miss contents of DM cache and victim
cache are swapped.

16
The effect of DM cache size on victim cache
performance

DM size increase, likelyhood of conflict miss
removed by victim cache reduces

17
Reducing Capacity and Compulsory Misses
Use prefetch technique 1.prefetch always
2.prefetch on miss 3.tagged prefetch
18

Prefetch always prfetches always after every
reference.
On miss prefetch on miss always fetches the next
line.
In tagged prefetch each block has a tag bit
associated with it.
When a block is fetched its tag bit set is set
zero and one when it is used
While block undergoes this change a new block is
fetched.

19
Stream buffers

Start prefetch before tag transition

Stream buffer consist of a series of entries,
each consisting of a tag, an available bit, and a
data line.
On a miss it fetches successive line at the miss
target.
Lines after the line requested are placed in
buffer which avoid populating the cache with the
data which is not needed.

21
Multi-Way Stream Buffers
? only remove 25 of data cache miss ?interleaved
stream of data from different sources ? four
stream buffer in parallel ? instruction stream
unchanged ? twice the performance of the single
stream buffer
22
Stream buffer Vs Prefetch

Feasible to Implement
Lower latency
Extra hardware required by stream buffers is
comparable with additional tag required by tagged
prefetch.

23
Stream buffer performance vs.cache size

Only data stream buffer performance
improve as cache size increase
It can contain data for reference pattern
that access several sets of data.

24
(No Transcript)
25
Conclusion

Miss cache beneficial in removing data cache miss
and conflict misses.
Victim cache is an improvement over Miss cache
that saves the victim of the cache miss instead
of target.
stream buffer reduces capacity,compulsory miss
Multiway stream buffers are set of stream buffers
that can prefetch down several stream
concurrently.

26
References