Improving DirectMapped Cache Performance by the Addition of a Small FullyAssociative Cache - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Improving DirectMapped Cache Performance by the Addition of a Small FullyAssociative Cache

Description:

This two effects can results in miss cost. ... If every instruction had cache miss then machine performance can go down by `. ... – PowerPoint PPT presentation

Number of Views:129
Avg rating:3.0/5.0
Slides: 27
Provided by: admi85
Category:

less

Transcript and Presenter's Notes

Title: Improving DirectMapped Cache Performance by the Addition of a Small FullyAssociative Cache


1
  • Improving Direct-Mapped Cache Performance by the
    Addition of a Small Fully-Associative Cache
  • And Pefetch Buffers
  • Norman P. Jouppi
  • PresenterShrinivas Narayani

2
Contents
  • Cache Basics
  • Types of Cache misses
  • Cost of Cache misses
  • How to remove the cache misses
  • Larger Block size
  • Adding Associativity (Reducing Conflict Misses)
  • Miss Cache
  • Victim Cache .. An Improvement over miss
    cache
  • Removing Capacity Misses and Compulsory Misses
  • Prefetch Technique
  • Stream Buffers
  • Conclusion

3
  • Mapping
  • (Block Address) modulo (Number of cache
    blocks in the cache)
  • Cache is accessed using lower order bits.
  • e.g Memory address between (0001) and 11101
    map to locations 001 and 101 in cache.
  • Data is addressed using tag (higher order bits
    of address)

4
Direct Mapped Cache
000 001 010 011 100 101
110 111
00001
00101 01001
01101
10001

5
Cache Terminology
  • Cache Hit
  • Cache Miss
  • Miss Penalty The miss penalty is the time to
    replace the block in the upper level with
    corresponding block from the lower level.

6
  • In a direct-Mapped cache, there is only one
    place the newly requested item and hence only one
    choice of what to replace.

7
Types of Misses
  • CompulsoryThe first access to a block is not in
    the cache, so the block must be brought into the
    cache. These are also called cold start misses or
    first reference misses.(Misses in Infinite
    Cache)
  • CapacityIf the cache cannot contain all the
    blocks needed during execution of a program,
    capacity misses will occur due to blocks being
    discarded and later retrieved.(Misses in Size )
  • ConflictIf the block-placement strategy is set
    associative or direct mapped, conflict misses (in
    addition to compulsory and capacity misses) will
    occur because a block can be discarded and later
    retrieved if too many blocks map to its set.
    These are also called collision misses or
    interference misses.(Misses in N-way
    Associative)
  • Coherence Misses Result of invalidation to
    preserve multiprocessor cahce consistency.

8
  • Conflict Misses account for
  • Between 20 to 40 of of all direct-mapped cache
    misses

9
Cost of Cache Misses
  • Cycle time has been decreasing much faster than
    memory access time.
  • Average number of machine cycles per instruction
    has been decreasing dramatically. This two
    effects can results in miss cost.
  • Eg Cache miss on VAX11/780 only cost 60 of the
    average instruction execution. If every
    instruction had cache miss then machine
    performance can go down by 60.

10
How to Reduce the Cache Miss
  • Increase Block Size
  • Increase Associativity
  • Use a Victim Cache
  • Use a Pseudo Associative Cache
  • Hardware Prefetching
  • Compiler-Controlled Prefetching
  • Compiler Optimizations

11
Increasing Block size
  • One way to reduce the miss rate is to increase
    the block size
  • Reduce compulsory misses - why?
  • Take advantage of spacial locality
  • However, larger blocks have disadvantages
  • May increase the miss penalty (need to get more
    data)
  • May increase hit time (need to read more data
    from cache and larger mux)
  • May increase conflict and capacity misses.

12
Adding Associativity
From processor
To processor
tag
data
one cache line of data
MRU entry Fully-associative miss cache LRU entry
tag and comparator
one cache line of data
tag and comparator
one cache line of data
tag and comparator
one cache line of data
tag and comparator
From next lower cache
  • when a miss occur,data is returned
  • to DM and miss cache
  • Each time the upper cache and miss cache is probed

13
Performance of Miss cache
  • Replaces a long off-chip miss penalty with a
    short one-cycle on-chip miss.
  • Data conflict misses more removed

14
Disadvantage of Miss Cache
  • Waste of storage space in the miss cache due
    to duplication of data.

15
Victim Cache
  • An improvement over miss cache.
  • Loads victim line instead of requested line.
  • In case of miss contents of DM cache and victim
    cache are swapped.

16
The effect of DM cache size on victim cache
performance
  • DM size increase, likelyhood of conflict miss
    removed by victim cache reduces

17
Reducing Capacity and Compulsory Misses
Use prefetch technique 1.prefetch always
2.prefetch on miss 3.tagged prefetch
18
  • Prefetch always prfetches always after every
    reference.
  • On miss prefetch on miss always fetches the next
    line.
  • In tagged prefetch each block has a tag bit
    associated with it.
  • When a block is fetched its tag bit set is set
    zero and one when it is used
  • While block undergoes this change a new block is
    fetched.

19
Stream buffers
  • Start prefetch before tag transition

20
  • Stream buffer consist of a series of entries,
    each consisting of a tag, an available bit, and a
    data line.
  • On a miss it fetches successive line at the miss
    target.
  • Lines after the line requested are placed in
    buffer which avoid populating the cache with the
    data which is not needed.

21
Multi-Way Stream Buffers
? only remove 25 of data cache miss ?interleaved
stream of data from different sources ? four
stream buffer in parallel ? instruction stream
unchanged ? twice the performance of the single
stream buffer
22
Stream buffer Vs Prefetch
  • Feasible to Implement
  • Lower latency
  • Extra hardware required by stream buffers is
    comparable with additional tag required by tagged
    prefetch.

23
Stream buffer performance vs.cache size
  • Only data stream buffer performance
  • improve as cache size increase
  • It can contain data for reference pattern
  • that access several sets of data.

24
(No Transcript)
25
Conclusion
  • Miss cache beneficial in removing data cache miss
    and conflict misses.
  • Victim cache is an improvement over Miss cache
    that saves the victim of the cache miss instead
    of target.
  • stream buffer reduces capacity,compulsory miss
  • Multiway stream buffers are set of stream buffers
    that can prefetch down several stream
    concurrently.

26
References
  • Improving Direct-Mapped Cache Performance by the
    Addition of a small
  • Fully-Associative Cache and Prefetch Buffers
  • Norman P. Jouppi
  • Computer Organization and design
  • Patterson D. and Hennesy J.
Write a Comment
User Comments (0)
About PowerShow.com