Memory Hierarchy Adaptivity An Architectural Perspective - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Memory Hierarchy Adaptivity An Architectural Perspective

Description:

Cache 'Assist': prefetch, write buffer, victim cache, etc. between ... Propose hardware mechanisms to select between assist types and allocate buffer space ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 23
Provided by: Informatio367
Learn more at: https://www.ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Memory Hierarchy Adaptivity An Architectural Perspective


1
Memory Hierarchy AdaptivityAn Architectural
Perspective
  • Alex Veidenbaum
  • AMRM Project
  • sponsored by DARPA/ITO

2
Opportunities for Adaptivity
  • Cache organization
  • Cache performance assist mechanisms
  • Hierarchy organization
  • Memory organization (DRAM, etc)
  • Data layout and address mapping
  • Virtual Memory
  • Compiler assist

3
Opportunities - Contd
  • Cache organization adapt what?
  • Size NO
  • Associativity NO
  • Line size MAYBE,
  • Write policy YES (fetch,allocate,w-back/thru)
  • Mapping function MAYBE

4
Opportunities - Contd
  • Cache Assist prefetch, write buffer, victim
    cache, etc. between different levels.
  • Adapt what?
  • Which mechanism(s) to use
  • Mechanism parameters

5
Opportunities - Contd
  • Hierarchy Organization
  • Where are cache assist mechanisms applied?
  • Between L1 and L2
  • Between L1 and Memory
  • Between L2 and Memory
  • What are the data-paths like?
  • Is prefetch, victim cache, write buffer data
    written into the cache?
  • How much parallelism is possible in the
    hierarchy?

6
Opportunities - Contd
  • Memory Organization
  • Cached DRAM?
  • Interleave change?
  • PIM

7
Opportunities - Contd
  • Data layout and address mapping
  • In theory, something can be done but
  • MP case is even worse
  • Adaptive address mapping or hashing based
  • on ???

8
Opportunities - Contd
  • Compiler assist
  • Can select initial configuration
  • Pass hints on to hardware
  • Generate code to collect run-time info and adjust
    execution
  • Adapt configuration after being called at
    certain intervals during execution
  • Select/run-time optimize code

9
Opportunities - Contd
  • Virtual Memory can adapt
  • Page size?
  • Mapping?
  • Page prefetching/read ahead
  • Write buffer (file cache)
  • The above under multiprogramming?

10
Applying Adaptivity
  • What Drives Adaptivity?
  • Performance impact, overall and/or relative
  • Effectiveness, e.g. miss rate
  • Processor Stall introduced
  • Program characteristics
  • When to perform adaptive action
  • Run time use feedback from hardware
  • Compile time insert code, set up hardware

11
Where to Implement
  • In Software compiler and/or OS
  • (Static) Knowledge of program behavior
  • Factored into optimization and scheduling
  • Extra code, overhead
  • Lack of dynamic run-time information
  • Rate of adaptivity
  • requires recompilation, OS changes

12
Where to Implement - Contd
  • Hardware
  • dynamic information available
  • fast decision mechanism possible
  • transparent to software (thus safe)
  • delay, clock rate limit algorithm complexity
  • difficult to maintain long-term trends
  • little knowledge of about program behavior

13
Where to Implement - Contd
  • Hardware/software
  • Software can set coarse hardware parameters
  • Hardware can supply software dynamic info
  • Perhaps more complex algorithms can be used
  • Software modification required
  • Communication mechanism required

14
Current Investigation
  • L1 cache assist
  • See wide variability in assist mechanisms
    effectiveness between
  • Individual Programs
  • Within a program as a function of time
  • Propose hardware mechanisms to select between
    assist types and allocate buffer space
  • Give compiler an opportunity to set parameters

15
Mechanisms Used
  • Prefetching
  • Stream Buffers
  • Stride-directed, based on address alone
  • Miss Stride prefetch the same address using the
    number of intervening misses
  • Victim Cache
  • Write Buffer, all after L1

16
Mechanisms Used - Contd
  • A mechanism can be used by itself or
  • All are used at once
  • Buffer space size and organization fixed
  • No adaptivity involved

17
Observed Behavior
  • Programs exhibit different effect from each
    mechanism, e.g none a consistent winner
  • Within a program the same holds in the time
    domain between mechanisms.

18
Observed Behavior - Contd
  • Both of the above facts indicate a likely
    improvement from adaptivity
  • Select a better one among mechanisms
  • Even more can be expected from adaptively
    re-allocating from the combined buffer pool
  • To reduce stall time
  • To reduce the number of misses

19
Proposed Adaptive Mechanism
  • Hardware
  • a common pool of 2-4 word buffers
  • a set of possible policies, a subset of
  • Stride-directed prefetch
  • PC-based prefetch
  • History-based prefetch
  • Victim cache
  • Write buffer

20
Adaptive Hardware - Contd
  • Performance monitors for each type/buffer
  • misses, stall time on hit, thresholds
  • Dynamic buffer allocator among mechanisms
  • Allocation and monitoring policy
  • Predict future behavior from observed past
  • Observe over a time interval dT, set for next
  • Save perform. trends in next-level tags (lt8bits)

21
Further opportunities to adapt
  • L2 cache organization
  • variable-size line
  • L2 non-sequential prefetch
  • In-memory assists (DRAM)

22
MP Opportunities
  • Even longer latency
  • Coherence, hardware or software
  • Synchronization
  • Prefetch under and beyond the above
  • Avoid coherence if possible
  • Prefetch past synchronization
  • Assist Adaptive Scheduling
Write a Comment
User Comments (0)
About PowerShow.com