Better than the Two: Exceeding Private and Shared Caches via Two-Dimensional Page Coloring - PowerPoint PPT Presentation

About This Presentation
Title:

Better than the Two: Exceeding Private and Shared Caches via Two-Dimensional Page Coloring

Description:

Better than the Two: Exceeding Private and Shared Caches via Two-Dimensional Page Coloring Lei Jin and Sangyeun Cho Dept. of Computer Science University of Pittsburgh – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 32
Provided by: Sang55
Category:

less

Transcript and Presenter's Notes

Title: Better than the Two: Exceeding Private and Shared Caches via Two-Dimensional Page Coloring


1
Better than the Two Exceeding Private and
Shared Caches via Two-Dimensional Page Coloring
  • Lei Jin and Sangyeun Cho

Dept. of Computer Science University of Pittsburgh
2
Multicore distributed L2 caches
  • L2 caches typically sub-banked and distributed
  • IBM Power4/5 3 banks
  • Sun Microsystems T1 4 banks
  • Intel Itanium2 (L3) many sub-arrays
  • (Distributed L2 caches switched NoC) ? NUCA
  • Hardware-based management schemes
  • Private caching
  • Shared caching
  • Hybrid caching

3
Private and shared caching
  • Private caching
  • ? short hit latency (always local)
  • ? high on-chip miss rate
  • long miss resolution time
  • complex coherence enforcement
  • Shared caching
  • low on-chip miss rate
  • straightforward data location
  • simple coherence (no replication)
  • long average hit latency

4
Other approaches
  • Hybrid/flexible schemes
  • Core clustering Speight et al., ISCA2005
  • Flexible CMP cache sharing Huh et al.,
    ICS2004
  • Flexible bank mapping Liu et al., HPCA2004
  • Improving shared caching
  • Victim replication Zhang and Asanovic,
    ISCA2005
  • Improving private caching
  • Cooperative caching Chang and Sohi, ISCA2006
  • CMP-NuRAPID Chishti et al., ISCA2005

5
Motivation
Hit latency
Miss rate
What is the optimal balance between miss rate and
hit latency?
6
Talk roadmap
  • Data mapping, a key property cho and Jin,
    Micro2006
  • Two-dimensional (2D) page coloring algorithm
  • Evaluation and results
  • Conclusion and future works

7
Data mapping
  • Data mapping
  • Memory data ? location in L2 cache
  • Private caching
  • Data mapping determined by program location
  • Mapping created at miss time
  • No explicit control
  • Shared caching
  • Data mapping determined by address
  • slice number (block address) (Nslice)
  • Mapping is static
  • No explicit control

8
Change mapping granularity
Block granularity
Page granularity
Page
Page
Page
slice number (block address) (N slice)
Page
slice number (page address) (N slice)
9
OS controlled page mapping
Program 1
Memory pages
OS PAGE ALLOCATION
OS PAGE ALLOCATION
Program 2
Virtual address space
Physical address space
10
2D page coloring the problem
access miss
cost
9000
6900
9000
8100
9600
Page
Page
Page
Page
Page
500 30
500 3
P
500 10
500 7
500 12
Network latency / hop 3 cycles Memory latency
300 cycles
Cost(color ) ( access x hop x 3 cycles)
( miss x 300 cycles)
11
2D coloring algorithm
  • Collect L2 reference trace
  • Derive conflict information Sherwood et al.,
    ICS1999

12
2D coloring algorithm (contd)
  • Derive conflict information

Reference Matrix A B C A 0 0 0 B 0 0 0 C 0 0 0
Conflict Matrix A B C A 0 0 0 B 0 0 0 C 0 0 0
11
13
2D coloring algorithm (contd)
  • Derive conflict information

Reference Matrix A B C A 0 0 0 B 1 0 0 C 1 0 0
Conflict Matrix A B C A 0 0 0 B 0 0 0 C 0 0 0
14
2D coloring algorithm (contd)
  • Derive conflict information

Reference Matrix A B C A 0 0 0 B 1 0 0 C 1 0 0
Conflict Matrix A B C A 0 0 0 B 0 0 0 C 0 0 0
15
2D coloring algorithm (contd)
  • Derive conflict information

Reference Matrix A B C A 0 1 0 B 1 0 0 C 1 1 0
Conflict Matrix A B C A 0 0 0 B 0 0 0 C 0 0 0
1
0
16
2D coloring algorithm (contd)
  • Derive conflict information

Reference Matrix A B C A 0 1 0 B 0 0 0 C 1 1 0
Conflict Matrix A B C A 0 0 0 B 1 0 0 C 0 0 0
17
2D coloring algorithm (contd)
  • Derive conflict information

Reference Matrix A B C A 0 1 0 B 0 0 0 C 1 1 0
Conflict Matrix A B C A 0 0 0 B 1 0 0 C 0 0 0
18
2D coloring algorithm (contd)
  • Derive conflict information

Reference Matrix A B C A 0 1 1 B 0 0 1 C 1 1 0
Conflict Matrix A B C A 0 0 0 B 1 0 0 C 0 0 0
0
0
1
1
19
2D coloring algorithm (contd)
  • 2D Page coloring

Conflict Matrix A B C A 0 0 0 B 1 0 0 C 1 1 0
Access Counter A B C 1 2 1
Conflict Matrix A B C A 0 0 0 B 1 0 0 C 1 1 0
20
2D coloring algorithm (contd)
  • 2D Page coloring

Conflict Matrix A B C A 0 0 0 B 1 0 0 C 1 1 0
Access Counter A B C 1 2 1
Conflict(color)
Access
Cost(color, page) (
x mem latency)
x hop(color) x hop
delay)
a x
(1-a) x
Optimal color(page) C Cost(C)
MINCost(color, page)

for all colors
21
Experiments setup
  • Experiments were carried out using simulator
    derived from SimpleScalar toolset.
  • The simulator models a 16-core tile-based CMP.
  • Each core has private 32KB I/D L1, global shared
    256KB L2 slice (total 4MB).

22
Optimal page mapping
a 1/64
a 1/256
of pages
of pages
x
y
y
x
gcc
23
Access distribution
24
Relative performance
25
Value of a
26
Conclusions
  • With cautious data placement, there is huge room
    for performance improvement.
  • Dynamic mapping schemes with information assisted
    by hardware are possible to achieve similar
    perform-ance improvement.
  • This method can also be applied to other
    optimization target.

27
Current and future works
  • Dynamic mapping schemes
  • Performance
  • Power
  • Multiprogrammed and parallel workloads

28
Thank you Questions?
29
Private caching
  • ? short hit latency (always local)
  • ? high on-chip miss rate
  • long miss resolution time
  • complex coherence enforcement
  • L1 miss
  • L2 access
  • Hit
  • Miss
  • Access directory
  • A copy on chip
  • Global miss

Local L2 access
30
Shared caching
  • L1 miss
  • L2 access
  • Hit
  • Miss
  • low on-chip miss rate
  • straightforward data location
  • simple coherence (no replication)
  • long average hit latency

31
Performance
Write a Comment
User Comments (0)
About PowerShow.com