GC Advantage: Improving Program Locality - PowerPoint PPT Presentation

About This Presentation
Title:

GC Advantage: Improving Program Locality

Description:

In class-oblivious traversal orders partial depth first order is the best. Online profiling, class-based traversal is. more flexible, up to 50% better. ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 36
Provided by: Kath285
Category:

less

Transcript and Presenter's Notes

Title: GC Advantage: Improving Program Locality


1
GC Advantage Improving Program Locality
  • Xianglong Huang, Zhenlin Wang,
  • Stephen M Blackburn, Kathryn S McKinley,
  • J Eliot B Moss, Perry Cheng

2
Motivation
  • Memory gap
  • How are Java programs affected?

3
Marksweep vs. Copying
pseudojbb
4
Motivation
  • Javac with perfect L1 and L2 cache.
  • 16K L1 256K L2
  • Appel, GCTk.
  • Breadth first

5
Motivation
  • Copying collector can reorder objects
  • Goal take advantage of copying collectors
  • reorder objects to improve locality

6
Exploring The Space
  • Different policies for traversing roots
  • Class-oblivious traversal orders
  • Which traversing order is the best?
  • Class-based traversal orders
  • How to find the important data structure?

7
Different Root Traversal Policies
  • Two different types of roots
  • Stack, global variables
  • Remember sets (for generational)
  • Different traversal orders
  • Copy all roots before traversing any children
  • Copy each root and its children (root-by-root)
  • Split roots
  • Stack first and the children
  • Remset first and the children

8
Experiment Setup
  • JikesRVM, JMTk
  • Generational copying collector with bounded
    nursery size of 4MB
  • PseudoAdaptive 2nd iteration

9
Different Root Traversal Policies
  • RxR has the best mutator locality

10
Different Root Traversal Policies
  • Total execution time

11
Exploring The Space
  • Different policies for traversing roots
  • Class-oblivious traversal orders
  • Which traversing order is the best?
  • Class-based traversal orders
  • How to find the important data structure?

12
Different Traversal Orders
  • Breadth first 1,2,3,4,5,6,7
  • Pure depth first 1,2,6,3,4,7,5
  • Pure depth first, LIFO 1,5,4,7,3,2,6

1
5
4
2
3
7
6
13
Different Traversal Orders
  • Breadth first 1,2,3,4,5,6,7
  • Pure depth first 1,2,6,3,4,7,5
  • Pure depth first, LIFO 1,5,4,7,3,2,6
  • Partial depth first, 2 children 1,2,6,3,4,5,7

1
5
4
2
3
7
6
14
Class Oblivious Type
  • Different traversal policies
  • Partial DF is the best

15
Exploring The Space
  • Different policies for traversing roots
  • Class-oblivious traversal orders
  • Which traversing order is the best?
  • Class-based traversal orders
  • How to find the important data structure?

16
Class-based Traversal
  • Class-oblivious traversal orders inflexible
  • Class-based object traversal
  • Static profiling
  • Dynamic sampling

17
Static Profiling
  • Profile object accesses
  • Find hot pairs with strong correlation
  • Example
  • (1,4), (4,7) and (2,6) have strong correlation
  • Order 1,4,7,2,6,3,5

1
5
4
2
3
7
6
18
Online Profiling
  • Use the adaptive compiler sampling
  • Hot method
  • Hot basic block
  • Use field accesses to indicate hot fields
  • Example (In a hot method)
  • Class A a
  • a.b

A
b
..
B
19
Online Profiling
  • Micro benchmark results

20
Online Profiling
  • Geometric mean

21
Reasons
  • No advice for most of the objects copied
  • For jess, db and raytrace, we only pick ltlt1 of
    the objects as hot objects
  • 5 for javac
  • The hot fields are within the first 2 pointers
  • 90 of the advised objects for javac

22
Online Profiling
  • PseudoJBB mutator results
  • Generate advice for 23 of the copied objects
  • 75 of the objects have adviced hot fields other
    than first 2

23
Questions
  • Have we found all the hot objects?
  • Not all hot objects are connected?
  • Is class-base good enough?
  • For pseudojbb, we need instance-based?
  • Locality for the nursery objects?

24
Future Work
  • Sampling technique
  • Catch more hot objects access
  • Lower the threshold
  • Hot objects that are not connected
  • Dynamically change the advice for phase changing
  • Nursery locality
  • Different traversal orders for cold objects
  • Instance-based

25
Conclusion
  • Reorder objects during copying collection can
    improve locality
  • In class-oblivious traversal orders partial depth
    first order is the best
  • Online profiling, class-based traversal is
  • more flexible, up to 50 better.
  • very low overhead, 0
  • Still mysteries

26
Questions?
27
Answers?
  • Lower the threshold of the sampling, not only the
    hot methods
  • For objects with only 1 or 2 pointers, it maybe
    easier just depth first
  • Maybe the nursery locality is more important
  • Instance-based advice

28
Online Profiling
  • Execution overhead

29
Online Profiling
  • Micro benchmark results for mutator time

30
Different Root Traversal Policies
_227_mtrt
31
Static Profiling
  • Results

32
Answers?
  • Most objects have only one pointer
  • Percentage of objects copied by advice (whether
    it is really hot?)
  • For pseudojbb 50, for jess ltlt1, for our micro
    benchmark 16
  • Change! Half of the pairs do not form chains
    longer than 2
  • Maybe the nursery locality is more important

33
Class Oblivious Orderings
  • Different traversal policies
  • Partial DF is better

pseudoJBB
34
Motivation
  • MarkSweep vs. Copying Collector

Mutator time of _213_javac
35
Motivation
Mutator L2 misses _213_javac
Write a Comment
User Comments (0)
About PowerShow.com