Title: GHC
1GHCs Garbage Collector
- Simon Marlow
- Simon Peyton Jones
- Microsoft Research, Cambridge
2Parameters
- Very fast allocation ( 1Gb/s)
- High infant mortality
- Truly mutable objects are rare
- updates are common (single-mutation), but
usually happen to young objects - Lightweight Haskell threads
- Weak pointers, finalizers, pinning
3A block allocation layer
- We chose to build the storage manager on top of a
block layer for flexibility. - Block allocator provides blocks of memory (eg.
4k) singly or in contiguous groups. - Block allocator manages the free list.
Storage Manager
Block Allocator
malloc() / mmap()
4Block descriptors
- Each block has a fixed table of data the block
descriptor.
1st free byte in the block
struct bdescr void start, free, link
int blocks bdescr Bdescr (void
) bdescr allocBlocks (int blocks) void
freeBlocks (bdescr )
Bdescr() maps address to block descriptor in a
few instructions, no memory accesses
chains blocks together (or links to head of group)
Start of the block
Number of blocks in group (0 if not the head)
5Where do block descriptors live?
- Choice 1 at the start of a block.
- Easy to find the descriptor from an address.
- Bad for cache behaviour.
- Harder to group blocks together for large
objects.
6- 2. The location of the descriptor is some
function of the address of the block. - Harder to find the descriptor from an address.
- We can keep block descriptors together, better
for cache behaviour - Groups of consecutive blocks are easier the
non-head blocks can have descriptors too.
7Block Allocator (cont.)
2m bytes
Block 1
Block 2
Block N
2k bytes
2m bytes aligned
The block allocator requests memory from the
operating system in units of a Megablock, which
is divided into N blocks and block descriptors.
8Block Allocator (cont.)
2m bytes
Block 1
Block 2
Block N
2k bytes
Bdescr(p) ((p 2m-1) gtgt k) ltlt d)
(p 2m-1)
2m bytes aligned
2d
9Linear vs. block-structured
- Advantages of a block-structured heap
- Memory can be recycled quickly less wastage,
better cache behaviour - Flexible dynamic resizing of generations is easy
- Large objects can be stored in their own blocks,
and managed separately. - Disadvantages
- may be slightly slower
- small amounts of wastage at the end of each block
- Deciding whether an object is heap-allocated or
not may be harder
10The GC proper
- Highly flexible runtime selection of
- number of generations
- aging within a generation
- generation sizes
- sizing policy for nursery fixed size or use
whatevers left - compaction or copying for the oldest generation,
or automatically choose based on residency
11Our approach to the write barrier
- Objects which contain old-to-new pointers are
kept on a linked list. - These include
- Updates when an update writes to an old-gen
object, chain the object onto the mutable list
for that generation. - Truly mutable objects (mutable references,
mutable arrays, thread stacks) are kept on the
mutable list for the generation (alternative we
could track writes to mutable objects).
12Aging
- It is well-known that objects should be aged in a
generation to avoid premature promotion. - We divide a generation into steps objects in
step N have survived N GCs in that generation. - During a GC, objects are copied to the next step.
13Aging
- No write barrier between steps all steps in a
gen are collected together.
Generation 0
14Eager promotion
- Collecting gen 0 A is alive, but there is no
point aging it. - Eager promotion we evacuate A directly to the
generation of the object which points to it.
Generation 0
Generation 1
Step 0
Step 1
A
B
15Eager promotion
- Old-to-new pointers arise from updates
- either an update of an old-gen object, or
- promotion of an updated object (A updated to
point to B in a younger step, A gets promoted).
Step 0
Step 1
Generation 1
A
B
16Eager promotion
- In general, old-to-new pointers may span several
steps/generations, so we might be able to avoid
multiple copies. - Not always possible the target object may
already have been copied during this GC cant
move an object twice during GC (other objects may
already point to the new location). - Eager promotion itself may introduce old-to-new
pointers, due to leapfrogging. - We might decide to eagerly promote certain
objects by type, e.g. large arrays.
17Mutable objects and eager promotion
- Should we treat objects pointed to by mutable
objects as candidates for eager promotion? NO! - Only consider immutable old-to-new pointers.
Step 0
Step 1
Generation 1
B
B
A
C
18Eager promotion
- So we keep two lists per generation
- mutable objects
- immutable objects with old-to-new ptrs, which are
subject to eager promotion - Both lists are traversed on every GC
19Eager promotion
- We want to promote an object to the oldest gen
that references it. - Cant move an object more than once during GC,
and impractical to find all the references before
moving it. - So scavenge pointers from oldest gens first, as
far as possible. - We counted failed promotions and tweaked the
ordering to minimize them.
20Large objects
- Large objects (eg. arrays, thread stacks) are
given block groups to themselves - We keep large objects in linked lists attached to
each step, to avoid copying. - Also pinned objects, for passing to foreign
language calls, can be handled this way.
21Tuning the GC
- Default setup
- 2 generations
- 2 steps in gen 0
- allocation area (gen 0, step 0) fixed at 256k
- gen 1 collected when it doubles in size (minimum
1M) - heap size unlimited
- copying (not compaction) for the oldest gen
22Tuning
- Defaults biased in favour of reducing memory
consumption. Using more memory can give big
speedups e.g. H64M says use at least 64M for
the heap (using heuristics to keep within the
bounds). - Keeping the alloc area small to stay in the cache
doesnt help much. - 3 or more gens occasionally helps
- maximum heap size can be specified the GC will
switch to compaction when near the limit. - Compaction is about 2x slower than copying
- We havent done any rigorous measurements to find
better sets of parameters.
23Conclusions
- Block layer good
- Eager promotion good (but perhaps only relevant
with single mutation?) - Too many knobs!