GHC - PowerPoint PPT Presentation

About This Presentation
Title:

GHC

Description:

GHC s Garbage Collector Simon Marlow Simon Peyton Jones Microsoft Research, Cambridge Parameters Very fast allocation (~ 1Gb/s) High infant mortality Truly mutable ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 24
Provided by: Simon315
Category:

less

Transcript and Presenter's Notes

Title: GHC


1
GHCs Garbage Collector
  • Simon Marlow
  • Simon Peyton Jones
  • Microsoft Research, Cambridge

2
Parameters
  • Very fast allocation ( 1Gb/s)
  • High infant mortality
  • Truly mutable objects are rare
  • updates are common (single-mutation), but
    usually happen to young objects
  • Lightweight Haskell threads
  • Weak pointers, finalizers, pinning

3
A block allocation layer
  • We chose to build the storage manager on top of a
    block layer for flexibility.
  • Block allocator provides blocks of memory (eg.
    4k) singly or in contiguous groups.
  • Block allocator manages the free list.

Storage Manager
Block Allocator
malloc() / mmap()
4
Block descriptors
  • Each block has a fixed table of data the block
    descriptor.

1st free byte in the block
struct bdescr void start, free, link
int blocks bdescr Bdescr (void
) bdescr allocBlocks (int blocks) void
freeBlocks (bdescr )
Bdescr() maps address to block descriptor in a
few instructions, no memory accesses
chains blocks together (or links to head of group)
Start of the block
Number of blocks in group (0 if not the head)
5
Where do block descriptors live?
  • Choice 1 at the start of a block.
  • Easy to find the descriptor from an address.
  • Bad for cache behaviour.
  • Harder to group blocks together for large
    objects.

6
  • 2. The location of the descriptor is some
    function of the address of the block.
  • Harder to find the descriptor from an address.
  • We can keep block descriptors together, better
    for cache behaviour
  • Groups of consecutive blocks are easier the
    non-head blocks can have descriptors too.

7
Block Allocator (cont.)
2m bytes
Block 1
Block 2
Block N
2k bytes
2m bytes aligned
The block allocator requests memory from the
operating system in units of a Megablock, which
is divided into N blocks and block descriptors.
8
Block Allocator (cont.)
2m bytes
Block 1
Block 2
Block N
2k bytes
Bdescr(p) ((p 2m-1) gtgt k) ltlt d)
(p 2m-1)
2m bytes aligned
2d
9
Linear vs. block-structured
  • Advantages of a block-structured heap
  • Memory can be recycled quickly less wastage,
    better cache behaviour
  • Flexible dynamic resizing of generations is easy
  • Large objects can be stored in their own blocks,
    and managed separately.
  • Disadvantages
  • may be slightly slower
  • small amounts of wastage at the end of each block
  • Deciding whether an object is heap-allocated or
    not may be harder

10
The GC proper
  • Highly flexible runtime selection of
  • number of generations
  • aging within a generation
  • generation sizes
  • sizing policy for nursery fixed size or use
    whatevers left
  • compaction or copying for the oldest generation,
    or automatically choose based on residency

11
Our approach to the write barrier
  • Objects which contain old-to-new pointers are
    kept on a linked list.
  • These include
  • Updates when an update writes to an old-gen
    object, chain the object onto the mutable list
    for that generation.
  • Truly mutable objects (mutable references,
    mutable arrays, thread stacks) are kept on the
    mutable list for the generation (alternative we
    could track writes to mutable objects).

12
Aging
  • It is well-known that objects should be aged in a
    generation to avoid premature promotion.
  • We divide a generation into steps objects in
    step N have survived N GCs in that generation.
  • During a GC, objects are copied to the next step.

13
Aging
  • No write barrier between steps all steps in a
    gen are collected together.

Generation 0
14
Eager promotion
  • Collecting gen 0 A is alive, but there is no
    point aging it.
  • Eager promotion we evacuate A directly to the
    generation of the object which points to it.

Generation 0
Generation 1
Step 0
Step 1
A
B
15
Eager promotion
  • Old-to-new pointers arise from updates
  • either an update of an old-gen object, or
  • promotion of an updated object (A updated to
    point to B in a younger step, A gets promoted).

Step 0
Step 1
Generation 1
A
B
16
Eager promotion
  • In general, old-to-new pointers may span several
    steps/generations, so we might be able to avoid
    multiple copies.
  • Not always possible the target object may
    already have been copied during this GC cant
    move an object twice during GC (other objects may
    already point to the new location).
  • Eager promotion itself may introduce old-to-new
    pointers, due to leapfrogging.
  • We might decide to eagerly promote certain
    objects by type, e.g. large arrays.

17
Mutable objects and eager promotion
  • Should we treat objects pointed to by mutable
    objects as candidates for eager promotion? NO!
  • Only consider immutable old-to-new pointers.

Step 0
Step 1
Generation 1
B
B
A
C
18
Eager promotion
  • So we keep two lists per generation
  • mutable objects
  • immutable objects with old-to-new ptrs, which are
    subject to eager promotion
  • Both lists are traversed on every GC

19
Eager promotion
  • We want to promote an object to the oldest gen
    that references it.
  • Cant move an object more than once during GC,
    and impractical to find all the references before
    moving it.
  • So scavenge pointers from oldest gens first, as
    far as possible.
  • We counted failed promotions and tweaked the
    ordering to minimize them.

20
Large objects
  • Large objects (eg. arrays, thread stacks) are
    given block groups to themselves
  • We keep large objects in linked lists attached to
    each step, to avoid copying.
  • Also pinned objects, for passing to foreign
    language calls, can be handled this way.

21
Tuning the GC
  • Default setup
  • 2 generations
  • 2 steps in gen 0
  • allocation area (gen 0, step 0) fixed at 256k
  • gen 1 collected when it doubles in size (minimum
    1M)
  • heap size unlimited
  • copying (not compaction) for the oldest gen

22
Tuning
  • Defaults biased in favour of reducing memory
    consumption. Using more memory can give big
    speedups e.g. H64M says use at least 64M for
    the heap (using heuristics to keep within the
    bounds).
  • Keeping the alloc area small to stay in the cache
    doesnt help much.
  • 3 or more gens occasionally helps
  • maximum heap size can be specified the GC will
    switch to compaction when near the limit.
  • Compaction is about 2x slower than copying
  • We havent done any rigorous measurements to find
    better sets of parameters.

23
Conclusions
  • Block layer good
  • Eager promotion good (but perhaps only relevant
    with single mutation?)
  • Too many knobs!
Write a Comment
User Comments (0)
About PowerShow.com