Title: Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms
1Correctness-Preserving Derivation of Concurrent
Garbage Collection Algorithms
Martin T. Vechev
David F. Bacon
Eran Yahav
University of Cambridge
IBM T.J. Watson Research Center
PLDI June 2006
2Why Concurrent Garbage Collection ?
- Java and C
- Garbage-collected languages are prevalent
- Multicore
- Concurrency is becoming prevalent
- Cheap RAM
- Large heaps are becoming prevalent
- Real-Time Systems
- More widely used
3Existing Way to Create a Concurrent GC
REQUIREMENTS
ENVIRONMENT
Throughput Memory Consumption Pause Time
Memory Model Thread Model Concurrency
Primitives CPU primitives
TECHNIQUES
Tracing/reference counting moving Allocate White
/ Black Dijkstra / Steele / Yuasa Barrier Atomic
/ Incremental Stack Snapshot Write Barrier Atomic
/ Non-atomic Color toggle, stacklets etc etc etc
??
- Hard to verify/test
- Often buggy
- Did the monkey
- choose well??
Implementation
4Concurrent GC algorithms and proofs are hard
Yuasa 90
Steele(C) 75
Dijkstra(C) 78
FAMILY
Ben-Ari Base 84
Doligez(C) 93
Boehm 91
Azatchi 03
Ben-Ari Extended 84
Barabash 03
ALGORITHMS
Domani 03
Ben-Ari Base 84
Doligez 94
Pixley 88
PROOFS
THEOREM PROVING
5Our Research Vision
ENVIRONMENT (Declarative Specification)
REQUIREMENTS
Memory Model Thread Model Concurrency
Primitives CPU primitives
Throughput Memory Consumption Pause Time
Automated System
Formally Defined Techniques
Optimal Correct Implementation
6In This Work
FIXED ENVIRONMENT
REQUIREMENTS
Memory Model Thread Model Concurrency
Primitives CPU primitives
Throughput Pause Time
Memory Consumption
Automated System
Formally Defined Techniques for Tracing Non-
Moving GC
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm N
lt
lt
lt
7Problem Interference
SYSTEM MUTATOR GC
B
C
A
1. GC traced B
8Problem Interference
SYSTEM MUTATOR GC
B
B
C
C
A
A
1. GC traced B
2. Mutator links C to B
9Problem Interference
SYSTEM MUTATOR GC
B
B
B
C
C
C
X
A
A
A
1. GC traced B
2. Mutator links C to B
3. Mutator unlinks C from A
10Problem Interference
SYSTEM MUTATOR GC
C LOST
B
B
B
B
C
C
C
C
A
A
A
A
1. GC traced B
2. Mutator links C to B
3. Mutator unlinks C from A
4. GC traced A
11The 3 Families of Concurrent GC Algorithms
3. Rescan B when C is linked to B (STEELE)
1. Marks C when C is linked to B
(DIJKSTRA)
2. Marks C when link to C is removed
(YUASA)
B
B
B
B
C
C
C
C
C
X
A
A
A
- Solutions are applied uniformly for all objects
12Contributions
- Systematic Exploration
- A new parametric model of concurrent GC
- Better understanding
- New algorithms potentially useful
- Formal Relationship between algorithms
- Space - Relative precision between algorithms
- Sharing Proof Burden
- Correctness-preserving transformations
13A Parametric Concurrent GC Skeleton
- Intuition Common out as much as possible
- Record interaction history between collector and
mutator during tracing - Collector exposes hidden objects based on
entire interaction history
14A Parametric Concurrent GC Skeleton
Complete Garbage Collection
COLLECTOR
mark
reclaim
Expose(L,D)
mark
Expose(L,D)
MUTATOR
Change Heap
Change Heap
15Dimensions an intuition
- The effect of each Mutator/GC action is
controlled by a dimension
A
B
X
Collector Scans Pointer Wavefront Granularity
Mutator Creates Pointer Counting
Mutator Overwrites Pointer Snapshot
Mutator Allocates Object Allocation Color
16Implementation Choice Wavefront
- Per-Field Wavefront
- Exact information
- One bit per field
- More expensive
- More synchronization
- More garbage collected
- Per-Object Wavefront
- Approximate Information
- One bit per object
- Less expensive
- Less synchronization
- Less garbage collected
17Choice Record on Link or Unlink
X
- Record on Link
- More synchronization
- More garbage collected
- Record on Unlink
- Less synchronization
- Less garbage collected
18Combined Choices
A
B
A
B
X
Per-Object WF
A
B
A
B
X
Per-Field WF
Record on Link
Record on Unlink
19Combined Choices Per Object
A
B
Per-Obj A Per-Obj B
X
X
Per-Obj A Per-Field B
X
X
X
X
Per-Field A Per-Obj B
Per-Field A Per-Field B
X
X
Rec. Link A Rec. Link B
Rec. Link A Unlink B
Rec. Unlink A Rec. Link B
Rec. Unlink A Rec. Unlink B
20Correctness
- Transformations Proof Steps
APEX (U, U, U, U, )
START WITH A CORRECT ALGORITHM
RETAIN LESS GARBAGE
STEELE
DIJKSTRA (stacks,U,,U,)
STEELE-YC
STEELE-D
STEELE-D-YC
DIJKSTRA-OLD
DIJKSTRA-YC
STEELE-BC
DIJKSTRA-BC
HYBRID-YC (stacks,A,,,)
STEELE-D-BC
RETAIN MORE GARBAGE
YUASA (stacks, A, , , U)
21Relative Precision
- Intuition an algorithm is more precise than
another if it collects more garbage - An algorithm that is less precise (more
conservative) than a correct algorithm is
guaranteed to be correct - Should be a reference point for practical
comparisons - no ad-hoc methods
- Hard to do manually need a tool to provide
insights - Finding the right definition was harder than
proving safety, yet simpler than relative
concurrency
22Precision
APEX (U, U, U, U, )
MORE PRECISE
STEELE
DIJKSTRA (stacks,U,,U,)
STEELE-YC
STEELE-D
STEELE-D-YC
DIJKSTRA-OLD
DIJKSTRA-YC
STEELE-BC
DIJKSTRA-BC
HYBRID-YC (stacks,A,,,)
STEELE-D-BC
LESS PRECISE
YUASA (stacks, A, , , U)
23Conclusions
- Systematic exploration of an algorithm space
- Useful new algorithms
- Formal definition of Relative precision between
algorithms - A first step towards automatic derivation of
concurrent garbage collectors
24The End