Title: Type Systems For Distributed Data Sharing
1Type Systems ForDistributed Data Sharing
- Ben Liblit, Alex Aiken,and Katherine Yelick
- University of California, Berkeley
2Why Shared/Private Matters
- Data location management
- Cache coherence
- Race condition detection
- Program/algorithm documentation
- Consistency model relaxation
- Synchronization elimination
- Autonomous garbage collection
- Security
3Highlights of This Talk
- Review of underlying memory model
- Originally in Liblit et al, POPL 00
- Captures representation but not sharing
- Suite of type systems for data sharing
- One size does not fit all
- Overview of type inference
- Selected experimental findings
4Distributed Memory Model
- Multiple machines, each with local memory
- Global memory is union of local memories
- Distinguish two types of pointers
- Local points to local memory only ?address?
- Global points anywhere ?machine, address?
- Different representations operations
5Type Grammar
- Integers and pointers unboxed pairs in paper
- All indirection (boxing) is explicit
- Pointers are either local or global
- Coercion, not subtyping
6Review of Global DereferencingStandard Approach
Unsound
x
5
?x
7Review of Global DereferencingStandard Approach
Unsound
x
?
5
?x
8Review of Global DereferencingSound With Type
Expansion
x
5
?x
9Review of Global DereferencingSound With Type
Expansion
10Representation Versus Sharing
- Consider obvious assumptions
- local pointers address private data
- global pointers address shared data
5
11Representation Versus Sharing
- Locally pointed-to data might not be private
5
12Representation Versus Sharing
- Locally pointed-to data might not be private
- Because of local / global aliasing
x
5
13Representation Versus Sharing
- Locally pointed-to data might not be private
- Because of transitivity pointer widening
y
5
?y
14Representation Versus Sharing
- Globally pointed-to data might not be shared
- What if ?y never actually happens?
y
5
?y
15Distinct, but not Independent
- Local pointer to shared data ?
- Local pointer to private data ?
- Global pointer to shared data ?
- Global pointer to private data ?!?
- Several possible approaches
- Determines what private really means
- Determines which clients can benefit
16Sharing Qualifiers
- Polymorphism needed in practice
- Mixed supertype of shared private
- Local access only, but assume others may be
watching - Top, but no bottom
mixed ?
shared
private
17Augmented Type Grammar
- Allow subtyping of pointers
- But not across pointers, since we allow
assignment - Allocation is explicitly shared or private
18Late EnforcementLimited Use of Global Pointers
19Late Enforcement Applicability
- Data location management
- Cache coherence
- Race condition detection
- Program/algorithm documentation
- Consistency model relaxation
- Synchronization elimination
- Autonomous garbage collection (in practice)
- Security
20Why Garbage Collection Breaks
- Locally allocate some private data
5
21Why Garbage Collection Breaks
- Locally allocate some private data
- Send its address to another machine
5
22Why Garbage Collection Breaks
- Forget the original local pointer
5
23Why Garbage Collection Breaks
- Forget the original local pointer
- Garbage collect unreachable private data
24Why Garbage Collection Breaks
- Later, retrieve the global pointer
- Coerce back to local (runtime check)
25Export EnforcementNo Escape of Private Addresses
- Note that t' might reference private data
- Autonomous garbage collection OK
- Security not OK
26Early EnforcementShared is Transitively Closed
27Recap of Enforcement Strategies
- Late enforcement
- Anything can point to anything
- Restricted global dereference assignment
y
5
3
28Recap of Enforcement Strategies
- Export enforcement
- Can only reveal shared addresses
- Still restrict global pointer operations
y
5
3
29Recap of Enforcement Strategies
- Early enforcement
- Shared universe is transitively closed
- Global pointer restrictions trivially satisfied
y
5
3
30Type InferenceConstraint Generation
- Type structure already known
- Including local / global
- Induce constraints on sharing qualifiers
- d shared from global deref / assign
- d d' from assignments
- d d' from various other operations
- Stricter enforcement adds more constraints
- d shared Þ d' shared
31Type InferenceConstraint Resolution
d1
d2
private
shared
d
- Given constraints
- d d1 shared d1
- d d2 private d2
32Type InferenceConstraint Resolution
d1 ? mixed
d2 ? shared
private
shared
d ? shared
- Two minimal solutions
- d ? shared Þ d1 ? mixed Ù d2 ? shared
33Type InferenceConstraint Resolution
d1 ? private
d2 ? mixed
private
shared
d ? private
- Two minimal solutions
- d ? shared Þ d1 ? mixed Ù d2 ? shared
- d ? private Þ d1 ? private Ù d2 ? mixed
34Type InferenceBiased Constraint Resolution
d1
shared d2
private
shared
d
- Push shared and mixed forward
35Type InferenceBiased Constraint Resolution
d1
shared d2
private
shared
d
- Push shared and mixed forward
- Identify qualifiers which cannot be private
36Type InferenceBiased Constraint Resolution
shared d2
d1 ? private
private
shared
d ? private
- Push shared and mixed forward
- Identify qualifiers which cannot be private
- Set all other qualifiers to private
37Type InferenceBiased Constraint Resolution
shared d2 private d2
d1 ? private
private
shared
d ? private
- Identify qualifiers which cannot be private
- Set all other qualifiers to private
- Push private forward
38Type InferenceBiased Constraint Resolution
d1 ? private
d2 ? mixed
private
shared
d ? private
- Set all other qualifiers to private
- Push private forward
- Set remaining qualifiers to shared or mixed
39Implementation For Titanium
- Java SPMD extensions
- Objects, classes, interfaces, methods
- Multidimensional arrays, templates
- Local / global, communications primitives
- Sharing validation as type checking
- Sharing inference as compiler analysis
- Late or early enforcement
- Whole-program or partial
40Experimental FindingsConsistency Model
Relaxation
- Titanium has very weak consistency model
- Sequential model preferred, but too slow?
- Sequential is overkill for private data
- Weakly consistent on private data
- Sequentially consistent on shared data
- Compare to weak fully sequential models
- Four-way Pentium III SMP at 550 MHz
41Experimental FindingsConsistency Model
Relaxation
42Experimental FindingsData Location Management
- Tally allocations by type at run time
- Tremendous variation
- 1 - 100 of allocated bytes are private
- 45 in large gas benchmark
- Sensitivity to enforcement policy
- amr 74 late / 19 early
43Summary
- Private might not mean what you think
- Generalize on earlier (often implicit) designs
- Amenable to efficient type inference
- Experimental implementation
- Ideas algorithms scale to real system
- More aggressive clients needed
- Potential for stronger, phase-aware inference