Title: STM in Managed Runtimes: HighLevel Language Semantics MICRO 07 Tutorial
1STM in Managed Runtimes High-Level Language
Semantics(MICRO 07 Tutorial)
- Dan Grossman
- University of Washington
- 2 December 2007
2So
- Hopefully youre convinced high-level language
semantics is needed for transactions to succeed - First session focus on various notions of
isolation - A taxonomy of ways weak isolation can surprise
you - Ways to avoid surprises
- Strong isolation
- Restrictive type systems
- Second session
- Formal model for high-level definitions
correctness proofs - Memory-model problems
- Integrating exceptions, I/O, and multithreaded
transactions
3 slide review
3Notions of isolation
- Strong-isolation A transaction executes as
though no other computation is interleaved - Weak-isolation?
- Single-lock (weak-sla) A transaction executes
as though no other transaction is interleaved - Single-lock abort (weak undo) Like weak-sla,
but a transaction can abort/retry, undoing
changes - Single-lock lazy update (weak on-commit)
Like weak-sla, but buffer updates until commit - Real contention Like weak undo or weak
on-commit, but multiple transactions can run at
once - Catch-fire Anything can happen if theres a race
4Partition
- Surprises arose from the same mutable locations
being used inside outside transactions by
different threads - Hopefully sufficient to forbid that
- But unnecessary and probably too restrictive
- Bans publication and privatization
- cf. STM Haskell PPoPP05
- For each allocated object (or word), require one
of - Never mutated
- Only accessed by one thread
- Only accessed inside transactions
- Only accessed outside transactions
5Static partition
- Recall our what is a race problem
initially x0, y0, z0
atomic if(xlty) z
atomic x y
r z //race? assert(z0)
- So accessed on valid control paths is not
enough - Use a type system that conservatively assumes all
paths are possible
6So
- Hopefully youre convinced high-level language
semantics is needed for transactions to succeed - First session focus on various notions of
isolation - A taxonomy of ways weak isolation can surprise
you - Ways to avoid surprises
- Strong isolation
- Restrictive type systems
- Second session
- Formal model for high-level definitions
correctness proofs - Memory-model problems
- Integrating exceptions, I/O, and multithreaded
transactions
7Why formal models
- Some really smart people didnt anticipate the
surprises - So maybe there are other surprises even with the
partitioning type system - Increase our confidence by modeling
(mini-languages) various forms of isolation and
prove them equivalent given the type system - So far weak-sla, weak undo
- Future work weak on-commit, real contention,
thread-local, immutable
8A formal program state
- a H e1 en
- e a thread (an expression that runs and
terminates) - H a heap (maps mutable labels to values)
- a either ?o or ?
- ? means one thread is in a transaction
- o means no thread is in a transaction
- A high-level model for programmers
compiler-writers - No TM implementation details!
9Operational semantics
- Execution is a series of steps from one state to
another - At each step, one thread runs some instruction
- aHe1 en aHe1 en
- Isolation amounts to using the a to restrict
interleavings - strong If a ?, only the transaction can touch
H - weak-sla If a ?, no other thread can start a
transaction - weak undo Like weak-sla, but transactions log
updates and can abort/retry by undoing them - a returns to o after the abort is complete
10A family of languages
- So strong, weak-sla, and weak undo are
similar languages with different semantic rules - The AtomsFamily ?
- Lots of Greek letters in the paper
- Theorem
- If e1, , en type-check with our partition
rules, then - the set of states reachable from aHe1
en - is the same for strong, weak-sla, and weak
undo. - Not quite, weak undo has more transient states
and can produce more garbage
11Type-checking
- Code can be used inside transactions, outside
transactions, or both - Each memory location can be accessed only inside
transactions or only outside transactions - Form of type-checking
- ? ot wt both
- ? int ??
- ? ?, x?
- ? ? e ?
- Assuming variables in ? have those types, e has
type ? and stays on the side of the partition
required by ?
12Type-checking
- ? ? e ?
- Assuming variables in ? have those type, e has
type ? and stays on the side of the partition
required by ? - Three example rules (C-style syntax)
- (specialized slightly to emphasize the
partition) - ?(x) ?? ? ? e ??
- ? ? x ?? ? ? e ?
- ? wt e ?
- ? ? atomice ?
13The proof
- The proofs are dozens of pages and a few
person-months (lest a skipped step hold a
surprise) - But the high-level picture is illuminating
14The proof
- The proofs are dozens of pages and a few
person-months (lest a skipped step hold a
surprise) - But the high-level picture is illuminating
- If possible in strong, then possible in weak-sla
- trivial dont ever violate isolation
strong
weak-sla
weak undo
15The proof
- The proofs are dozens of pages and a few
person-months (lest a skipped step hold a
surprise) - But the high-level picture is illuminating
- If possible in weak-sla, then possible in weak
undo - trivial dont ever abort
strong
weak-sla
weak undo
16The proof
- The proofs are dozens of pages and a few
person-months (lest a skipped step hold a
surprise) - But the high-level picture is illuminating
- If possible in weak-sla, then possible in strong
- Current transaction is serializable thanks to the
type system (can permute with other threads) - Earlier transactions serializable by induction
strong
weak-sla
weak undo
17The proof
- The proofs are dozens of pages and a few
person-months (lest a skipped step hold a
surprise) - But the high-level picture is illuminating
- If possible in weak undo, then possible in
weak-sla? - Really need that abort is correct
- And thats hard to show, especially with
interleavings from weak isolation
strong
weak-sla
weak undo
18The proof
- The proofs are dozens of pages and a few
person-months (lest a skipped step hold a
surprise) - But the high-level picture is illuminating
- If possible in weak undo, then possible in
weak-sla? - Define strong undo for sake of the proof
- Can show abort is correct without interleavings
strong undo
strong
weak-sla
weak undo
19Why we formalize, redux
- Thanks to the formal semantics, we
- Had to make precise definitions
- Know we did not skip cases (at least in the
model) - Learned the essence of why the languages are
equivalent under partition - Weak interleavings are serializable
- Abort is correct
- And these two arguments compose
20So
- Hopefully youre convinced high-level language
semantics is needed for transactions to succeed - First session focus on various notions of
isolation - A taxonomy of ways weak isolation can surprise
you - Ways to avoid surprises
- Strong isolation
- Restrictive type systems
- Second session
- Formal model for high-level definitions
correctness proofs - Memory-model problems
- Integrating exceptions, I/O, and multithreaded
transactions
21Relaxed memory models
- Modern languages dont provide sequential
consistency - Lack of hardware support
- Prevents otherwise sensible ubiquitous compiler
transformations (e.g., copy propagation) - So safe languages need two complicated
definitions - What is properly synchronized?
- What can compiler and hardware do with bad
code? - (Unsafe languages need (1))
- A flavor of simplistic ideas and the consequences
22Ordering
- Can get strange results for bad code
- Need rules for what is good code
initially x0 and y0
x 1 y 1
r y s x assert(sgtr)//invalid
23Ordering
- Can get strange results for bad code
- Need rules for what is good code
initially xy0
initially x0 and y0
x 1 sync(lk) y 1
r y sync(lk) //same lock s
x assert(sgtr)//valid
24Ordering
- Can get strange results for bad code
- Need rules for what is good code
initially x0 and y0
x 1 atomic y 1
r y atomic s x assert(sgtr)//???
If this is good code, existing STMs are wrong
25Ordering
- Can get strange results for bad code
- Need rules for what is good code
initially x0 and y0
x 1 atomicz1 y 1
r y atomictmp0z s x assert(sgtr)//???
Conflicting memory a slippery ill-defined slope
26Lesson
- It is not clear when transactions are ordered,
but languages need memory models - Corollary This could/should delay adoption of
transactions in well-specified languages - I wish I had more answers. ?
27Other operations
- So far every atomic block we have considered
only - read/wrote/allocated memory
- called functions
- What about
- I/O
- Exceptions (or first-class continuations)
- Spawn a thread
28I/O
- Cant have irreversible actions in
- transactions that might abort
- Need pragmatic, partial solutions such as
- Forbid irreversible actions in transactions
- Trivial extension of our partition type system
- Have unabortable transactions
- Make actions reversible
- Buffer output
- Buffer (idempotent) input
29I After O
- The real problem is input after output in a
transaction
atomic write_to_file() read_from_file()
Contents read cannot depend on how external world
sees write if the write is buffered
30Native mechanism
- Can generalize Require native code to have 2
versions - Runtime calls 1 in transactions, 1 not in
transactions - Native code responsible for 2 versions the same
- Transactional versions also need callbacks for
pre-commit, post-commit, and pre-abort - Sufficient for buffering input and output
- Sufficient for external transaction systems
- If in transaction version causes abort, that
just encodes safe dynamic failure/retry
31Exceptions
- If code in atomic throws exception to outside
atomic - A. Does the transaction commit or abort?
- B. Where does control transfer to?
- Three obvious answers
- 1. Commit transaction, transfer to exception
handler - My preference exceptions in most HLLs are
semantically just non-local jumps - Preserves design goal that atomic has no effect
on single-threaded programs
32Exceptions
- If code in atomic throws exception to outside
atomic - A. Does the transaction commit or abort?
- B. Where does control transfer to?
- Three obvious answers
- 2. Abort transaction, transfer to retry the
exception - Turns exceptions into aborts
- Useful if exceptions due to shared-memory state
- But programmer can encode this
atomic try s catch (Throwable e)
abort
33Exceptions
- If code in atomic throws exception to outside
atomic - A. Does the transaction commit or abort?
- B. Where does control transfer to?
- Three obvious answers
- 3. Abort transaction, transfer to exception
handler - But the transaction never happened?!
- What if the exception value uses memory
allocated/written by the aborted transaction?!
34Beyond exceptions
- Other non-local jumps even harder to deal with
- Example Perhaps a coroutine jumps out of an
atomic and then jumps back in - Then probably the jump out should continue the
transaction (commit or abort) later - It depends what youre trying to do which is a
problem if the same language feature (exceptions,
continuations, etc.) is used for multiple idioms. - Tough policy questions mechanism pretty easy
35Multithreaded transactions
- What if code in atomic creates a new thread?
- Easy answers
- Dynamic failure
- Thread not runnable unless/until transaction
commits - More interesting
- Parallelism within transaction
- Isolation and concurrency are orthogonal
- Controversial(?) claim Necessary due to Amdahls
Law as core-count increases
36Multithreaded transactions
- (Semantics done implementation is work in
progress) - When does multithreaded transaction commit?
- After all spawned threads terminate
- What is hard for programmers?
- Nested transactions now crucial for isolating
parallel computations inside a larger transaction - What is hard for implementors?
- Transactional bookkeeping must be parallel
- Unclear how hardware could best help
37So
- Hopefully youre convinced high-level language
semantics is needed for transactions to succeed - First session focus on various notions of
isolation - A taxonomy of ways weak isolation can surprise
you - Ways to avoid surprises
- Strong isolation
- Restrictive type systems
- Second session
- Formal model for high-level definitions
correctness proofs - Memory-model problems
- Integrating exceptions, I/O, and multithreaded
transactions
38If I had another 2 hours
- Plenty more semantics to consider
- Open-nesting semantics
- Message-passing within transactions
- See recent work from Oregon, Purdue, UW
- atomic s1 orelse s2
- Try s2 if s1 aborts
- Fairness guarantees
- Obstruction-freedom
-
39Conclusions
- Weak isolation without type restrictions is
surprising - Interaction with other language features
non-trivial - PL-style semantics has a huge role to play in
bringing transactions to high-level languages - An essential complement to the core algorithms,
compiler, hardware work - Need cross-cultural understanding of the issues
wasp.cs.washington.edu