Spin Locks and Contention - PowerPoint PPT Presentation

About This Presentation
Title:

Spin Locks and Contention

Description:

The Art of Multiprocessor Programming. by Maurice Herlihy & Nir Shavit. Modified by Rajeev Alur ... SIMD (Vector) Single instruction. Multiple data. MIMD ... – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 178
Provided by: Maurice80
Category:
Tags: contention | locks | spin

less

Transcript and Presenter's Notes

Title: Spin Locks and Contention


1
Spin Locks and Contention
  • Companion slides for
  • The Art of Multiprocessor Programming
  • by Maurice Herlihy Nir Shavit
  • Modified by Rajeev Alur
  • for CIS 640, University of Pennsylvania

2
Muddy childrens puzzle(Common Knowledge)
  • A group of kids are playing. A stranger walks by
    and announces Some of you have mud on your
    forehead
  • Each kid can see everyone elses forehead, but
    not his/her own (and they dont talk to one
    another)
  • Stranger says Raise your hand if you conclude
    that you have mud on your forehead. Nobody does.
  • Stranger keeps on repeating the statement.
  • If k kids have muddy foreheads, then exactly
    these k kids raise their hands after the stranger
    repeats the statement exactly k times

Art of Multiprocessor Programming
2
3
Muddy childrens puzzleWhy does this happen?
  • For every k
  • If gtk kids have muddy foreheads, then in the
    first k-1 rounds nobody raises hands
  • If k kids have muddy foreheads, then in the k-th
    round, exactly muddy kids raise their hands
  • This claim can be proved by induction on k
  • Base case k1
  • Inductive case (assume for k, and prove for k1)

Art of Multiprocessor Programming
3
4
What is the role of strangers statement?
  • Let p stand for gt 0 kids have muddy foreheads
  • Assuming gt1 kids are muddy, stranger announcing p
    does not add to anyones information
  • However, without strangers announcement, nobody
    will ever raise their hands
  • So whats going on
  • Well, the base case for our proof fails, but
    exactly what information do kids acquire from the
    strangers announcement?

Art of Multiprocessor Programming
4
5
Common Knowledge
  • E p Everybody knows p
  • E E p Everybody knows that everybody knows p
  • Ek p defined similarly (k repetitions)
  • C p p is common knowledge limit of Everybody
    knows that everybody knows .
  • For k 2, each kid knows p, but not Ep, and after
    strangers announcement, each kid knows E p
  • If k kids are muddy, before announcement, each
    kid knows Ek-1 p, but not Ek p
  • Stranger makes p the common knowledge

Art of Multiprocessor Programming
5
6
Mutual ExclusionFocus so far Correctness
  • Models
  • Accurate
  • But idealized
  • Protocols
  • Elegant
  • Important
  • But used in practice

7
New Focus Performance
  • Models
  • More complicated
  • Still focus on principles
  • Protocols
  • Elegant
  • Important
  • And realistic

8
Kinds of Architectures
  • SISD (Uniprocessor)
  • Single instruction stream
  • Single data stream
  • SIMD (Vector)
  • Single instruction
  • Multiple data
  • MIMD (Multiprocessors)
  • Multiple instruction
  • Multiple data.

9
Kinds of Architectures
  • SISD (Uniprocessor)
  • Single instruction stream
  • Single data stream
  • SIMD (Vector)
  • Single instruction
  • Multiple data
  • MIMD (Multiprocessors)
  • Multiple instruction
  • Multiple data.

Our space
(1)
10
MIMD Architectures
memory
Shared Bus
Distributed
  • Memory Contention
  • Communication Contention
  • Communication Latency

11
Today Revisit Mutual Exclusion
  • Think of performance, not just correctness and
    progress
  • Begin to understand how performance depends on
    our software properly utilizing the
    multiprocessor machines hardware
  • And get to know a collection of locking
    algorithms

(1)
12
What Should you do if you cant get a lock?
  • Keep trying
  • spin or busy-wait
  • Good if delays are short
  • Give up the processor
  • Good if delays are long
  • Always good on uniprocessor

(1)
13
What Should you do if you cant get a lock?
  • Keep trying
  • spin or busy-wait
  • Good if delays are short
  • Give up the processor
  • Good if delays are long
  • Always good on uniprocessor

our focus
14
Basic Spin-Lock
CS
Resets lock upon exit
spin lock
critical section
15
Basic Spin-Lock
lock introduces sequential bottleneck
CS
Resets lock upon exit
spin lock
critical section
16
Basic Spin-Lock
CS
Resets lock upon exit
spin lock
critical section
17
Basic Spin-Lock
CS
Resets lock upon exit
spin lock
critical section
Notice these are distinct phenomena
18
Test-and-Set Primitive
  • Boolean value
  • Test-and-set (TAS)
  • Swap true with current value
  • Return value tells if prior value was true or
    false
  • Can reset just by writing false
  • TAS aka getAndSet

19
Test-and-Set
public class AtomicBoolean boolean value
public synchronized boolean getAndSet(boolean
newValue) boolean prior value value
newValue return prior
(5)
20
Review Test-and-Set
public class AtomicBoolean boolean value
public synchronized boolean getAndSet(boolean
newValue) boolean prior value value
newValue return prior
Package java.util.concurrent.atomic
21
Review Test-and-Set
public class AtomicBoolean boolean value
public synchronized boolean getAndSet(boolean
newValue) boolean prior value value
newValue return prior
Swap old and new values
22
Test-and-Set
AtomicBoolean lock new AtomicBoolean(false) b
oolean prior lock.getAndSet(true)
23
Test-and-Set
AtomicBoolean lock new AtomicBoolean(false) b
oolean prior lock.getAndSet(true)
Swapping in true is called test-and-set or TAS
(5)
24
Test-and-Set Locks
  • Locking
  • Lock is free value is false
  • Lock is taken value is true
  • Acquire lock by calling TAS
  • If result is false, you win
  • If result is true, you lose
  • Release lock by writing false

25
Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
26
Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
Lock state is AtomicBoolean
27
Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
Keep trying until lock acquired
28
Test-and-set Lock
class TASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(state.getAndSet(true)) void unlock()
state.set(false)
Release lock by resetting state to false
29
Space Complexity
  • TAS spin-lock has small footprint
  • N thread spin-lock uses O(1) space
  • As opposed to O(n) Peterson/Bakery
  • How did we overcome the W(n) lower bound?
  • We used a combined read-write operation

30
Performance
  • Experiment
  • n threads
  • Increment shared counter 1 million times
  • How long should it take?
  • How long does it take?

31
Mystery 1
TAS lock Ideal
time
What is going on?
threads
(1)
32
Test-and-Test-and-Set Locks
  • Lurking stage
  • Wait until lock looks free
  • Spin while read returns true (lock taken)
  • Pouncing state
  • As soon as lock looks available
  • Read returns false (lock free)
  • Call TAS to acquire lock
  • If TAS loses, back to lurking

33
Test-and-test-and-set Lock
class TTASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(true) while (state.get()) if
(!state.getAndSet(true)) return
34
Test-and-test-and-set Lock
class TTASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(true) while (state.get()) if
(!state.getAndSet(true)) return
Wait until lock looks free
35
Test-and-test-and-set Lock
class TTASlock AtomicBoolean state new
AtomicBoolean(false) void lock() while
(true) while (state.get()) if
(!state.getAndSet(true)) return
Then try to acquire it
36
Mystery 2
TAS lock TTAS lock Ideal
time
threads
37
Mystery
  • Both
  • TAS and TTAS
  • Do the same thing (in our model)
  • Except that
  • TTAS performs much better than TAS
  • Neither approaches ideal

38
Opinion
  • Our memory abstraction is broken
  • TAS TTAS methods
  • Are provably the same (in our model)
  • Except they arent (in field tests)
  • Need a more detailed model

39
Bus-Based Architectures
cache
cache
cache
Bus
memory
40
Bus-Based Architectures
Random access memory (10s of cycles)
cache
cache
cache
Bus
memory
41
Bus-Based Architectures
  • Shared Bus
  • Broadcast medium
  • One broadcaster at a time
  • Processors and memory all snoop

cache
cache
cache
Bus
memory
42
Bus-Based Architectures
  • Per-Processor Caches
  • Small
  • Fast 1 or 2 cycles
  • Address state information

cache
cache
cache
Bus
memory
43
Jargon Watch
  • Cache hit
  • I found what I wanted in my cache
  • Good Thing
  • Cache miss
  • I had to go all the way to memory for that data
  • Bad Thing

44
Caveat
  • This model is still a simplification
  • But not in any essential way
  • Illustrates basic principles
  • Will discuss complexities later

45
Processor Issues Load Request
cache
cache
cache
Bus
memory
data
46
Processor Issues Load Request
Gimme data
cache
cache
cache
Bus
Bus
memory
data
47
Memory Responds
cache
cache
cache
Bus
Bus
Got your data right here
memory
data
data
48
Processor Issues Load Request
cache
cache
data
Bus
memory
data
49
Processor Issues Load Request
cache
cache
data
Bus
Bus
memory
data
50
Processor Issues Load Request
I got data
cache
cache
data
Bus
Bus
memory
data
51
Other Processor Responds
I got data
data
cache
cache
data
Bus
Bus
memory
data
52
Other Processor Responds
data
cache
cache
data
Bus
Bus
memory
data
53
Modify Cached Data
data
cache
data
Bus
memory
data
(1)
54
Modify Cached Data
data
data
cache
data
Bus
memory
data
(1)
55
Modify Cached Data
data
cache
data
Bus
memory
data
56
Modify Cached Data
data
cache
data
Bus
Whats up with the other copies?
memory
data
57
Cache Coherence
  • We have lots of copies of data
  • Original copy in memory
  • Cached copies at processors
  • Some processor modifies its own copy
  • What do we do with the others?
  • How to avoid confusion?

58
Write-Back Caches
  • Accumulate changes in cache
  • Write back when needed
  • Need the cache for something else
  • Another processor wants it
  • On first modification
  • Invalidate other entries
  • Requires non-trivial protocol

59
Write-Back Caches
  • Cache entry has three states
  • Invalid contains raw seething bits
  • Valid I can read but I cant write
  • Dirty Data has been modified
  • Intercept other load requests
  • Write back to memory before using cache

60
Invalidate
cache
data
data
Bus
memory
data
61
Invalidate
Mine, all mine!
cache
data
data
Bus
Bus
memory
data
62
Invalidate
Uh,oh
cache
data
data
cache
Bus
Bus
memory
data
63
Invalidate
Other caches lose read permission
cache
cache
data
Bus
memory
data
64
Invalidate
Other caches lose read permission
cache
cache
data
Bus
This cache acquires write permission
memory
data
65
Invalidate
Memory provides data only if not present in any
cache, so no need to change it now (expensive)
cache
cache
data
Bus
memory
data
(2)
66
Another Processor Asks for Data
cache
cache
data
Bus
Bus
memory
data
(2)
67
Owner Responds
cache
data
cache
data
Bus
Bus
memory
data
(2)
68
End of the Day
cache
data
data
data
Bus
memory
data
Reading OK, no writing
(1)
69
Mutual Exclusion
  • What do we want to optimize?
  • Bus bandwidth used by spinning threads
  • Release/Acquire latency
  • Acquire latency for idle lock

70
Simple TASLock
  • TAS invalidates cache lines
  • Spinners
  • Miss in cache
  • Go to bus
  • Thread wants to release lock
  • delayed behind spinners

71
Test-and-test-and-set
  • Wait until lock looks free
  • Spin on local cache
  • No bus use while lock busy
  • Problem when lock is released
  • Invalidation storm

72
Local Spinning while Lock is Busy
busy
busy
busy
Bus
memory
busy
73
On Release
free
invalid
invalid
Bus
memory
free
74
On Release
Everyone misses, rereads
miss
miss
free
invalid
invalid
Bus
memory
free
(1)
75
On Release
Everyone tries TAS
TAS()
TAS()
free
invalid
invalid
Bus
memory
free
(1)
76
Problems
  • Everyone misses
  • Reads satisfied sequentially
  • Everyone does TAS
  • Invalidates others caches
  • Eventually quiesces after lock acquired
  • How long does this take?

77
Mystery Explained
TAS lock TTAS lock Ideal
time
Better than TAS but still not as good as ideal
threads
78
Solution Introduce Delay
  • If the lock looks free
  • But I fail to get it
  • There must be contention
  • Better to back off than to collide again

time
spin lock
d
r1d
r2d
79
Dynamic Example Exponential Backoff
time
spin lock
d
2d
4d
  • If I fail to get lock
  • wait random duration before retry
  • Each subsequent failure doubles expected wait

80
Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
81
Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Fix minimum delay
82
Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Wait until lock looks free
83
Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
If we win, return
84
Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Back off for random duration
85
Exponential Backoff Lock
public class Backoff implements lock public
void lock() int delay MIN_DELAY while
(true) while (state.get()) if
(!lock.getAndSet(true)) return
sleep(random() delay) if (delay lt
MAX_DELAY) delay 2 delay
Double max delay, within reason
86
Spin-Waiting Overhead
TTAS Lock
time
Backoff lock
threads
87
Backoff Other Issues
  • Good
  • Easy to implement
  • Beats TTAS lock
  • Bad
  • Must choose parameters carefully
  • Not portable across platforms

88
Idea
  • Avoid useless invalidations
  • By keeping a queue of threads
  • Each thread
  • Notifies next in line
  • Without bothering the others

89
Anderson Queue Lock
next
flags
T
F
F
F
F
F
F
F
90
Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
91
Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
92
Anderson Queue Lock
next
Mine!
flags
T
F
F
F
F
F
F
F
93
Anderson Queue Lock
next
flags
T
F
F
F
F
F
F
F
94
Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
95
Anderson Queue Lock
next
getAndIncrement
flags
T
F
F
F
F
F
F
F
96
Anderson Queue Lock
next
flags
T
F
F
F
F
F
F
F
97
Anderson Queue Lock
next
flags
T
T
F
F
F
F
F
F
98
Anderson Queue Lock
next
Yow!
flags
T
T
F
F
F
F
F
F
99
Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
100
Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
One flag per thread
101
Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
Next flag to use
102
Anderson Queue Lock
class ALock implements Lock boolean
flagstrue,false,,false AtomicInteger next
new AtomicInteger(0) ThreadLocalltIntegergt
mySlot
Thread-local variable
103
Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
104
Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
Take next slot
105
Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
Spin until told to go
106
Anderson Queue Lock
public lock() myslot next.getAndIncrement()
while (!flagsmyslot n) flagsmyslot
n false public unlock()
flags(myslot1) n true
Prepare slot for re-use
107
Anderson Queue Lock
public lock() mySlot next.getAndIncrement()
while (!flagsmySlot n) flagsmySlot
n false public unlock()
flags(mySlot1) n true
Tell next thread to go
108
Performance
TTAS
  • Shorter handover than backoff
  • Curve is practically flat
  • Scalable performance
  • FIFO fairness

queue
109
Anderson Queue Lock
  • Good
  • First truly scalable lock
  • Simple, easy to implement
  • Bad
  • Space hog
  • One bit per thread
  • Unknown number of threads?
  • Small number of actual contenders?

110
CLH Lock
  • FIFO order
  • Small, constant-size overhead per thread

111
Initially
tail
false
112
Initially
tail
Queue tail
false
113
Initially
Lock is free
tail
false
114
Initially
tail
false
115
Purple Wants the Lock
tail
false
116
Purple Wants the Lock
tail
true
false
117
Purple Wants the Lock
Swap
tail
true
false
118
Purple Has the Lock
tail
true
false
119
Red Wants the Lock
tail
true
false
true
120
Red Wants the Lock
Swap
tail
true
false
true
121
Red Wants the Lock
tail
true
false
true
122
Red Wants the Lock
tail
true
false
true
123
Red Wants the Lock
Implicit Linked list
tail
true
false
true
124
Red Wants the Lock
tail
true
false
true
125
Red Wants the Lock
Actually, it spins on cached copy
true
tail
true
false
true
126
Purple Releases
Bingo!
false
tail
false
false
true
127
Purple Releases
tail
true
128
Space Usage
  • Let
  • L number of locks
  • N number of threads
  • ALock
  • O(LN)
  • CLH lock
  • O(LN)

129
CLH Queue Lock
class Qnode AtomicBoolean locked new
AtomicBoolean(true)
130
CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
myNode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)

(3)
131
CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
mynode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)

Queue tail
(3)
132
CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
Qnode pred tail.getAndSet(myNode) while
(pred.locked)
Thread-local Qnode
(3)
133
CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
mynode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)

Swap in my node
(3)
134
CLH Queue Lock
class CLHLock implements Lock
AtomicReferenceltQnodegt tail ThreadLocalltQnodegt
myNode new Qnode() public void lock()
mynode.locked.set(true) Qnode pred
tail.getAndSet(myNode) while (pred.locked)

Spin until predecessor releases lock
(3)
135
CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
(3)
136
CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
Notify successor
(3)
137
CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
Recycle predecessors node
(3)
138
CLH Queue Lock
Class CLHLock implements Lock public void
unlock() myNode.locked.set(false) myNode
pred
(Code in book shows how its done using myPred
reference.)
(3)
139
CLH Lock
  • Good
  • Lock release affects predecessor only
  • Small, constant-sized space
  • Bad
  • Doesnt work for uncached NUMA architectures

140
NUMA Architecturs
  • Acronym
  • Non-Uniform Memory Architecture
  • Illusion
  • Flat shared memory
  • Truth
  • No caches (sometimes)
  • Some memory regions faster than others

Art of Multiprocessor Programming
140
141
NUMA Machines
Spinning on local memory is fast
Art of Multiprocessor Programming
141
142
NUMA Machines
Spinning on remote memory is slow
Art of Multiprocessor Programming
142
143
CLH Lock
  • Each thread spins on predecessors memory
  • Could be far away

Art of Multiprocessor Programming
143
144
MCS Lock
  • FIFO order
  • Spin on local memory only
  • Small, Constant-size overhead

Art of Multiprocessor Programming
144
145
Initially
tail
false
false
Art of Multiprocessor Programming
145
146
Acquiring
(allocate Qnode)
true
tail
false
false
Art of Multiprocessor Programming
146
147
Acquiring
true
swap
tail
false
false
Art of Multiprocessor Programming
147
148
Acquiring
true
tail
false
false
Art of Multiprocessor Programming
148
149
Acquired
true
tail
false
false
Art of Multiprocessor Programming
149
150
Acquiring
false
tail
true
swap
Art of Multiprocessor Programming
150
151
Acquiring
false
tail
true
Art of Multiprocessor Programming
151
152
Acquiring
false
tail
true
Art of Multiprocessor Programming
152
153
Acquiring
false
tail
true
Art of Multiprocessor Programming
153
154
Acquiring
true
tail
true
false
Art of Multiprocessor Programming
154
155
Acquiring
Yes!
true
tail
false
true
Art of Multiprocessor Programming
155
156
MCS Queue Lock
class Qnode boolean locked false qnode
next null
Art of Multiprocessor Programming
156
157
MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)

Art of Multiprocessor Programming
157
(3)
158
MCS Queue Lock
Make a QNode
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)

Art of Multiprocessor Programming
158
(3)
159
MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)

add my Node to the tail of queue
Art of Multiprocessor Programming
159
(3)
160
MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)

Fix if queue was non-empty
Art of Multiprocessor Programming
160
(3)
161
MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void lock() Qnode qnode new
Qnode() Qnode pred tail.getAndSet(qnode)
if (pred ! null) qnode.locked true
pred.next qnode while (qnode.locked)

Wait until unlocked
Art of Multiprocessor Programming
161
(3)
162
MCS Queue Unlock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
Art of Multiprocessor Programming
162
(3)
163
MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
Missing successor?
Art of Multiprocessor Programming
163
(3)
164
MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
If really no successor, return
Art of Multiprocessor Programming
164
(3)
165
MCS Queue Lock
class MCSLock implements Lock AtomicReference
tail public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null) return
while (qnode.next null)
qnode.next.locked false
Otherwise wait for successor to catch up
Art of Multiprocessor Programming
165
(3)
166
MCS Queue Lock
class MCSLock implements Lock AtomicReference
queue public void unlock() if (qnode.next
null) if (tail.CAS(qnode, null)
return while (qnode.next null)
qnode.next.locked false
Pass lock to successor
Art of Multiprocessor Programming
166
(3)
167
Purple Release
false
false
Art of Multiprocessor Programming
167
(2)
168
Purple Release
By looking at the queue, I see another thread is
active
false
false
Art of Multiprocessor Programming
168
(2)
169
Purple Release
By looking at the queue, I see another thread is
active
false
false
I have to wait for that thread to finish
Art of Multiprocessor Programming
169
(2)
170
Purple Release
prepare to spin
true
false
Art of Multiprocessor Programming
170
171
Purple Release
spinning
true
false
Art of Multiprocessor Programming
171
172
Purple Release
spinning
true
false
false
Art of Multiprocessor Programming
172
173
Purple Release
Acquired lock
true
false
false
Art of Multiprocessor Programming
173
174
Abortable Locks
  • What if you want to give up waiting for a lock?
  • For example
  • Timeout
  • Database transaction aborted by user

Art of Multiprocessor Programming
174
175
Back-off Lock
  • Aborting is trivial
  • Just return from lock() call
  • Extra benefit
  • No cleaning up
  • Wait-free
  • Immediate return

Art of Multiprocessor Programming
175
176
Queue Locks
  • Cant just quit
  • Thread in line behind will starve
  • Need a graceful way out
  • Timeout Queue Lock

Art of Multiprocessor Programming
176
177
One Lock To Rule Them All?
  • TTASBackoff, CLH, MCS, ToLock
  • Each better than others in some way
  • There is no one solution
  • Lock we pick really depends on
  • the application
  • the hardware
  • which properties are important

Art of Multiprocessor Programming
177
Write a Comment
User Comments (0)
About PowerShow.com