Title: Efficient Dependency Tracking for Relevant Events in Shared Memory Systems
1Efficient Dependency Tracking for Relevant Events
in Shared Memory Systems
- Anurag Agarwal (anurag_at_cs.utexas.edu)
- Vijay K. Garg (garg_at_ece.utexas.edu)
- PDS Lab
- University of Texas at Austin
2Outline
- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion
3Motivation
- Dependency between events required for global
state information - Applications like monitoring and debugging
- Vector clock Fidge 88, Mattern 89
- O(N) operations for a system with N processes
- Dynamic creation of processes
4Outline
- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion
5Relevant Events
- Events useful for application
- Predicate Detection
- There are no messages in the channel
p1
p2
p3
p4
6Vector Clocks Fidge 88, Mattern 89
- Assigns N-tuple (V) to every relevant event
- e ? f iff e.V lt f.V (clock condition)
- Process Pi
- V (0, , 0)
- On an event e
- If e is receive of message m
- V max (V, m.V)
- If e is a relevant event
- Vi Vi 1
- If e is a send of message m
- m.V V
7Outline
- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion
8Key Idea
- Any chain in the computation poset can function
as a process
a
b
c
e
d
h
f
g
9Chain Clocks
- A component in timestamp corresponds to a chain
- Change Rule II in the vector clock algorithm
- If e is a relevant event
- Ve.c Ve.c 1
- Theorem Chain clocks guarantee the clock
condition - Goal Online decomposition of poset into as few
chains as possible
10Outline
- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- DCC
- ACC
- VCC
- Experimental Results
- Conclusion
11Dynamic Chain Clocks (DCC)
- Shared vector Z maintains up-to-date values of
all components - Each process starts with empty vector
- Rule II
- e.c j such that Zj e.Vj
- Give preference to component last updated by Pi
- Ve.c Ve.c 1
12DCC Example
- If e is receive of message m
- V max (V, m.V)
- If e is a relevant event
- e.c i s.t. Zi Vi
- Ve.c Ve.c 1
- Ze.c Ze.c 1
- If e is a send of message m m.V V
p1
(1)
(1,1) max(1),(0,1)
(2,1)
(3,1)
p2
(0,1)
p3
(3,2)
(3,1)
V1
V2
Z
V3
1
1
0
2
2
3
3
3
1
1
1
1
2
1
2
13Problem
- Number of processes can be much larger than
minimal number of chains
p1
(1)
p2
(1,2)
(0,1)
p3
(0,1,1)
(1,2,2)
p4
(0,1,1,1)
(1,2,2,2)
14Optimal Chain Decomposition
- Antichain Set of pairwise concurrent elements
- Width Maximum size of an antichain
- Dilworths Theorem 1950 A poset of width k
can be partitioned into k chains and no fewer. - Requires knowledge of complete poset
15Online Chain Decomposition
- Elements of poset presented in a total order
consistent with the poset - Assign elements to chains as they arrive
- Can be modeled as a game between
- Bob Presents elements
- Alice Assigns them to chains
- Felsner 1997 For a poset of width k, Bob can
force Alice to use k(k1)/2 chains
16Chain Partitioning Algorithm (ACC)
- Felsner gave an algorithm which meets the
k(k1)/2 bound - Our algorithm is simpler and more efficient
- B1 Bk Bi i
- For an element z
- Insert into the first queue q in Bi with head lt z
- Swap queues in Bi and Bi-1 leaving q in its place
z
B1
B2
B3
17Drawback of DCC and ACC
- Require a shared data structure
- Monitoring applications generally need a central
server - Hybrid clocks
- Multiple servers, each responsible for a subset
of processes - Finds chains within a process group
18Shared Memory System
- Accesses to shared variables induce dependencies
- Observation Access events for a shared variable
form a chain - Variable-based Chain Clocks (VCC)
- Associate a component with every variable
19VCC Application Predicate Detection
- Predicate (x 1) and (y 1)
- Only events changing x and y are relevant
- Associate a component of VCC with x and other
with y
Initially x0, y 0
20Outline
- Motivation
- Background
- Chain Clock
- Instances of Chain Clock
- Experimental Results
- Conclusion
21Experiments
- Setup
- A multithreaded application
- Each thread generates a sequence of events
- Parameters
- Number of Processes
- Number of Events
- Probability of relevant event a
- Metrics
- Number of components used
- Execution time
22Components Used
Events 100 a 1
23Execution Time
Events 100 a 1
24Effect of Relevancy
Threads 100 Events 100
25Conclusion
- Generalized vector clocks to a class of
algorithms called Chain Clocks - Dynamic Chain Clock (DCC) can provide tremendous
speedup and reduce memory requirement for
applications - Antichain-based Chain Clock (ACC) meets the lower
bound for chain decomposition
26Questions?
27(No Transcript)
28Example Poset of width 2
- For a poset of width 2, Alice can force Bob to
use 3 chains
3
1
1
2
29Drawback of DCC and ACC
- Require a shared data structure
- Monitoring applications generally need a central
server - Hybrid clocks
- Multiple servers, each responsible for a subset
of processes - Finds chains within a process group
30Example Poset of width 2
- For a poset of width 2, Alice can force Bob to
use 3 chains
3
1
1
2
31Chain Partitioning Algorithm (ACC)
- Felsner gave an algorithm which meets the
k(k1)/2 bound - Our algorithm is simpler and more efficient
- B1 Bk Bi i
- For an element z
- Insert into the first queue q in Bi with head lt z
- Swap queues in Bi and Bi-1 leaving q in its place
z
B1
B2
B3
32Happened Before Relation (?)Lamport 78
- Distributed computation with N processes
- Every process executes a series of events
- Internal, send or receive event
p1
p2
- e ? f if there is a path from e to f
- ef if there is no path between e and f
33Future work
- Lower bound for online chain decomposition when a
decomposition into N chains is already known - Other chain decomposition strategies
34Distributed System Time vs Threads
Events 100 a 1
35Distributed System Events vs Time
Threads 100 a 1
36Effect of Number of Events
Threads 100 a 1
37DCC Example
- If e is receive of message m
- V max (V, m.V)
- If e is a relevant event
- e.c i s.t. Zi Vi
- Ve.c Ve.c 1
- Ze.c Ze.c 1
- If e is a send of message m m.V V
p1
(1)
(1,1) max(1),(0,1)
(2,1)
(3,1)
p2
(0,1)
p3
(3,2)
(3,1)
V1
V2
Z
V3
1
1
0
2
2
3
3
3
1
1
1
1
2
1
2
38(No Transcript)
39(No Transcript)
40- Example for DCC is it appropriate ?
- Is the content a bit too much for this amount
- Where can I reduce it ?
- Remove VCC or ACC ?
- Chain clock
- Generalizes vector clocks
- Reduces the time and memory overhead
- Elegantly handles dynamic process creation