Title: On Specifying and Monitoring Epistemic Properties of Distributed Systems
1On Specifying and Monitoring Epistemic Properties
of Distributed Systems
- Koushik Sen
- Abhay Vardhan
- Gul Agha
- Grigore Rosu
University of Illinois at Urbana-Champaign, USA
2Software Reliability
- Software Validation
- Rigorous and Complete Methods
- Model Checking
- Theorem Proving
- Infeasible for large-scale open distributed
systems (Actors) - Non-determinism and Asynchrony
- Testing
- Widely used
- Ad-Hoc
- Good Test Coverage Required
- Runtime Monitoring
- Adds rigor to Testing
3Centralized Monitoring Approach
- Monitoring Use Formal Methods in Testing
- Synthesize light-weight Monitors from
Specification - Automata, Rewriting-based Monitors
- Instrument code to insert monitors
- Execute instrumented code
- Distributed System Monitoring
- Global state is distributed
- For every state update send state to a central
monitor - Central monitor assembles them to form consistent
execution traces - Sequence of global states
- Monitor execution traces
4An Example
- Mobile node a requests certain value from node b
- b computes the value and sends it to a
- Property no node receives a value from another
node to which it had not sent a request
5Centralized Monitoring Example
If a receives a value from b then b calculated
the value after receiving request from a
valRcv ? ?(valComputed ? ?valReq)
valRcv ? ?(valComputed ? ?valReq)
valReq
?valReq
valComputed ? ?valReq
?(valComputed ? ?valReq)
b
valComputed
a
valReq
valRcv
6Decentralized Monitoring Approach
If a receives a value from b then b calculated
the value after receiving request from a
valRcv ? _at_b(?(valComputed ? _at_a(?valReq)))
valComputed ? _at_a(?valReq)
?(valComputed ? _at_a(?valReq))
_at_a(?valReq)
b
valComputed
a
valReq
valRcv
?valReq
valRcv ? _at_b(?(valComputed ? _at_a(?valReq)))
7Past time Distributed Temporal Logic (pt-DTL)
- Based on epistemic logic
- Aumann76Meenakshi et al. 00
- Properties with respect to a process, say p
8Leader Election Example
- If a leader is elected then if the current
process is a leader then, at its knowledge, none
of the other processes (b and c) is a leader - elected ? (stateleader ?
- (_at_b(state ? leader) Æ _at_c(state ? leader)))
9Leader Election (Stronger Property)
- Every process must know the name of the process
that has been elected leader - elected ? (let kleaderName in
- (_at_b(leaderName k) Æ _at_c(leaderName k)))
10Leader Election (Open System)
- There are arbitrary number of processes whose
names are not known before-hand - elected ? (let kleaderName in
- _at_8 j j ? i(leaderName k))
11Extended Distributed Temporal Logic (xDTL)
- Suitable for Open Distributed Systems (Actors)
- Ids of all processes are not known before-hand
- Quantification over processes
- All processes satisfying a predicate
- _at_8 j pred(j)
- Some process satisfying a predicate
- _at_9 j pred(j)
- Value-binding (Increases Expressive Power)
- let k x in F
- To refer to values in remote states
12xDTL syntax
- Fi true false P(Ei) Fi Fi Æ Fi
propositional - Fi Fi ?Fi Fi S Fi temporal
- _at_8 JFj _at_9 JFj epistemic
- let k Ei in Fi binding
- Ei c vi 2 Vi f(Ei) k
functional - _at_jEj epistemic
13Interpretation of _at_8 JEj at process i
p3
m4
m1
m2
p2
_at_ 1(x9)
m3
p1
x7
x9
14Monitoring Algorithm
- Requirements
- Should be fast so that online monitoring is
possible - Little memory overhead
- Additional messages sent should be minimal
ideally zero - Monitoring using KnowledgeVector
- Maintain knowledge of global state at each
process - Update knowledge with incoming messages
- Attach knowledge with outgoing messages
- At each process monitor local knowledge
15Conclusion
- Decentralized Technique to effectively verify
open distributed systems at runtime - No extra message over-head for monitoring
- xDTL can express interesting and useful safety
properties of distributed systems - How to instrument code running on all processes
so that monitoring can be done?