Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Ko University, Istanbul, Turkey - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Ko University, Istanbul, Turkey

Description:

Widely-used software systems are built on concurrently-accessed. software ... append(StringBufer sb) { 1 int len = sb.length(); 2 int newCount = count len; ... – PowerPoint PPT presentation

Number of Views:109

Avg rating:3.0/5.0

Slides: 33

Provided by: tel4

Category:

more less

Transcript and Presenter's Notes

Title: Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Ko University, Istanbul, Turkey

1
A Novel Test Coverage Metric for
Concurrently-Accessed Software Components

Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi,
M. Erkan KeremogluKoç University, Istanbul,
Turkey

2
Our Focus

Widely-used software systems are built on
concurrently-accessed software components
File systems, databases, internet services
Standard Java and C class libraries
Intricate synchronization mechanisms to improve
performance
Prone to concurrency errors
Concurrency errors
Data loss/corruption
Difficult to detect, reproduce through testing

3
The Location Pairs Metric

Goal of metric To help answer the question
If I am worried about concurrency errors only,
what unexamined scenario should I try to
trigger?
Coverage metrics Link between validation tools
Communicate partial results, testing goals
between tools
Direct tools toward unexplored, distinct new
executions
The location pairs (LP) metric
Directed at concurrency errors ONLY
Focus High-level data races
Atomicity violations
Refinement violations
All variables may be lock-protected, but
operations not implemented atomically

4
Outline

Runtime Refinement Checking
Examples of Refinement/Atomicity Violations
The Location Pairs Metric
Discussion, Ongoing Work

5
Refinement as Correctness Criterion

Client threads invokeoperations concurrently
Data structure operations should appear to be
executed
atomically
in a linear order
to client threads.

ComponentImplementation
6
Runtime Refinement Checking

Refinement
For each execution of Impl
there exists an equivalent, atomic execution of
data structure Spec
Spec Atomized version of Impl
Client methods run one at a time
Obtained from Impl itself
Use refinement as correctness criterion
More thorough than assertions
More observability than pure testing
Runtime verification Check refinement using
execution traces
Can handle industrial-scale programs
Intermediate between testing exhaustive
verification

7
The VYRD Tool
Multi-threaded test
Impl
Write to log
...
Returnsuccess
Call Insert(3)
A0.elt3
Call LookUp(3)
Returnsuccess
Unlock A1
Unlock A0
Unlock A0
A1.elt4
Returnsuccess
read A0
Return true
A0.eltnull
Call Insert(4)
Call Delete(3)
Read from log
Execute logged actions
Run methods atomically
Replay Mechanism
Implreplay
Spec
Refinement Checker
traceImpl
traceSpec

At certain points for each method, take state
snapshots
Check consistency of data structure contents

8
The Vyrd Experience

Scalable method Caught previously undetected,
serious but subtle bugs in industrial-scale
designs
Boxwood (30K LOC)
Scan Filesystem (Windows NT)
Java Libraries with known bugs
Reasonable runtime overhead
Key novelty Checking refinement improves
observability
Catches bugs that are triggered but not observed
by testing
Significant improvement

9
Experience
The Boxwood Project
10
Refinement vs. Testing Improved Observability

Using Vyrd, caught previously undetected bug in
Boxwood Cache
Scan File System (Windows NT)
Bug manifestation
Cache entry is correct, marked clean
Permanent storage has corrupted data
Hard to catch through testing
As long as Reads hit in Cache, return value
correct
Caught through testing only if
Cache fills, clean entry in Cache is evicted
Not written again to permanent storage since
entry is marked clean
Entry read from permanent storage after eviction
With no Writes to entry in the meantime

11
Outline

Runtime Refinement Checking
Examples of Refinement/Atomicity Violations
The Location Pairs Metric
Discussion, Ongoing Work

12
Idea behind the LP metric

Observation Bug occurs whenever
Method1 executes up to line X, context switch
occurs
Method2 starts execution from line Y
Provided there is a data dependency between
Method1s code right before line X BlockX
Method2s code right after line Y BlockY
Description of bug in the log follows pattern
above
Only requirement on program state, other threads,
etc.
Make the interleaving above possible
May require many other threads, complicated
program state, ...
A one-bit data abstraction captures error
scenario
Depdt Is there a data dependency between BlockX
and BlockY

13
public synchronized StringBuffer
append(StringBuffer sb)
public synchronized void setLength(int
newLength)
int len sb.length()
int newCount count len
if (newCount gt value.length)
ensureCapacity(newCount)
...
if (count lt newLength)
...
else
count newLength
...

return this
sb.getChars(0, len, value, count)
count newCount

return this

14
Experience
Concurrency Bug in Cache
Flush()starts
Write(handle,AB) starts
Flush() ends
Write(handle, AB)ends
15
private static void CpToCache( byte buf,
CacheEntry te, int lsn, Handle h sb)
public static void Flush(int lsn)
...
lock (clean)
for (int i0 iltbuf.length i)
BoxMain.alloc.Write(h, te.data,
te.data.length, 0,
0, WRITE_TYPE_RAW)
te.datai bufi

...
te.lsn lsn

16
Outline

Runtime Refinement Checking
Examples of Refinement/Atomicity Violations
The Location Pairs Metric
Discussion, Ongoing Work

17
public synchronized StringBuffer
append(StringBufer sb) 1 int len
sb.length()2 int newCount count len3
if (newCount gt value.length) 4
ensureCapacity(newCount)5 sb.getChars(0,
len, value, count)6 count newCount7
return this8
-----------------------------------acquire(this)
-----------------------------------invoke
sb.length()-------------------------- L1
----int len sb.length()-----------------------
---- L2 ----int newCount count len
-----------------------------------if (newCount
gt value.length)
-----------------------------------expandCapacity
(newCount)
-----------------------------------invoke
sb.getChar()-----------------------------------s
b.getChars(0, len, value, count)-----------------
-----------------count newCount--------------
---------------------return this
18
Coverage FSM State
Method 2
Method 1
(LX, pend1, LY, pend2, depdt)
Location inthe CFG ofMethod 2
Location inthe CFG ofMethod 1
Do actions following LX and LY have a data
dependency?
Is an interesting action in Method 2 expected
next?
Is an interesting action in Method 1 is
expected next?
19
Coverage FSM
(L1, !pend1, L3, !pend2, depdt)
t1 L1 ? L2
t2 L3 ? L4
t2 L3 ? L4
t1 L1 ? L2
20
Coverage Goal

The pend1 bit gets set when
The depdt bit is TRUE
Method2 takes an action
Intuition Method1s dependent action must
follow
Must cover all (reachable) transitions of the
form
p (LXp, TRUE, LY, pend2p, depdtp) ? q
(LXq, pend1q, LY, pend2q, depdtq)
p (LX, pend1p, LYp, TRUE, depdtp) ? q (LX,
pend1q, LYq, pend2q, depdtq)
Separate coverage FSM for each method pair
FSM(Method1, Method2)
Cover required transitions in each FSM

21
Important Details

Action Atomically executed code fragment
Defined by the language
Method calls
Call action Method call, all lock acquisitions
Return action Total net effect of method,
atomically executed lock releases
Separate coverage FSM for each method pair
FSM(Method1, Method2)
Cover required transitions in each FSM
But what if there is interesting concurrency
inside called method?
Considered separately when that method is
considered as one in the method pair
If Method1 calls Method3
Considered when FSM(Method3, Method2) is covered

22
Outline

Runtime Refinement Checking
Examples of Refinement/Atomicity Violations
The Location Pairs Metric
Discussion, Ongoing Work

23
Empirical evidence

Does this metric correspond well with high-level
concurrency errors?
Errors captured by metric
100 metric ? Bug guaranteed to be triggered
Triggered vs. detected
May need view refinement checking to improve
observability
Preliminary study
Bugs in Java class libraries
Bug found in Boxwood cache
Bug found in Scan file system
Bugs categories reported in E. Farchi, Y. Nir,
S. Ur Concurrent Bug Patterns and How to Test
Them 17th Intl. Parallel and Distributed
Processing Symposium (IDPDS 03)
How many are covered by random testing? How does
coverage change over time?
Dont know yet. Implementing coverage measurement
tool.

24
Reducing the Coverage FSM

Method-local actions
Basic block consisting of method-local actions
considered a single atomic action
Pure blocks Flanagan Qadeer, ISSTA 04
A pure execution of pure block does not affect
global state
Example Acquire lock, read global variable,
decide resource not free, release lock
Considered a no-op
Modeled by bypass transition in coverage FSM.
Does not need to be covered

25
Discussion

The metric is NOT for deciding when to stop
testing/verification
Intended use
Testing, runtime verification is applied to
program
List of non-covered coverage targets provided to
programmer
Intuition Given an unexercised scenario, the
programmer must have a simple reason to believe
that
the scenario is not possible, or
the scenario is safe
Given uncovered coverage target, programmer
either provides hints to coverage tool to rule
target out
or, assumes that coverage target is a
possibility,
writes test to trigger it
or, makes sure that no concurrency error would
result if coverage target were to be exercised

26
Future Work Approximating Reachable LP Set

of locations per method in Boxwood 10, after
factoring out atomic and pure blocks
LP reachability undecidable
Metric only intended as aid to programmer
What have I tested?
What should I try to test?
Make sure LP does not lead to error if it looks
like it can be exercised.
Future work Better approximate reachable LP set
Do conservative reachability analysis of coverage
FSM using predicate abstraction.
Programmer can add predicates for better FSM
reduction

27
(No Transcript)
28
Multiset
Implementation LookUp

Multiset data structure
M 2, 3, 3, 3, 9, 8, 8, 5
Has highly concurrent implementations of
Insert
Delete
InsertPair
LookUp

LookUp (x) for i 1 to n acquire(Ai) if
(Ai.contentx Ai.valid)
release(Ai) return true else
release(Ai) return false
29
Multiset
Testing

Dont know which happened first
Insert(3) or Delete(3) ?
Should 3 be in the multiset at the end?
Must accept both possibilities as correct
Common practice
Run long multi-threaded test
Perform sanity checks on final state

30
Multiset
I/O Refinement
31
View-refinement
View Variables

State correspondence
Hypothetical view variables must match at
commit points
view variable
Value of variable is abstract data structure
state
Updated atomically once by each method
For A1..n
Extract contentif validtrue

32
View-refinement
Call LookUp(3)
Call Insert(3)
A0.elt3
viewImpl 3
viewSpec 3
Call Delete(3)
Call Insert(4)
viewImpl 3
viewSpec 3
A1.elt4
Return true
viewImpl 3,4
viewSpec 3,4
Returnsuccess
Returnsuccess
A0.eltnull
viewSpec 4
viewImpl 4
Returnsuccess

Write a Comment

User Comments (0)