Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Ko University, Istanbul, Turkey - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Ko University, Istanbul, Turkey

Description:

Widely-used software systems are built on concurrently-accessed. software ... append(StringBufer sb) { 1 int len = sb.length(); 2 int newCount = count len; ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 33
Provided by: tel4
Category:

less

Transcript and Presenter's Notes

Title: Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi, M. Erkan Keremoglu Ko University, Istanbul, Turkey


1
A Novel Test Coverage Metric for
Concurrently-Accessed Software Components
  • Serdar Tasiran, Tayfun Elmas, Guven Bolukbasi,
    M. Erkan KeremogluKoç University, Istanbul,
    Turkey

2
Our Focus
  • Widely-used software systems are built on
    concurrently-accessed software components
  • File systems, databases, internet services
  • Standard Java and C class libraries
  • Intricate synchronization mechanisms to improve
    performance
  • Prone to concurrency errors
  • Concurrency errors
  • Data loss/corruption
  • Difficult to detect, reproduce through testing

3
The Location Pairs Metric
  • Goal of metric To help answer the question
  • If I am worried about concurrency errors only,
    what unexamined scenario should I try to
    trigger?
  • Coverage metrics Link between validation tools
  • Communicate partial results, testing goals
    between tools
  • Direct tools toward unexplored, distinct new
    executions
  • The location pairs (LP) metric
  • Directed at concurrency errors ONLY
  • Focus High-level data races
  • Atomicity violations
  • Refinement violations
  • All variables may be lock-protected, but
    operations not implemented atomically

4
Outline
  • Runtime Refinement Checking
  • Examples of Refinement/Atomicity Violations
  • The Location Pairs Metric
  • Discussion, Ongoing Work

5
Refinement as Correctness Criterion
  • Client threads invokeoperations concurrently
  • Data structure operations should appear to be
    executed
  • atomically
  • in a linear order
  • to client threads.

ComponentImplementation
6
Runtime Refinement Checking
  • Refinement
  • For each execution of Impl
  • there exists an equivalent, atomic execution of
    data structure Spec
  • Spec Atomized version of Impl
  • Client methods run one at a time
  • Obtained from Impl itself
  • Use refinement as correctness criterion
  • More thorough than assertions
  • More observability than pure testing
  • Runtime verification Check refinement using
    execution traces
  • Can handle industrial-scale programs
  • Intermediate between testing exhaustive
    verification

7
The VYRD Tool
Multi-threaded test
Impl
Write to log
...
Returnsuccess
Call Insert(3)
A0.elt3
Call LookUp(3)
Returnsuccess
Unlock A1
Unlock A0
Unlock A0
A1.elt4
Returnsuccess
read A0
Return true
A0.eltnull
Call Insert(4)
Call Delete(3)
Read from log
Execute logged actions
Run methods atomically
Replay Mechanism
Implreplay
Spec
Refinement Checker
traceImpl
traceSpec
  • At certain points for each method, take state
    snapshots
  • Check consistency of data structure contents

8
The Vyrd Experience
  • Scalable method Caught previously undetected,
    serious but subtle bugs in industrial-scale
    designs
  • Boxwood (30K LOC)
  • Scan Filesystem (Windows NT)
  • Java Libraries with known bugs
  • Reasonable runtime overhead
  • Key novelty Checking refinement improves
    observability
  • Catches bugs that are triggered but not observed
    by testing
  • Significant improvement

9
Experience
The Boxwood Project
10
Refinement vs. Testing Improved Observability
  • Using Vyrd, caught previously undetected bug in
  • Boxwood Cache
  • Scan File System (Windows NT)
  • Bug manifestation
  • Cache entry is correct, marked clean
  • Permanent storage has corrupted data
  • Hard to catch through testing
  • As long as Reads hit in Cache, return value
    correct
  • Caught through testing only if
  • Cache fills, clean entry in Cache is evicted
  • Not written again to permanent storage since
    entry is marked clean
  • Entry read from permanent storage after eviction
  • With no Writes to entry in the meantime

11
Outline
  • Runtime Refinement Checking
  • Examples of Refinement/Atomicity Violations
  • The Location Pairs Metric
  • Discussion, Ongoing Work

12
Idea behind the LP metric
  • Observation Bug occurs whenever
  • Method1 executes up to line X, context switch
    occurs
  • Method2 starts execution from line Y
  • Provided there is a data dependency between
  • Method1s code right before line X BlockX
  • Method2s code right after line Y BlockY
  • Description of bug in the log follows pattern
    above
  • Only requirement on program state, other threads,
    etc.
  • Make the interleaving above possible
  • May require many other threads, complicated
    program state, ...
  • A one-bit data abstraction captures error
    scenario
  • Depdt Is there a data dependency between BlockX
    and BlockY

13
public synchronized StringBuffer
append(StringBuffer sb)
public synchronized void setLength(int
newLength)
int len sb.length()
int newCount count len
if (newCount gt value.length)
ensureCapacity(newCount)
...
if (count lt newLength)
...
else
count newLength
...

return this
sb.getChars(0, len, value, count)
count newCount

return this

14
Experience
Concurrency Bug in Cache
Flush()starts
Write(handle,AB) starts
Flush() ends
Write(handle, AB)ends
15
private static void CpToCache( byte buf,
CacheEntry te, int lsn, Handle h sb)
public static void Flush(int lsn)
...
lock (clean)
for (int i0 iltbuf.length i)
BoxMain.alloc.Write(h, te.data,
te.data.length, 0,
0, WRITE_TYPE_RAW)
te.datai bufi


...
te.lsn lsn


16
Outline
  • Runtime Refinement Checking
  • Examples of Refinement/Atomicity Violations
  • The Location Pairs Metric
  • Discussion, Ongoing Work

17
public synchronized StringBuffer
append(StringBufer sb) 1 int len
sb.length()2 int newCount count len3
if (newCount gt value.length) 4
ensureCapacity(newCount)5 sb.getChars(0,
len, value, count)6 count newCount7
return this8
-----------------------------------acquire(this)
-----------------------------------invoke
sb.length()-------------------------- L1
----int len sb.length()-----------------------
---- L2 ----int newCount count len
-----------------------------------if (newCount
gt value.length)
-----------------------------------expandCapacity
(newCount)
-----------------------------------invoke
sb.getChar()-----------------------------------s
b.getChars(0, len, value, count)-----------------
-----------------count newCount--------------
---------------------return this
18
Coverage FSM State
Method 2
Method 1
(LX, pend1, LY, pend2, depdt)
Location inthe CFG ofMethod 2
Location inthe CFG ofMethod 1
Do actions following LX and LY have a data
dependency?
Is an interesting action in Method 2 expected
next?
Is an interesting action in Method 1 is
expected next?
19
Coverage FSM
(L1, !pend1, L3, !pend2, depdt)
t1 L1 ? L2
t2 L3 ? L4
t2 L3 ? L4
t1 L1 ? L2
20
Coverage Goal
  • The pend1 bit gets set when
  • The depdt bit is TRUE
  • Method2 takes an action
  • Intuition Method1s dependent action must
    follow
  • Must cover all (reachable) transitions of the
    form
  • p (LXp, TRUE, LY, pend2p, depdtp) ? q
    (LXq, pend1q, LY, pend2q, depdtq)
  • p (LX, pend1p, LYp, TRUE, depdtp) ? q (LX,
    pend1q, LYq, pend2q, depdtq)
  • Separate coverage FSM for each method pair
    FSM(Method1, Method2)
  • Cover required transitions in each FSM

21
Important Details
  • Action Atomically executed code fragment
  • Defined by the language
  • Method calls
  • Call action Method call, all lock acquisitions
  • Return action Total net effect of method,
    atomically executed lock releases
  • Separate coverage FSM for each method pair
    FSM(Method1, Method2)
  • Cover required transitions in each FSM
  • But what if there is interesting concurrency
    inside called method?
  • Considered separately when that method is
    considered as one in the method pair
  • If Method1 calls Method3
  • Considered when FSM(Method3, Method2) is covered

22
Outline
  • Runtime Refinement Checking
  • Examples of Refinement/Atomicity Violations
  • The Location Pairs Metric
  • Discussion, Ongoing Work

23
Empirical evidence
  • Does this metric correspond well with high-level
    concurrency errors?
  • Errors captured by metric
  • 100 metric ? Bug guaranteed to be triggered
  • Triggered vs. detected
  • May need view refinement checking to improve
    observability
  • Preliminary study
  • Bugs in Java class libraries
  • Bug found in Boxwood cache
  • Bug found in Scan file system
  • Bugs categories reported in E. Farchi, Y. Nir,
    S. Ur Concurrent Bug Patterns and How to Test
    Them 17th Intl. Parallel and Distributed
    Processing Symposium (IDPDS 03)
  • How many are covered by random testing? How does
    coverage change over time?
  • Dont know yet. Implementing coverage measurement
    tool.

24
Reducing the Coverage FSM
  • Method-local actions
  • Basic block consisting of method-local actions
    considered a single atomic action
  • Pure blocks Flanagan Qadeer, ISSTA 04
  • A pure execution of pure block does not affect
    global state
  • Example Acquire lock, read global variable,
    decide resource not free, release lock
  • Considered a no-op
  • Modeled by bypass transition in coverage FSM.
  • Does not need to be covered

25
Discussion
  • The metric is NOT for deciding when to stop
    testing/verification
  • Intended use
  • Testing, runtime verification is applied to
    program
  • List of non-covered coverage targets provided to
    programmer
  • Intuition Given an unexercised scenario, the
    programmer must have a simple reason to believe
    that
  • the scenario is not possible, or
  • the scenario is safe
  • Given uncovered coverage target, programmer
  • either provides hints to coverage tool to rule
    target out
  • or, assumes that coverage target is a
    possibility,
  • writes test to trigger it
  • or, makes sure that no concurrency error would
    result if coverage target were to be exercised

26
Future Work Approximating Reachable LP Set
  • of locations per method in Boxwood 10, after
    factoring out atomic and pure blocks
  • LP reachability undecidable
  • Metric only intended as aid to programmer
  • What have I tested?
  • What should I try to test?
  • Make sure LP does not lead to error if it looks
    like it can be exercised.
  • Future work Better approximate reachable LP set
  • Do conservative reachability analysis of coverage
    FSM using predicate abstraction.
  • Programmer can add predicates for better FSM
    reduction

27
(No Transcript)
28
Multiset
Implementation LookUp
  • Multiset data structure
  • M 2, 3, 3, 3, 9, 8, 8, 5
  • Has highly concurrent implementations of
  • Insert
  • Delete
  • InsertPair
  • LookUp

LookUp (x) for i 1 to n acquire(Ai) if
(Ai.contentx Ai.valid)
release(Ai) return true else
release(Ai) return false
29
Multiset
Testing
  • Dont know which happened first
  • Insert(3) or Delete(3) ?
  • Should 3 be in the multiset at the end?
  • Must accept both possibilities as correct
  • Common practice
  • Run long multi-threaded test
  • Perform sanity checks on final state

30
Multiset
I/O Refinement
31
View-refinement
View Variables
  • State correspondence
  • Hypothetical view variables must match at
    commit points
  • view variable
  • Value of variable is abstract data structure
    state
  • Updated atomically once by each method
  • For A1..n
  • Extract contentif validtrue

32
View-refinement
Call LookUp(3)
Call Insert(3)
A0.elt3
viewImpl 3
viewSpec 3
Call Delete(3)
Call Insert(4)
viewImpl 3
viewSpec 3
A1.elt4
Return true
viewImpl 3,4
viewSpec 3,4
Returnsuccess
Returnsuccess
A0.eltnull
viewSpec 4
viewImpl 4
Returnsuccess
Write a Comment
User Comments (0)
About PowerShow.com