Using Set Operations on Code Coverage Data to Discover Program Properties - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Using Set Operations on Code Coverage Data to Discover Program Properties

Description:

Code coverage data contains wealth of information about the program ... Data for statement coverage maps nicely onto sets ... of format from LCOV coverage tool ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 14
Provided by: NickR46
Category:

less

Transcript and Presenter's Notes

Title: Using Set Operations on Code Coverage Data to Discover Program Properties


1
Using Set Operations on Code Coverage Data to
Discover Program Properties
  • by
  • Nick Rutar

2
Motivation
  • Many Programs already have code coverage data
  • Various Code Coverage Tools Available
  • Widely Explored Area of Research
  • Regression tests with coverage data becoming more
    common
  • Code coverage data contains wealth of information
    about the program
  • Data usually limited to how program reports it
  • Want to milk the data for all it is worth
  • Possibly useful for finding errors in the program

3
Code Coverage
  • Three Main Types
  • Statement
  • Every line of code
  • Conditional
  • Every decision in program (if/else)
  • Path
  • Every path in the program
  • Program usually Instrumented
  • Dynamic or Static
  • Usually presented as a composite of separate tests

4
Using Set Operations
  • Why use set operations?
  • Most developers familiar with sets
  • Data for statement coverage maps nicely onto sets
  • Possible to manipulate data easily and give
    glimpses of properties of the code
  • Most code coverage tools implicitly use sets
    anyway

5
Set Operations
  • Union
  • Traditional Coverage
  • Intersection
  • Lines ran on all tests
  • Difference
  • Potential for Locating Errors
  • Probably biggest stretch from what data is
    currently being used for

6
Set Operations At Work
Inputs
int main(int argc, char argv) int x, y, z
x y z 0 if (argc 2) x
atoi(argv1) if (x 1) y 3 else
if (x 2) y 4 if (y gt 0) z
5 else z -2 return z
7
Off the Beaten Path Sets
  • Diff, - Union, U Intersection, I
  • U/I Bad Sets - U Good Sets
  • Sometimes give better basis for finding bad code
  • Closest example of prior work only dealt with one
    bad run at a time
  • Any given test - itself
  • Gives you the empty set
  • U (I of Sets (U/I Bad Sets - U Good Sets))
  • Gives you a very rough slice of program that went
    bad
  • Manipulate data as seen fit for what you are
    looking for

8
Other Code Coverage Info
  • Pareto principle
  • Better known as 80-20 rule
  • Pareto noticed 80 of the land in Italy owned by
    20 of people
  • Shows up in all kinds of domains
  • Nicks high school - 80 of girls dated 20 of
    the boys
  • Software 80-20 rule
  • 20 of the lines of code is 80 of the runtime of
    the software
  • Code Coverage often has frequency information
  • Use that information for performance bottlenecks

9
Implementation
  • Create tool that can use the set information
  • Implementation details
  • Created in Java
  • Based on output of format from LCOV coverage tool
  • Takes in pre-generated coverage information as
    input
  • Supports Union, Difference, and Intersection
  • Supports Frequency Information

10
Demo
11
Evaluation
  • Test Large Program against its regression test
  • Use Dyninst for evaluation
  • C program that does binary instrumentation
  • 100 Source Files
  • 30,000 LOC instrumented to create coverage data
  • Nightly build already has coverage capability
    with regression tests
  • Verify Union matches coverage data given by tool
  • Use Difference to try to find errors
  • Series of tests with various inputs
  • See which inputs cause failure and locate lines
    to discover error

12
Future Work
  • For the Tool
  • Create Template for Insertion into program
  • This program doesnt care what language you are
    using
  • Just needs input format to generate initial sets
  • Specify format in text file, program uses it to
    input data
  • Better Visualization to specify points of
    interest
  • Highlight source code that still has active lines
  • Usability
  • Write now more of a proof of concept than a
    battle hardened tool
  • In General
  • More evaluation of using Diff for finding errors
    in the program
  • Evaluation of software bottlenecks
  • IDE integration

13
Questions???
Write a Comment
User Comments (0)
About PowerShow.com