Title: An Experimental Comparison of UsageBased and ChecklistBased Reading
1An Experimental Comparison of Usage-Based and
Checklist-Based Reading
- Written by Thomas Thelin, Per Runeson, and Claes
WohlinPublished in IEEE Transactions on
Software Engineering, Volume 29, No. 8, August
2003, pp 687-704
Presented by Brian Simms and Nick Wilkinson
2Outline
- Introduction
- Experiment Definition
- Experiment Planning
- Experiment Operation
- Data Analysis
- Conclusions and Future Work
3Software Engineering
- Evolution over time
- Of particular import to this paper
- Usage-based testing
- Use cases
- Both focus on usage
- Motivation
- User-impacting faults are more important
4Reading Techniques
- Checklist-Based Reading (CBR)
- List of issues to find faults
- Standard industry technique
- Baseline used in studies for reading techniques
5Reading Techniques (cont)
- Defect-Based Reading (DBR)
- Focuses on finding specific types of faults
- Aims at finding same types of faults as CBR
- More structured technique with additional
information
6Reading Techniques (cont)
- Perspective-Based (PBR)
- Assigns different perspectives to reviewers
- Assumes reviewer with specific focus performs
better than one looking for all faults - Union of different foci achieves full coverage
- Designer, tester, user roles
7Reading Techniques (cont)
- Traceability-Based Reading (TBR)
- Used to inspect object-oriented design specs
- Vertical
- Design vs. requirements
- Horizontal
- Design artifacts vs. each other
8Reading Techniques (cont)
- Usage-Based Reading (UBR)
- Focus on critical faults
- Faults not equally important
- Prioritized (requirement level) use-case model
- Ranked-based reading
- Time-controlled reading
9Reading Techniques (cont)
- UBR vs. PBR
- UBR utilizes existing use cases
- PBR develops use cases for the user perspective
- UBR scenarios are specific to each project
- PBR scenarios are general
10Related Work
- Previous studies - no focus on UBR
11Related Work (cont)
12Outline
- Introduction
- Experiment Definition
- Experiment Planning
- Experiment Operation
- Data Analysis
- Conclusions and Future Work
13Goal Definition
- Object of study Analyze UBR as compared to CBR
- Purpose Show UBR is more effective and
efficient than CBR - Quality Focus Effectiveness, efficiency, types
of faults - Perspective Users point of view
- Context 23 4th year Masters students at
Blekinge Institute of Technology in Sweden
reading a requirements document and a design
document
14Summary of Definition
- Analyze UBR as compared to CBR
- For the purpose of showing that UBR is more
effective and efficient than CBR - From the point of view of the users
- In the context of Masters students reading
requirements and design documents
15Outline
- Introduction
- Experiment Definition
- Experiment Planning
- Experiment Operation
- Data Analysis
- Conclusions and Future Work
16Context Selection
- Off-line conducted in a controlled academic
environment - Student 4th year students in Masters program
- Specific Subjects/Objects may not be
representative of the real world - Toy problems The documents are shorter than
real software requirements documents
Copied from Rans presentation of Are the
Perspectives Really Different Fall 2002
17Subjects Objects
- Subjects
- 23 4th year Masters students at Blekinge
Institute of Technology - Objects
- Requirements document
- Design document
18Variables
- Independent (treatments)
- Reading technique used (UBR or CBR)
- Controlled
- Experience level of reviewers as measured on an
ordinal scale - Based on 7-question questionnaire
19Variables (cont)
- Dependent
- Time spent on preparation (minutes)
- Time spent on inspection (minutes)
- Clock time when each fault was found (minutes)
measured from start of preparation - Number of faults found by each reviewer
- Number of faults found by each experiment group
- Efficiency (faults/hour) measured as
- 60 (Number of faults found / (Preparation Time
Inspection Time) - Effectiveness (detection rate) measured as
- Number of faults found / Total Number of faults
20Variable Selection
21Experiment Type
- One-factor with two treatments using the
controlled variable (subjects experience) to get
a balanced design - Multiple objects
- requirements document
- design document
- Quasi-experiment
- Selection of subjects and objects was not random
- Subjects were selected by convenience sampling
22Theory
- UBR is more efficient and effective in finding
faults of the most critical fault classes, i.e.,
UBR is assumed to find more faults per time unit
and to find a larger rate of the critical faults.
23Theory
Different reviewers use different reading
techniques (treatments)
UBR is more efficient and effective at finding
critical faults than CBR
Observation
Perspectives UBR, CBR Docs requirements, design
Efficiency Effectiveness Faults
Independent variable
Dependent variable
24Hypotheses
- H0eff There is no difference in efficiency
(i.e., found faults per hour) between the
reviewers applying prioritized use cases and the
reviewers using a checklist. - H0rate There is no difference in effectiveness
(i.e. rate of faults found) between the reviewers
applying prioritized use cases and the reviewers
using a checklist. - H0fault The reviewers applying prioritized use
cases do not detect different faults then the
reviewers using a checklist. - Proven/disproven using
- measures of efficiency
- measures of effectiveness
- unique faults found by each group
- Mann Whitney test
25Experiment Design
- Randomization
- Subject assignment non-random
- Based on reported experience
- No blocking
- Standard 1 factor with 2 treatments design type
- 1 group for each treatment (balanced on
experience)
26Validity Threats (as stated in the paper)
- Conclusion validity
- Subjects from the different groups may talk to
each other about the other technique - allows a subject from CBR group to use UBR, and
vice versa - We feel this is actually a social threat to
internal validity (diffusion or imitation of
treatments) - No real mitigation, although risk considered low
as no benefit to groups for doing so, and told
not to talk to each other
27Validity Threats (as stated in the paper)
- Internal validity
- Subjects not receiving UBR treatment may suffer
from resentful demoralization - Mitigated by telling subjects they would receive
training in other treatment after experiment - Rivalry between groups
- Lack of motivation
- Grade based solely on attendance mitigates
rivalry, but may cause lack of motivation
28Validity Threats (as stated in the paper)
- Construct validity
- Requirements developed after use cases
- Listed as a low risk in the paper
- We feel this is a larger risk
- Not industry standard
- Poses a threat to generalizing the outcome of the
experiment
29Validity Threats (as stated in the paper)
- External Validity
- Use of students as subjects
- Mitigated by students industry jobs
- Small size of document compared to industry
documentation - Mitigated by describing a real-world problem
- Industry documentation often broken up into
smaller documents
30Validity Threats(as noted by presenters)
- Construct Validity
- Use of surrogates to prioritize use cases
- May be unavoidable in this experiment, but still
needs to be listed as a threat - Internal Validity (instrumentation)
- A single student was responsible for developing
requirements, design, code, and user
documentation - May not be representative of what an
industry-trained developer could produce - Industry rarely allows a single developer to
operate in a vacuum
31Outline
- Introduction
- Experiment Definition
- Experiment Planning
- Experiment Operation
- Data Analysis
- Conclusions and Future Work
32Preparation
33Data Collection and Validation
- Handouts from the Subjects
- Faults Found
- Time each fault was Found
- Data Validation
- No Mention Made
- Box plot shows no outliers
34Outline
- Introduction
- Experiment Definition
- Experiment Planning
- Experiment Operation
- Data Analysis
- Conclusions and Future Work
35Preparation Inspection Time
36Time vs. Class A B
37Time vs. All Faults
38Average Reviewer
39Class A Faults
40Class B Faults
41Class C Faults
42Efficiency
43Effectiveness
44Unique Faults
45Results
46Team Performance
- Simulation of Teams
- UBR, CBR, and Mixed Groups
- Group size from 2 6
- All Combinations of Subjects
47All Faults
48All Faults
49Class A B Faults
50Class A B Faults
51Outline
- Introduction
- Experiment Definition
- Experiment Planning
- Experiment Operation
- Data Analysis
- Conclusions and Future Work
52Conclusion
- UBR More Efficient
- All Faults
- Crucial Faults
- UBR More Effective
- Crucial Faults
- UBR Finds Different Faults
53Future Work
- Replicating is Needed
- Time Controlled Reading
- Inject More Faults
- Hybrid UBR/CBR Methods
54Questions?