Applying Statistical Process Control SPC in Software Presented by Dr. Saswati Bhattacharyya - PowerPoint PPT Presentation

1 / 54

About This Presentation

Title:

Applying Statistical Process Control SPC in Software Presented by Dr. Saswati Bhattacharyya

Description:

By gathering information about the various stages of the process and performing ... A cumulative frequency distribution (Ogive) is used to determine how many or ... – PowerPoint PPT presentation

Number of Views:224

Avg rating:3.0/5.0

Slides: 55

Provided by: xxx975

Category:

more less

Transcript and Presenter's Notes

Title: Applying Statistical Process Control SPC in Software Presented by Dr. Saswati Bhattacharyya

1
Applying Statistical Process Control (SPC) in
SoftwarePresented byDr. Saswati Bhattacharyya
2
Agenda

Quantitative Techniques
Statistical Techniques

10/15/2009
3
Introduction

The term Statistical Process Control (SPC) is
typically used in context of manufacturing
processes (although it may also pertain to
services and other activities), to denote
statistical methods used to monitor and improve
the quality of the respective operations.
By gathering information about the various stages
of the process and performing statistical
analysis on that information, the SPC engineer is
able to take necessary action (often preventive)
to ensure that the overall process stays
in-control and to allow the product to meet all
desired specifications.
SPC involves monitoring processes, identifying
problem areas, recommending methods to reduce
variation and verifying that they work,
optimizing the process, assessing the reliability
of parts, and other analytic operations.

4
Introduction

SPC uses basic statistical quality control
methods as quality control charts (Shewhart,
Pareto, and others), capability analysis, gage
repeatability/reproducibility analysis, and
reliability analysis.
Specialized experimental methods (DOE) and other
advanced statistical techniques are often part of
global SPC systems.
Important components of effective, modern SPC
systems are real-time access to data and
facilities to document and respond to incoming QC
data on-line, efficient central QC data
warehousing, and groupware facilities allowing QC
engineers to share data and reports.

5
Why Measure

If you dont know where you are, a map wont
help.
If you dont know where you are going, any road
will do.

6
Use of Measures

Measures by definition, are raw data
Unless this data is sorted in some way, or used
to derive information, decision making will not
be possible
Metrics are hence derived out of measures by
using Contextual information
Applying a formula
Performing some calculations or computations
Metrics are hence derived
Some say measures and metrics have no
difference
Measures can be either base or derived we
prefer the former terminology due to its industry
usage

7
Type of Measures

Historically, industry has used two types
Process metrics/measures
Indicate how an activity was performed or
product delivered/developed or service
rendered
E,g. Time taken to perform a task. Effort,
Schedule, number of attempts, number of
inspections, number of reviews etc.
Product metrics/measures
Used to indicate an attribute of what was
delivered or performed, but not how was it
done.
E,g. Defect Density in modules. Post release
defects, number of customer complaints
Product measures most likely will indicate the
way of building the product the process.
Process measures are a great way to predict
quality of product.

8
Examples

Identify whether these are process or product
metrics/measures
Schedule Variance
Effort Variance
Review Efficiency number of defects detected
per hour
Review effectiveness number of defects detected
in all reviews or per review
Mean Time to Enhance
Mean Time to fix Bug
Reliability number of times a product fails on
site
System Availability percentage up time
Time spent in project management activities
Number of change requests open in last 3 months
Number of defects per LOC
Number of defects detected by Joe Smith in an
hour
Sales in this quarter
Number of New Customer accounts this year
Net Cash flow in this company

9
Uses and Abuses of Measures

There are three kind of lies lies, damn lies
and statistics
Figures dont lie, liars figure
Many abusers are either ignorant or careless
Some others have an objective to mislead the
reader by emphasizing the data that support their
position

10
An example of Misuse

Average one measure of central tendency
Average defects across the organizations
projects is 18
What if there were 7 projects that had 11, 13,
14, 12, 13, 52 and 11 defects
Would 18 then seem like a typical number of
defects across the organization?
There is no objective set of criteria on what
average should be reported every time

11
Yet Another

2 out of 3 software developers recommend Oracle
as the best ERP solution
Surveyors trick by pointing the survey where 2 of
3 mentioned this ERP
There needs to be more surveys
Strong association between variables
Number of hours that one studies and the marks
scored more hours means more score?
The two variables are related

12
Summary

Sometimes numbers can be deceptive
Statistics means to be precise
Understanding these will make you a better
consumer of Statistical information and help you
defend yourself against those who mislead!

13
Putting it all together

Quantitative Analysis Techniques

15
Quantitative Techniques

Quantitative Techniques of Analysis
Typically involve grouping data or combining
data by some key
E.g., sorting in an ascending or descending order
Depicting it in simple chart or graph
Interpreting or understanding the data and the
context
May not be used for any decision making
Can not be used for prediction purposes

16
Data Grouping and Charting Techniques

Frequency Distribution
A Frequency distribution is a grouping of data
into mutually exclusive categories showing the
number of observations in each class
Example Actual Effort (in man-months) of
projects completed till date
26, 35, 66, 15, 82, 74, 28, 35, 33, 78, 91, 51,
56, 37, 68, 48, 52, 72, 65, 57, 42, 59, 45, 62,
73, 54, 58, 63. (to organize the data into a
frequency distribution)

17
Graphic Representations

A Histogram is a graph in which the classes are
marked on the horizontal axis and the class
frequencies on the vertical axis
The class frequencies are represented by the
heights of the bars and the bars are drawn
adjacent to each other

18
Graphic Representations

A frequency polygon consists of line segments
connecting the points formed by the class
midpoint and the class frequency

19
Graphic Representations

A cumulative frequency distribution (Ogive) is
used to determine how many or what proportion of
the data values are below or above a certain value

20
Bar Chart

A bar chart can be used to depict any of the
levels of measurement (nominal, ordinal, interval
or ratio)

21
Pie Charts

A pie chart is useful for displaying a relative
frequency distribution. A circle is divided
proportionally to the relative frequency and
portions of the circle are allocated for the
different groups
Example A sample of 100 children were asked to
indicate their favorite color

22
Pie Charts

A pie chart is useful for displaying a relative
frequency distribution. A circle is divided
proportionally to the relative frequency and
portions of the circle are allocated for the
different groups

23
Statistical Analysis Techniques

SPC for Software Premises
Software process is performed by people not
machines
The Software process is (or can be) repeatable,
but not repetitive
The act of measuring and analyzing will change
behavior potentially in dysfunctional ways
Another way to look at it
Enumerative Studies aim is to determine how
many as opposed to why so many
Analytic Studies aim is to predict or improve
the behavior in future

24
Enumerative Studies in Software

Inspections of code modules to detect and count
existing defects
Functional or system testing to ascertain the
extent to which a product has certain qualities
Measurement of software size to determine project
status or the amount of software under
configuration control
Measurement of staff hours so that the results
can be used to bill customers or track
expenditures against budget

25
Analytic Studies

Evaluating software tools, technologies or
methods
Tracking defect discovery rates to predict
product release dates
Evaluating defect discovery profiles to identify
focal areas for process improvement, predicting
schedules, costs or operational reliability
Using control charts to stabilize and improve
software processes or to assess process capability

26
Statistically Controlled

A phenomenon will be said to be controlled when,
through the use of past experience, we can
predict , at least within limits , how the
phenomenon may be expected to vary in the future
Walter A. Shewhart, 1931

27
Control Charts

Statistical Quality Control emphasizes in-process
control with the objective of controlling the
quality of a manufacturing process or service
operation using sampling techniques.
Statistical sampling techniques are used to aid
in the manufacturing of a product to
specifications rather than attempt to inspect
quality into the product after it is manufactured
Control charts are useful for monitoring a process

28
Causes of Variation

There is variation in all parts produced by a
process. There are two sources of variation
Chance Variation (Common Cause)
Cannot be completely eliminated unless there is a
major change in the equipment or material used in
the process
Random in nature
Assignable Variation (Special Cause)
Can be eliminated or reduced by investigating the
problem and finding the cause
Non-random in nature

29
Example 1

Bus travel time 25 minutes from Point A to
Point B
Each run may not exactly take 25 min
Some may take longer and some lesser
Snowstorm or an accident
Chance Variation (Common Cause)
Driver may not hit the green lights at the right
time due to his driving inefficiency
Assignable Variation (Special Cause)

30
Example 2

Chance Variation (Common Causes)
Internal Machine friction
Temperature influences
Humidity influences
Vibrations transmitted from a passing forklift
Network clogging due to a snowstorm
System freeze due to a power failures
Assignable Variation (Special Cause)
Operator who continually sets up machine
incorrectly
Hole drilled into steel due to a dull drill
Processor speed low while requirement is for a
higher one
New programmer or inexperienced manager
Tired programmer due to overwork
Attrition midway on the project
Network or link failure on particular day
No knowledge of domain or lack of functional
knowledge

31
Why are we bothered about variation

It will change the shape, dispersion and central
tendency of the characteristic being measured
Assignable variation can be correctable and
stabilized economically

32
Purpose of Quality Control Charts

The purpose of quality-control charts is to
portray graphically when an assignable cause
enters the production system so that it can be
identified and corrected
This is accomplished by periodically selecting a
random sample from the current production

33
Why Control Charts

Control charts let you know what processes can
do, so that you can set achievable goals
They represent the Voice of the process
Identify unusual events and point at fixable
problems and potential process improvements

34
Variable and Attributes Data

Variable Data
Represents measurements of a continuous
phenomenon
e,.g. elapsed time, effort expanded, years of
experience, memory utilization, cost of rework
Volume, weight, height, length, efficiency,
viscosity
Attributes Data
Represents data that can either conform or not to
conform to a discrete set of values
E.g. number of defects found, number of
defectives per unit, of projects using a formal
code inspection method

35
Simple Control Charts for software

X-Bar Chart
nP and p-Chart
C and U-Chart
I-X Chart (Individual X-Chart)
XmR Chart (Moving Range Chart)

36
Types of Quality Control Charts - Variables

The mean of the x-bar chart is designed to
control variables such as weight, length etc. The
upper control limit (UCL) and the lower control
limit (LCL) are obtained from the equation
_
_
UCL XA2R and UCL X-A2R
_
Where X is the mean of the sample means and R is
the mean of the sample ranges.

37
Finding special causes by control charts

Consider Peer Review (PR) preparation rate for 5
samples (each sample is covering 4 PRs from 4
divisions)

38
Mean and Ranges

The table below shows the mean and ranges

39
Compute UCL and LCL

Compute the Grand Mean (X-double bar) and the
average range.
Grand Mean (25.2526.7525.25)/5 26.35
Mean Range (563)/5 5.8
Determine the UCL and LCL for the average
preparation time
UCL 26.35.7295.8 30.58
LCL 26.35-.7295.8 22.12
0.729 is the constant found from statistical
table.

40
Control Charts for Means and Analysis

Is the process under control? Any special causes
of variation?
Control Charts are typically indicative after 25
or 30 samples

41
Detecting instabilities and Out of Control

Examine control charts for instances of behavior
and patterns that show nonrandom behavior
Values falling outside the control limits and
unusual patterns within the running record
suggest that assignable causes exist

42
Tests of Out of Control and Instabilities

Test 1 A single point falls outside the 3-sigma
control limits
Test 2 At least two of three successive values
fall on the same side of , and more than two
sigma units away from, the center line
Test 3 At least four out of five successive
values fall on the same side of, and more than
one sigma units away from , the center line
Test 4 At least eight successive values fall on
the same side of the center line.

43
Typical Causal Analysis

Test 1
Unusual Single event
Accident snowstorm, Merger, Attrition
Test 2
Chance variation
New operator or untrained operator
Test 3
New machine or new process/procedure
Test 4
Wear and tear of the machine
These are only typical, there could be atypical
causes as well!!

44
Statistics in Action An Example

Conviction of a person who bribed some players to
loss in betting
X-bar and R-bar charts showed unusual betting
patterns and some contestants did not win as
expected
A QC Expert identified times when assignable
causes stopped and prosecutors were able to tie
this to the times of the arrest of the suspect
Canadian firm on the Japanese order in 1980s
Three defectives shipped separately

45
Types of Control Charts nP and p

Used with discrete binomial data (number of
failures)
Likelihood of items failure unaffected by
failure of previous item in the sample
nP charts
xi number of failures in a sample
fixed sample size n
average fraction non-conforming p
Mean np
Constant control limits np ? 3 np(1 - p)1/2
P charts
pi proportion of failures in a sample
variable sample size ni
Mean p
Variable control limits p ? 3 p(1 - p)/ni1/2
Control limits tighten up for larger sample sizes
and relax for smaller sample sizes

46
Types of Control Charts c and u

Used with discrete Poisson data (count of
defects/sample)
independent events (defects)
probability proportional to area of opportunity
(sample size)
events are rare ( lt 10 possible defects)
C charts
ci event count
constant area of opportunity
average number of events per sample cavg
Mean cavg
Constant control limits cavg ? 3(cavg)1/2
U charts
ui event count per unit area of opportunity
(defects/unit size)
variable area of opportunity ai
Mean uavg
Variable control limits uavg ? ? 3 (uavg)1/2
Control limits tighten up for larger sample sizes
and relax for smaller sample sizes

47
Individual X Chart

Sample of Individual Reviews
Can show the distribution of variation in cases
of non homogeneous data

48
Individual X Chart
49
XmR Charts

Used with continuous data (measurements)
no assumptions about underlying distribution
Appropriate for items that are not produced in
batches or when it is desirable to use all
available data
Two charts X and mR (moving Range of X)
mRavg is used to estimate s for X as well as mR
mRi Xi - Xi-1
X chart mean Xavg
X chart control limits Xavg ? 2.660 mRavg
mR chart mean mRavg
mR chart control limit 3.268 mRavg

50
Other Statistical Techniques

Control Charts are not enough
Confidence intervals
Prediction intervals
Test of hypotheses
ANOVA
Probability techniques
Regression
Correlation Analysis

51
Recent Studies

Although software industry has made significant
progress in implementing metrics programs, large
number of them fail
Howard Rubins Rubins Systems, Inc.
1997 study indicated that four out of five
metrics programs fail to succeed
Success is defined as
Measurements program that lasts for more than two
years and that impacts the business decisions
made by the organization

52
Primary Reasons for Failure

Not tied to business goals
Irrelevant or not understood by key players
Perceived to be unfair or resisted
Motivated wrong behavior
Expensive, cumbersome
No action based on the numbers
No sustained management sponsorship

53
Success in Metrics Programs

Are more than collecting data
Benefit and value come from decisions taken from
data
Sometimes choosing the right metrics becomes
overwhelming due to many opportunities that exist
especially in large organizations
Goal driven measurement is a must!
Ref GUIDEBOOK CMU/SEI-97-HB-003 by
William A. Florac, Robert E. Park, Anita D.
Carleton