Applying Statistical Process Control SPC in Software Presented by Dr. Saswati Bhattacharyya - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Applying Statistical Process Control SPC in Software Presented by Dr. Saswati Bhattacharyya

Description:

By gathering information about the various stages of the process and performing ... A cumulative frequency distribution (Ogive) is used to determine how many or ... – PowerPoint PPT presentation

Number of Views:224
Avg rating:3.0/5.0
Slides: 55
Provided by: xxx975
Category:

less

Transcript and Presenter's Notes

Title: Applying Statistical Process Control SPC in Software Presented by Dr. Saswati Bhattacharyya


1
Applying Statistical Process Control (SPC) in
SoftwarePresented byDr. Saswati Bhattacharyya
2
Agenda
  • Quantitative Techniques
  • Statistical Techniques

10/15/2009
3
Introduction
  • The term Statistical Process Control (SPC) is
    typically used in context of manufacturing
    processes (although it may also pertain to
    services and other activities), to denote
    statistical methods used to monitor and improve
    the quality of the respective operations.
  • By gathering information about the various stages
    of the process and performing statistical
    analysis on that information, the SPC engineer is
    able to take necessary action (often preventive)
    to ensure that the overall process stays
    in-control and to allow the product to meet all
    desired specifications.
  • SPC involves monitoring processes, identifying
    problem areas, recommending methods to reduce
    variation and verifying that they work,
    optimizing the process, assessing the reliability
    of parts, and other analytic operations.

4
Introduction
  • SPC uses basic statistical quality control
    methods as quality control charts (Shewhart,
    Pareto, and others), capability analysis, gage
    repeatability/reproducibility analysis, and
    reliability analysis.
  • Specialized experimental methods (DOE) and other
    advanced statistical techniques are often part of
    global SPC systems.
  • Important components of effective, modern SPC
    systems are real-time access to data and
    facilities to document and respond to incoming QC
    data on-line, efficient central QC data
    warehousing, and groupware facilities allowing QC
    engineers to share data and reports.

5
Why Measure
  • If you dont know where you are, a map wont
    help.
  • If you dont know where you are going, any road
    will do.

6
Use of Measures
  • Measures by definition, are raw data
  • Unless this data is sorted in some way, or used
    to derive information, decision making will not
    be possible
  • Metrics are hence derived out of measures by
  • using Contextual information
  • Applying a formula
  • Performing some calculations or computations
  • Metrics are hence derived
  • Some say measures and metrics have no
    difference
  • Measures can be either base or derived we
    prefer the former terminology due to its industry
    usage

7
Type of Measures
  • Historically, industry has used two types
  • Process metrics/measures
  • Indicate how an activity was performed or
    product delivered/developed or service
    rendered
  • E,g. Time taken to perform a task. Effort,
    Schedule, number of attempts, number of
    inspections, number of reviews etc.
  • Product metrics/measures
  • Used to indicate an attribute of what was
    delivered or performed, but not how was it
    done.
  • E,g. Defect Density in modules. Post release
    defects, number of customer complaints
  • Product measures most likely will indicate the
    way of building the product the process.
  • Process measures are a great way to predict
    quality of product.

8
Examples
  • Identify whether these are process or product
    metrics/measures
  • Schedule Variance
  • Effort Variance
  • Review Efficiency number of defects detected
    per hour
  • Review effectiveness number of defects detected
    in all reviews or per review
  • Mean Time to Enhance
  • Mean Time to fix Bug
  • Reliability number of times a product fails on
    site
  • System Availability percentage up time
  • Time spent in project management activities
  • Number of change requests open in last 3 months
  • Number of defects per LOC
  • Number of defects detected by Joe Smith in an
    hour
  • Sales in this quarter
  • Number of New Customer accounts this year
  • Net Cash flow in this company

9
Uses and Abuses of Measures
  • There are three kind of lies lies, damn lies
    and statistics
  • Figures dont lie, liars figure
  • Many abusers are either ignorant or careless
  • Some others have an objective to mislead the
    reader by emphasizing the data that support their
    position

10
An example of Misuse
  • Average one measure of central tendency
  • Average defects across the organizations
    projects is 18
  • What if there were 7 projects that had 11, 13,
    14, 12, 13, 52 and 11 defects
  • Would 18 then seem like a typical number of
    defects across the organization?
  • There is no objective set of criteria on what
    average should be reported every time

11
Yet Another
  • 2 out of 3 software developers recommend Oracle
    as the best ERP solution
  • Surveyors trick by pointing the survey where 2 of
    3 mentioned this ERP
  • There needs to be more surveys
  • Strong association between variables
  • Number of hours that one studies and the marks
    scored more hours means more score?
  • The two variables are related

12
Summary
  • Sometimes numbers can be deceptive
  • Statistics means to be precise
  • Understanding these will make you a better
    consumer of Statistical information and help you
    defend yourself against those who mislead!

13
Putting it all together

14
  • Quantitative Analysis Techniques

15
Quantitative Techniques
  • Quantitative Techniques of Analysis
  • Typically involve grouping data or combining
    data by some key
  • E.g., sorting in an ascending or descending order
  • Depicting it in simple chart or graph
  • Interpreting or understanding the data and the
    context
  • May not be used for any decision making
  • Can not be used for prediction purposes

16
Data Grouping and Charting Techniques
  • Frequency Distribution
  • A Frequency distribution is a grouping of data
    into mutually exclusive categories showing the
    number of observations in each class
  • Example Actual Effort (in man-months) of
    projects completed till date
  • 26, 35, 66, 15, 82, 74, 28, 35, 33, 78, 91, 51,
    56, 37, 68, 48, 52, 72, 65, 57, 42, 59, 45, 62,
    73, 54, 58, 63. (to organize the data into a
    frequency distribution)

17
Graphic Representations
  • A Histogram is a graph in which the classes are
    marked on the horizontal axis and the class
    frequencies on the vertical axis
  • The class frequencies are represented by the
    heights of the bars and the bars are drawn
    adjacent to each other

18
Graphic Representations
  • A frequency polygon consists of line segments
    connecting the points formed by the class
    midpoint and the class frequency

19
Graphic Representations
  • A cumulative frequency distribution (Ogive) is
    used to determine how many or what proportion of
    the data values are below or above a certain value

20
Bar Chart
  • A bar chart can be used to depict any of the
    levels of measurement (nominal, ordinal, interval
    or ratio)

21
Pie Charts
  • A pie chart is useful for displaying a relative
    frequency distribution. A circle is divided
    proportionally to the relative frequency and
    portions of the circle are allocated for the
    different groups
  • Example A sample of 100 children were asked to
    indicate their favorite color

22
Pie Charts
  • A pie chart is useful for displaying a relative
    frequency distribution. A circle is divided
    proportionally to the relative frequency and
    portions of the circle are allocated for the
    different groups

23
Statistical Analysis Techniques
  • SPC for Software Premises
  • Software process is performed by people not
    machines
  • The Software process is (or can be) repeatable,
    but not repetitive
  • The act of measuring and analyzing will change
    behavior potentially in dysfunctional ways
  • Another way to look at it
  • Enumerative Studies aim is to determine how
    many as opposed to why so many
  • Analytic Studies aim is to predict or improve
    the behavior in future

24
Enumerative Studies in Software
  • Inspections of code modules to detect and count
    existing defects
  • Functional or system testing to ascertain the
    extent to which a product has certain qualities
  • Measurement of software size to determine project
    status or the amount of software under
    configuration control
  • Measurement of staff hours so that the results
    can be used to bill customers or track
    expenditures against budget

25
Analytic Studies
  • Evaluating software tools, technologies or
    methods
  • Tracking defect discovery rates to predict
    product release dates
  • Evaluating defect discovery profiles to identify
    focal areas for process improvement, predicting
    schedules, costs or operational reliability
  • Using control charts to stabilize and improve
    software processes or to assess process capability

26
Statistically Controlled
  • A phenomenon will be said to be controlled when,
    through the use of past experience, we can
    predict , at least within limits , how the
    phenomenon may be expected to vary in the future
  • Walter A. Shewhart, 1931

27
Control Charts
  • Statistical Quality Control emphasizes in-process
    control with the objective of controlling the
    quality of a manufacturing process or service
    operation using sampling techniques.
  • Statistical sampling techniques are used to aid
    in the manufacturing of a product to
    specifications rather than attempt to inspect
    quality into the product after it is manufactured
  • Control charts are useful for monitoring a process

28
Causes of Variation
  • There is variation in all parts produced by a
    process. There are two sources of variation
  • Chance Variation (Common Cause)
  • Cannot be completely eliminated unless there is a
    major change in the equipment or material used in
    the process
  • Random in nature
  • Assignable Variation (Special Cause)
  • Can be eliminated or reduced by investigating the
    problem and finding the cause
  • Non-random in nature

29
Example 1
  • Bus travel time 25 minutes from Point A to
    Point B
  • Each run may not exactly take 25 min
  • Some may take longer and some lesser
  • Snowstorm or an accident
  • Chance Variation (Common Cause)
  • Driver may not hit the green lights at the right
    time due to his driving inefficiency
  • Assignable Variation (Special Cause)

30
Example 2
  • Chance Variation (Common Causes)
  • Internal Machine friction
  • Temperature influences
  • Humidity influences
  • Vibrations transmitted from a passing forklift
  • Network clogging due to a snowstorm
  • System freeze due to a power failures
  • Assignable Variation (Special Cause)
  • Operator who continually sets up machine
    incorrectly
  • Hole drilled into steel due to a dull drill
  • Processor speed low while requirement is for a
    higher one
  • New programmer or inexperienced manager
  • Tired programmer due to overwork
  • Attrition midway on the project
  • Network or link failure on particular day
  • No knowledge of domain or lack of functional
    knowledge

31
Why are we bothered about variation
  • It will change the shape, dispersion and central
    tendency of the characteristic being measured
  • Assignable variation can be correctable and
    stabilized economically

32
Purpose of Quality Control Charts
  • The purpose of quality-control charts is to
    portray graphically when an assignable cause
    enters the production system so that it can be
    identified and corrected
  • This is accomplished by periodically selecting a
    random sample from the current production

33
Why Control Charts
  • Control charts let you know what processes can
    do, so that you can set achievable goals
  • They represent the Voice of the process
  • Identify unusual events and point at fixable
    problems and potential process improvements

34
Variable and Attributes Data
  • Variable Data
  • Represents measurements of a continuous
    phenomenon
  • e,.g. elapsed time, effort expanded, years of
    experience, memory utilization, cost of rework
  • Volume, weight, height, length, efficiency,
    viscosity
  • Attributes Data
  • Represents data that can either conform or not to
    conform to a discrete set of values
  • E.g. number of defects found, number of
    defectives per unit, of projects using a formal
    code inspection method

35
Simple Control Charts for software
  • X-Bar Chart
  • nP and p-Chart
  • C and U-Chart
  • I-X Chart (Individual X-Chart)
  • XmR Chart (Moving Range Chart)

36
Types of Quality Control Charts - Variables
  • The mean of the x-bar chart is designed to
    control variables such as weight, length etc. The
    upper control limit (UCL) and the lower control
    limit (LCL) are obtained from the equation
  • _
    _
  • UCL XA2R and UCL X-A2R


  • _
  • Where X is the mean of the sample means and R is
    the mean of the sample ranges.

37
Finding special causes by control charts
  • Consider Peer Review (PR) preparation rate for 5
    samples (each sample is covering 4 PRs from 4
    divisions)

38
Mean and Ranges
  • The table below shows the mean and ranges

39
Compute UCL and LCL
  • Compute the Grand Mean (X-double bar) and the
    average range.
  • Grand Mean (25.2526.7525.25)/5 26.35
  • Mean Range (563)/5 5.8
  • Determine the UCL and LCL for the average
    preparation time
  • UCL 26.35.7295.8 30.58
  • LCL 26.35-.7295.8 22.12
  • 0.729 is the constant found from statistical
    table.

40
Control Charts for Means and Analysis
  • Is the process under control? Any special causes
    of variation?
  • Control Charts are typically indicative after 25
    or 30 samples

41
Detecting instabilities and Out of Control
  • Examine control charts for instances of behavior
    and patterns that show nonrandom behavior
  • Values falling outside the control limits and
    unusual patterns within the running record
    suggest that assignable causes exist

42
Tests of Out of Control and Instabilities
  • Test 1 A single point falls outside the 3-sigma
    control limits
  • Test 2 At least two of three successive values
    fall on the same side of , and more than two
    sigma units away from, the center line
  • Test 3 At least four out of five successive
    values fall on the same side of, and more than
    one sigma units away from , the center line
  • Test 4 At least eight successive values fall on
    the same side of the center line.

43
Typical Causal Analysis
  • Test 1
  • Unusual Single event
  • Accident snowstorm, Merger, Attrition
  • Test 2
  • Chance variation
  • New operator or untrained operator
  • Test 3
  • New machine or new process/procedure
  • Test 4
  • Wear and tear of the machine
  • These are only typical, there could be atypical
    causes as well!!

44
Statistics in Action An Example
  • Conviction of a person who bribed some players to
    loss in betting
  • X-bar and R-bar charts showed unusual betting
    patterns and some contestants did not win as
    expected
  • A QC Expert identified times when assignable
    causes stopped and prosecutors were able to tie
    this to the times of the arrest of the suspect
  • Canadian firm on the Japanese order in 1980s
  • Three defectives shipped separately

45
Types of Control Charts nP and p
  • Used with discrete binomial data (number of
    failures)
  • Likelihood of items failure unaffected by
    failure of previous item in the sample
  • nP charts
  • xi number of failures in a sample
  • fixed sample size n
  • average fraction non-conforming p
  • Mean np
  • Constant control limits np ? 3 np(1 - p)1/2
  • P charts
  • pi proportion of failures in a sample
  • variable sample size ni
  • Mean p
  • Variable control limits p ? 3 p(1 - p)/ni1/2
  • Control limits tighten up for larger sample sizes
    and relax for smaller sample sizes

46
Types of Control Charts c and u
  • Used with discrete Poisson data (count of
    defects/sample)
  • independent events (defects)
  • probability proportional to area of opportunity
    (sample size)
  • events are rare ( lt 10 possible defects)
  • C charts
  • ci event count
  • constant area of opportunity
  • average number of events per sample cavg
  • Mean cavg
  • Constant control limits cavg ? 3(cavg)1/2
  • U charts
  • ui event count per unit area of opportunity
    (defects/unit size)
  • variable area of opportunity ai
  • Mean uavg
  • Variable control limits uavg ? ? 3 (uavg)1/2
  • Control limits tighten up for larger sample sizes
    and relax for smaller sample sizes

47
Individual X Chart
  • Sample of Individual Reviews
  • Can show the distribution of variation in cases
    of non homogeneous data

48
Individual X Chart
49
XmR Charts
  • Used with continuous data (measurements)
  • no assumptions about underlying distribution
  • Appropriate for items that are not produced in
    batches or when it is desirable to use all
    available data
  • Two charts X and mR (moving Range of X)
  • mRavg is used to estimate s for X as well as mR
  • mRi Xi - Xi-1
  • X chart mean Xavg
  • X chart control limits Xavg ? 2.660 mRavg
  • mR chart mean mRavg
  • mR chart control limit 3.268 mRavg

50
Other Statistical Techniques
  • Control Charts are not enough
  • Confidence intervals
  • Prediction intervals
  • Test of hypotheses
  • ANOVA
  • Probability techniques
  • Regression
  • Correlation Analysis

51
Recent Studies
  • Although software industry has made significant
    progress in implementing metrics programs, large
    number of them fail
  • Howard Rubins Rubins Systems, Inc.
  • 1997 study indicated that four out of five
    metrics programs fail to succeed
  • Success is defined as
  • Measurements program that lasts for more than two
    years and that impacts the business decisions
    made by the organization

52
Primary Reasons for Failure
  • Not tied to business goals
  • Irrelevant or not understood by key players
  • Perceived to be unfair or resisted
  • Motivated wrong behavior
  • Expensive, cumbersome
  • No action based on the numbers
  • No sustained management sponsorship

53
Success in Metrics Programs
  • Are more than collecting data
  • Benefit and value come from decisions taken from
    data
  • Sometimes choosing the right metrics becomes
    overwhelming due to many opportunities that exist
    especially in large organizations
  • Goal driven measurement is a must!
  • Ref GUIDEBOOK CMU/SEI-97-HB-003 by
  • William A. Florac, Robert E. Park, Anita D.
    Carleton

54
  • THANK YOU
Write a Comment
User Comments (0)
About PowerShow.com