Managing Software Projects Analysis and Evaluation of Data - PowerPoint PPT Presentation

Loading...

PPT – Managing Software Projects Analysis and Evaluation of Data PowerPoint presentation | free to download - id: 7ce90b-YzY4N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Managing Software Projects Analysis and Evaluation of Data

Description:

Managing Software Projects Analysis and Evaluation of Data Reliable, Accurate, and Valid Data Distribution of Data Centrality and Dispersion Data Smoothing: Moving ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 30
Provided by: JayT155
Learn more at: http://www.letu.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Managing Software Projects Analysis and Evaluation of Data


1
Managing Software Projects Analysis and
Evaluation of Data
  • Reliable, Accurate, and Valid Data
  • Distribution of Data
  • Centrality and Dispersion
  • Data Smoothing Moving Averages
  • Data Correlation
  • Normalization of Data

(Source Tsui, F. Managing Software Projects.
Jones and Bartlett, 2004)
2
Reliable, Accurate, and Valid Data
3
Definitions
  • Reliable data Data that are collected and
    tabulated according to the defined rules of
    measurement and metric
  • Accurate data Data that are collected and
    tabulated according to the defined level of
    precision of measurement and metric
  • Valid data Data that are collected, tabulated,
    and applied according to the defined intention of
    applying the measurement

4
Distribution of Data
5
Definition
  • Data distribution A description of a collection
    of data that shows the spread of the values and
    the frequency of occurrences of the values of the
    data

6
Example 1 Skew of the Distribution
The number of problems detected at each of five
severity levels
  • Severity level 1 23
  • Severity level 2 46
  • Severity level 3 79
  • Severity level 4 95
  • Severity level 5 110

(more on next slide)
7
Example 1 (continued)
Number of problems is skewed towards the
higher-numbered severity levels
120 100 80 60 40 20



Number of Problems Found


0 1 2 3 4 5
Severity Level
8
Example 2 Range of Data Values
The number of severity level 1 problems by
functional area
  • Functional area 1 2
  • Functional area 2 7
  • Functional area 3 3
  • Functional area 4 8
  • Functional area 5 0
  • Functional area 6 1
  • Functional area 7 8

The range is from 0 to 8
9
Example 3 Data Trends
The total number of problems found in a
specific functional area across the test time
period in weeks
  • Week 1 20
  • Week 2 23
  • Week 3 45
  • Week 4 67
  • Week 5 35
  • Week 6 15
  • Week 7 10

10
Centrality and Dispersion
11
Definition
  • Centrality analysis An analysis of a data set to
    find the typical value of that data set
  • Approaches
  • Average value
  • Median value
  • Mode value
  • Variance and Standard deviation
  • Control chart

12
Average, Median, and Mode
  • Average value (or mean) One type of centrality
    analysis that estimates the typical (or middle)
    value of a data set by summing all the observed
    data values and dividing the sum by the number of
    data points
  • This is the most common of the centrality
    analysis methods
  • Median A value used in centrality analysis to
    estimate the typical (or middle) value of a data
    set. After the data values are sorted, the
    median is the data value that splits the data set
    into upper and lower halves
  • If there are an even number of values, the values
    of the middle two observations are averaged to
    obtain the median
  • Mode The most frequently occurring value in a
    data set
  • If the data set contains floating point values,
    use the highest frequency of values occurring
    between two consecutive integers (inclusive)

13
Example
Data Set 2, 7, 3, 8, 0, 1, 8
Average xavg (2 7 3 8 0 1 8) / 7
4.1 Median 3 0, 1, 2, 3, 7, 8, 8
Mode 8
14
Variance and Standard Deviation
  • Variance The average of the squared deviations
    from the average value s2 SUM (xi xavg)2)
    / (n 1)
  • Standard deviation the square root of the
    variance. A metric used to define and measure
    the dispersion of data from the average value in
    a data set
  • It is numerically defined as follows

s SQRT SUM (xi xavg)2) / (n 1)
where SQRT square root function SUM
sum function xi ith observation xave
average of all xi n total number of
observations
15
Standard Deviation Example
Data Set 2, 7, 3, 8, 0, 1, 8
xavg (2 7 3 8 0 1 8) / 7 4.1 SUM
(xi xavg)2) 4.41 8.41 1.21 15.21
16.81 9.61 15.21
70.87 SUM (xi xavg)2) /
(n 1) 70.87 / 6
11.81 STANDARD DEVIATION s SQRT(11.81)
3.44
16
Control Chart
  • Control chart A chart used to assess and control
    the variability of some process or product
    characteristic
  • It usually involves establishing lower and upper
    limits (the control limits) of data variations
    from the data sets average value
  • If an observed data value falls outside the
    control limits, then it would trigger evaluation
    of the characteristic

17
Control Chart (continued)

7.54 problems


4.1 problems (average)

0.66 problems
18
Data Smoothing Moving Averages
19
Definitions
  • Moving average A technique for expressing data
    by computing the average of a fixed grouping
    (e.g., data for a fixed period) of data values
    it is often used to suppress the effects of one
    extreme data point
  • Data smoothing A technique used to decrease the
    effects of individual, extreme variability in
    data values

20
Example
Test week Problems found 2-week moving avg 3-week moving avg
1 20 - -
2 33 26.5 -
3 45 39 32.7
4 67 56 48.3
5 35 51 49
6 15 25 39
7 20 17.5 23.3
21
Data Correlation
22
Definition
  • Data correlation A technique that analyzes the
    degree of relationship between sets of data
  • One sought-after relationship is software is that
    between some attribute prior to product release
    and the same attribute after product release
  • One popular way to examine data correlation is to
    analyze whether a linear relationship exists
  • Two sets of data are paired together and plotted
  • The resulting graph is reviewed to detect any
    relationship between the data sets

23
Linear Regression
  • Linear regression A technique that estimates the
    relationship between two sets of data by fitting
    a straight line to the two sets of data values
  • This is a more formal method of doing data
    correlation
  • Linear regression uses the equation of line y
    mx b, where m is the slope and b is the
    y-intercept value
  • To calculate the slope, use the following m
    SUM (xi xavg) x (yi yavg) / SUM (xi
    xavg)2
  • To calculate the y-intercept, use the
    following b yavg (m x xavg)

24
Example
Pre-release and Post-release Problems
SW Products Pre-release Post-release
A 10 24
B 5 13
C 35 71
D 75 155
E 15 34
F 22 50
G 7 16
H 54 112
25
Example (continued)
xavg 27.9 yavg 59.4 m 2.0 slope
(approx.) b 3.6 y-intercept (approx.) y 2x
3.6
26
Example (continued)
200 - 150 100 50 0


Number of Post-release Problems Found






10 20 30 40 50 60 70 80
Number of Pre-release Problems Found
27
Normalization of Data
28
Definition
  • Normalizing data A technique used to bring data
    characterizations to some common or standard
    level so that comparisons become more meaningful
  • This is needed because a pure comparison of raw
    data sometimes does not provide an accurate
    comparison
  • The number of source lines of code is the most
    common means of normalizing data
  • Function points may also be used

29
Summary
  • Reliable, Accurate, and Valid Data
  • Distribution of Data
  • Centrality and Dispersion
  • Data Smoothing Moving Averages
  • Data Correlation
  • Normalization of Data

?
About PowerShow.com