Title: A Systematic Review of Cross- vs. Within-Company Cost Estimation Studies
1A Systematic Review of Cross- vs. Within-Company
Cost Estimation Studies
EEL 6883 Research Paper Presentation
- Barbara Kitchenham, Emilia Mendes, Guilherme
Travassos - IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL.
33, NO. 5, MAY 2007
Mustafa Ilhan Akbas Omer Bilal Orhan
2Outline
- Motivation
- Objective
- Method
- Results
- Conclusions
- Comments
3Motivation
- Cross vs within-company cost estimation
- Early studies suggested calibrating general
purpose cost estimation models and using only
single-company data. BUT - Time required to collect data
- Older projects may not reflect current tech
- Care is necessary in data collection
- Cross-company models are favored. BUT
4Motivation
- 1999 Maxwell Within-company model is more
accurate - 1999 Briand Cross-company model could be as
acc. - 2000 Briand (with Maxwell data) Cross-comp.
model can be as good as within data. - 2002 Wieczorek Ruhe Same trend with Briand data
- 2005 Mendes - Same trend with another data set
- But
- 2000,2001Jeffrey - Within-company models are
superior - 2003 Lefley Shepperd - Within-company model is
more accurate with Briand data - 2004 Mendes, Kitchenham - Within-company models
are significantly better
5Motivation
- Applicability of cross company models to the
effort estimate for single company projects
contradicts.
6Objective
- To determine under what conditions individual
organisations are able to rely on
cross-company-based estimation models - To provide advice to researchers about the value
of cross-company models.
7Method
- Prepare a systematic review to determine factors
that influence the outcome of the studies. - Discuss different variations in experimental
procedure
8Research Questions
- For the review, the authors follow the
approach of Kitchenham paper - Procedures For Performing Systematic Reviews
- Point of view is formed by research questions
- Question one
- What evidence is there that cross-company
estimation models are not significantly different
from within-company estimation models for
predicting effort for software/Web projects?
9Research Questions
- Question two
- What characteristics of the study data sets and
the data analysis methods used in the study
affect the outcome of within-company and
cross-company effort estimation accuracy studies?
- Question three
- Which experimental procedure is most appropriate
for studies comparing within-company and
cross-company effort estimation models?
10Method Population, Intervention, Comparison,
Outcome
- Population Cross-company benchmarking data bases
of Web and software projects - Intervention Effort estimation models
constructed from cross-company data, used to
predict effort for single company projects - Comparison Intervention Effort estimation models
constructed from the within- company data only - Outcome The accuracy of the cross- and
within-company models
11Search Strategy used for Primary Studies
- The search terms used are constructed using the
following strategy - Derive major terms from the questions by
identifying the population, intervention and
outcome - Identify alternative spellings and synonyms for
major terms. Consultations with field experts
and/or subject librarians to identify the terms - Check the keywords in any relevant papers we
already have - Use the Boolean OR to incorporate alternative
spellings and synonyms Use the Boolean AND to
link the major terms from population,
intervention and outcome.
12The main search terms
- Population software, Web, project.
- Intervention cross-company, project, effort,
estimation,model. - Comparison single-company, project, effort,
estimation,model. - Outcomes prediction, estimate, accuracy
13Sample search string
AND (software OR application OR product OR Web )
AND (method OR process OR system OR technique
OR methodology OR procedure) AND(cross company
OR multi organisation OR within organisation OR
single company OR single-organisational OR
company-specific) AND(model) AND(effort OR
cost) AND(estimation OR prediction OR
assessment) Complete set of search strings is
given in the paper.
14Initial Search Phase
- Identification of candidate primary sources based
on authors knowledge, and searches of electronic
databases using the derived search strings - 1344 papers were retrieved, 25 represented the
set of 10 known papers. - Manual scan of titles and/or abstracts of all
1344 papers
15Databases/Journals Searched(from an earlier work)
- Electronic Databases
- INSPEC
- El Compendex
- Science Direct
- Web of Science
- IEEExplore
- ACM Digital library
- Individual journals (J) and conference
proceedings (C) - Empirical Software Engineering (J)
- Information and Software Technology (J)
- Software Process Improvement and Practice (J)
- Management Science (J)
- International Software Metrics Symposium (C)
- International Conference on Software Engineering
(C) - Evaluation and Assessment in Software Engineering
(manual search) (C)
16Secondary search phase
- Has two sub-phases
- To review the references of each of the primary
sources to find candidate primary sources
repeatedly until no further relevant document is
found. - To contact researchers who authored the primary
sources in the first phase, or who could be
working on the topic. Six researchers were
contacted, no one was working in the area.
17Study Selection
- Criteria for including a primary study
- Any study compared predictions of cross-company
with within-company models based on analysis of
single-company project data. - Criteria for excluding a primary study
- If projects were only collected from a small
number of different sources - If models derived from a within-company data set
were compared with predictions from a general
cost estimation model.
18Study Quality Assessment
- Part 1 The quality of the study itself.
- Has four top-level questions and an additional
quality issue related to data set size. (Weight
1,5) - Is the data analysis process appropriate?
- Did studies carry out a sensitivity or residual
analysis? - Were accuracy statistics based on the raw data
scale? - How good was the study comparison method?
19Study Quality Assessment
- Part 2 The quality of the provided reporting.
- Has four top-level questions.
- (Weight 1)
- Is it clear what projects were used to construct
each model? - Is it clear how accuracy was measured?
- Is it clear what cross-validation method was
used? - Were all model construction methods fully
defined?
20Quality
- Quality is used in 2 different ways
- as a score to ensure that results are not largely
confounded with quality - a source of difference indicator between studies.
- Quality of the study, not the model used.
- The overall quality is good.
- The factors varied between papers are size of
data set, the method for predictions and
performance of sensitivity analyses.
21Data Extraction Strategy
- For each paper a reviewer was nominated at random
as data extractor, checker, or adjudicator. - Extractor Reads the paper and completes the
form - Checker Reads the paper and verify the
correctness of the form - Adjudicator If there is a disagreement between
first two, then reads the paper and give the
final decision.
22Data Extraction Strategy
- Roles were assigned at random with the following
restrictions - No one should be data extractor on a paper he/she
authored. - All reviewers should have an equal work load (as
far as possible).
23Results Question 1
What evidence is there that cross-company
estimation models are not significantly different
from within-company estimation models for
predicting effort for software/Web projects?
24Results Question 1
- The Studies are organized into 3 groups
- Cross-company models are not significantly
different from within-company models. (4 out of
10) - Cross-company models are significantly worse than
within-company models. (All accuracy statistics
are better for within-co models) (6 out of 10) - Studies that didnt undertake formal statistical
testing inconclusive ( 2 of them, S1 and S7)
25Results Question 1
- Four studies stating cross-company models are not
significantly different. Uses leave-one-out,
which biases positively towards within-company
models. - S6 is not independent (uses S2 data), so this
cannot be used as an evidence in group1. - S1 and S7 did not test the statistical
significance. They are regarded as inconclusive
and cannot be used as evidence either.
26Results Question 2
- What characteristics of the study data sets and
the data analysis methods used in the study
affect the outcome of within-company and
cross-company effort estimation accuracy studies? - S10 contradicts that quality control makes cross-
models as good as within-company models. - S3 and S1 take a different view on quality
control (ESA database) Quality control isnt
reliable. - S2 and S6 both agree that stringent quality
control is applied to data collection. - Quality control can not ensure cross-company
models perform as well as within-company models.
27Results Question 2
- No consistent evidence that the quality of the
studies influences the results - S2 and S3 have lower scores
- S10 has the highest quality score
28Results Question 2
- Number of projects in the within-company models.
- There is noticeable difference in this number for
S2, S3, S10 (median 63) and S4, S5, S8, S9
(median 10) are compared. - All the studies where within-company predictions
were significantly better than cross-company
predictions used small within-company data sets
of fair quality. - Similar pattern applies to the range of effort
values for the entire database
29Results Question 2
- Number of projects in the within-company models.
- No clear patterns were observed for the size
metrics used, nor for the procedure used to build
the within-company model
30Results Question 2
- The relationship between within-company and
cross-company projects. - Tukutuku suggests, greater the difference between
projects, less likely it is that the
cross-company model will provide accurate
predictions for single company project. - There is no clear indication that the strength of
the cross-company relationship is a major factor
in determining whether cross-company prediction
models are as good as within-company models.
30
31Results Question 3
- Which experimental procedure is most
appropriate for studies comparing within-company
and cross-company estimation models? - There is a large variation in the adopted
procedures.
31
32Results Question 3
- Studies aimed at assessing the conditions
that would favor (or not) the use of a
cross-company model should adopt the following
procedure - Use new within-co data sets independent of
existing cross-co data sets - Perform sensitivity analysis using residual
analysis for non-regression-based methods and
influence analysis for regression-based methods. - Use regression analysis as the default model
construction method. - Use a stepwise approach on the cross-company data
based on variables collected in within-company
data set. - Apply data transformations appropriate to the
specific application - Perform statistical tests based on the absolute
residuals on the raw data scale. - Report the residuals for each model or the effort.
33Results Question 3
- Unable to provide definitive advice on cross
validation but the authors believe that
leave-one-out cross validation is not
sufficiently stringent criterion.
33
34Conclusions
- Some organizations would benefit from models
derived from cross-company databases, while some
others would not. - The review is not able to conclusively explain
the reason for this but shows some trends.
34
35Conclusions
- Some trends
- In all cases where within-company datasets
significantly outperformed, the datasets are
small and cross validation method was not very
stringent. - Within-co data is a subset of cross-co in all
studies which shows no significant difference
between two. - Similarly, the within-co data sets had been
collected separately in half of the studies that
shows within-company dataset is significantly
better.
35
36Conclusions
- Authors advice
- Consider the similarity of the projects in the
cross-company dataset to your project and
characteristics of your own company. - Further research is required. To researchers
- Come to consensus about the appropriate
experimental procedure for this type of study.
(authors suggest their procedure ?)
36
37Comments
- There were no other reviews on the same topic
that have been previously conducted. - The review criteria are not well-defined.
- Only 6 of 10 studies give results for Q1.
- No definitive results.
- There is no information about company size for
some projects. - If the projects undertaken in the company are
similar to the dataset of cross-co model, it can
be used. But deciding this similarity is another
problem. - The authors contributed to 3 of 10 studies.
- The paper cant go further away from the starting
point.