A Systematic Review of Software Development Cost Estimation Studies - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

A Systematic Review of Software Development Cost Estimation Studies

Description:

'A Systematic Review of Software Development Cost Estimation Studies' ... Only approx. 3% of the papers were blatantly misclassified ... – PowerPoint PPT presentation

Number of Views:281
Avg rating:3.0/5.0
Slides: 39
Provided by: Adr115
Category:

less

Transcript and Presenter's Notes

Title: A Systematic Review of Software Development Cost Estimation Studies


1
A Systematic Review of Software Development Cost
Estimation Studies
  • Authors Magne Jorgeson Martin Shepperd
  • Source IEEE Transactions on Software
    Engineering, Volume 33, Issue No. 1
  • Date January 2007
  • Presented by Adriana Ogasawara Joshua Mahaz

2
Introduction
  • Purpose of this study was to improve software
    estimation research through the examination of
    previous cost estimation studies
  • In short, the authors examined 304 journal papers
    by hand
  • Classified them according to a set of criteria
    they devised based on estimation topics,
    estimation approach, research approach, study
    context, and data set
  • Observed trends in the data
  • And based on this, they made suggestions on how
    software cost estimation might be improved

3
Introduction
  • According to the authors, this paper
    distinguishes itself from others in that
  • Aim is to direct future research, not discuss
    specific estimation models
  • More comprehensive and systematic review
  • Classification of studies used is unique to this
    study

4
Research Questions
5
Inclusion Identification of Papers
  • Systematic search for papers by hand
  • Issue by issue starting with Volume 1
  • Read titles and abstracts of all published papers
    from over 100 potentially relevant, peer reviewed
    journals published in English
  • Ended up with 304 papers from 76 journals
  • Journals were found through
  • Reading reference lists on cost estimation papers
  • Internet searches
  • Authors prior experience

6
Classification of Papers
  • To facilitate answering the 8 research questions,
    papers were classified according to the
    following
  • Research topic (estimation method, calibration of
    models, etc)
  • Estimation approach (regression, analogy, etc)
  • Research approach (theory, survey, etc)
  • Study context (student projects, professional
    projects, etc)
  • Data sets

7
Classification of Papers
  • Initial classification was performed by one
    author
  • Robustness of the classification was performed by
    second author
  • Tested a random sample of 30 papers (10)
  • Results of the classification testing showed the
    initial categories were too vague
  • Disagreements on 39 of the classifications

8
Classification of Papers
  • Most of the disagreements were due to different
    interpretations of the classification categories
  • Only approx. 3 of the papers were blatantly
    misclassified
  • Authors agreed the classification categories were
    accurate enough as long as
  • They clarified the descriptions of the vague
    categories (12 total)
  • Reclassified the papers that fell into the vague
    categories
  • Out of the 109 papers that were reread, only 21
    were reclassified

9
Results
10
RQ1 Which and how many journals include papers
on software cost estimation?Method Determine
which journals are the 10 most relevant by the
proportion of cost estimation papers they
containSupport cost estimation researchers with
a list of journals with potentially relevant
papers.
Research Questions
11
RQ1 (contd)
  • Found 76 journals with SW cost estimation papers
  • The top 10 journals still only included 2/3 of
    all the identified papers used in this study

12
Research Question
  • RQ2 To what extent are researchers aware of the
    breadth of potential estimation study sources?
  • Method Reference lists from 30 randomly selected
    cost estimation journal papers were analyzed.
  • Identify possible shortcomings of cost estimation
    researchers searchers for related work.

13
RQ2 (contd)
Out of the top 10 most important software cost
estimation journals, on average, only three are
referenced in a typical paper. And 7 out of the
10 were referenced in 3 or less of papers.
14
RQ2 (contd)
15
RQ2 (contd)
Arrows define information flow
16
RQ2 (contd)
Journals outside the software engineering field,
though with highly relevant cost estimation data,
were practically ignored.
Intl Journal of Forecasting
Intl Journal of Project Management
Statistics
17
Research Question
  • RQ3 Which journal is the dominant SW cost
    estimation journal? To what extent does this
    journal have research topic biases?
  • Method Identify which journal contains
  • A) the most cost estimation papers
  • B) the most references
  • Dominant journals have the potential to
    introduce publication biases wherein a
    researchers focus may be directed towards topics
    favored by the journal

18
RQ3 (contd)
19
RQ3 (contd)
  • After comparing the distribution of topics within
    IEEE TSE with the total set of estimation papers
  • From a high level, IEEE TSE cost estimation
    papers reflect the total set of estimation papers
    quite well
  • However the authors have no information on papers
    rejected by IEEE TSE so publication biases
    might still exist that arent readily apparent

20
Research Question
  • RQ4 How easy is it to identify relevant software
    cost estimation journal papers? (Using digital
    libraries.)
  • Method Identify the recall rate of cost
    estimation papers in Google Scholars and Inspec
    using search terms
  • software cost estimation OR software effort
    estimation
  • software AND (cost OR effort)
  • A manual issue-by-issue search of papers is
    accurate, but very time consuming and should be
    replaced with an automated tool.

21
RQ4 (contd)
22
RQ4 (contd)
  • However, the most typical reason for the missing
    papers was due to use of more specific search
    terms or substituting synonyms for estimation
    and software
  • Need for standardized use of search keywords
  • Authors suggest that a sufficiently wide search
    for cost estimation papers with digital libraries
    can result in a greater workload than a manual
    search
  • software AND (cost OR effort) resulted in
    278,000 records in Google Scholar

23
The Bigger Picture of RQs 1-4
  • Researchers need to increase the breadth of their
    search for relevant studies
  • Not sufficient to conduct searches in digital
    libraries or manual searches of the most
    important journals
  • Where completeness is essential to research,
    manually search for papers in a selected set of
    journals
  • Where completeness is not essential, combine
    manual searches and digital libraries

24
RQ5 How many researchers are there who have a
long term interest in software cost estimation?
To what extent do the interests of these
researchers affect the distribution of research
topics?Method Gather data on the authors of
the different journal papers including papers
published, recent activity, and topics
covered.Assess the vulnerability of software
cost estimation research.
Research Questions
25
RQ5 (contd)
  • Only 13 researchers with more then 5 journal
    papers published.
  • 9 out of 13 are still active with publications
    between 2000 2004.
  • The ratio of research topics/estimation
    approaches to active researchers is high.
  • The active researchers are generally covering a
    wide spectrum of topics.

26
RQ5 (contd)
  • With so few researchers with a long-term focus,
    topics requiring wide breadths of experience are
    at risk.
  • Measures of estimation performance
  • Data set properties
  • Both require long-term experience but are also
    essential for creating methods for meaningful
    analysis and evaluation of cost estimation
    techniques.

27
Research Question
  • RQ6 What are the most investigated software cost
    estimation research topics and how has this
    changed over time?
  • Method Separate papers by those published in
    1989 and earlier, 1990-1999, 2000-2004 and sort
    them by research topic (ex. Estimation method,
    organizational issue, measure of performance)
  • Identify trends in papers and any shortcomings in
    research topic focus

28
RQ6 (contd)
29
RQ6 (contd)
30
Research Question
  • RQ7 What are the most investigated estimation
    methods and how has this changed over time?
  • Method Separate papers by those published in
    1989 and earlier, 1990-1999, 2000-2004 and sort
    them by estimation topic (ex. Regression,
    analogy, expert judgement)
  • Identify trends in papers and any shortcomings in
    research topic focus

31
RQ7 (contd)
32
RQ7 (contd)
33
Research Question
  • RQ8What are the most frequently applied research
    methods, and in what study context? How has this
    changed over time?
  • Method Examine papers that proposed a new
    estimation method or evaluated an existing
    approach.
  • Identify trends and shortcomings in research
    topic focus.

34
RQ8 (contd)
35
RQ8 (contd)
  • Historical data does not contain the same
    realism as industry data
  • Evaluation of historical data depends on the
    availability of the data since not all companies
    keep this information
  • The lack of inclusion of conference papers from
    professionals and professional projects is an
    important shortcoming in cost estimation research

36
Pros Of This Study
  • Eight research questions used to root out the key
    underlying trends in software development cost
    estimation are the greatest strength of this
    paper
  • Uploading all the papers reviewed for the study
    into a freely available database providing a
    dense source of relevant information
  • Use of the authors cost estimation database to
    help younger researchers leap frog into more
    complex topics sooner

37
Cons Of This Study
  • The extent of the material excluded from the
    study was the papers resounding technical
    weakness
  • Final system of classification lead to an
    accuracy sufficiently high only one authors
    results from reclassification were provided . . .
    leaving one to wonder what the percentage change
    was between the first and second review
  • Exclusion of papers published by industry
    conferences, which would include past
    experiences, results, and real-life data from the
    software industry itself
  • Papers from other fields focusing on cost
    estimation were not used, which would have
    provided a great opportunity to apply analogical
    reasoning to the topic

38
Thoughts
  • We think software cost estimation research would
    be better improved by striving for more
    collaboration with software industry
  • Researchers need to move away from the more
    redundant research elements and into lesser
    studied topics such as those in the software
    industry
  • The authors themselves mention the idea numerous
    times but the lack of conformance on their own
    parts hurts their argument
  • Collaboration with the software industry would
    provide real-world data sets tenfold more
    relevant and accurate than the historical data
    sets that are so commonly referenced in papers on
    the subject matter
Write a Comment
User Comments (0)
About PowerShow.com