Title: Using Machine Learning to Predict Project Effort: Empirical Case Studies in Data-starved Domains
1Using Machine Learning to Predict Project Effort
Empirical Case Studies in Data-starved Domains
- Gary D. Boetticher
- Department of Software Engineering
- University of Houston - Clear Lake
2What Customers Want
3What Requirements Tell Us
4Standish Group Standish94
- Exceeded planned budget by 90
- Schedule by 222
- More than 50 of the projects had less than 50
requirements
5Underlying Problems
- 85 are at CMM 1 or 2 CMU CMM95, Curtis93
- Scarcity of data
6Consequences
- Early life-cycle estimates use a factor of 4
Boehm81, Heemstra92
7Related Research Economic Models
8Why are Machine Learning algorithms not used more
often for estimating early in the life cycle?
9Related Research - 2
10Goal
- Apply Machine Learning (Neural Network)
- early in the software lifecycle
- against Empirical Data
11Neural Network
12Data
- B2B Electronic Commerce Data
- Delphi-based
- 104 Vectors
- Fleet Management Software
- Delphi-based
- 433 Vectors
13Experiment 1 Product-Based Fleet to B2B
14Experiment 1 Product Results
15Experiment 2 Project-Based Results Fleet to B2B
16Experiment 3 Product-Based B2B to Fleet
17Extrapolation issue
- Largest SLOCs divided by each other
- 4398 / 2796 1.57
18Experiment 3 Product Results
19Experiment 4 Project-Based Results B2B to Fleet
20Results
21Conclusions
- Bottom-up approach produced very good results on
a project-basis - Results comparable between NN and stat.
- Scaling helped
- Estimation Approach is suitable for
Prototype/Iterative Development
22Future Directions
- Explore an extrapolation function
- Apply other ML algorithms
- Collect additional metrics
- Integrate with COCOMO II
- Conduct more experiments (additional data)