Title: SoftLab Bogazii University Department of Computer Engineering Software Engineering Research Lab http
1SoftLabBogaziçi University Department of
Computer EngineeringSoftware Engineering
Research Labhttp//softlab.boun.edu.tr/
2Research Challenges
- Trend to large, heterogenous, distributed sw
systems leads to an increase in system complexity - Software and service productivity lags behind
requirements - Increased complexity takes sw developers further
from stakeholders - Importance of interoperability, standardisation
and reuse of software increasing.
3Research Challenges
- Service Engineering
- Complex Software Systems
- Open Source Software
- Software Engineering Research
4Software Engineering Research Approaches
- Balancing theory and praxis
- How engineering research differs from scientific
research - The role of empirical studies
- Models for SE research
5The need to link research with practice
Colin Potts, Software Engineering Research
Revisited, IEEE Software, September 1993
- Why after 25 years of SE has SE research failed
to influence industrial practice and the quality
of resulting software? - Potts argues that this failure is caused by
treating research and its application by industry
as separate, sequential activities. - What he calls the research-then-transfer
approach. The solution he proposes is the
industry-as-laboratory approach. - .
6Industry-as-Laboratory Approach
- Stronger connection at start because knowledge of
problem is acquired from the real practitioners
in industry, often industrial partners in a
research consortium. - Connection is strengthened by practitioners and
researchers constantly interacting to develop the
solution - Early evaluation and usage by industry lessens
the Technology Transfer Gap. - Reliance on Empirical Research
- shift from solution-driven SE to problem-focused
SE - solve problems that really do matter to
practitioners
7Industry-as-Laboratory emphasizes Real Case
Studies
- Advantages of case studies over studying problems
in research lab. - Scale and complexity - small, simple (even
simplistic) cases avoided - these often bear
little relation to real problems. - Unpredictability - assumptions thrown out as
researchers learn more about real problems - Dynamism - a real case study is more vital than
a textbook account - The real-world complications of industrial case
studies are more likely to throw up
representative problems and phenomena than
research laboratory examples influenced by the
researchers preconceptions.
8Need to consider Human/Social Context in SE
research
- Not all solutions in software engineering are
solely technical. - There is a need to examine organizational, social
and cognitive factors systematically as well. - Many problems are people problems, and require
people-orientated solutions.
9Theoretical SE research
- While there is still a place for innovative,
purely speculative research in Software
Engineering, research which studies real problems
in partnership with industry needs to be given a
higher profile. - These various forms of research ideally
complement one another. - Neither is particularly successful if it ignores
the other. - Too industrially focused research may lack
adequate theory! - Academically focused research may miss the
practice!
10Software Engineering Research Approaches
- The Industry-as-Laboratory approach links theory
and praxis - Engineering research aims to improve existing
processes and/or products - Empirical studies are needed to validate Software
Engineering research - Models for SE research need to shift from the
analytic to empirical.
11Empirical SE Research
12(No Transcript)
13Real life Problems
14Do parts connect?
15(No Transcript)
16Research Questions
17Research Questions
18Research Questions
19Research Questions
20(No Transcript)
21Our Research Question
- Software development lifecycle
- Requirements
- Design
- Development
- Test (Takes 50 of overall time)
- Detect and correct defects before delivering
software. - Test strategies
- Expert judgment
- Manual code reviews
- Oracles/ Predictors as secondary tools
22In Practise
- Product quality
- Lower defect rates
- Less costly testing times
- Low maintenance cost
- Process quality
- Effort and cost estimation
- Process improvement
23Research Question
- How much test is enough?
- When to stop testing?
24Problem
- Decision making under uncertainity
25Solution
- CS claims it can be solved
- Artificial Intelligence
26SE Research
- Intersection of AI and Software Engineering
- An opportunity to
- Use some of the most interesting computational
techniques to solve some of the most important
and rewarding questions
27AI Fields, Methods and Techniques
28What Can We Learn From Each Other?
29Software Development Reference Model
Intersection of AI and SE Research
Empirical Software Engineering
30Intersection of AI and SE Research
- Build Oracles to predict
- Defects
- Cost and effort
- Refactoring
- Measure
- Static code attributes
- Complexity and call graph structure
- Data collection
- Open repositories (NASA, Promise)
- Open source
- Softlab Data Repository (SDR)
31Software Engineering Domain
- Classical ML applications
- Data miner performance
- The more data the better the performance
- Little or no meaning behind the numbers, no
interesting stories to tell
32Software Engineering Domain
- Algorithm performance
- Understanding Data
- Change training data over/ under/ micro sampling
- Noise analysis
- Increase information content of data
- Feature analysis/ weighting
- Learn what you will predict later
- Cross company vs within company data
- Domain Knowledge
- SE
- ML
33Software Engineering Research
- Predictive Models
- Defect prediction and cost estimation
- Bioinformatics
- Process Models
- Quality Standards
- Measurement
34Major Research Areas
- Software Measurement
- Defect Prediction/ Estimation
- Effort Cost Estimation
- Process Improvement (CMM)
35A Testing Workbench
36Static Code Attributes
- void main()
-
- //This is a sample code
- //Declare variables
- int a, b, c
- // Initialize variables
- a2
- b5
- //Find the sum and display c if greater than
zero - csum(a,b)
- if c lt 0
- printf(d\n, a)
- return
-
LOC Line of Code LOCC Line of commented Code V
Number of unique operandsoperators CC
Cyclometric Complexity
37Prest
- A tool developed by Softlab
- Parser
- C, Java, C, jsp
- Metric Collection
- Data Analysis
38Data Sources
- Public Datasets
- NASA (IVV Facility, Metrics Program)
- PROMISE (Software Engineering Repository)
- Includes Softlab data now
- Open Source Projects (Sourceforge, Linux, etc.)
- Internet based small datasets
- University of South California (USC) Dataset
- Desharnais Dataset
- ICBSG Dataset
- NASA COCOMO and NASA 93 Datasets
- Softlab Data Repository (SDR)
- Local industry collaboration
- Total 20 companies, 25 projects over 5 years
39Tangible Benefit
40Matching reqs with defects
Requirements Analysis
Call Graph / Refactoring
Design
Test driven development
Coding
Defect prediction
Test
Refactoring
Maintenance ? 8
41Emerging Research Topics
- Adding organizational factors to local prediction
model - Information about the development team,
experience, coding practices, etc. - Adding file metrics from version history
- Modified/added/deleted lines of code
- Selecting only modified files from each version
in the prediction model - Confidence Factor
- Using time factors
- Dynamic prediction Constructing a model
- for each application in a version
- for each module/package in an application
- for each developer by learning from his/her
coding habits - TDD
- Measuring test coverage
- Defect proneness
- Company wide implementation process
- Embedded systems
- Cost/ Effort Estimation
- Dynamic estimation per process