Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method

Description:

Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method Dr. Robertas Dama evi ius Software Engineering Department, – PowerPoint PPT presentation

Number of Views:250
Avg rating:3.0/5.0
Slides: 25
Provided by: ktu84
Category:

less

Transcript and Presenter's Notes

Title: Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method


1
Visualization and Analysis of Open Source
Software Evolution using An Evolution Curve Method
  • Dr. Robertas Damaševicius
  • Software Engineering Department,
  • Kaunas University of Technology
  • Studentu 50-415, Kaunas, Lithuania
  • Email robertas.damasevicius_at_ktu.lt
  • http//soften.ktu.lt/damarobe

2
Context and Problem
  • Software systems are
  • designed, constructed and used by people
  • components in larger socio-technical systems
  • Software design is
  • a social process embedded within organizational
    and cultural structures
  • influenced by social processes such as programmer
    collaboration in teams
  • Open source software systems
  • Free to use
  • Free availability of source code
  • Developed by many programmers
  • Continuously evolve
  • Aim analysis of open source software evolution
    using metrics

3
What is software evolution?
  • Definition
  • a continuing process in time during which some
    essential software properties are changed
  • Activities
  • modification, adaptation, maintenance, and
  • other activities which occur after the delivery
    of the first operational release to the users
  • Importance
  • costs devoted to system maintenance and evolution
    account for more than 90 of total software costs
    (Erlikh, 1990)

4
Forces and factors of open source software
evolution
  • Evolution of open source systems
  • less strict control and management model
  • usually started by a single developer (seed)
  • attracted users become co-developers
  • governed by the needs of users and spontaneous
    collaboration of co-developers
  • Evolution mechanisms
  • natural selection, competition
  • variation-increasing variation-decreasing
  • influenced by psychological, intellectual, social
    and cultural, economic and business factors

5
Software metrics
  • Common
  • Source lines of code
  • Cyclomatic complexity
  • Halstead metrics
  • Number of classes and interfaces
  • R.C. Martins software package metrics
  • Cohesion, Coupling,
  • Specific software evolution metrics
  • SDI metric
  • Lmetric
  • AICC metric
  • G-metric
  • Software development models
  • Statistical models
  • Rayleigh model
  • Halsteads Software Science model
  • COCOMO model

6
Lehmans Laws of Software Evolution
  • Formulated by M.M. Lehman in the 1980s
  • Law of Continuing Change
  • Law of Increasing Complexity
  • Law of Statistically Smooth Growth
  • Law of Organisational Stability
  • Law of Conservation of Familiarity
  • Law of Continuing Growth
  • Law of Declining Quality
  • Law of Feedback System
  • Evolution forces
  • Growth
  • Maintenance

7
Transition-based model of evolution
  • Stages many, often overlapping
  • Transitions breakpoints between stages, which
    represent significant changes. Transitions occur
    because as a system evolves, its structure must
    be regularly adapted to the changing requirements
    and environment
  • Gradual change a slow process of incremental
    change caused by accumulating maintenance steps
    or gradual decay
  • Sudden change significant changes in the
    evolving system or in the process by which it is
    evolved

8
Information-theoretic methods
  • Shannon entropy
  • A measure of the uncertainty associated with a
    random variable.
  • The information source generates a series of
    symbols xi belonging to an alphabet with size N
    according to a known probability distribution
    p(xi), the entropy function H of a sequence X can
    be defined
  • High entropy higher complexity of the systems
    code
  • Low entropy there are some repeated patterns of
    source code code maintenance is required
  • Kolmogorov Complexity
  • Measures the complexity (i.e., information
    content) of an object by the length of the
    smallest program that generates it.
  • Kolmogorov Complexity Kf(x) of an object x in the
    description system f is the length of the
    shortest program capable of producing x

9
Evolution curve method (1)
  • Motivation the addition of new features to a
    software system leads to the change of basic
    software characteristics (complexity/entropy) in
    the system.
  • Idea use the change of software size and
    complexity as a means to determine different
    stages of evolution of a software system
  • Inspiration Z-curve1 and DNA walk2 methods used
    in analyzing complex genetic sequences

1 R. Zhang, C.T. Zhang. Z Curves, an Intuitive
Tool for Visualizing and Analyzing DNA sequences.
J. Biomol. Struc. Dynamics 11, 767782, 1994. 2
S. Paxia, A. Rudra, Y. Zhou, B. Mishra. A Random
Walk down the Genomes DNA Evolution in VALIS.
IEEE Computer 35(7)73-79, 2002.
10
Evolution curve method (2)
  • E-curve is composed of a series of nodes
    , whose coordinates are and (i
    1,2,...,N), where N is the number of versions
    of the analyzed software system.
  • The nodes are connected sequentially with
    straight segments.
  • The coordinates and are calculated
    iteratively
  • is the Kolmogorov Complexity of the i-th
    version of a software system
  • is the Shannon entropy of the i-th version
    of a system

11
Evolution curve method (3)
  • Two dimensions of the Evolution curve
  • x (relative information content) and
  • y (relative complexity),
  • Represent two independent (orthogonal)
    characteristics of a software system
  • x-dimension amount of information contained in a
    software system and is an estimation of software
    size
  • y-dimension information entropy of a software
    system and is an estimation of software
    complexity.

12
Software evolution stages
  • Software Growth system is actively developed
  • Software Maintenance system becomes simpler
    often at a cost of its size
  • Software Improvement system becomes more complex
    and generic
  • Software Shrink functionality of a system is
    reduced

13
Trends of Evolution curve
  • Actively developed systems long upward trends of
    growth
  • Mature, stable systems long downward trends of
    maintenance

14
Case studies
  • Source SourceForge
  • 7-zip
  • Archiver
  • 82 versions, 5 years, 160K LOC
  • Grip
  • CD player/ripper
  • 36 versions, 14K LOC
  • eMule
  • P2P file sharing client

15
Case study eMule
  • eMule
  • one of the biggest P2P file sharing clients
  • coded in Microsoft Visual C using MFC
  • Free software, released under the GNU GPL
  • Source code first released at version 0.02 on
    July 6, 2002
  • Latest release contains 222,680 lines of code
  • Actively developed by 5 developers
  • Current development status is Production/Stable
  • For analysis, 68 versions of eMule source code
    were used

16
eMule Entropy
Version 015a
Version 030a
Version 018a
17
eMule Size
y A Bx Cx2 A 7676.17 B 4324.67 C
177.488 r 0.9935
18
eMules Evolution curve
30e
47c
23b
44b
25b
19
What does the changelog say?
20
Conclusions
  • Software evolution process can be divided into 4
    stages
  • software growth the size and complexity of
    developed software is increasing
  • software maintenance the aim is to contain
    complexity and fix software bugs
  • software improvement the aim is to contain
    software system size at a cost of increasing
    complexity
  • software shrink both software size and its
    complexity is trimmed
  • Evolution curve method can
  • identify software evolution stages
  • identify the initial development status of the
    analyzed software system
  • actively developed systems show long growth
    trends
  • mature systems show maintenance and improvement
    trends
  • Is independent from software implementation
    language

21
Ongoing Research and Further Work
  • Analysis of other entropy measures such as block
    entropy and Rényi entropies
  • paper submitted to Journal of Software
    Maintenance and Evolution
  • Dynamic models of software evolution
  • Differential equations, etc.
  • More case studies
  • paper submitted to Computing and Information
    Systems Journal

22
Thank You.Any Questions?
23
7-zip Evolution curve
24
Grip Evolution curve
Write a Comment
User Comments (0)
About PowerShow.com