Title: Two Case Studies of Open Source Software Development: Apache and Mozilla
1Two Case Studies of Open Source Software
DevelopmentApache and Mozilla
- By Helen Gower, Drew Spencer, Mila Reid, Nigel
Macarthur Mohamed Hossain
2Introduction
- Development Process - Traditional
- What is Open Source Software
- Apache and Mozilla
- Apache Process
- Hypotheses
- Mozilla Process
- Hypotheses revisited
- Conclusion
- Research
- Any questions
3Development Process - Traditional
- Basically the waterfall cycle
- Predominately used in the commercial industry
- Advantages well established, structured
procedures - Disadvantages management related constraints,
cannot go back a phase
4What is Open Source Software (OSS)?
- A new way to develop software
- Differences from traditional development
- Source code is freely available
- Communicate exclusively by email/bulletin boards
- Geographically distributed development
- Advantages developer freedom, tacit knowledge
- Disadvantages lacks traditional methods to
coordinate development
5Open Source Software - Results
- OSS development has proven to be
equivalent/superior to traditional methods - Defects found and fixed quicker
- Code written with more care/creativity
- An example of successful OSS software is the
Linux operating system
6What is Apache?
- Apache is a free, open source HTTP web server
software system - Works well on open source operating systems such
as UNIX and Linux - Also available for Windows and other operating
systems
7What is Apache? cont
- Supports the PERL and PHP languages
- Provides services such as server-side scripting
- Industry leaders such as DEC, UUNet and Yahoo use
Apache - 70 of the worlds web servers run on Apache
(http//news.netcraft.com/web_server_survey.html)
8Why call it Apache?
- In early 1995, developers of some high visibility
web sites decided to pool their patches and
enhancements to the NCSA/1.3 server to create - A patchy server
- The Apache Group (AG) started 1995
9What is Mozilla?
- The Mozilla Project is an open source software
project - Dedicated to development of the Mozilla web
browser and application framework - Available for many operating systems
- Firefox, a cross-platform browser and
- Camino, a web browser for MacOS X
10What is Mozilla? cont
- Includes mail and news reader (Mozilla
Thunderbird), HTML editor and an IRC client - Supports many technologies including development
tools - CVS, Bugzilla, Bonsai, Tinderbox
- It also builds toolkit type applications such as
Komodo from ActiveState
11What is Mozilla? cont
- Mozilla uses a development process with
commercial roots - Mozilla.org exists as a group within Netscape
- Central point of contact responsible for
coordinating development
12The Development Process
- Problems posed by OSS-style development
- Decentralised Workspaces
- Lack of communication leadership
- Inconsistent dedication of time
- Solutions
- Concurrent Version Control Archive (CVS)
- Mailing List
- Quorum Voting System
- Meritocracy
13Identifying work to be done
- Modification Requests (MRs) mailing list
- BUGDB
- USENET groups
- Showstoppers always addressed
- Others discussed on mailing list
14Assigning Performing Work
- Core developers have own areas
- New developers take on disowned areas or new
features - Great respect for core developers expertise and
experience - No specific rights to code meritocracy gives
implicit ownership
15The Development Community
- 400 individual contributors of code
- 182 people contributed 695 fixes
- 249 people contributed 6,092 new code submissions
- 3,060 people submitted 3,975 bug reports
- 458 people submitted 591 that caused a change in
the code
16Distribution of Work
- Top 15 developers contributed
- 83 of MRs for new features
- 66 of MRs for defects/bugs
- The wider development community is significant in
defect repair - Few outside the core group submit with any
regularity
Developers contributing gt 1 MR Before After Both
New Features 49 49 25
Fixes 120 140 25
17Commercial Project Comparison
MR KLOCA Dev MR/top dev/yr LOCA/top dev/yr
A 3,300 5,000 101 30 38,600
B 2,500 1,000 91 30 11,700
C 1,100 81 17 90 6,100
D 200 21 8 20 5,400
E 700 90 16 60 10,000
A-E Avg. - - 47 46 14,360
Apache 6,000 220 388 110 4,300
18Commercial Project Comparison
- Top developers handle around twice the number
of MRs as commercial projects - Rate of development is within 2/3 that of C D
in terms of LOCA - B E are about twice as productive
- A is 10 times more productive
19Reporting Problems
- Top Problem Reporters only contributed 5 of
PRs - Of these 15, only 3 are also core developers
- Problem reporting belongs almost exclusively to
the wider development community
20Ownership of Code
- Was thought likely that strong code ownership
would evolve due to modular design and
decentralisation - This was not supported by analysis of files (.c
files) - 75 had gt 2 developers contributing 10 of lines
- 50 had gt 4 developers contributing 10 of lines
- High level of trust and recognition of expertise
21Defect Density
- Measured in defects/KLOCA
- Apache has same defect density for pre-release
and post release tests - Pre-release less defects than commercial
products - Post release more defects than commercial
products
22Resolving Problems
- How long does it take to resolve problems?
- 50 resolved within 24hrs
- 75 resolved within 42 days
- 90 resolved within 140 days
- Slightly lower for documentation, OS related and
optional features - Over two periods the average resolution interval
decreased significantly while the number of users
increased
23What has been analysed?
- The structure of the development process
- The number of participants
- The distribution of work among different roles
- Rules of ownership of code
- Density of defects
- Time taken to resolve problems
24Hypothesis 1
- Implicit coordination mechanism
- Detailed knowledge of who has expertise in what
area - Customs habits regarding how things are done
- What are core members are doing
- Core of developers who control the code base
- No larger than 10-15 people
- Create approx 80 of the new functionality
- (not fixes or problem reporting)
25Hypothesis 2
- Satellite projects created
- Divide conquer work split over core
developers and satellite groups
Strict code ownership policy needs to be adopted
26Hypothesis 3
- A group around 10x larger
- than the core (10-15 people)
- will repair defects
- E.g. Apache, 182 people repaired defects
- A group 10x larger or more will report problems
- E.g. Apache, 3,060 people reported bug reports
27Hypothesis 4
- Lack of resources overburdened
- Most people have only ever submitted 1 bug
- Apache 3,060 people reported 3,975 bugs
- Wider community needed to free up core developers
time so they can develop new functionality
Projects without a wider community finding and
repairing defects will fail
28Hypothesis 5
Defect density (per 1,000 lines of code) will be
lower than commercial software
29Hypothesis 6
- Familiar with the features needed
- Familiar with desirable user behaviour
Developers are also experienced users of the
software they write
30Hypothesis 7
- Many eyeballs implies shallow bugs
- Free-world of OSS
- Patches available to all customers nearly as soon
as they are made - Commercial developments
- Patches bundled into new releases and scheduled
for release at specific times (long term
projects)
OSS developments exhibit rapid responses to
customer problems
31Mozilla How Things Happen
- Development was done at the time of writing the
paper by 12 staff in mozilla.org - Non-development staff concentrate on issues like
testing, or community milestone releases - The content of future releases is specified in a
road map - Work within this is allocated according to
developer preferences and expertise
32How Things Happen cont
- Developers can browse Bugzilla to choose areas on
which they would like to work - Mozilla web pages can be used to note areas where
help is needed - Mozilla operates on a daily build
- Each build is smoke tested by one of 6
pre-release test teams - This is followed by inspections and managed
release
33Mozilla Research Findings
- The points below summarise the research questions
originally considered - 486 people contributed code
- 412 contributed code to fixes
- 6,873 communicated problems (external community
very large, small core) - Code ownership is enforced
34Mozilla Findings cont
- The authors hypotheses 1 and 2 are supported by
the Mozilla data - However, these hypotheses were modified as
summarised below - The core size (10-15) is limited to 10-15 if only
informal coordination is used - Original hypothesis did not discuss impact of
coordination - Project cores larger than 10-15 might require
other mechanisms in addition to code ownership to
improve coordination
35Mozilla Findings cont
- Hypothesis 3 (relative sizes of
core/fixers/reporters) is weakly supported - Core 22 to 35 (larger than expected)
- Fixers 47 to 129
- Reporters 119 to 623
- Mozilla defect density lower than commercial
equivalent projects - although caution more may be found later
36Mozilla - Conclusions
- Commercial/OSS has many possible hybrids
- These hybrids will require a large open source
community to fix bugs - They will also require an even larger community
to find bugs
37Research
- A strong paper
- Good approach to measuring the metrics required
to test the required hypotheses - Citation frequency in the following years
suggests that is regarded as authoritative - However, the final conclusion, as mentioned
previously, must be considered unproven as yet
38References
- http//www.cs.colostate.edu/cs656/reading/reading
-paper.ppt1 (How to read and critique a
technical paper, Colorado State University - Greenhalgh, Trisha, How to read a paper London,
BMJ, 1997 - The above summarised at
- http//www.bmj.com/archive/7102/7102ed.htm
- Note the BMJ references above concern evidence
based medicine, but have some useful sections!
39- Any questions?
- preferably easy ones!