SENG 521 Software Reliability - PowerPoint PPT Presentation

1 / 49

About This Presentation

Title:

SENG 521 Software Reliability

Description:

Evolution of software engineering paradigms: Assembly languages ... Accept or reject software (operations) using reliability demonstration chart. ... – PowerPoint PPT presentation

Number of Views:94

Avg rating:3.0/5.0

Slides: 50

Provided by: Behrou3

Category:

more less

Transcript and Presenter's Notes

Title: SENG 521 Software Reliability

1
SENG 521Software Reliability Testing

Overview of Software Reliability Engineering

Department of Electrical Computer Engineering,
University of Calgary B.H. Far (far_at_enel.ucalgary.
ca) http//www.enel.ucalgary.ca/far/Lectures/SENG
521/01/
2
Contents

About this course.
What is software reliability?
What factors affect software quality?
What is software reliability engineering?
Software reliability engineering process.

3
Section 1

Basic Concepts
Definitions

4
Realities

Software development is a very high risk task.
About 20 of the software projects are canceled.
(missed schedules, etc.)
About 84 of software projects are incomplete
when released (need patch, etc).
Almost all of the software projects costs exceed
initial estimations.
(cost overrun)

5
Software Engineering /1

Business software has a large number of parts
that have many interactions (i.e., complexity).
Software engineering paradigms provide models and
techniques that make it easier to handle
complexity.
A number of contemporary software engineering.
paradigms have been proposed
Object-orientation
Component-ware
Design patterns
Software architectures
etc.

6
Software Engineering /2

Evolution of software engineering paradigms
Assembly languages
Procedural and structured programming
Object Oriented programming
Component-ware
Design patterns
Software architectures
Software Agents

time
7
What Affects Software?

Timeliness
Meeting the project deadline.
Reaching the market at the right time.
Cost
Meeting the anticipated project costs.
Reliability
Working fine for the designated period on the
designated system.

8
Definition Failure Availability

Failure Any departure of system behavior in
execution from user needs.
Failure intensity the number of failures per
natural or time unit. Failure intensity is way of
expressing reliability.
Availability The probability at any given time
that a system or a capability of a system
functions satisfactorily in a specified
environment.
If you are given an average down time per
failure, availability implies a certain
reliability.

9
Definition Verification Validation

Verification
For each development phase or for each module are
the outputs and inputs generated correctly? And
do they match correctly?
Validation
Does the software meet its requirements?

10
Definition Reliability

Reliability is the probability that a system or a
capability of a system functions without failure
for a specified time or number of natural
units in a specified environment. (Musa, et al.)
A recent survey of software consumers revealed
that reliability was the most important quality
attribute of the application software.
This course is concerned with the engineering of
reliable software products.

11
About This Course

The topics discussed include
Concepts and relationships
analytical models and supporting tools
techniques for software reliability improvement,
including
fault avoidance, fault elimination, fault
tolerance
error detection and repair,
failure detection and retraction
risk management.

12
Section 2

Reliability

13
Reliability Natural System

Natural system life cycle.
Aging effect Life span of a natural system is
limited by the maximum reproduction rate of the
cells.

14
Reliability Hardware

Hardware life cycle.
Useful life span of a hardware system is limited
by the age (wear out) of the system.

15
Reliability Software

Software life cycle.
Software systems are changed (updated) many times
during their life cycle.
Each update adds to the structural deterioration
of the software system.

16
Software vs. Hardware

Software reliability doesnt decrease with time.
Hardware faults are mostly physical faults.
Software faults are mostly design faults which
are harder to measure, model, detect and correct.

17
Reliability Science

Exploring ways of implementing reliability in
software products.
Reliability Sciences goals
Developing models and techniques to build
reliable software.
Testing such models and techniques for adequacy,
soundness and completeness.

18
Section 3

Reliability
Engineering

19
What is Engineering?

Engineering
Analysis
Design
Construction
Verification
Management

What is the problem to be solved?
What characters of the entity are used to solve
the problem?
How will the entity be realized?
How it is constructed?
What approach is used to uncover errors in design
and construction?
How will the entity be supported in the long term?

20
Reliability Engineering /1

Engineering of reliability in software
products.
Reliability Engineerings goal
developing software to reach the market
With minimum development time
With minimum development cost
With maximum reliability

Software Quality
21
Reliability Engineering /2
Software quality means getting the right balance
among development cost, development time and
reliability.

Pick quantitative representations for the 3
factors (cost, time and reliability) and measure
them!

22
What is SRE? /1

Software Reliability Engineering (SRE) is a
multi-faceted discipline covering the software
product lifecycle.
It involves both technical and management
activities in three basic areas
Software Development and Maintenance
Measurement and Analysis of Reliability Data,
Feedback of Reliability Information into the
software lifecycle activities.

23
What is SRE ? /2

SRE is a practice for quantitatively planning and
guiding software development and test, with
emphasis on reliability and availability.
SRE simultaneously does three things
It ensures that product reliability and
availability meet user needs.
It delivers the product to market faster.
It increases productivity, lowering product
life-cycle cost.
In applying SRE, one can vary relative emphasis
placed on these three factors.

24
Section 4

Software Reliability
Engineering (SRE) Process

25
SRE Process /1

There are 5 steps in SRE process (for each system
to test)
Define necessary reliability
Develop operational profiles
Prepare for test
Execute test
Apply failure data to guide decisions

26
SRE Process /2

The Develop Operational Profiles, and Prepare for
Test activities all start during the Requirements
and Architecture phases of the software
development process.
They all extend to varying degrees into the
Design and Implementation phase, as they can be
affected by it.
The Execute Test and Guide Test activities
coincide with the Test phase.

27
SRE Necessary Reliability

Define what failure means for the product.
Choose a common measure for all failure
intensities, either failures per some natural
unit or failures per hour.
Set the total system failure intensity objective
(FIO).
Compute a developed software FIO by subtracting
the total of the FIOs of all hardware and
acquired software components from the system
FIOs.
Use the developed software FIOs to track the
reliability growth during system test.

28
SRE Operational Profile /1

An operation is a major system logical task,
which returns control to the system when
complete.
An operational profile is a complete set of
operations with their probabilities of occurrence.

29
SRE Operational Profile /2

There are four principal steps in developing an
operational profile
Identify the operation initiators
List the operations invoked by each initiator
Determine the occurrence rates
Determine the occurrence probabilities by
dividing the occurrence rates by the total
occurrence rate
There are three kinds of initiators user types,
external systems, and the system itself.

30
SRE Operational Profile /3

Review Operational profile
Review the functionality to be implemented to
remove operations that are not likely to be worth
their cost
Suggest operations where opportunities for reuse
will be most cost-effective
Plan a more competitive release strategy using
operational development. With operational
development, development proceeds operation by
operation, ordered by the operational profile.
This makes it possible to deliver the most used,
most critical capabilities to customers earlier
than scheduled.
Allocate resources for requirements, design, and
code reviews among operations to cut schedules
and costs
Allocate system engineering, architectural
design, development, and code resources among
operations to cut schedules and costs
Allocate development, code, and test resources
among modules to cut schedules and costs

31
SRE Prepare for Test

The Prepare for Test activity uses the
operational profiles to prepare test cases and
test procedures.
Test cases are allocated in accordance with the
operational profile.
Test cases are assigned to the operations by
selecting from all the possible intra-operation
choices with equal probability.
The test procedure is the controller that invokes
test cases during execution.

32
SRE Execute Test

Allocate test time among the associated systems
and types of test (feature, load, regression,
etc.).
Invoke the test cases at random times, choosing
operations randomly in accordance with the
operational profile.
Identify failures, along with when they occur.
This information will be used in Apply Failure
Data and Guide Test.

33
Types of Test

Reliability Growth Test
Certification Test

34
SRE Apply Failure Data

Plot each new failure as it occurs on a
reliability demonstration chart.
Accept or reject software (operations) using
reliability demonstration chart.
Track reliability growth as faults are removed.

35
Collect Field Data

SRE for the software product lifecycle.
Collect field data to use in succeeding releases
either using automatic reporting routines or
manual collection, using a random sample of field
sites.
Collect data on failure intensity and on customer
satisfaction and use this information in setting
the failure intensity objective for the next
release.
Measure operational profiles in the field and use
this information to correct the operational
profiles we estimated.
Collect information to refine the process of
choosing reliability strategies in future
projects.

36
Section 5

Error
Failure

37
Definition Fault

A fault is a cause for either a failure of the
program or an internal error (e.g., an incorrect
state, incorrect timing)
A fault must be detected and then removed
Fault can be removed without execution (e.g.,
code inspection, design review)
Fault removal due to execution depends on the
occurrence of associated failure.
Occurrence depends on length of execution time
and operational profile.

38
Definition Error

Error has two meanings
A discrepancy between a computed, observed or
measured value or condition and the true,
specified or theoretically correct value or
condition.
A human action that results in software
containing a fault.
Human errors are the hardest to detect.

39
More Definitions

Defect refers to either fault (cause) or failure
(effect)
Service expected behavior of a software system
Availability system uptime divided by the sum of
system uptime and downtime.

40
Failure Specification /1
Time based failure specification

Time of failure
Time interval between failures
Cumulative failure up to a given time
Failures experienced in a time interval

Failure no. Failure times (hours) Failure interval (hours)
1 10 10
2 19 9
3 32 13
4 43 11
5 58 15
6 70 12
7 88 18
8 103 15
9 125 22
10 150 25
11 169 19
12 199 30
13 231 32
14 256 25
15 296 40
41
Failure Specification /2
Failure based failure specification

Time of failure
Time interval between failures
Cumulative failure up to a given time
Failures experienced in a time interval

Time(s) Cumulative Failures Failures in interval
30 2 2
60 5 3
90 7 2
120 8 1
150 10 2
180 11 1
210 12 1
240 13 1
270 14 1
42
Failure Specification /3

Many reliability modeling programs and tools
based on them (e.g., SMERFS, and CASRE) have the
capability to estimate model parameters from
either failure count or time interval between
failures data.

43
Failure Functions /1
Failure distribution

Cumulative Failure Function (mean value function)
denotes the average cumulative failures
associated with each time point.

Failures in time period Probability Value X Probability
0 0.10 0.00
1 0.18 0.18
2 0.22 0.44
3 0.16 0.48
4 0.11 0.44
5 0.08 0.40
6 0.05 0.30
7 0.04 0.28
8 0.03 0.24
9 0.02 0.18
10 0.01 0.10
Cumulative failure Cumulative failure 3.04
44
Failure Functions /2

Failure Intensity Function (FIF) represents the
rate of change of cumulative failure function.
As faults are removed, failure intensity tends to
drop and reliability tends to increase.

45
Failure Functions /3

Meantime to Failure (MTTF) expected time that
next failure will be observed.
R(x) is the reliability.
Meantime to Repair (MTTR) expected time until
the system will be repaired.

46
Failure Functions /4

Failure Rate Function the probability that a
failure per unit time occurs in the interval
t, t?t given the failure has not occurred
before t.
Meantime Between Failures (MTBF)
MTBF MTTF MTTR
Availability can also be defined as

47
Failure Functions /5
Failure(s) in time period Probability Probability
Failure(s) in time period Elapsed time (1 hour) Elapsed time (5 hours)
0 0.10 0.01
1 0.18 0.02
2 0.22 0.03
3 0.16 0.04
4 0.11 0.05
5 0.08 0.07
6 0.05 0.09
7 0.04 0.12
8 0.03 0.16
9 0.02 0.13
10 0.01 0.10
11 0 0.07
12 0 0.05
13 0 0.03
14 0 0.02
15 0 0.01
Mean 3.04 7.77
48
Reliability Model
Fault removal Failure discovery (e.g., extent of
execution, operational profile) Quality of repair
activity
Fault introduction Characteristics of the
product (e.g., program size) Development process
(e.g., SE tools and techniques, staff
experiences, etc.)
Reliability Model
Environment
49
Conclusion

Software Reliability Engineering (SRE) can offer
metrics to help elevate a software development
organization to the upper levels of software
development maturity.

Write a Comment

User Comments (0)