SENG 521 Software Reliability - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

SENG 521 Software Reliability

Description:

Evolution of software engineering paradigms: Assembly languages ... Accept or reject software (operations) using reliability demonstration chart. ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 50
Provided by: Behrou3
Category:

less

Transcript and Presenter's Notes

Title: SENG 521 Software Reliability


1
SENG 521Software Reliability Testing
  • Overview of Software Reliability Engineering

Department of Electrical Computer Engineering,
University of Calgary B.H. Far (far_at_enel.ucalgary.
ca) http//www.enel.ucalgary.ca/far/Lectures/SENG
521/01/
2
Contents
  • About this course.
  • What is software reliability?
  • What factors affect software quality?
  • What is software reliability engineering?
  • Software reliability engineering process.

3
Section 1
  • Basic Concepts
  • Definitions

4
Realities
  • Software development is a very high risk task.
  • About 20 of the software projects are canceled.
    (missed schedules, etc.)
  • About 84 of software projects are incomplete
    when released (need patch, etc).
  • Almost all of the software projects costs exceed
    initial estimations.
  • (cost overrun)

5
Software Engineering /1
  • Business software has a large number of parts
    that have many interactions (i.e., complexity).
  • Software engineering paradigms provide models and
    techniques that make it easier to handle
    complexity.
  • A number of contemporary software engineering.
    paradigms have been proposed
  • Object-orientation
  • Component-ware
  • Design patterns
  • Software architectures
  • etc.

6
Software Engineering /2
  • Evolution of software engineering paradigms
  • Assembly languages
  • Procedural and structured programming
  • Object Oriented programming
  • Component-ware
  • Design patterns
  • Software architectures
  • Software Agents

time
7
What Affects Software?
  • Timeliness
  • Meeting the project deadline.
  • Reaching the market at the right time.
  • Cost
  • Meeting the anticipated project costs.
  • Reliability
  • Working fine for the designated period on the
    designated system.

8
Definition Failure Availability
  • Failure Any departure of system behavior in
    execution from user needs.
  • Failure intensity the number of failures per
    natural or time unit. Failure intensity is way of
    expressing reliability.
  • Availability The probability at any given time
    that a system or a capability of a system
    functions satisfactorily in a specified
    environment.
  • If you are given an average down time per
    failure, availability implies a certain
    reliability.

9
Definition Verification Validation
  • Verification
  • For each development phase or for each module are
    the outputs and inputs generated correctly? And
    do they match correctly?
  • Validation
  • Does the software meet its requirements?

10
Definition Reliability
  • Reliability is the probability that a system or a
    capability of a system functions without failure
    for a specified time or number of natural
    units in a specified environment. (Musa, et al.)
  • A recent survey of software consumers revealed
    that reliability was the most important quality
    attribute of the application software.
  • This course is concerned with the engineering of
    reliable software products.

11
About This Course
  • The topics discussed include
  • Concepts and relationships
  • analytical models and supporting tools
  • techniques for software reliability improvement,
    including
  • fault avoidance, fault elimination, fault
    tolerance
  • error detection and repair,
  • failure detection and retraction
  • risk management.

12
Section 2
  • Reliability

13
Reliability Natural System
  • Natural system life cycle.
  • Aging effect Life span of a natural system is
    limited by the maximum reproduction rate of the
    cells.

14
Reliability Hardware
  • Hardware life cycle.
  • Useful life span of a hardware system is limited
    by the age (wear out) of the system.

15
Reliability Software
  • Software life cycle.
  • Software systems are changed (updated) many times
    during their life cycle.
  • Each update adds to the structural deterioration
    of the software system.

16
Software vs. Hardware
  • Software reliability doesnt decrease with time.
  • Hardware faults are mostly physical faults.
  • Software faults are mostly design faults which
    are harder to measure, model, detect and correct.

17
Reliability Science
  • Exploring ways of implementing reliability in
    software products.
  • Reliability Sciences goals
  • Developing models and techniques to build
    reliable software.
  • Testing such models and techniques for adequacy,
    soundness and completeness.

18
Section 3
  • Reliability
  • Engineering

19
What is Engineering?
  • Engineering
  • Analysis
  • Design
  • Construction
  • Verification
  • Management
  • What is the problem to be solved?
  • What characters of the entity are used to solve
    the problem?
  • How will the entity be realized?
  • How it is constructed?
  • What approach is used to uncover errors in design
    and construction?
  • How will the entity be supported in the long term?

20
Reliability Engineering /1
  • Engineering of reliability in software
    products.
  • Reliability Engineerings goal
  • developing software to reach the market
  • With minimum development time
  • With minimum development cost
  • With maximum reliability

Software Quality
21
Reliability Engineering /2
Software quality means getting the right balance
among development cost, development time and
reliability.
  • Pick quantitative representations for the 3
    factors (cost, time and reliability) and measure
    them!

22
What is SRE? /1
  • Software Reliability Engineering (SRE) is a
    multi-faceted discipline covering the software
    product lifecycle.
  • It involves both technical and management
    activities in three basic areas
  • Software Development and Maintenance
  • Measurement and Analysis of Reliability Data,
  • Feedback of Reliability Information into the
    software lifecycle activities.

23
What is SRE ? /2
  • SRE is a practice for quantitatively planning and
    guiding software development and test, with
    emphasis on reliability and availability.
  • SRE simultaneously does three things
  • It ensures that product reliability and
    availability meet user needs.
  • It delivers the product to market faster.
  • It increases productivity, lowering product
    life-cycle cost.
  • In applying SRE, one can vary relative emphasis
    placed on these three factors.

24
Section 4
  • Software Reliability
  • Engineering (SRE) Process

25
SRE Process /1
  • There are 5 steps in SRE process (for each system
    to test)
  • Define necessary reliability
  • Develop operational profiles
  • Prepare for test
  • Execute test
  • Apply failure data to guide decisions

26
SRE Process /2
  • The Develop Operational Profiles, and Prepare for
    Test activities all start during the Requirements
    and Architecture phases of the software
    development process.
  • They all extend to varying degrees into the
    Design and Implementation phase, as they can be
    affected by it.
  • The Execute Test and Guide Test activities
    coincide with the Test phase.

27
SRE Necessary Reliability
  • Define what failure means for the product.
  • Choose a common measure for all failure
    intensities, either failures per some natural
    unit or failures per hour.
  • Set the total system failure intensity objective
    (FIO).
  • Compute a developed software FIO by subtracting
    the total of the FIOs of all hardware and
    acquired software components from the system
    FIOs.
  • Use the developed software FIOs to track the
    reliability growth during system test.

28
SRE Operational Profile /1
  • An operation is a major system logical task,
    which returns control to the system when
    complete.
  • An operational profile is a complete set of
    operations with their probabilities of occurrence.

29
SRE Operational Profile /2
  • There are four principal steps in developing an
    operational profile
  • Identify the operation initiators
  • List the operations invoked by each initiator
  • Determine the occurrence rates
  • Determine the occurrence probabilities by
    dividing the occurrence rates by the total
    occurrence rate
  • There are three kinds of initiators user types,
    external systems, and the system itself.

30
SRE Operational Profile /3
  • Review Operational profile
  • Review the functionality to be implemented to
    remove operations that are not likely to be worth
    their cost
  • Suggest operations where opportunities for reuse
    will be most cost-effective
  • Plan a more competitive release strategy using
    operational development. With operational
    development, development proceeds operation by
    operation, ordered by the operational profile.
    This makes it possible to deliver the most used,
    most critical capabilities to customers earlier
    than scheduled.
  • Allocate resources for requirements, design, and
    code reviews among operations to cut schedules
    and costs
  • Allocate system engineering, architectural
    design, development, and code resources among
    operations to cut schedules and costs
  • Allocate development, code, and test resources
    among modules to cut schedules and costs

31
SRE Prepare for Test
  • The Prepare for Test activity uses the
    operational profiles to prepare test cases and
    test procedures.
  • Test cases are allocated in accordance with the
    operational profile.
  • Test cases are assigned to the operations by
    selecting from all the possible intra-operation
    choices with equal probability.
  • The test procedure is the controller that invokes
    test cases during execution.

32
SRE Execute Test
  • Allocate test time among the associated systems
    and types of test (feature, load, regression,
    etc.).
  • Invoke the test cases at random times, choosing
    operations randomly in accordance with the
    operational profile.
  • Identify failures, along with when they occur.
  • This information will be used in Apply Failure
    Data and Guide Test.

33
Types of Test
  • Reliability Growth Test
  • Certification Test

34
SRE Apply Failure Data
  • Plot each new failure as it occurs on a
    reliability demonstration chart.
  • Accept or reject software (operations) using
    reliability demonstration chart.
  • Track reliability growth as faults are removed.

35
Collect Field Data
  • SRE for the software product lifecycle.
  • Collect field data to use in succeeding releases
    either using automatic reporting routines or
    manual collection, using a random sample of field
    sites.
  • Collect data on failure intensity and on customer
    satisfaction and use this information in setting
    the failure intensity objective for the next
    release.
  • Measure operational profiles in the field and use
    this information to correct the operational
    profiles we estimated.
  • Collect information to refine the process of
    choosing reliability strategies in future
    projects.

36
Section 5
  • Error
  • Failure

37
Definition Fault
  • A fault is a cause for either a failure of the
    program or an internal error (e.g., an incorrect
    state, incorrect timing)
  • A fault must be detected and then removed
  • Fault can be removed without execution (e.g.,
    code inspection, design review)
  • Fault removal due to execution depends on the
    occurrence of associated failure.
  • Occurrence depends on length of execution time
    and operational profile.

38
Definition Error
  • Error has two meanings
  • A discrepancy between a computed, observed or
    measured value or condition and the true,
    specified or theoretically correct value or
    condition.
  • A human action that results in software
    containing a fault.
  • Human errors are the hardest to detect.

39
More Definitions
  • Defect refers to either fault (cause) or failure
    (effect)
  • Service expected behavior of a software system
  • Availability system uptime divided by the sum of
    system uptime and downtime.

40
Failure Specification /1
Time based failure specification
  1. Time of failure
  2. Time interval between failures
  3. Cumulative failure up to a given time
  4. Failures experienced in a time interval

Failure no. Failure times (hours) Failure interval (hours)
1 10 10
2 19 9
3 32 13
4 43 11
5 58 15
6 70 12
7 88 18
8 103 15
9 125 22
10 150 25
11 169 19
12 199 30
13 231 32
14 256 25
15 296 40
41
Failure Specification /2
Failure based failure specification
  1. Time of failure
  2. Time interval between failures
  3. Cumulative failure up to a given time
  4. Failures experienced in a time interval

Time(s) Cumulative Failures Failures in interval
30 2 2
60 5 3
90 7 2
120 8 1
150 10 2
180 11 1
210 12 1
240 13 1
270 14 1
42
Failure Specification /3
  • Many reliability modeling programs and tools
    based on them (e.g., SMERFS, and CASRE) have the
    capability to estimate model parameters from
    either failure count or time interval between
    failures data.

43
Failure Functions /1
Failure distribution
  • Cumulative Failure Function (mean value function)
    denotes the average cumulative failures
    associated with each time point.

Failures in time period Probability Value X Probability
0 0.10 0.00
1 0.18 0.18
2 0.22 0.44
3 0.16 0.48
4 0.11 0.44
5 0.08 0.40
6 0.05 0.30
7 0.04 0.28
8 0.03 0.24
9 0.02 0.18
10 0.01 0.10
Cumulative failure Cumulative failure 3.04
44
Failure Functions /2
  • Failure Intensity Function (FIF) represents the
    rate of change of cumulative failure function.
  • As faults are removed, failure intensity tends to
    drop and reliability tends to increase.

45
Failure Functions /3
  • Meantime to Failure (MTTF) expected time that
    next failure will be observed.
  • R(x) is the reliability.
  • Meantime to Repair (MTTR) expected time until
    the system will be repaired.

46
Failure Functions /4
  • Failure Rate Function the probability that a
    failure per unit time occurs in the interval
  • t, t?t given the failure has not occurred
    before t.
  • Meantime Between Failures (MTBF)
  • MTBF MTTF MTTR
  • Availability can also be defined as

47
Failure Functions /5
Failure(s) in time period Probability Probability
Failure(s) in time period Elapsed time (1 hour) Elapsed time (5 hours)
0 0.10 0.01
1 0.18 0.02
2 0.22 0.03
3 0.16 0.04
4 0.11 0.05
5 0.08 0.07
6 0.05 0.09
7 0.04 0.12
8 0.03 0.16
9 0.02 0.13
10 0.01 0.10
11 0 0.07
12 0 0.05
13 0 0.03
14 0 0.02
15 0 0.01
Mean 3.04 7.77
48
Reliability Model
Fault removal Failure discovery (e.g., extent of
execution, operational profile) Quality of repair
activity
Fault introduction Characteristics of the
product (e.g., program size) Development process
(e.g., SE tools and techniques, staff
experiences, etc.)
Reliability Model
Environment
49
Conclusion
  • Software Reliability Engineering (SRE) can offer
    metrics to help elevate a software development
    organization to the upper levels of software
    development maturity.
Write a Comment
User Comments (0)
About PowerShow.com