Dependability Theory and Methods Part 1: Introduction and definitions - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Dependability Theory and Methods Part 1: Introduction and definitions

Description:

Reasoning. Predicting the behavior of a system. Need a model. A ... measurements on components (accelerated tests). A. Bobbio. Bertinoro, March 10-14, 2003 ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 75
Provided by: Bob101
Category:

less

Transcript and Presenter's Notes

Title: Dependability Theory and Methods Part 1: Introduction and definitions


1
Dependability Theory and MethodsPart 1
Introduction and definitions
  • Andrea Bobbio
  • Dipartimento di Informatica
  • Università del Piemonte Orientale, A. Avogadro
  • 15100 Alessandria (Italy)
  • bobbio_at_unipmn.it - http//www.mfn.unipmn.it/bob
    bio

Bertinoro, March 10-14, 2003
2
Dependability Definition
Dependability is the property of a system to be
dependable in time, i.e. such that reliance can
justifiably be placed on the service it delivers.
Dependability extends the interest on the system
from the design and construction phase to the
operational phase (life cycle).
3
What dependability theory and practice wants to
avoid
4
Dependability Taxonomy
reliability availability maintainability safety se
curity
measures
dependability
5
Quantitative analysis
The quantitative analysis aims at numerically
evaluating measures to characterize the
dependability of an item
  • Risk assessment and safety
  • Design specifications
  • Technical assistance and maintenance
  • Life cycle cost
  • Market competition

6
Risk assessment and safety
The risk associated to an activity is given
proportional to the probability of occurrence of
the activity and to the magnitute of the
consequences.
R P ? M
A safety critical system is a system whose
incorrect behavior may cause a risk to occur,
causing undesirable consequences to the item, to
the operators, to the population, to the
environment.
7
Design specifications
  • Technological items must be dependable.
  • Some times, dependability requirements (both
    qualitative and quantitative) are part of the
    design specifications
  • Mean time between failures
  • Total down time

8
Technical assistance and maintenance
The planning of all the activity related to the
technical assistance and maintenance is linked to
the system dependability (expected number of
failure in time).
  • planning spare parts and maintenance crews
  • cost of the technical assistance (warranty
    period)
  • preventive vs reactive maintenance.

9
Market competition
  • The choice of the consumers is strongly
    influenced by the perceived dependability.
  • advertisement messages stress the
    dependability
  • the image of a product or of a brand may depend
    on the dependability.

10
Purpose of evaluation
  • Understanding a system
  • Observation
  • Operational environment
  • Reasoning
  • Predicting the behavior of a system
  • Need a model
  • A model is a convenient abstraction
  • Accuracy based on degree of extrapolation

11
Methods of evaluation
  • Measurement-Based
  • Most believable, most expensive
  • Not always possible or cost effective during
    system design
  • Model-Based
  • Less believable, Less expensive
  • Analytic vs Discrete-Event Simulation
  • Combinatorial vs State-Space Methods

12
Measurement-Based
  • Most believable, most expensive
  • Data are obtained observing the behavior of
    physical objects.
  • field observations
  • measurements on prototypes
  • measurements on components (accelerated tests).

13
Models
Closed-form Answers
Numerical Solution
Analytic
Simulation
All models are wrong some models are useful
14
Methods of evaluation
  • Measurements Models data bank

15
The probabilistic approach
The mechanisms that lead to failure a
technological object are very complex and depend
on many physical, chemical, technical, human,
environmental factors.
The time to failure cannot be expressed by a
determin-istic law.
We are forced to assume the time to failure as a
random variable. The quantitative dependability
analysis is based on a probabilistic approach.
16
Reliability
The reliability is a measurable attribute of the
dependability and it is defined as
The reliability R(t) of an item at time t is the
probability that the item performs the required
function in the interval (0 t) given the stress
and environmental conditions in which it operates.
17
Basic Definitions cdf
  • Let X be the random variable representing the
    time to failure of an item.

The cumulative distribution function (cdf) F(t)
of the r.v. X is given by
F(t) Pr X ? t
F(t) represents the probability that the item is
already failed at time t (unreliability) .
18
Basic Definitions cdf
  • Equivalent terminoloy for F(t)
  • CDF (cumulative distribution function)
  • Probability distribution function
  • Distribution function

19
Basic Definitions cdf
F(t)
1
F(b)
F(a)
0
t
a
b
F(0) 0 lim F(t) 1 t?? F(t) non-decreasing
20
Basic Definitions Reliability
  • Let X be the random variable representing the
    time to failure of an item.

The survivor function (sf) R(t) of the r.v. X is
given by
R (t) Pr X gt t 1 - F(t)
R(t) represents the probability that the item is
correctly working at time t and gives the
reliability function .
21
Basic Definitions
  • Equivalent terminology for R(t) 1 -F(t)
  • Reliability
  • Complementary distribution function
  • Survivor function

22
Basic Definitions Reliability
R(t)
1
R(a)
0
t
a
b
R(0) 1 lim R(t) 0 t?? R(t) non-increasing
23
Basic Definitions density
  • Let X be the random variable representing the
    time to failure of an item and let F(t) be a
    derivable cdf

The density function f(t) is defined as
d F(t) f (t)
dt
f (t) dt Pr t ? X lt t dt
24
Basic Definitions Density
f (t)
0
t
a
b
b
? f(x) dx Pr a lt X ? b F(b) F(a)
a
25
Basic Definitions Density
f (t)
1
0
t
26
Basic Definitions
  • Equivalent terminology pdf
  • probability density function
  • density function
  • density
  • f(t)

For a non-negative random variable
27
Quiz 1The higher the MTTF is, the higher the
item reliability is.
  • Correct
  • Wrong

The correct answer is wrong !!!
28
Hazard (failure) rate
  • h(t) ?t Conditional Prob. system will fail in
  • (t, t ?t) given that it is survived until
    time t
  • f(t) ?t Unconditional Prob. System will fail in
  • (t, t ?t)

29
The Failure Rate of a Distribution
  • is the conditional probability that
    the unit will fail in the interval
    given that it is functioning at time t.
  • is the unconditional probability that
    the unit will fail in the interval
  • Difference between the two sentences
  • probability that someone will die between 90 and
    91, given that he lives to 90
  • probability that someone will die between 90 and
    91

30
Bathtub curve
h(t)
(infant mortality burn in)
(wear-out-phase)
CFR Constant fail. rate (useful life)
DFR
IFR
t
Increasing fail. rate
Decreasing failure rate
31
Infant mortality (dfr)
Also called infant mortality phase or reliability
growth phase. The failure rate decreases with
time.
  • Caused by undetected hardware/software defects
  • Can cause significant prediction errors if
    steady-state failure rates are used
  • Weibull Model can be used

32
Useful life (cfr)
The failure rate remains constant in time (age
independent) .
  • Failure rate much lower than in early-life
    period.
  • Failure caused by random effects (as
    environmental shocks).

33
Wear-out phase (ifr)
The failure rate increases with age.
It is characteristic of irreversible aging
phenomena (deterioration, wear-out, fatigue,
corrosion etc) Applicable for mechanical and
other systems. (Properly qualified electronic
parts do not exhibit wear-out failure during its
intended service life) Weibull Failure Model can
be used
34
Exponential Distribution
Failure rate is age-independent (constant).
  • Cumul. distribution function
  • Reliability
  • Density Function
  • Failure Rate (CFR)
  • Mean Time to Failure

35
The Cumulative Distribution Function of an
Exponentially Distributed Random Variable With
Parameter ? 1
F(t)
1.0
F(t) 1 - e
- ? t
0.5
2.50
0
1.25
3.75
5.00
t
36
The Reliability Function of an Exponentially
Distributed Random Variable With Parameter ? 1
R(t)
1.0
0.5
2.50
0
1.25
3.75
5.00
t
37
Exponential Density Function (pdf)
f(t)
MTTF 1/ ?
38
Memoryless Property of the Exponential
Distribution
  • Assume X gt t. We have observed that the
    component has not failed until time t
  • Let Y X - t , the remaining (residual) lifetime

39
Memoryless Property of the Exponential
Distribution (cont.)
  • Thus Gt(y) is independent of t and is identical
    to the original exponential distribution of X
  • The distribution of the remaining life does not
    depend on how long the component has been
    operating
  • An observed failure is the result of some
    suddenly appearing failure, not due to gradual
    deterioration

40
Quiz 3 If two components (say, A and B) have
independent identical exponentially distributed
times to failure, by the memoryless property,
which of the following is true?
  • 1. They will always fail at the same time
  • 2. They have the same probability of failing at
    time t during operation
  • 3. When these two components are operating
    simultaneously, the component which has been
    operational for a shorter duration of time will
    survive longer

41
Weibull Distribution
  • Distribution Function
  • Density Function
  • Reliability

42
Weibull Distribution
? shape parameter ? scale parameter.
Failure Rate
Dfr
Cfr
Ifr
43
Failure Rate of the Weibull Distribution with
Various Values of ?
44
Weibull Distribution for Various Values of ?
Cdf
density
45
Failure Rate Models
  • We use a truncated Weibull Model
  • Infant mortality phase modeled by DFR Weibull and
    the steady-state phase by the exponential

Figure 2.34 Weibull Failure-Rate Model
7 6 5 4 3 2 1 0
Failure-Rate Multiplier
0
2,190
4,380
6,570
8,760
10,950
13,140
15,330
17,520
Operating Times (hrs)
46
Failure Rate Models (cont.)
  • This model has the form
  • where
  • steady-state failure rate
  • is Weibull shape parameter
  • Failure rate multiplier

47
Failure Rate Models (cont.)
  • There are several ways to incorporate time
    dependent failure rates in availability models
  • The easiest way is to approximate a continuous
    function by a piecewise constant step function


Discrete Failure-Rate Model
7 6 5 4 3 2 1 0
Failure-Rate Multiplier
2,190
4,380
6,570
10,950
13,140
15,330
17,520
8,760
0
Operating Times (hrs)
48
Failure Rate Models (cont.)
  • Here the discrete failure-rate model is defined
    by

49
A lifetime experiment
X 1
1
X 2
2
X 3
3
X 4
4
X N
N
t 0
N i.i.d components are put in a life test
experiment.
50
A lifetime experiment
X 1
1
X 2
2
X 3
3
4
X 4
X N
N
51
Repairable systemsAvailability
52
Repairable systems
X 1
X 2
X 3
UP

DOWN
t
Y 1
Y 2
X 1, X 2 . X n Successive UP times Y1, Y 2
. Y n Successive DOWN times
53
Repairable systems
  • The usual hypothesis in modeling repairable
    systems is that
  • The successive UP times X 1, X 2 . X n are
    i.i.d. random variable i.e. samples from a
    common cdf F (t)
  • The successive DOWN times Y1, Y 2 . Y n are
    i.i.d. random variable i.e. samples from a
    common cdf G (t)

54
Repairable systems
X 1
X 2
X 3
UP

DOWN
t
Y 1
Y 2
  • The dynamic behaviour of a repairable system is
    characterized by
  • the r.v. X of the successive up times
  • the r.v. Y of the successive down times

55
Maintainability
  • Let Y be the r.v. of the successive down times
  • G(t) Pr Y ? t (maintainability)
  • d G(t)
  • g (t) (density)
  • dt
  • g(t)
  • h g (t) (repair rate)
  • 1 - G(t)
  • MTTR ? t g(t) dt (Mean Time To
    Repair)

?
0
56
Availability
The measure to characterize a repairable system
is the availability (unavailability)
The avaiability A(t) of an item at time t is the
probability that the item is correctly working at
time t.
57
Availability
  • The measure to characterize a repairable system
    is the availability (unavailability)
  • A(t) Pr time t, system UP
  • U(t) Pr time t, system DOWN
  • A(t) U(t) 1

58
Definition of Availability
  • An important difference between reliability and
    availability is
  • reliability refers to failure-free operation
    during an interval (0 t)
  • availability refers to failure-free operation at
    a given instant of time t (the time when a
    device or system is accessed to provide a
    required function), independently on the number
    of cycles failure/repair.

59
Definition of Availability
I(t)
1
Failed and being restored
Operating and providing a required function
Operating and providing a required function
0
t
1 working 0 failed
I(t) indicator function
System Failure and Restoration Process
60
Availability evaluation
  • In the special case when times to failure and
    times to restoration are both exponentially
    distributed, the alternating process can be
    viewed as a two-state homogeneous Continuous Time
    Markov Chain

Time-independent failure rate
? Time-independent repair rate ?
61
2-State Markov Availability Model
  • Transient Availability analysis
  • for each state, we apply a flow balance equation
  • Rate of buildup rate of flow IN - rate of flow
    OUT

62
2-State Markov Availability Model
63
2-State Markov Availability Model
1
A(t)
Ass
64
2-State Markov Model
1) Pointwise availability A(t)
2) Steady state availability limiting value as
  • If there is no restoration (?0) the
    availability
  • becomes the reliability A(t) R(t)

65
Steady-state Availability
  • Steady-state availability
  • In many system models, the limit
  • exists and is called the steady-state availability

The steady-state availability represents the
probability of finding a system operational after
many fail-and-restore cycles.
66
Steady-state Availability
1
0
UP
DOWN
t
Expected UP time EU(t) MUT MTTF
Expected DOWN time ED(t) MDT MTTR
67
Availability Example (I)
Let a system have a steady state availability Ass
0.95 This means that, given a mission time T,
it is expected that the system works correctly
for a total time of 0.95T. Or, alternatively,
it is expected that the system is out of service
for a total time Uss T (1- Ass) T
68
Availability Example (II)
Let a system have a rated productivity of W
/year. The loss due to system out of service can
be estimated as Uss W (1- Ass) W The
availability (unavailability) is an index to
estimate the real productivity, given the rated
productivity.
Alternatively, if the goal is to have a net
productivity of W /year, the plant must be
designed such that its rated productivity W
should satisfy Uss W W
69
Availability
We can show that This result is valid without
making any assumptions on the form of the
distributions of times to failure times to
repair. Also
70
Motivation High Availability
71
Maintainability
  • MDT (Mean Down Time or MTTR - mean time to
    restoration).
  • The total down time (Y ) consists of
  • Failure detection time
  • Alarm notification time
  • Dispatch and travel time of the repair person(s)
  • Repair or replacement time
  • Reboot time

72
Maintainability
  • The total down time (Y ) consists of
  • Logistic time
  • Administrative times
  • Dispatch and travel time of the repair
    person(s)
  • Waiting time for spares, tools
  • Effective restoration time
  • Access and diagnosis time
  • Repair or replacement time
  • Test and reboot time

73
Maintenance Costs
  • The total cost of a maintenance action consists
    of
  • Cost of spares and replaced parts
  • Cost of person/hours for repair
  • Down-time cost (loss of productivity)
  • The down-time cost (due to a loss of
    productivity) can be the most relevant cost
    factor.

74
Maintenance Policy
  • Is the sequence of action that minimizes the
    total cost related to a down time
  • Reactive maintenance
  • maintenance action is triggered by a failure.
  • Proactive maintenance
  • preventive maintenance policy.
Write a Comment
User Comments (0)
About PowerShow.com