Safety Related Systems - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Safety Related Systems

Description:

Safety Related Systems. The second of four lectures on the real ... Weapons systems (torpedo, Vincennes Aegis) Control systems (dams, mines, Thames barrier etc) ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 21
Provided by: Martyn67
Category:

less

Transcript and Presenter's Notes

Title: Safety Related Systems


1
Safety Related Systems
  • The second of four lectures on the real world of
    computing.
  • Martyn Thomas

2
En Route ATC at Swanwick
3
Airspace
4
Control Room
5
RS 6000 workstations
6
A medium sized system
  • 114 controller workstations
  • 20 supervisory/management positions
  • 10 engineering positions
  • 48-workstation simulator
  • 2 15-workstation test systems
  • 2.5 million lines of software
  • gt500 processors

7
Development
  • Project start 1989
  • Planned operational date 1996
  • Actual operational date, Jan 27th 2002
  • but Mitre Corp forecast 13 years
  • Safety Case audited by CAA
  • top hazards
  • radio failure
  • plausible but incorrect data

8
Operational data
  • 1,667,381 flights in 2002
  • Continuous operation,
  • one 3-hour failure
  • other flight delays caused by NAS failures at
    West Drayton
  • 10Mb ARM data collected each minute - key
    measures better than forecast.

9
Challenges for the future
  • Current ATC safety depends on the controllers
    ability to clear their sector with radio only.
  • Future traffic growth requires higher densities,
    controllers will not be able to maintain a mental
    picture of the traffic.
  • So future ATC will depend on automatic systems,
    which must not fail.
  • Target? At least the avionics standard10-8 pfh
  • No current air traffic management systems are
    built to such standards.

10
Some safety-critical systems
  • Medical radiotherapy systems
  • Therac-25 deaths
  • Nuclear power-station control/shutdown
  • Avionics (TCAS, A320, Boeing 777, )
  • Railway signalling
  • Weapons systems (torpedo, Vincennes Aegis)
  • Control systems (dams, mines, Thames barrier etc)

11
Safety principles ALARP
Tolerable, only if further risk reduction is not
practicable (i.e. impossible, or unreasonably
expensive).
ALARP REGION The risk may be tolerable if the
benefit is sufficiently great to justify it, and
if the risk has been reduced As Low As Reasonably
Practicable.
Lower risk means that less cost is practicable
in reducing it further. This reducing pressure to
improve is represented by the shape of the
triangle

Broadly Acceptable Region No detailed
justification required
Negligible Risk
12
Risk Defined in IEC 61508 Part 4 as the
probable rate of occurrence of a hazard causing
harm and the degree of severity of harm.
FREQUENCY CONSEQUENCE Catastrophic
Critical Marginal Negligible Frequent I I
I II Probable I I II
III Occasional I II III
III Remote II III III
IV Improbable III III IV
IV Incredible IV IV IV IV
I - intolerable risk II - undesirable risk, and
tolerable only if risk reduction is impractical
or if the costs are grossly disproportionate to
the improvement gained III - tolerable risk if
the cost of the risk reduction would exceed the
improvement gained IV - negligible (acceptable)
risk
13
Safety Integrity Levels (SILs)
SIL Continuous / High Demand
Mode pfh 4 ³ 10-9 to lt 10-8 3 ³ 10-8 to
lt 10-7 2 ³ 10-7 to lt 10-6 1 ³ 10-6 to lt
10-5
IEC 61508 indicative probabilities
14
Assurance
  • Assurance showing that a system has the required
    safety
  • Much harder than just developing a system that is
    safe enough
  • what evidence is sufficient?
  • How safe is a system that has never failed?
  • What evidence does testing provide?
  • How can we do better?

15
How safe is a system that has never failed?
  • If it has run for n hours without failure, and if
    the operating conditions remain much the same,
    the best estimate for the probability of failure
    in the next n hours is
  • 0.5
  • So, to show that a system has a pfh of lt10-4 with
    50 confidence, we need about 14 months of
    fault-free testing.
  • 10,000 hours is 13.89 months

16
What evidence does testing provide?
  • Testing shows the presence, not the absence, of
    bugs - Dijkstra
  • We cannot test every path.
  • Testing functions, or boundary conditions, may
    find faults but test that work provide no
    evidence of pfh.
  • Statistical testing, under operational
    conditions, provides evidence of pfh.
  • But it takes a very long time.

17
Statistical testing
  • To show an MTBF of n hours, with 99 confidence,
    takes around 10n hours of testing with no faults
    found. So to show the SIL 4 claim (10-8 pfh)
    takes around 109 hours (gt100,000 years.)
  • With good prior evidence, e.g. from a strong
    process, using a Bayesian approach may reduce
    this to lt10,000 years

18
One can construct convincing proofs quite
readily of the ultimate futility of exhaustive
testing of a program and even of testing by
sampling. So how can one proceed? The role of
testing, in theory, is to establish the base
propositions of an inductive proof. You should
convince yourself, or other people, as firmly as
possible, that if the program works a certain
number of times on specified data, then it will
always work on any data. This can be done by an
inductive approach to the proof. C A R Hoare
1969
19
The role of formal methods
  • Formal specifications make safety properties
    explicit and unambiguous.
  • Formal proof of safety invariants.
  • Formal demonstration of equivalence classes
    that can be tested.
  • Formal analysis of the impact of changes reduces
    the assurance effort after maintenance.

20
Dependability and strong software engineering
  • is the subject of next weeks lecture.
Write a Comment
User Comments (0)
About PowerShow.com