Vous êtes sur la page 1sur 33

Safety Analysis: FMEA

Risk analysis
Lecture 8
Failure modes and effect analysis (FMEA)
• Why: to identify contribution of components failures to
system failure

• How: progressively select the individual components or


functions within a system and investigate their possible
modes of failure

• Information analyzed: possible failure modes, possible


causes, local and system effect, how to fix (remedial
actions)
What is the proper level?
• Depends at which design stage: might be very
general might be very detailed
• Hardware
– To off-the-shelf components
– Or field-replaceable assemblies for which failure modes
are available
• Software as a single component
– Failure modes as worst possible effects

– Does not include human


Example of hardware-oriented FMEA
Evaluation of FMEA
• “+”
• Allows to identify redundancy, single-point
failure, inspection points and how often the
system needs to be serviced
• Technique is complete
• “-”
• Time consuming
• Does not consider effect of multiple or
common-cause failures
Some notes about FMEA
Very often hardware-oriented FMEA formulates
software requirements very vaguely, e.g.,
“modify software to detect failure”.
How to do it better?
Find a common model, i.e., the model which
would be a middle-hand between safety
analysis and software requirements
Example – a conveyor system

A conveyor system consists of a feed belt and an elevating rotary table. The feed belt
transports objects placed on its left end to the right end and then on the table. The table
then elevates and rotates an object to make it available for processing by further
machines. The belt has photo-electric cells, which signals when an object has arrived at
its ends. The motor of the belt may be switched on and off: it has to be on while
waiting for a new object and has to be switched off when an object is at the end
of the belt but cannot be delivered onto the table because the table is not in
proper position. The table lifts and rotates an object clockwise to a position for
further processing. When an object is taken the table moves down and
counterclockwise to accept another object. For the brevity we omit the details
of table’s implementation. We merely observe that the table moves between
two positions: one for loading an object from the feed belt and another for
unloading an object for further machines. Initially the table is in its loading
position and the feed belt is running empty while waiting for an object to be
placed.
Hazards:

– Object is jammed between the belt and the table


– Objects are piled up
Safety requirements
• Safety requirement1: The feed belt may only
convey an object through its exit sensor, if the
table is in the loading position.
• Safety requirement2. A new object may only
be put on the feed belt after the exit sensor
confirms that the last one has arrived at the
end of the feed belt
Statechart of fault free system
Safety requirements in terms of
statecharts
• Safety requirement1:
– FB is in Delivering implies TAB is in
ReadyForLoading
• Safety requirement2:
– EntrySen_On arrives only when FB is in VACANT
Analysis
• The model of “reality” which controller has is
actually a statechart model
• Controller keeps record of state and updates
the state upon arrival of every event
• We introduce variables to model states of
components
Example of “traditional” FMEA

Unit Feed belt entry sensor


Failure Stuck at zero
mode
Possible Primary sensor failure
cause
Local Sensor sends zero signal constantly
effects
System Arrival of an object is undetected. No control over the
effects distance between arriving objects. Danger to pile up.
Remedial Ensure that the fault is always detected. Modify
action software to detect the fault. If fault occurs then switch
on alarm and stop the system
Formalizing FMEA
• The main idea is to use statecharts to express
how each error should be detected and
mitigated
• Two types of detection: Aberrant event and
Timeout
Examples of formalized FMEA
Statechart model of fail-safe conveyor system (see appendix 3)
Fail-safe systems

A fail-safe system upon detection of an error is shut down.


Fail-safe controller

Constants
/*Maximal time to reload object from feed belt to table */
MaxDeliveringTime
/*Maximal time for object to come from beginning to end of feed belt */
MaxTranspDelivTime
Procedures

/* Halts feed belt */
HaltFB = if FB=VACANT ∧ FB_st=VACANT
then FB := HALTINGV || FB_st := HaltedVac
elseif FB=FBLOADED ∧ FB_st=StartTransporting
then FB := HALTINGL || FB_st := HStartTransporting
/* Immediately stops feed belt */
StopFB = if (FB=VACANT ∨ FB=FBLOADED)
then FB := FBSTOPPED || FB_ST := FBSTOPPED

/* Outputs message to an operator */


Warning (Msg: String) = output(Msg)
/* Timers are active then ON */
DeliveryTimer, TranspDelivTimer : {ON, OFF}

/*Time stamps fix time when timers are activated */


DeliveryTimerStamp, TranspDelivTStamp : INT

/*Object arrived at the beginning—activate timer


TranspDelivTimer */
E=EntrySen_ON ∧ FB=VACANT ∧ FB_st=VACANT → E
:=NIL || FB := FBLOADED || FB_st :=
StartTransporting || TranspDelivTimer := ON ||
TranspDelivTStamp := t
/*Object arrived at the end -- deactivate timer TranspDelivTimer */
E=ExitSen_ON ∧ FB= FBLOADED || FB_st = Transporting ∧ TAB=TVACANT ∧
TAB_st = ReadyToLoad → FB_st := Delivering || TranspDelivTimer := OFF

[] E=ExitSen_ON ∧ FB= FBLOADED ∧ FB_st = Transporting ∧ (TAB=TVACANT ∨


TAB=TLOADED) ∧ TAB_st ≠ ReadyToLoad → E :=NIL || FB_st := Waiting ||
TranspDelivTimer := OFF

/* Object passed the exit sensor while the motor was supposed to be OFF */

E=ExitSen_OFF ∧ FB=FBLOADED ∧ FB_st=Waiting → E :=NIL || FB := FBFAILED


|| FB_ST := FBFAILED || call Warning(“Feed belt motor fails to stop”) ||
call StopTAB
Conclusions on safety analysis
• Do not be scared by hardware terms in fault trees and FMEA! Your
knowledge of simple failure modes of sensors and actuators which
we have studied already will be sufficient.
• But always deduce software requirements from safety analysis:
overlooked safety requirements is a greatest threat!
• Always start safety analysis from identifying hazards, i.e., asking
yourself what I would like to avoid happening in this system
• Draw fault tree to analyse how it can happen
• Conduct FMEA to see which components in which failure modes are
contributing to hazards
• Derive specifications of error detection procedures, remedial
actions and make sure you implement them correctly!
Hazard and accident
• Hazard is a potential for an accident
• How do we judge whether a hazard is
acceptable?
• The importance of a hazard is related to the
accidents that may result from it.
• Accident is an unintended event or sequence of
events that causes death, injury, environmental
or material damage

23
Risk

• Two factors are significant for an


accident:
– the potential consequences of any accident
that might result from the hazard
– the frequency (or probability) of such an
accident occurring
• Risk is a combination of the frequency
or probability of a specified hazardous
event, and its consequence.

24
Example
• Failure of a particular component is likely
to result in an explosion that could kill 100
people. It is estimated that this component
will fail once in every 10000 years. What is
the risk associated with that component?
• Risk= severity x frequency =
100 x 0.0001 = 0.01 deaths per year

25
Categories of severity for military
systems
Category Definition
Catastrophic Multiple deaths
Critical A single death, and/or multiple severe
injuries or severe occupational illnesses
Marginal A single severe injury or occupational
illness, and/or multiple minor injuries or
minor occupational illnesses
Negligible At most a single injury or minor
occupational illness

26
Accident probability ranges for
military systems
Accident Occurrence during operational life
frequency considering all instances of the system

Frequent Likely to be continually experienced


Probable Likely to occur often
Occasional Likely to occur several times
Remote Likely to occur some time
Improbable Unlikely, but may exceptionally occur
Incredible Extremely unlikely that the event will
occur at all

27
Risk classification

Severity of
hazardous
event Risk classification

Frequency of
hazardous
event
28
Why classifying risks?
• Risks can be expressed qualitatively and
quantitatively
• Calculation of risk results in a risk class (or
risk level).
• Most standards define a number of risk
classes and then set out development and
design techniques appropriate for each
category of risk

29
Risk classes and interpretations for military
systems

30
As Low As is Reasonably Practicable (ALARP) principle

31
The process of risk reduction

32
Difference between criticality of
systems
• Both an electric toaster and a nuclear reactor
protection system should be adequately safe but
meaning of “adequately” would be different for
these two cases
• Hence the importance of safe operation differs
widely between applications
• Different safety requirements for different
projects mean different levels of risk reduction
required

33

Vous aimerez peut-être aussi