Vous êtes sur la page 1sur 8

ARTICLE IN PRESS

Reliability Engineering and System Safety 95 (2010) 606613

Contents lists available at ScienceDirect

Reliability Engineering and System Safety


journal homepage: www.elsevier.com/locate/ress

Application of Petri nets to reliability prediction of occupant safety systems


with partial detection and repair
Andre Kleyner a,, Vitali Volovoi b
a
Delphi Corporation, Electronics and Safety Division, P.O. Box 9005, M.S. CTC 2E, Kokomo, IN 46904, USA
b
School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA

a r t i c l e in f o a b s t r a c t

Article history: This paper presents an application of stochastic Petri nets (SPN) to calculate the availability of safety
Received 17 July 2009 critical on-demand systems. Traditional methods of estimating system reliability include standards-
Received in revised form based or eld return-based reliability prediction methods. These methods do not take into account the
15 January 2010
effect of fault-detection capability and penalize the addition of detection circuitry due to the higher
Accepted 19 January 2010
parts count. Therefore, calculating system availability, which can be linked to the systems probability
Available online 29 January 2010
of failure on demand (Pfd), can be a better alternative to reliability prediction. The process of estimating
Keywords: the Pfd of a safety system can be further complicated by the presence of system imperfections such as
Safety critical partial-fault detection by users and untimely or uncompleted repairs. Additionally, most system
Failure on demand
failures cannot be represented by Poisson process Markov chain methods, which are commonly utilized
Occupant safety
for the purposes of estimating Pfd, as these methods are not well-suited for the analysis of non-Poisson
Petri nets
System availability failures. This paper suggests a methodology and presents a case study of SPN modeling adequately
Fault detection handling most of the above problems. The model will be illustrated with a case study of an automotive
Airbag electronics airbag controller as an example of a safety critical on-demand system.
IEC 61508 & 2010 Elsevier Ltd. All rights reserved.
ISO 26262

1. Introduction The ability to estimate reliability of a safety system is critical to


a successful system design and there are various ways to
1.1. Reliability of safety-critical systems approach this problem. A reliability prediction is one of the most
common forms of reliability analysis for calculating failure rate
Reliability of safety-critical systems receives special attention and mean time between failures (MTBF). When actual product
in many industries including automotive, aviation, energy and reliability data is not available, standard-based reliability predic-
chemical. Examples of the safety-critical systems include emer- tions [2] may be used to evaluate design feasibility, compare
gency power generators, re alarms and occupant safety systems design alternatives, identify potential failure areas, trade-off
such as airbags, seatbelt pretensions and knee bolsters. In many system design factors and track reliability improvements [3].
cases product specications include the expected reliability or However, reliability prediction of a safety system working in
some other values reecting the probability of such system to be on-demand mode might require an approach involving the
operational when required. In some instances reliability numbers probability of the system being available on demand (1  Pfd),
are legislated. For example, the International Electrotechnical where Pfd is the probability of failure on demand.
Commission established an international standard, IEC 61508 [1] A good example of an on-demand emergency system would be
that applies to almost all electrical/electronic/programmable an occupant safety system such as an airbag controller which
electronic safety-related systems including the associated with includes functions such as crash sensing and generating the signal
it automotive industry standard ISO 26262. In the case of the to deploy an airbag. Depending on design, an airbag controller
occupant safety systems such as airbags, reliability numbers are may have a number of ring loops ranging from 2 to 24 or more
specied by the automotive OEMs and are treated as safety (see for example [4]). Automotive occupant safety systems have
critical. been evolving for the past 20 years to reduce the number of
automobile injuries and deaths. Initially, individual passive
devices and features such as seat belts, airbags and knee bolsters
were developed to help save lives and minimize injuries when an
 Corresponding author.
accident occurred. Today, heightened industry and consumer
E-mail addresses: andre.v.kleyner@delphi.com (A. Kleyner),
vitali.volovoi@ae.gatech.edu (V. Volovoi). safety initiatives and increased government regulations strive to

0951-8320/$ - see front matter & 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ress.2010.01.008
ARTICLE IN PRESS
A. Kleyner, V. Volovoi / Reliability Engineering and System Safety 95 (2010) 606613 607

provide increased protection to vehicle occupants under any If, as in colored Petri nets [16], tokens can have unique
condition. identities (labels), an alternative interpretation of ring facilitates
In order to account for the fault detection and consequent the preservation of the information about the systems past
repair of the system, the availability, or (1  Pfd), should be states: rather than considering removing a token from the
evaluated instead of simple reliability. In addition to that, for transitions input place and depositing a different token to the
safety-related systems, reliability requirements in product speci- output place as two disjoint actions, one can unite these two
cations are typically very high (0.9999 and higher), which would actions into a single action of moving the same token from an
associate with SIL 34 categories of IEC 61508 [1] or ASIL CD of input place to the output place. Memory can be assigned to tokens
ISO 26262. Therefore, traditional reliability demonstration testing with the result of the aging tokens [17]. Such tokens can move
would be cost prohibitive due to the extremely large number of freely throughout the Petri net without losing their memory.
test samples required to demonstrate those kinds of numbers [5]. While proliferation of great variety of versions and modeling
The only reasonable option to meet the specication would be to styles used in SPN modeling can be construed as a testimony of its
conduct a comprehensive modeling of the system availability. popularity and exibility, this also facilitates confusion among
To this end, the use of stochastic Petri nets (SPN) can be reliability practitioners who are used to relatively rigid and
suggested, as described next. standardized frameworks such as reliability block diagrams and
fault trees. In this context, clarity and simplicity of modeling is of
great importance [18] and the reader is invited to compare the
1.2. Reliability modeling using stochastic Petri nets models presented in this paper to the previously published
models that address a similar application [19].
A graphical framework called Petri nets was introduced by C.A.
Petri [6]. This framework focuses on modeling component states
that comprise the system, so that the state of the system can be 2. Model formulation
inferred from the states of its components. Possible states (called
places) are denoted with circles with the objects called tokens
This section will present several modeling scenarios for the
(denoted by small lled circles) occupying one of the places at a
reliability/availability of an automotive occupant safety system
time. The combined position of all the tokens in Petri net is referred
simulated as on-demand emergency system.
to as marking. Possible paths of token movements among places are
modeled using so-called transitions depicted as lled rectangles.
Movements of the tokens correspond to ring of transitions, 2.1. Modeling procedure
where the tokens from all input places are removed, and tokens
are deposited to all output places for this transition. Importantly, The function of the automotive occupant safety system is to
the ring of a transition can only occur when it is enabled, i.e., provide an emergency function (e.g., airbag deployment) in
certain conditions are satised. For example, an inhibitor arc that response to an event such as a vehicle crash. The simplied
connects a place anywhere in the model and a transition (the arc version of an emergency on-demand system in Fig. 1 consists of
is depicted by using a hollow circle at the transitions end) can the fault-detection system, power supply and user warning
disable the transition if there is sufcient number of tokens in the system, which can be as simple as a warning light.
place (this place does not have to be either input or output place The systems failure to perform its functions (i.e., to deploy an
for this transition). airbag in the case of vehicle crash) can occur when the emergency
The original Petri net has not included the concept of time, so system failure is combined with one of the following conditions:
that enabled transitions re immediately. Such Petri nets can be
particularly useful in safety assessments as formal methods are
1. System failure is not detected by the fault-detection system.
available to analyze so-called reachability of undesirable
2. System detected the problem, but failed to notify the user.
(unsafe) states and identify non-trivial scenarios that can lead
3. System notied the user, but the user failed to take reparative
to unsafe states [7]. These scenarios are of great importance for
action.
safety and reliability as they are analogous to cut sets in fault
4. The repair was scheduled or initiated, but was not completed
trees in the dynamic context where the order of events is taken
before the vehicle crash.
into account. However, the likelihood of those scenarios cannot be
quantitatively evaluated without explicit account of timing. To
this end, an extension called a stochastic Petri nets (SPN) was
developed some years later [8] and is a subset of so-called non-
autonomous Petri nets [9] and is of particular relevance to the
modeling of time-dependent system reliability (see, for example
[10,11]). SPN introduces delays between the enabling and ring of
a transition that are transitions attributes and can be either
absent, deterministic, or sampled from a given distribution
(stochastic). It is possible to provide an equivalent model to the
Markov representation exponential distributions for ring delays.
SPN is often used as a modeling preprocessor: so the model is
internally converted to Markov state space and solved using
standard Markov methods [12]. However, a discrete event (e.g.
Monte Carlo) simulation can be used to solve SPN directly [13] as
opposed to using the Markov method, which allows the use of
non-exponential statistical distributions. Depending on the con-
guration of the system, the error due to the use of an exponential
delay with the same mean value for non-exponential distributions Fig. 1. Simplied diagram of an automotive occupant safety system with fault
can be quite signicant [14,15]. detection.
ARTICLE IN PRESS
608 A. Kleyner, V. Volovoi / Reliability Engineering and System Safety 95 (2010) 606613

Condition 4 may occur because it takes a certain amount of upon it. Consequently, (1y) fraction of systems will not be
time for the system to be repaired. In exponential form this repaired after fault detection. Therefore, the overall probability of
pd
feature is expressed as m, mean time to repair (MTTR) [5]. failure on demand under a perfect detection scenario Pfd is easily
It is important to note that most vehicles remain in use for calculated as
extended time periods often exceeding 1015 years. It is also pd s
Pfd t 1y1Rt yPfd t 3
noteworthy that the human factor is involved in the key decision-
making and repair processes. Therefore, the modeling of the where pd
Pfd is the probability of failure on demand when all the
safety system is further complicated by the list of factors failures are detected (perfect detection), 1 y is the portion of the
classied as imperfections below: population which would not repair the failed system due to
economic reasons or a failure to notice warning light.
1. Detectability of the system failure is less than 100%. It is important to note that in certain cases, the function R(t)
2. Emergency system power supply (e.g., vehicle battery) may fail can be represented by a mixture of statistical distributions to
during the crash. reect the change in failure rate, for example in accordance to the
3. Warning light can go unnoticed after the fault is detected (the bathtub curve (see for example [21]). In those cases R(t) should be
human factor). addressed accordingly in the modeling process.
4. Repairs may not be initiated due to nancial, timing, or other
considerations. As an example, vehicle age and market value
2.3. Dynamic modeling
become important factors in repair decision, where percent of
repaired vehicles diminishes as vehicle age increases and
market value decreases. Rather than separating the whole population into two sub-
groups, let us assume that the decision as to whether to repair a
detected failure is made every time the warning signal appears
Accounting for the factors above makes the modeling more
with the probabilities y and 1  y, respectively. This decision is
challenging, but also more realistic.
considered to be independent of previous repair decisions for this
system (e.g., the system that has been repaired the rst time
2.2. System availability might not be repaired the second time). In addition, for the
moment we consider that y does not depend on time (this
If no repair of the safety system is considered, the reliability of assumption will be relaxed later). While the difference is subtle it
the system by the end of the design life is R(T). When system results in a need for dynamic (i.e., state-space based) reliability
failures follow exponential distribution, they can be represented modeling.
by the constant failure rate l. If all failures are detected and To this end, Markov analysis is widely used in modeling
repaired with a mean time to repair MTTR=1/m, where m electronics reliability [12], but it has two well-recognized
corresponds to a repair rate [20]. The unavailability (probability deciencies. The rst deciency is related to the large number
that the system will not work on demand) can be well of possible system states (on the order of kn, where k is the
approximated by a steady-state solution, providing the following number of possible states for each component and n is the
estimate: number of such components) that are needed to represent all
possible permutations. Although this issue can be mitigated by
s l
Pfd 1 the use of symmetry and hierarchical (nested) calculations, it
lm
remains an important limitation. The second limitation is a
s
where Pfd is the probability of failure on demand (fd), the steady- natural use of constant transition rate (following exponential
state solution (s). distribution) due to the Markovian property.
However, taking into account the considerations listed in To illustrate the dynamic solution, let us present the system
Section 2.1, Eq. (1) might represent the system unavailability described in Section 2.2 as a state-space solution (Fig. 2), i.e.,
inaccurately. Due to the fact that both detection and repairs are Markov chain.
taking place less than 100% of the time, the real time-dependent Initially the system is in state A, which corresponds to a fully
probability of failure on demand Pfd(t) will lie somewhere operational safety system with the detection system functioning
between this lower boundary and the unreliability of the system as intended. Transitioning from state A to state B indicates that
at the end of its life that never undergoes repairs
s nr
Pfd rPfd t rPfd t 1Rt 2
nr
where Pfd is the probability of failure on demand for the system,
which does not undergo repairs, R(t) the reliability of the system
under no-repair condition (conventional reliability function).
Importantly, those bounds (2) are quite wide, which provides
motivation for a more rened analysis.
Due to the factors listed as imperfections in Section 2.1, a
certain percent of the vehicle population will not be subject to
repair after the fault has been detected. In the simplest, static
scenario, the total population of the system can be separated into
two subpopulations based on whether a detected failure will be
repaired or not. Let us dene y as the percent of the population of
the vehicles subject to repair. This percent would include the
drivers responding to the warning light as opposed to those who
would ignore or not notice it. Next, we can consider the combined
effects of failure of the detection system and the presence of a Fig. 2. Markov chain for a system with imperfect detection (yportion of the
subpopulation that does not notice the warning or fails to act vehicles subject to repair).
ARTICLE IN PRESS
A. Kleyner, V. Volovoi / Reliability Engineering and System Safety 95 (2010) 606613 609

the main function has failed with the corresponding transition system is operating normally (the token is in Det system OK
rate l, but the detection system is still operational, and hence the place) when the main system fails (the token moves to System
driver receives a warning. On the other hand, if the detection sub- failed place), then two transitions are enabled at the same
system fails rst (with the corresponding transition rate n), the time (to Ready to repair and No repair). Just like in the
system transitions to state D (note that detection sub-system is Markov model (see Fig. 2) the decision is modeled by assigning
considered to be non-repairablethis assumption can be relaxed, those two transitions exponential rates cy and c(1 y),
e.g., if periodic inspections are introduced). The transition from respectively. However, when the detection system fails rst, the
D to E corresponds to the failure of the main function of the safety corresponding token moves to Det Sys Failed and the inhibitor
system after the detection system has failed, so this failure cannot originating in the place prevents the transition of the system
be detected, and therefore is not repaired, hence there is no token to the Ready to repair place (the transition becomes
reverse transition from E to D. Once the system transitions into disabled).
state B, a decision is made whether to repair the system or not
with the probabilities y and 1 y, respectively. This decision is 2.4. Modeling time-dependent parameters of the problem related to
modeled using a ctitious transition c that is very large (a specic a vehicle aging
value of c is immaterial as long as it is several orders of
magnitude larger then the other transition rates in the model).
In a real world the owners repair priorities often change with
Assigning a transition rate cy from state B to state C (ready to
the vehicle age. With declining vehicle market value, the number
repair) and the rate c(1  y) from state B to state E (non-repairable
of repairs considered by owners as non-essential is increasing.
system failure) ensures that B is a vanishing (transitional) state.
Since cost of repair remains virtually the same while vehicle value
A choice between repair (state C) and non-repair (state E) occurs
declines, the number of owners who choose to ignore warning
with the desired probabilities.
lights steadily increases with vehicle age when the problem is
Stochastic Petri nets are capable of addressing the main
considered non-critical to a vehicle performance.
shortcomings of Markov chains. As mentioned in the Introduction,
In order to model that phenomenon we will introduce here the
SPN focuses on modeling component states that comprise the
renewal attrition function as a ratio of the number of repairs to
system, so that the state of the system can be inferred from the
the number of failures:
states of its components rather than dened explicitly as required
by Markov state space. Places in SPN are similar to Markov states, #Parts Repaired
rt t 4
but SPNs tokens can represent individual components of the #Parts Failed
system and therefore allow differentiation among the state The typical renewal attrition function will have a shape
spaces for those components. As a result, marking (i.e., combined presented in Fig. 4. Where TLife is the expected vehicle life (e.g.
position of all the tokens) provides a means to describe the 10 years, 15 years, etc.), TW is the warranty term duration (e.g.,
system as a whole implicitly, without the need to explicitly depict 3 years, 5 years, etc.). The assumption is made that while the
the corresponding system state, thus potentially mitigating the vehicle is under warranty all the required repairs will be
state-space explosion. Effective system modeling using SPN performed
involves its decomposition into a set of relevant entities, where (
1:0 when t r Tw
each entity does not necessarily represent a physical component rt f t when T ot r T 5
of the system, but describes a phase of operation, or environ- W Life

mental condition. The following conditions apply: 0rf(t)r1 and f(TW)= 1:


Fig. 3 provides the SPN model of the system shown in Fig. 2 Therefore, once past warranty TW the percent of the repaired
using Markov space. The top two places describe the two possible population y will be further reduced by the diminishing function
states of the failure detection sub-system, while the bottom part r(t), and (3) will take form of
describes the possible states of the main sub-system. Exponential
pd
delays with parameters l, m, and n are used for transitions (system Pfd t y1Rnr t rt1yPfd
s
t 1rt1y1Rnr t 6
failure, Repair, and Det failure, respectively). If the detection
And consequently
pd
Pfd t yrtPfds t 1yrt1Rnr t 7

SPN provides a exible tool to model even minute nuances of


system behavior. To demonstrate this exibility let us contrast the
dynamic model in Fig. 3 with the model that represents a static
choice (see Fig. 5). At the beginning of the simulation the token
from place Choice can move either to No repair or Repair

Fig. 3. SPN model for a system with imperfect detection and with a portion of the
vehicles subject to repair. Fig. 4. Renewal attrition function.
ARTICLE IN PRESS
610 A. Kleyner, V. Volovoi / Reliability Engineering and System Safety 95 (2010) 606613

moves from System OK to System failed, then the inhibitor


from Repair Inhibited prevents the system token from moving
to Ready to repair place.
At the bottom of the model Fig. 6, the demand for system
operation is modeled. Here, the Demand timing transition can
be assigned simply a uniform distribution, which would corre-
spond to time averaging of the probability of failure on demand
(accident) or any other statistical distribution appropriate for the
task. When token moves to Demand place there are two xed
transitions that have durations of e (an arbitrary small number)
and 2e. If both those transitions are enabled the former will re
rst and the token will move to Success place. On the other
hand, if the rst transition is disabled, then the token will move to
Failure place. This transition should be disabled as long the
system token is not in System OK place, and we could have
three inhibitors starting at three other places where this token
can be (System failed, Ready to Repair, and No repair).
Fig. 5. SPN model describing the selection of detected failure for repair that takes
place in a static manner (two sub-populations of drivers in regards to addressing
However, there is a more compact and direct way to model this
the warning light). situation by using a negative inhibitor (enabler) that acts in the
opposite way of the regular inhibitor (and so it is denoted with a
negative number). More precisely, additional conditions required
for enabling a transition can be expressed by those enablers.
Their action is opposite to that of a regular inhibitor: in the
presence of a negative inhibitor of multiplicity k (here k40)
transition is only enabled if the number of tokens n in the input
place for that inhibitor n Zk.
In our case for the transition from place Demand to place
Success to be enabled it requires the enabler to have a token in
System OK place.

2.5. Time averaging

As mentioned before, for an on-demand repairable system,


reliability function is of limited use and availability function
would be a more reasonable measure. However, due to the fact
that the system is not fully renewable and availability cannot
reach a steady state, either a time averaging of availability should
be considered or a full description of availability as a function of
time should be given. Time averaging can be motivated by the
following consideration: if a demand occurs at a random time
(i.e., uniformly distributed throughout the life of the system),
the relevant measure would be a probability that the system
on-demand will be available. This probability would be given by
Z
1 T
P fd P t dt 8
T 0 fd
While this formula can be used, SPN also provides an
opportunity to simulate the demand directly as shown in Fig. 6.
Moreover, non-uniform distributions can be used for the
transition from the place demand if the demand for safety-
Fig. 6. SPN model with attrition function and demand. related system varies with time. When this transition res the
token into System failed place this token has a choice of moving
either to Failed on Demand or Success depending on whether
place with the specied probabilities. If this token moves to No the inhibitor emanating from System OK place is engaged or
repair place, then the corresponding inhibitor precludes the not. This can be implemented by assigning the transition to
repair, even if the detection system operates properly (the token Success a slightly smaller xed delay as compared to the
in place Det system OK does not move). The results of the transition to Failure, by assigning distinct colors for two tokens
simulation using this SPN model are consistent with analytical that can appear in the System failed place, and by implement-
Eq. (7). Finally, to demonstrate a more realistic SPN model, we ing color-depending transitions from this place. In Fig. 6 the
incorporate both the possibility of attrition and demand (Fig. 6), difference between the delays of transitions to Success and to
where the choice whether to repair takes place dynamically. Here Failure is not important as long as the former res rst when
in place Warranty (the top portion of the model) the token both transitions are enabled. In the following case study the xed
represents the attrition: Warranty ends transition is simply a delay e and 2e are selected for transitions to Failure and to
xed delay Tw. Note if either of the top two tokens in Fig. 6 moves Success places, respectively, and assigned arbitrarily as a small
into Repair Inhibited by the time the system fails and the token value of e =10  6 month.
ARTICLE IN PRESS
A. Kleyner, V. Volovoi / Reliability Engineering and System Safety 95 (2010) 606613 611

m = 26.07143/1 year. If all the failures are detected and repaired,


the unavailability (probability that the airbag will not work on
demand) can be well approximated by a steady-state solution:

s l
Pfd 1:31  104 10
lm
In reality both detection and repairs are taking place in less
than 100% of the time, so this probability will be somewhere
between this lower boundary and the unreliability of the system
that never undergoes the repairs at the end of its life per (2):
s
Pfd 1:31  104 o Pfd o 1R15 years 0:05 11
In order to model the imperfections listed in the Section 2.1 let
us assume y = 0.98, meaning that 98% of the population will decide
to repair the faulty system. Fig. 7 shows a comparison of dynamic
and static scenarios for this value. One can observe that dynamic
scenario shows a slightly higher probability of failure.
Qualitatively this can be explained by the fact that two
populations are separated at the beginning (static scenario); the
sub-group that makes the repairs is less likely to fail and therefore
Fig. 7. Comparison of several approaches to account for the fact that only y = 0.98 the effective, dynamic fraction of this population will be slightly
fraction of all detected failures are repaired: static (when two populations are
higher than y = 0.98. The effect is minimal for the presented values
separated in the beginning) and dynamic (when the decision is made upon
demand). and therefore can be neglected.
Please note that the results shown in Fig. 7 for dynamic model
3. Case study: automotive occupant safety system are presented using both Markov chains and SPN (obtained using
100 million Monte Carlo runs). The result for the scenario where
all detected failures are repaired is also provided for reference
The concept on an emergency, on-demand system is utilized in
purposes.
automotive safety systems and particularly in the design of an
In some instances, the difference between the static and
airbag controller unit. The original data in this case study has been
dynamic models can be more signicant. For the case where
modied to protect the proprietary nature of this information.
demonstrated reliability R(15 years)= 0.5 and y =0.5 (a hypothe-
The modern airbag controller is a complicated electronic
tical scenario), the impact will be quite noticeable (see Fig. 8).
system containing crash sensors capable of detecting various
Please note that while the Markov model provides the
types of crashes (e.g. side impact vs. front collision) and 424
description of the system as a whole, SPN focuses on system
ring loops. The number of ring loops depends on the occupant
component behavior. If the constant transition rates are used the
safety options of the vehicle such as driver and passenger airbags,
results should be identical (see for example Fig. 7, where the
side curtains, rear passenger protection, belt pretensioners and
results by SPN and Markov chains are practically indistinguish-
dual-phase deployment. On-time deployment triggered by the
able for dynamic modeling). However, it is quite difcult to
vehicle crash is a safety-critical feature of a controller [22];
directly implement into the Markov chain model static subdivi-
therefore system reliability requirements are high and, depending
sion into two populations to provide a model analogous to the one
on a specic automotive customer, could range from 0.9999 to
given in Fig. 5.
0.999999. Each modern airbag controller is equipped with a fault-
Another important advantage of SPN is its ability to model
detection circuit that detects a system failure and triggers a
transitions that have variable rates when simulation is used to
warning such as a light indicator to alert the driver. The subse-
quent action may be either to repair or replace the faulty
component [19]. Conversely, the vehicle owner may not act on
the warning due to either inability to heed the warning in a timely
manner or a conscious decision not to repair the system for
nancial or other reasons.
In order to obtain a renewal attrition function for an airbag
controller an analysis of warehouse shipping history for this
product was conducted (the details of this method are outside the
scope of this paper). The following function was obtained:
(
1:0 when t r Tw
rt A eBt when T o t r T 9
W Life

where A= 1.0942, B =0.03, t is the time in years of service, TW is the


3-year warranty.
Since an airbag is a typical on-demand repairable system, we
would need to estimate its probability of failure on demand or
availability instead of utilizing the traditional reliability function.
Let us consider an example where demonstrated reliability for
this system is 0.95 for 15 years. Assuming exponential distribu-
tion, the corresponding constant failure rate is l = 3.42  10  3/1 Fig. 8. Difference between two scenarios: static (when two populations are
year. Let us further consider a repair with the mean value of separated in the beginning and dynamic (when the decision is made upon
14 days, which corresponds to the equivalent repair constant rate demand).
ARTICLE IN PRESS
612 A. Kleyner, V. Volovoi / Reliability Engineering and System Safety 95 (2010) 606613

evaluate the model. To demonstrate this capability, let us


investigate how the probability of failure on demand changes as
various parameters for failure distribution is considered. Speci-
cally, let us compare Weibull distribution with shape parameters
b =1 (exponential), 2, and 3 (wear-out mode).
The corresponding scale parameters are calculated to match
the reliability R(T)= 0.95 if no repairs are possible. This yields
Weibull characteristic life of Z =292.436, 66.2309, and 40.3711
years, respectively. Larger values of b imply that the failures are
relatively more likely to occur later rather than sooner, so if the
reliability at the end of design life is matched and no repairs take
place, the larger b implies smaller P fd . Those values are:
P fd 0:0252; 0:0168, and 0:0126, respectively. The results for
the scenarios with no attrition and the decision to repair made
dynamically when the failures are detected with the probability
y 0:98 are shown in Fig. 9.
Note that unlike the cases with no repairs, the probability of
failure on demand at the end of the design life increases with b
rather than remaining relatively the same. However, the average
value of probability of failure on demand still decreases; those Fig. 11. Comparing probability of failure on demand as a function of time for
different failure models with attrition that provide the same reliability if no repairs
take place. Weibull with shape parameters b = 1 (exponential), 2, and 3.

values are: P fd 8:12  104 ; 6:33  104 , and 5:28  104 , for
b = 1 (exponential), 2, and 3, respectively.
Next, let us consider the effect or attrition and focus rst on
the exponential failures (see Fig. 10).
Note that for the rst 3 years there is no difference in the
results, since r(t)= 1 within the warranty period (9). Finally, let us
observe how changing the assumptions about the failures impacts
the probability of failure on demand. Using the same assumptions
as above for the models without attrition, we can observe (see
Fig. 11) that the negative impact of attrition increases with the
value of shape function b. Indeed, even the average value of
probability of failure on demand will not always decrease; the
corresponding values are: P fd 2:54  103 ; 2:68  103 , and
2:51  103 , for b = 1 (exponential), 2, and 3, respectively.
In the cases where power source survival (vehicle battery) is a
design concern (see Fig. 1) its effect on the model can be easily
accounted for by multiplying the probability of successful airbag
deployment (1 Pfd) by the probability of battery survival during
the crash.
Fig. 9. Comparing probability of failure on demand as a function of time for
different failure models that provide the same reliability if no repairs take place.
Weibull with shape parameters b = 1 (exponential), 2, and 3.
4. Conclusions

The proposed method illustrates numerous advantages of


applying the concept of availability and stochastic Petri nets in the
early stages of design reliability analysis, especially when dealing
with safety-critical on-demand systems. The proposed method
combines various real life factors, such as probability of the user
to notice the warning signal, reliability of detection circuitry,
users response time to the warning light, duration of repair,
estimated down time, system age, and other relevant factors. It
shows that the application of stochastic Petri nets (SPN) provides
a clear advantage in performing such analyses. The presented
case study with real life example illustrates the sensitivity of
probability of failure on demand to various factors needed to be
accounted for to provide an accurate solution to the real life
problems. This innovative model presents a more realistic,
exible, and accurate estimate of the systems failure rates and
reliability compared to the more traditional reliability analysis
techniques. In addition to that SPN provides a graphical trace-
ability of the solution as opposed to some stochastic methods,
Fig. 10. Effect of attrition for exponential failure. such as custom-made Monte Carlo simulation.
ARTICLE IN PRESS
A. Kleyner, V. Volovoi / Reliability Engineering and System Safety 95 (2010) 606613 613

This method can also easily accommodate the time-dependent [9] David R, Alla H. Discrete, continuous, and hybrid Petri nets. Berlin,
input variables, such as system age, which in turn may affect the Heidelberg: Springer; 2005.
[10] Chew SP, Dunnett SJ, Andrews JD. Phased mission modeling of systems with
renewal rate of the system. To add the exibility, the SPN method maintenance-free operating periods using simulated Petri nets. Reliability
can be effectively combined with traditional reliability analysis Engineering and System Safety 2008;93:98094.
techniques, such as Markov chains, standards-based reliability [11] Clavereau J, Labeau P-E. A Petri net-based modelling of replacement
prediction, block diagrams, Weibull analysis, Monte Carlo simula- strategies under technological obsolescence. Reliability Engineering and
System Safety 2009;94:35769.
tion, etc. In summary, this method provides the efcient synthesis [12] Trivedi SK. Probability and statistics with reliability, queuing and computer
of practical engineering approach with the academic rigor of the science applications, 2nd ed. John Wiley and Sons; 2002.
modern stochastic simulation techniques. [13] Dutuit Y, Chatelet E, Signoret J-P, Thomas P. Dependability modeling and
evaluation by using stochastic Petri nets: application to two test cases.
Reliability Engineering and System Safety 1997;55:11724.
References [14] Faria JA, Matos MA. An analytical methodology for the dependability
evaluation of non-Markovian systems with multiple components. Reliability
Engineering and System Safety 2001;74(2):193210.
[1] IEC 61508: Functional safety of electrical/electronic/programmable electronic
[15] Khouas A, Derieux A, FDP: fault detection probability function for analog
safety related systems, 19982000.
circuits. In: The 2001 IEEE international symposium on circuits and systems,
[2] Foucher B, Boullie J, Meslet B, Das D. A review of reliability prediction methods
ISCAS 2001, 69 May 2001, vol. 4. p.1720.
for electronic devices. Microelectronics Reliability 2002;42:115562.
[16] Jensen K. Coloured Petri nets. Basic concepts, analysis methods and practical
[3] Kleyner A, Volovoi V. Reliability prediction using Petri nets for on-demand
safety systems with fault detection. In: Martorell S, Guedes Soares C, use, vol. 1. Berlin: Springer; 1993.
Barnett J, editors. Safety and reliability and risk analysis. Taylor and Francis; [17] Volovoi VV. Modeling of system reliability using Petri nets with aging tokens.
2008. p. 19618. Reliability Engineering and System Safety 2004;84(2):14961.
[4] Product Information CG989 8-Loop Firing IC CG989 by Bosch (2006) [18] Schneeweiss WG. Tutorial: Petri nets as a graphical description medium for
/http://www.semiconductors.bosch.de/pdf/CG989_Product_Info.pdfS. many reliability scenarios. IEEE Transactions on Reliability 2001;50(2):
[5] Kleyner A. Reliability demonstration: theory and application. In: Reliability 15964.
and maintainability symposium (RAMS) Tutorials CD, January 2008. [19] Yang SK, Liu TS. Failure analysis for an airbag inator by Petri nets. Quality
[6] Petri A. Kommunikation mit Automaten. PhD thesis, Institut fur Instrumen- and Reliability Engineering International 1997;13:13951.
telle Mathematik, Schriften des IIM, 1962. [20] OConnor P. Practical reliability engineering, 4th ed. Wiley; 2003.
[7] Sadou N, Demmou H. Reliability analysis of discrete event dynamic systems [21] Kleyner A, Sandborn P. A warranty forecasting model based on piecewise
with Petri nets. Reliability Engineering and System Safety 2009;94:184861. statistical distributions and stochastic simulation. Reliability Engineering and
[8] Symons FJW. Modelling and analysis of communication protocols using System Safety 2005;88:20714.
numerical Petri nets. PhD thesis, Department of Electrical Engineering [22] Teng S-H, Ho S-Y. Reliability analysis for the design of an inator. Quality and
Science, University of Essex, Essex, England, 1978. Reliability Engineering International 1995;11:20314.

Vous aimerez peut-être aussi