Design of Relaibility

Department of Mechanical Engineering, Subject: Product Design & Value Engineering
Faculty of Technology & Engineering Class: S.S. BE-IV (M echanical)

The Maharaja Sayajirao University of Baroda
DESIGN FOR RELIABILITY

What is Reliability?
Reliability is the ability of a system or component to perform its required functions under stated conditions
for a specified period of time.
Reliability may be defined in several ways:
 The idea that something is fit for purpose with respect to time;
 The capacity of a device or system to perform as designed;
 The resistance to failure of a device or system;
 The ability of a device or system to perform a required function under stated conditions for a
specified period of time;
 The probability that a functional unit will perform its required function for a specified interval under
stated conditions.
 The ability of something to "fail well" (fail without catastrophic consequences)
Reliability theory is the foundation of reliability engineering. For engineering purposes, reliability is defined
as:
the probability that a device will perform its intended function during a specified period of time under
stated conditions.
Reliability engineering is concerned with four key elements of this definition:
 First, reliability is a probability. This means that failure is regarded as a random phenomenon: it is a
recurring event, and we do not express any information on individual failures, the causes of failures, or
relationships between failures, except that the likelihood for failures to occur varies over time according
to the given probability function. Reliability engineering is concerned with meeting the specified
probability of success, at a specified statistical confidence level.
 Second, reliability is predicated on "intended function:" Generally, this is taken to mean operation
without failure. However, even if no individual part of the system fails, but the system as a whole does
not do what was intended, then it is still charged against the system reliability. The system requirements
specification is the criterion against which reliability is measured.
 Third, reliability applies to a specified period of time. In practical terms, this means that a system has a
specified chance that it will operate without failure before time . Reliability engineering ensures that
components and materials will meet the requirements during the specified time. Units other than time
may sometimes be used. The automotive industry might specify reliability in terms of miles, the military
might specify reliability of a gun for a certain number of rounds fired. A piece of mechanical equipment
may have a reliability rating value in terms of cycles of use.
 Fourth, reliability is restricted to operation under stated conditions. This constraint is necessary because
it is impossible to design a system for unlimited conditions. A Mars Rover will have different specified
conditions than the family car. The operating environment must be addressed during design and testing.
If the time (t) over which a system must operate and the underlying distributions of failures for its constituent
elements are known, then the system reliability can be calculated by taking the integral (essentially the area
under the curve defined by the probability density function(PDF)) of the PDF from t to infinity, which can be
expressed mathematically as,
1
,
Where, f(t) = the failure probability density function
t = the length of the period (which is assumed to start from time zero)
& R(t) = Reliability function.
If the underlying failure distribution is exponential, above equation becomes
Where, λ= the failure rate (inverse of MTBF)

t = the length of time the system must function
e = the base of natural logarithms
R(t) = reliability over time t
The Figure below shows the curve of above equation. if the mean time between failures of a type of
equipment is 100 hours, we expect only 37% (if t = MTBF = 1/λ, then e-λt = e-1 = 0.367879) of the population
of equipment to still be operating after 100 hours of operation. Put another way, when the time of operation
equals the MTBF, the reliability is 37%.
Figure: Exponential curve relating reliability and time.

It may be also defined as,
R(t) = P {the system doesn‟t fail during [0,t]} = 1- F(t)
Where, F(t) is failure function
Properties of Reliability function:

1. Reliability is a decreasing function with time t. That is, for t 1 <t2 ; R(t1 ) ≥ R(t2 )
2. It is usually assumed that R(0) = 1. As t becomes larger and larger R(t) approaches zero, that is R(ω)
Determining which distribution best describes the pattern of failures for an item is extremely important, since
the choice of distributions greatly affects the calculated value of reliability. Two of the continuous
distributions commonly used in reliability are shown in table 2-1. Note that f(t) is called the probability
density function. It is also referred to as the PDF. For reliability, we are usually concerned with the
probability of an unwelcome event (failure) occurring.
2
Table 1. Commonly used continuous distributions
Reliability is a broad term that focuses on the ability of a product to perform its intended function.
Mathematically speaking, assuming that an item is performing its intended function at time equals zero,
reliability can be defined as the probability that an item will continue to perform its intended function
without failure for a specified period of time under stated conditions. Please note that the product defined
here could be an electronic or mechanical hardware product, a software product, a manufacturing process, or
even a service. Reliability engineers rely heavily on statistics, probability theory, and reliability theory.
Failure Probability Density Function (Bathtub Curve)
Historically, the failure of products over their total lifetimes can be classified into three major types of
failures:
1. Quality failures occur early in the life cycle and are due to quality defects caused by manufacturing
or design.
2. Stress related failures occur at random over the total system lifetime and are caused by the
application of stresses that exceed the design‟s strength.
3
3. Wearout failure occurs when the product reaches the end of its effective life and begins to degenerate
and wear out.
The bathtub curve is widely used in reliability engineering, although the general concept is also applicable to
humans. It describes a particular form of the hazard function which comprises three parts:
 The first part is a decreasing failure rate, known as Quality failures.
 The second part is a constant failure rate, known as Stress-related failures.
 The third part is an increasing failure rate, known as Wear-out failures.
The name is derived from the cross-sectional shape of the bathtub device. The bathtub curve is generated by
mapping the rate of early "infant mortality" failures when first introduced, the rate of random failures with
constant failure rate during its "useful life", and finally the rate of "wear out" failures as the product exceeds
its design lifetime.
In the early life of a product adhering to the bathtub curve, the failure rate is high but quickly decreasing as
defective products are identified and discarded, and early sources of potential failure such as handling and
installation error are surmounted. In the mid-life of a product - generally, once it reaches consumers - the
failure rate is low and constant. In the late life of the product, the failure rate increases, as age and wear take
their toll on the product. Many consumer products strongly reflect the bathtub curve, such as computer
processors.
The bathtub curve, displayed in Figure above, does not depict the failure rate of a single item, but describes
the relative failure rate of an entire population of products over time. Some individual units will fail
relatively early (infant mortality failures), others (we hope most) will last until wear-out, and some will fail
during the relatively long period typically called normal life. Failures during infant mortality are highly
undesirable and are always caused by defects and blunders: material defects, design blunders, errors in
assembly, etc. Normal life failures are normally considered to be random cases of "stress exceeding
strength." However, as we'll see, many failures often considered normal life failures are actually infant
mortality failures. Wear-out is a fact of life due to fatigue or depletion of materials (such as lubrication
depletion in bearings). A product's useful life is limited by its shortest-lived component. A product
manufacturer must assure that all specified materials are adequate to function through the intended product
life.
Note that the bathtub curve is typically used as a visual model to illustrate the three key periods of product
failure and not calibrated to depict a graph of the expected behavior for a particular product family. It is rare
to have enough short-term and long-term failure information to actually model a population of products with
a calibrated bathtub curve.
Also note that the actual time periods for these three characteristic failure distributions can vary greatly.
Infant mortality does not mean "products that fail within 90 days" or any other defined time period. Infant
mortality is the time over which the failure rate of a product is decreasing, and may last for years.
Conversely, wear-out will not always happen long after the expected product life. It is a period when the
failure rate is increasing, and has been observed in products after just a few months of use.
While the bathtub curve is useful, not every product or system follows a bathtub curve hazard function.
4
Why is Reliability Important?
There are a number of reasons why product reliability is an important product attribute, including:
 Reputation. A company's reputation is very closely related to the reliability of their products. The
more reliable a product is, the more likely the company is to have a favorable reputation.
 Customer Satisfaction. While a reliable product may not dramatically affect customer satisfaction
in a positive manner, an unreliable product will negatively affect customer satisfaction severely.
Thus high reliability is a mandatory requirement for customer satisfaction.
 Warranty Costs. If a product fails to perform its function within the warranty period, the
replacement and repair costs will negatively affect profits, as well as gain unwanted negative
attention. Introducing reliability analyses is an important step in taking corrective action, ultimately
leading to a product that is more reliable.
 Repeat Business. A concentrated effort towards improved reliability shows existing customers that a
manufacturer is serious about their product, and committed to customer satisfaction. This type of
attitude has a positive impact on future business.
 Cost Analysis. Manufacturers may take reliability data and combine it with other cost information to
illustrate the cost-effectiveness of their products. This life cycle cost analysis can prove that although
the initial cost of their product might be higher, the overall lifetime cost is lower than a competitor's
because their product requires fewer repairs or less maintenance.
 Customer Requirements. Many customers in today's market demand that their suppliers have an
effective reliability program. These customers have learned the benefits of reliability analysis from
experience.
 Competitive Advantage. Many companies will publish their predicted reliability numbers to help
gain an advantage over their competition who either does not publish their numbers or has lower
numbers.
What is the Difference Between Quality and Reliability?
Even though a product has a reliable design, when the product is manufactured and used in the field, its
reliability may be unsatisfactory. The reason for this low reliability may be that the product was poorly
manufactured. So, even though the product has a reliable design, it is effectively unreliable when fielded
which is actually the result of a substandard manufacturing process. As an example, cold solder joints could
pass initial testing at the manufacturer, but fail in the field as the result of thermal cycling or vibration. This
type of failure did not occur because of an improper design, but rather it is the result of an inferior
manufacturing process. So while this product may have a reliable design, its quality is unacceptable because
of the manufacturing process.
Just like a chain is only as strong as its weakest link, a highly reliable product is only as good as the inherent
reliability of the product and the quality of the manufacturing process.
Reliability Parameters:
Requirements are specified using reliability parameters. The most common reliability parameter is the mean-
time-between-failure (MTBF), which can also be specified as the failure rate or the number of failures during
a given period. These parameters are very useful for systems that are operated on a regular basis, such as
most vehicles, machinery, and electronic equipment. In other cases, reliability is specified as the probability
of mission success. For example, reliability of a scheduled aircraft flight can be specified as a dimensionless
probability or a percentage.
5
A special case of mission success is the single-shot device or system. These are devices or systems that
remain relatively dormant and only operate once. Examples include automobile airbags, thermal batteries and
missiles. Single-shot reliability is specified as a probability of success, or is subsumed into a related
parameter. Single-shot missile reliability may be incorporated into a requirement for the probability of hit.
For such systems, the probability of failure on demand (PFD) is the reliability measure. This PFD is derived
from failure rate and mission time for non-repairable systems. For repairable systems, it is obtained from
failure rate and MTTR and test interval. This measure may not be unique for a given system as this measure
depends on the kind of demand. In addition to system level requirements, reliability requirements may be
specified for critical subsystems. In all cases, reliability parameters are specified with appropriate statistical
confidence intervals.
 MTBF (Mean-Time-Between-Failure):
It is expected time between two consecutive failures. A commonly used measure of reliability for repairable
systems is the mean time between failures (MTBF).
MTBF is also defined as „for a stated period in the life of a functional unit, the mean value of the lengths of
time between consecutive failures under stated condition. MTBF is a key reliability metric for systems that
can be repaired or restored
𝐓𝐨𝐭𝐚𝐥 𝐨𝐩𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐭𝐢𝐦𝐞

𝑴𝑻𝑩𝑭 =
𝐍𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐟𝐚𝐢𝐥𝐮𝐫𝐞𝐬
Mathematically, the MTBF is the sum of the MTTF (mean time to failure) and MTTR (mean time to repair).
MTBF is usually specified in hours, but can also be used with other units of measurement such as miles or
cycles .The higher the MTBF number is, the higher the reliability of the product.
𝑻𝒊𝒎𝒆
𝑹𝒆𝒍𝒊𝒂𝒃𝒊𝒍𝒊𝒕𝒚 = 𝒆 − 𝑴𝑻𝑩𝑭
For example, assume you are testing a system that can be repaired when there is a failure. The failures causes
the system to go down. The first failure happens at 10 hours and it takes 5 hours to fix. The second failure is
at 27 hours and the repair duration is 3 hours. Then after working for 13 hours, the system fails at 43 hours.
6
The repair lasts for 7 hours and the system is restored at 50 hours. This failure and repair process can be
illustrated using the following graph.
The MTBF = x (T1 + T2) = 16.5 hours, if you use only the observations of complete cycles. You can add
one more cycle by combining x0 and y3. Then the MTBF = ⅓ x (T1 + T2 + x0 + y3) hours.
If all the uptime durations xi are independent and identically distributed (i.i.d) and all the repair durations yi
are i.i.d, then:
MTBF= MTTF + MTTR (Mean Time to Repair)
Above equation shows that the MTBF is the sum of the average uptime and the average downtime.
A common misconception about MTBF is that it is equivalent to the expected number of operating hours
before a system fails, or the “service life”. It is not uncommon, however, to see an MTBF number on the
order of 1 million hours, and it would be unrealistic to think the system could actually operate continuously
for over 100 years without a failure. The reason these numbers are often so high is because they are based on
the rate of failure of the product while still in their “useful life” or “normal life”, and it is assumed that they
will continue to fail at this rate indefinitely. While in this phase of the products life, the product is
experiencing its lowest (and constant) rate of failure. In reality, wear-out modes of the product would limit its
life much earlier than its MTBF figure. Therefore, there should be no direct correlation made between the
service life of a product and its failure rate or MTBF. It is quite feasible to have a product with extremely
high reliability (MTBF) but a low expected service life. Take for example, a human being:
There are 500,000 25-year-old humans in the sample population.

Over the course of a year, data is collected on failures (deaths) for this population.
The operational life of the population is 500,000 x 1 year = 500,000 people years.
Throughout the year, 625 people failed (died).
The failure rate is 625 failures / 500,000 people years = 0.125% / year.
The MTBF is the inverse of failure rate or 1 / 0.00125 = 800 years.
So, even though 25-year-old humans have high MTBF values, their life expectancy
(service life) is much shorter and does not correlate.
The reality is that human beings do not exhibit constant failure rates. As people get older, more failures occur
(they wear-out). Therefore, the only true way to compute an MTBF that would equate to service life would
be to wait for the entire sample population of 25-year-old humans to reach their end-of-life. Then, theaverage
of these life spans could be computed. Most would agree that this number would be on the order of 75-80
years.
Characteristics of MTBF:
1. The value of MTBF is equal to MTTF if after each repair the system is as good as new.4
2. MTBF =1/λ for exponential distribution, where λ is failure rate.
Applications of MTBF:
1. For a repairable system, MTBF is the average time in service between failures.
2. MTBF is used to predict steady state availability measures like inherent and operational availability
7
 MTTF (Mean-Time-To-Failure):
MTTF (mean time to failure) is the expected time to failure of a system. It is used as a measure of reliability
for non repairable items such as bulbs, microchips and many electronic items.
Non-repairable systems can fail only once. Therefore, for a non-repairable system, MTTF is equivalent to the
mean of its failure time distribution. Mean time to failure (MTTF) is sometimes used instead of MTBF in
cases where a system is replaced after a failure, since MTBF denotes time between failures in a system which
is repaired.
Repairable systems can fail several times. In general, it takes more time for the first failure to occur than it
does for subsequent failures to occur. Therefore, MTTF for a repairable system can represent one of two
things: (1) the mean time to first failure (MTTFF) or (2) the mean uptime (MUT) within a failure-repair
cycle in a long run.
As discussed earlier, MTTF is a function of system age. The expected time to the first system failure is called
the mean time to first failure (MTTFF). MTTFF is important for systems where online repairs are tolerable
but not offline repairs. The use of MTTF for both MTTFF and steady-state MUT is another source of
confusion. It should be noted that for a single-component system, with perfect repair, MTTFF is equivalent
to MUT. Therefore, regardless of what MTTF refers to, its value is the same for single-component systems.
In the majority of systems, MDT or MTTR is negligible. In such cases, MTBF ≈ MTTF. Therefore, in most
practical cases, MTTF = MTBF. This is obviously another source of confusion.
Applications of MTTF:
1. MTTF is the average life of a non-repairable system.
2. For a repairable system, MTTF represents the average time before the first failure
3. MTTF is one of the popular contractual reliability measures for non-repairable systems.
 MTTR (Mean-Time-To-Repair):
The MTTR (Mean Time To Repair) for a system is the average time to repair a system if all the parts and
equipments are available.
This is not simple to determine and often is based on experimental estimates. In an operational system, repair
generally means replacing a failed hardware part. Thus, hardware MTTR could be viewed as mean time to
replace a failed hardware module. Taking too long to repair a product drives up the cost of the installation in
the long run, due to down time until the new part arrives and the possible window of time required
scheduling the installation.
 MDT (Mean-DownTime):
Mean downtime is defined as total time necessary to return an item to serviceable condition. In other words,
it is the total time when an item or system not in operating conditions. It is represented as,
MDT = MTTR + miscellaneous delay
Miscellaneous delay can be attributed to waiting for parts on order.
 Failure rate (λ):

Failure rate is average number of failures expected from group of items based on given period of time. It is
measured in units of time -1 , such as failures per million hours.
Failure is the termination of the ability of the product to perform its required function.
8
Failure rate is often used to express the reliability of simple items and components. It is one of the popular
contractual reliability measures among many industries including aerospace and defence.
Reliability Parameters for Repairable Systems:

In most reliability engineering literature, and particularly in theoretical literature such as research papers,
MTBF represents the mean time between failures. It is applicable when several system failures are expected.
This is possible only when the system is restored after a failure. The restoration can be performed by repair
or replacement of some of its failed components. Such systems are known as maintainable systems or
repairable systems.
After restoration, the system may not be as good as new. This is because the repair of the failed components
may be imperfect, warm components may still be present in the system, or all failed components may not
have been restored. Once a restored system is returned to operation, it can fail again after some time. The
failure of the system leads to downtime. Therefore, between two consecutive failures, the time can be divided
into uptime and downtime. The time between failures is referred to as a failure-repair cycle time. In most
cases, this time stochastically decreases with the age of the system. This means that although there are some
random variations in time, on average, there is a decreasing trend. Therefore, strictly speaking, the MTBF of
the system is a function of system age.
If all system failures can be restored, then in a long run, the estimate of the cycle time becomes constant with
respect to the system age. This is known as the steady-state condition. Theoretically, this condition exists as
time tends to infinity. However, for reliable systems where downtime is small in comparison to uptime, the
steady-state condition can be realized in a short time. Therefore, in practice, the MTBF is calculated by
assuming that the system has reached the steady-state condition. Because the MTBF is the expected value of
the failure-repair cycle time, it is sometimes referred to as the mean cycle time (MCT).
The values for uptimes and downtimes can also change with system age and reach their asymptotic values.
The expected values of the uptime and downtime in the steady-state condition are known as the mean uptime
(MUT) and mean downtime (MDT). Because the uptime is equivalent to the failure time, it is also known as
the mean time to failure (MTTF). The downtime can consist of repair time and other delays. If there are no
delays, then downtime is equivalent to the repair time. In this case, the mean downtime (MDT) is equivalent
to the mean time to repair (MTTR). MTTR is also known as mean corrective time. Under the steady-state
condition, the following well-known relationships exist:
MCT = MUT + MDT
When there are no delays in repair:
MTBF = MTTF + MTTR
Availability = MTTF / MTBF = MTTF / (MTTF + MTTR)
Reliability Parameters for Non-Repairable Systems
9
Some systems, such as spacecrafts, cannot be repaired after a major failure. In other cases, even though
maintenance tasks can be performed offline, they cannot be performed during a mission. For all of these
types of non-repairable systems, the time to system failure is an important reliability characteristic. The
expected value is known as mean time to failure (MTTF). Because a non-repairable system can fail only
once, both MTTFF and MTTF refer to the same metric. Because the time to failure is equivalent to the time
before failure, some sources define MTBF as the mean time before failure, which actually means the MTTF.
Using MTBF to represent the mean time before failure is one of the major sources for confusion on this
topic.
 AVAILABILITY :
The difference between reliability and availability is often misunderstood. High availability and high
reliability often go hand in hand, but they are not interchangeable terms. Although reliability provides an
excellent description of the frequency with which a product performs satisfactory, it does not address the
period of time during which the product is out of use after a failure does occur. Reliability measures the
number of failures, not the amount of time that the product is in use. The concept of availability was
developed to resolve this limitation by quantifying the amount of time that a product is in operational use.
Availability is the degree to which a system or component is operational and accessible when required for
use. In other words, availability is defined as „ability to be in a state to perform as required‟ and is a measure
of the time the item is in an operable state when compared to elapsed calendar time so in its simplest form
can be represented mathematically by the formula
𝑢𝑝𝑡𝑖𝑚𝑒
𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
𝑢𝑝𝑡𝑖𝑚𝑒 + 𝑑𝑜𝑤𝑛𝑡𝑖𝑚𝑒
Where, uptime is the time during which the system is available for use and downtime is the time during
which the system is not available for use. It can be seen that this calculation is simply a ratio, indicating the
percentage of the time that the system is up (or available).
Having considered the generic concept of availability there are a number of standard definitions that are
used depending on what is included within the measured downtime:
(a) Operational Availability:

Operational availability is defined as the probability that an item is properly operating at a given pont of
time. It can be defined as,
𝑀𝑇𝐵𝐹
𝐴𝑂 =
𝑀𝑇𝐵𝐹 + 𝑀𝐷𝑇
Where, MTBF is the mean time between all maintenance and

MDT is the mean downtime for each maintenance action.
Operational availability includes maintenance and logistics delays. Operational availability gives a more
realistic view of the levels of availability that can be achieved in service because it includes logistic
delays but it is more difficult to measure and thus gain a figure that is agreeable to everyone. What truly
constitutes logistic delay is a much debated topic with no clear answer and no clear rules that can be
applied to every corrective maintenance action.
(b) Inherent Availability:

Inherent availability reflects the percent of time a system would be available if delays due to
maintenance, supply, etc. are ignored. Inherent availability Ai is defined by the following equation
𝑀𝑇𝐵𝐹
𝐴𝑖 =
𝑀𝑇𝐵𝐹 + 𝑀𝑇𝑇𝑅
10
Where, MTBF is mean time between failure and

MTTR is mean time to repair
As seen in aboveequation, if the system never failed, the MTBF would be infinite and Ai would be
100%. Or, if it took no time at all to repair the system, MTTR would be zero and again the availability
would be 100%.
Inherent availability is a measure of the availability of the item under ideal conditions, i.e. assuming
that a trained maintainer, the spare parts, the tools and test equipment required to undertake
corrective maintenance action are all to hand immediately. It is the most common metric that is
included in a contract as it only includes the down time associated with carrying out corrective
maintenance action activity which is within the control of the design authority and it focuses
attention on ensuring that down time due to design is optimized. If inherent availability is used
within a specification, care must be taken to manage expectations as it is very unlikely that it can be
achieved in service because there will always be some logistic delays that will need to be included.
RELIABLE SYSTEM DESIGN
Design For Reliability (DFR), is an emerging discipline that refers to the process of designing reliability into
products. This process encompasses several tools and practices and describes the order of their deployment
that an organization needs to have in place in order to drive reliability into their products. Typically, the first
step in the DFR process is to set the system‟s reliability requirements. Reliability must be "designed in" to
the system. During system design, the top-level reliability requirements are then allocated to subsystems by
design engineers and reliability engineers working together.
Reliability design begins with the development of a model.
If the underlying distribution for each element is exponential and the failure rates (λi) for each element are
known, then the reliability of the system can be calculated using exponential equation.
1. Series Reliability:
The block diagram of an „„m‟‟-unit series network or configuration is shown in Fig. 1. Each block represents
a system unit or component. If any one of the components fails, the system fails; that is, all of the series units
must work normally for the system to succeed. For independent units, the reliability of the system shown in
Fig. 1 is
𝑅𝑆 = 𝑅1 × 𝑅 2 × 𝑅3 × … … .× 𝑅 𝑚
Where, RS= series system reliability
m =number of units
Ri = reliability of unit i for i=1, 2, 3, ... , m
Figure 1 Series system block diagram.

For constant unit failure rates of the units, above equation becomes
𝑅𝑠 𝑡 = 𝑒−𝜆1𝑡 𝑒−𝜆2𝑡 𝑒−𝜆3𝑡 … . . 𝑒−𝜆𝑚𝑡
Where, RS (t) = series system reliability at time t
λ = constant failure rate of unit i for i=1, 2, 3, ... , m
for individual item I, if ti is age of the item, which should be more than or equal to age t of the system. That
is, for the system to survive up to age t, the item I should survive up to t i.
11
 Characteristics of reliability function of a series configuration:

The value of the reliability function of the system Rs (t), for a series configuration is less than or equal to the
minimum value of the individual reliability function of the constituting items. That is
𝑅𝑆 𝑡 ≤ min 𝑅𝑖 (𝑡)
𝑖=1,2,…,𝑛
Example: Consider the system represented by the reliability block diagram (RBD) in figure below.
Figure2. Example reliability block diagram.

Components A and B in figure 2 are said to be in series, which means all must operate for the system to
operate. Since the components are in series, the system reliability can be found by adding together the failure
rates of the components and substituting the result as seen in equation 2.3. Furthermore, if the individual
reliabilities are calculated (the bottom values,) we could find the system reliability by multiplying the
reliabilities of the two components as shown in equation 2.4
R(t) = e−(λ A +λB )t = e−0.0025x10 = 0.9753 (2.3)

R(t) = RA (t) × RB (t) = 0.99000 × 0.98510 = 0.9753 (2.4)
2. Parallel Reliability:
This type of configuration can be used to improve a mechanical system‟s reliability during the design phase.
The block diagram of an „„m‟‟-unit parallel network is shown in Fig. 3. Each block in the diagram represents
a unit. This configuration assumes that all of its units are active and at least one unit must work normally for
the system to succeed. In other words, in a parallel configuration the system fails only when all the items of
the system fail.
Figure 3. Parallel system block diagram
Parallel components are introduced when the reliability requirements for the system are very high. The use of
more than one engine in aircraft is one of the obvious examples of the parallel configuration. However,
12
parallel items will increase cost, complexity and weight of the system. Hence, the number of items required
should be carefully determined and if possible optimised.
Reliability function of a parallel configuration can be obtained using following arguments. As the system
fails only when all the items fail, the failure function, Fs(t) of the system is given by:
Fs(t) = F1 (t) X F2 (t) X …….X Fn (t)
Where, Fi (t) is the time to failure distribution of item i.
Substituting, Fi(t) = 1- Ri (t) in above equation,

Fs(t) = [1-R1 (t)] X [1-R2 (t)] X….. X [1-Rn(t)]
Now, System reliability function Rs(t) can be written as,

Rs(t) = 1- Fs(t)
Rs(t) = 1- [1-R1 (t)] X [1-R2 (t)] X….. X [1-Rn(t)]
Rs t = 1 − ni=1[1 − Ri t ]
 Characteristics of reliability function of a parallel configuration:

The value of the reliability function of the system Rs (t), for a parallel configuration is more than reliability
of the any of the consisting items.=. That is
𝑅𝑆 𝑡 ≥ max 𝑅𝑖 (𝑡)
𝑖=1,2,…,𝑛
Example: The system represented by the RBD in figure 2 has the same components (A and B in series
denoted by one block labeled: A-B) used in series configuration, but two of each component are used in a
configuration referred to as redundant or parallel.
Figure 3. RBD o f a system with redundant components.

Two paths of operation are possible. The paths are: top A-B and bottom A-B. If either of two paths is intact,
the system can operate. The reliability of the system is most easily calculated by (equation 2-5) finding the
probability of failure (1 - R(t)) for each path, multiplying the probabilities of failure (which gives the
probability of both paths failing), and then subtracting the result from 1. The reliability of each path was
found in the previous example. Next, the probability of a path failing is found by subtracting its reliability
from 1. Thus, the probability of either path failing is 1 - 0.9753 = 0.0247. The probability that both paths will
fail is 0.0247 x 0.0247 = 0.0006.
Finally, the reliability of the system is 1 - 0.0006 = 0.9994, about a 2.5% improvement over the series
configured system.
R(t) = 1− (1−RT (t))×(1−RB (t)) = 1−(0.0274×0.0274) = 0.9994 (2.5)
Where, RT is the reliability of the top path
RB is the reliability of the bottom path
13
Adding a component in parallel, i.e., redundancy, improves the system's ability to perform its function. This
aspect of reliability is called functional or mission reliability.
REDUNDANCY TECHNIQUES USED IN SYSTEM DESIGN
Redundancy, as it is understood in reliability and maintenance management, is the provision of alternative

means or parallel paths in a system such that all these means must fail before the system failure can take
place. The provision of unit or components in parallel or provision of parallel paths results in an increase in
both system reliability and system mean life. The following techniques are used for incorporating
redundancy in systems:
1. System or Unit Redundancy:
The easiest and most straight forward method of providing redundancy with a view to increase the
system reliability and mean life is to provide redundancy or parallel path for the system as a whole.
2. Component Redundancy:
This method is used to provide redundancy or parallel path for each individual component.
3. Weakest link Technique:

In this method, the weak components, from the reliability viewpoint, are identified and paralle paths
are provided for these components only. By this method, the reliability of the weak component is
improved, thereby increasing the system reliability.
4. Mixed Redundancy:
This mthod is combination of the above mentioned techniques. The actual combination used depends
upon system complexity and configuration and system reliability requirement.
Example:
A simple two- component series system with component reliabilities of 0.95 and 0.75 respectively. Table
below shows the sytem configuration and system reliabilities obtained by use of the all the four approaches
as discussed above.
Now, if we apply first approach, that of system or unit redundancy, the system reliability of the given 2-
component series system can be increased from 0.7125 to 0.9173, which is a significant gain. However, the
application of component redundancy works out to be much more effective since by this approach the system
reliability can be increased to 0.9352. Although, system redundancy provides the simplest method of
increasing reliability, component redundancy provides higher system reliability than that obtained from
system redundancy for same cost. The first two methods are both expensive since additional costs involved
for providing redundancy at the system, or unit, level, or for each component of the system are usually very
high.
The application of weakest-link technique is cost effective and is evident from the table, the incorporation of
just one parallel path for component 2 only helps to raise the reliability from 0.7125 to 0.8906.
It is obvious the application of this method will probably turn out to be most expensive. The choice of the
technique depends, therefore, on the system complexity and system configuration, system reliability
requirements and cost considerations.
14
DESIGN FOR RELIABILITY
Reliability consideration has tended to be more of an after-thought in the development of many new
products. Many companies' reliability activities have been performed primarily to satisfy internal procedures
or customer requirements. Where reliability is actively considered in product design, it tends to be done
relatively late in the development process. Some companies focus their efforts on developing reliability
predictions when this effort instead could be better utilized understanding and mitigating failure modes,
thereby, developing improved product reliability. Organizations will go through repeated (and planned)
design/build/test iterations to develop higher reliability products. Overall, this focus is reactive in nature, and
the time pressures to bring a product to market limit the reliability improvements that might be made.
In an integrated product development environment, the orientation toward reliability must be changed and a
more proactive approach utilized. Reliability engineers need to be involved in product design at an early
point to identify reliability issues and concerns and begin assessing reliability implications as the design
concept emerges.
Use of computer-aided engineering (CAE) analysis and simulation tools at an early stage in the design can
improve product reliability more inexpensively and in a shorter time than building and testing physical
prototypes. Tools such as finite element analysis, fluid flow, thermal analysis, integrated reliability prediction
models, etc., are becoming more widely used, more user friendly and less expensive. Design of Experiments
techniques can provide a structured, proactive approach to improving reliability and robustness as compared
to unstructured, reactive design/build/test approaches. Further, these techniques consider the effect of both
product and process parameters on the reliability of the product and address the effect of interactions
between parameters. Finally, the company should begin establishing a mechanism to accumulate and apply
"lessons learned" from the past related to reliability problems as well as other producibility and
maintainability issues. These lessons learned can be very useful in avoiding making the same mistakes twice.
Specific Design for Reliability guidelines includes the following:
 Design based on the expected range of the operating environment.

 Design to minimize or balance stresses and thermal loads and/or reduce sensitivity to these stresses
or loads. This design technique requires understanding of the physics of failure. It relies on
understanding the physical processes of stress, strength and failure at a very detailed level. Then the
material or component can be re-designed to reduce the probability of failure.
 De-rate components for added margin: Selecting components whose tolerance significantly exceeds
the expected stress, as using a heavier gauge wire that exceeds the normal specifica tion for the
expected electrical current.
 Provide subsystem redundancy: One of the most important design techniques for achieving
reliability of systems is redundancy. This means that if one part of the system fails, there is an
alternate success path, such as a backup system. An automobile brake light might use two light
bulbs. If one bulb fails, the brake light still operates using the other bulb. Redundancy significantly
increases system reliability, and is often the only viable means of doing so. However, redundancy is
difficult and expensive, and is therefore limited to critical parts of the system.
 Use proven component parts & materials with well-characterized reliability.
 Reduce parts count & interconnections (and their failure opportunities).
 Improve process capabilities to deliver more reliable components and assemblies.
15

Design of Relaibility

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Design of Relaibility

Transféré par

Droits d'auteur :

Formats disponibles

Department of Mechanical Engineering, Subject: Product Design & Value Engineering

Faculty of Technology & Engineering Class: S.S. BE-IV (M echanical)

DESIGN FOR RELIABILITY

Reliability may be defined in several ways:

If the underlying failure distribution is exponential, above equation becomes

Where, λ= the failure rate (inverse of MTBF)

Figure: Exponential curve relating reliability and time.

Properties of Reliability function:

Table 1. Commonly used continuous distributions

Failure Probability Density Function (Bathtub Curve)

Why is Reliability Important?

What is the Difference Between Quality and Reliability?

𝐓𝐨𝐭𝐚𝐥 𝐨𝐩𝐞𝐫𝐚𝐭𝐢𝐧𝐠 𝐭𝐢𝐦𝐞

There are 500,000 25-year-old humans in the sample population.

 Failure rate (λ):

Reliability Parameters for Repairable Systems:

MCT = MUT + MDT

When there are no delays in repair:

MTBF = MTTF + MTTR

Availability = MTTF / MTBF = MTTF / (MTTF + MTTR)

Reliability Parameters for Non-Repairable Systems

(a) Operational Availability:

Where, MTBF is the mean time between all maintenance and

(b) Inherent Availability:

Where, MTBF is mean time between failure and

RELIABLE SYSTEM DESIGN

Reliability design begins with the development of a model.

Figure 1 Series system block diagram.

 Characteristics of reliability function of a series configuration:

Figure2. Example reliability block diagram.

R(t) = e−(λ A +λB )t = e−0.0025x10 = 0.9753 (2.3)

Figure 3. Parallel system block diagram

Substituting, Fi(t) = 1- Ri (t) in above equation,

Now, System reliability function Rs(t) can be written as,

 Characteristics of reliability function of a parallel configuration:

Figure 3. RBD o f a system with redundant components.

REDUNDANCY TECHNIQUES USED IN SYSTEM DESIGN

Redundancy, as it is understood in reliability and maintenance management, is the provision of alternative

3. Weakest link Technique:

DESIGN FOR RELIABILITY

Specific Design for Reliability guidelines includes the following:

 Design based on the expected range of the operating environment.

Vous aimerez peut-être aussi