Académique Documents
Professionnel Documents
Culture Documents
(2009) 9:185192
DOI 10.1007/s11668-009-9226-1
FEATUREPEER-REVIEWED
Submitted: 30 September 2008 / in revised form: 28 January 2009 / Published online: 20 March 2009
ASM International 2009
Introduction
Failure represents an adverse situation wherein a component or assembly fails to satisfactorily perform its intended
function. In other words, failure can be defined as the gap
S. K. Bhaumik (&)
Failure Analysis & Accident Investigation, Materials Science
Division, National Aerospace Laboratories, Council of Scientific
and Industrial Research (CSIR), Bangalore 560 017, India
e-mail: subir@css.nal.res.in
between the expected performance and the actual performance of any component or assembly. The purpose of
failure analysis is to establish the mechanism and causes of
the failure and to recommend a solution to the problem.
Often failures do not just happen but are caused, and
determination of the cause for the failure helps to identify
what exactly went wrong and what needs to be done to
avoid similar failures in future. Even the most sophisticated
simulation testing cannot adequately duplicate the varied
factors and the many unanticipated events that may lead to
failure. Hence, failure analysis offers the most reliable tool
in ensuring the continuing safety of the component or an
assembly or a system.
Analysis of engineering failures is a formidable, complex, and challenging task. It is a task that requires
information from personnel with expertise in many areas. It
also demands tremendous responsibility and coordination
on the part of the analyst and a thorough knowledge of
materials science supplemented with an appreciation for,
and willingness to apply, related engineering disciplines.
The effort to identify the root cause of failure not only
helps to solve the immediate problem, but provides valuable guidance as to what needs to be done to prevent
recurrence of similar failures in a given system or organization. However, experience suggests that most failure
analyses fall short of this goal. This is because a significant
fraction of failure analysts incorrectly use the term root
cause when what they really establish is the primary cause
of failure or the simple physical process of failure [1, 2].
This aspect of the failure analysis process is discussed in
this paper. A few examples of service failures are cited
wherein the immediate causes of physical failures were
obvious, but the underlying causes or the root causes
leading to these failures were traced to errors involving
human factors that are often overlooked.
123
186
123
established procedure for the evaluation of failed components/systems, and in the process they ignore the complexity involved in establishing the facts of the failure.
They may adopt a preset procedure without even trying to
discover the situations or conditions under which the failure had occurred. The use of a preset procedure generally
occurs because of a lack of appreciation that the same
physical failure can be arrived at in many ways. Because of
the many potential paths to a specific physical failure, an
understanding of the relative importance of various factors
in the specific case at hand is essential. It must be borne in
mind that all failures are unique, and, hence, each of them
should be treated uniquely. As described by Dennies [1],
established recipe-type procedures are generally inadequate in determining the root cause(s) of a failure.
Analysis of a Few Investigations: Case Studies
The failure analysis trends in organizations vary widely
from one to another, and the process adopted is often
dependent on the organizational culture or rules [1, 2].
Root cause analysis shows that eventually all failures are
caused by human errors. According to Zamanzadeh et al.
[2], these can be broadly classified into the following three
categories:
Errors of knowledge
Errors of performance (which might be caused by
negligence)
Errors of intent (which might even be acts of greed or
sabotage)
187
investigations of engineering failures are concluded without even making any attempt for the root cause
determination. The investigation simply stops after the
identification of the physical cause(s) for the failure, and
the impact of human error is not thoroughly investigated.
However, the failure analyst may also contribute to such an
incomplete investigation because of a lack of expertise.
The following examples amply illustrate these facts.
A Gear Failure: Deviation in Heat Treatment Procedure
Fabrication processes for aircraft components are generally
well designed and strictly controlled because of the safety
concerns. This is adequately supplemented by inspection at
various levels of fabrication schedule and maintenance of
records at each stage. Therefore, deviation in the manufacturing process/parameters, if any, is easily traceable.
Although the engineers working in the production shop are
well aware of this fact, still error can occur. The following
is an example.
123
188
123
189
Depainting
Sand blasting
Zinc metallizing
Oxyparkerizing (phosphating)
Repainting
Assembly and storage since 2003
Fig. 8 Dimple rupture (ductile) features at the steps and along the
rim of the bore
123
190
noticed during mandatory qualification tests of a few production batches. The acceptance criteria for fatigue life was
fixed as 10 9 106 cycles at 20 kN load, and failures
occurred after as few as 1.86 9 105 cycles.
A prematurely cracked rod-end top is shown in Fig. 9.
Examination revealed that the fatigue crack had initiated at
stress concentrations arising from mechanical damages
caused by metal-to-metal contact (Fig. 10). Detailed
investigation discovered that there were two reasons for the
mechanical damages to occur: (a) nonuse of recommended
MoS2 coating at the bearing interface and (b) improper fit
within the bearing. The failures continued even after restoration of MoS2 coating. Examination did not show any
deviation in the dimensions of the rod-end top bore and the
bearing or problems in the manufacturing procedure.
Hence, the manufacturer was asked to examine the staking
tools used for the assembly of the bearing, closely monitor
the process, and report. In spite of all care, the failure
continued to happen. Analysis of a few more failed components confirmed that in all cases, there was nonuniform
contact at the bearing/bore interface. As an additional
check, the manufacturer was asked to examine the press
used for the staking process. After examination, the
supervisor/operator reported back saying there were no
abnormalities in the press.
The reason for the improper fit and the nonuniform
contact were difficult to identify for the moment. Based on
the physical failure, no other conclusions could be established. Finally, it was suggested that the staking process for
a few components be carried out using a well-calibrated/
maintained universal testing machine. The components so
Fig. 9 (a) A prematurely
cracked rod-end top. (b) Closeup view of the cracked region.
(c) Crack surface showing
fatigue crack origin (arrow)
123
191
123
192
References
Summary
It has been shown that the identification of the causes for
the physical failure of a component or system is a part of
the total investigation process. This is only the first step in
the entire process and often sets the direction of the
investigation. The most important part of the investigation
is to determine the root causes, that is, what went wrong to
create the conditions for the physical failure to occur. In
this phase of investigation, dealing with human factors is
inevitable. Therefore, further progress into the investigation largely depends on the attitude of the customer/client
and expertise of the failure analyst. In many cases, unfortunately, the investigation stops when the physical cause is
determined, and the conclusions are drawn without even
making an attempt to establish the root cause.
123