Vous êtes sur la page 1sur 12

The research register for this journal is available at

http://www.mcbup.com/research_registers

JQME
7,4

252

The current issue and full text archive of this journal is available at
http://www.emerald-library.com/ft

Rethinking Pareto analysis:


maintenance applications of
logarithmic scatterplots
Peter F. Knights

Mining Centre, Catholic University of Chile, Santiago, Chile


Keywords Pareto analysis, Maintenance, Management
Abstract Pareto histograms are commonly used to determine maintenance priorities by
ranking equipment failure codes according to their relative cost or downtime contribution.
However, such histograms do not readily enable identification of the dominant variables
influencing downtime and repair costs, namely the failure frequency, mean downtime and mean
repair cost associated with each failure code. Advances an alternative method for analysing
equipment downtime and repair costs using logarithmic (log) scatterplots. By applying limit
values, log scatterplots can be divided into four quadrants enabling failures to be classified
according to acute or chronic characteristics and facilitating root cause failure analysis. Log
scatterplots permit the identification of frequently occurring failures that consume relatively little
repair cost or downtime yet cause frequent operational disturbances leading to production losses.
In addition, by graphing the trend of failure data over successive time periods, log scatterplots
provide a useful visual means of evaluating the performance of maintenance improvement
initiatives. Provides examples of the practical application of log scatterplots by a number of
mining companies and mining equipment suppliers in Chile.

Practical implications
Pareto histograms of equipment failure codes ranked according to downtime or
repair costs do not enable the influence of the failure frequencies or the mean
downtime or repair cost to be clearly identified. Logarithmic scatterplots enable
failures to be classified according to acute or chronic characteristics, and
provide a better means of establishing maintenance priorities. In addition,
logarithmic plots can be used to graph trends in maintenance performance.
Introduction
In the late nineteenth century, the Italian engineer Vilfredo Pareto (1842-1923)
constructed histograms of the distribution of wealth in Italy and concluded that
80 percent of the country's wealth was owned by 20 percent of the nation's
population. This trend was later found to be representative of the distribution
of other data populations. The 80:20 rule, or a variation known as ABC analysis
that uses an 80:15:5 classification rule, is now routinely used in many fields of
study. As applied to the field of maintenance engineering, Pareto analysis is
commonly used for identifying those failure codes responsible for the majority
Journal of Quality in Maintenance
Engineering, Vol. 7 No. 4, 2001,
pp. 252-263. # MCB University
Press, 1355-2511

The author would like to thank Komatsu Mining Systems Chile and Modular Mining Systems
Chile for supporting the development of the work outlined in this paper. The paper has also
benefited from the thesis work of Cristian Aranguiz and Carlos Turina, final year students of
the Mining Centre of the Catholic University of Chile.

of equipment maintenance cost or downtime (see Hall et al., 2000). Based on the
failure codes identified, action plans can be elaborated to lower maintenance
costs or improve equipment availability.
However, Pareto analysis suffers from several deficiencies:
.
First, maintenance costs and downtime are the product of two factors;
the number of failures that occurred in a particular time frame and the
average associated repair cost, or mean downtime. A Pareto histogram
based on downtime (or cost) alone cannot determine which factor, or
factors, are dominant in contributing to the downtime or cost associated
with individual failure codes.
.
Second, Pareto analysis may miss identifying: individual events having
high associated repair costs or downtime; or frequently occurring
failures that consume relatively little repair cost or downtime yet cause
frequent operational disturbances. An example of the former is the
failure of the transmission in a mechanical mining truck. An example of
the latter is a repair to the truck's driving lights. Whilst the high cost of
the former is immediately evident, failures that frequently re-occur often
have significant hidden costs. For example, if the truck has to return to
the workshop to have a light replaced, the time lost travelling to and
from the workshop may dramatically increase the opportunity costs
associated with lost production.
.
Third, Pareto histograms are not generally useful for trending
comparisons. It can be difficult to directly compare ranked histograms
of costs or downtime for two different time periods since the relative
position of failure codes can change from one period to the other.
This paper outlines a simple, but powerful way of analysing data in order to
overcome these shortcomings.
Logarithmic scatterplots
The most convenient way of presenting the theory behind the new
methodology is via an example. Table I presents unplanned downtime data for
electrical failures in a fleet of 13 cable shovels at an open pit copper mine,
located in northern Chile. The data was collected over a one-month period.
Figure 1 shows the frequency histogram for the unplanned electrical failures,
with failure codes ranked in descending order in accordance to the downtime
corresponding to each code. Applying the 80:20 rule, it is evident that priority
should be given to failure codes 1, 2, 11, 3, 10, 7, 12, 8 and 5. Of these,
maintenance can do little to reduce the downtime associated with failure codes
3 (substation changes or shovel moves) and 5 (substation power cuts).
Maintenance costs and downtime can be represented by two equations:
1
Costi ni  MRCi
and
Downtimei ni  MDTi

Rethinking
Pareto analysis

253

JQME
7,4

254

Table I.
Unplanned shovel
electrical downtime

Code

Description

1
2
11
3
10
7
12
8
5
15
6
9
4
17
14
16
13

Electrical inspections
Damaged feeder cable
Motor overtemperature
Change of substation or shovel move
Overload relay
Auxiliary motors
Earth faults
Main motors
Power cuts to substations
Air compressor
Rope limit protection
Lighting system
Coupling repairs or checks
Overcurrent faults
Control system
Operator controls
Miscellaneous
Total

Quantity

Duration
(min)

Time
(%)

Cum.
(%)

30
15
36
27
23
13
7
12
21
8
10
26
15
6
7
5
9
270

1,015
785
745
690
685
600
575
555
395
355
277
240
225
220
165
155
115
7,797

13.0
10.1
9.6
8.8
8.8
7.7
7.4
7.1
5.1
4.6
3.6
3.1
2.9
2.8
2.1
2.0
1.5
100

13.0
23.1
32.6
41.5
50.3
58.0
65.3
72.5
77.5
82.1
85.6
88.7
91.6
94.4
96.5
98.5
100

Figure 1.
Pareto histogram of
unplanned shovel
electrical downtime

where Costi and Downtimei are the cost and downtime associated with the ith
failure code and ni, MRCi and MDTi represent the number of failures, the mean
repair cost and mean downtime respectively.
Figure 2 shows an alternative means of presenting the failure data listed in
Table I. An x-y scatterplot is used to plot mean downtime against the number of
unplanned failures for each failure code. Curves of constant downtime are
represented by a family of hyperbolae as shown. It can be seen that the failures
that consume most downtime are those associated with failure codes 1, 2 and

Rethinking
Pareto analysis

255
Figure 2.
x-y dispersion plot of
mean repair times
versus number of
failures

11. Thus the order of priority observed in the Pareto analysis is preserved,
however a clearer picture is available as to which factor failure frequency or
mean downtime is dominant.
A disadvantage of Figure 2 is that the curves of constant downtime are
hyperbolae and can be difficult to plot. A solution to this is to take the
logarithm of equations (1) and (2). Thus:
logCosti logni logMRCi

logDowntimei logni logMDTi

and
where log refers to log10. If an x-y graph is prepared of log(ni) against
log(MDTi), the curves of constant downtime now appear as straight lines with
uniform negative gradient (see Figure 3). Logarithmic scatterplots simplify the
identification of those failures which contribute most to total equipment
downtime or cost, whilst continuing to permit the visualisation of the influence
of failure frequency and mean downtime.
Repairs that require lengthy downtime can be considered acute problems.
Those failures that frequently reoccur (i.e. high n) can be considered chronic
problems. By determining threshold limits, the log scatterplot can be divided
into four quadrants, as shown in Figure 4. The upper quadrants denote acute
failures, whilst the right-hand quadrants denote chronic failures. The upper
right-hand quadrant is a region of acute and chronic failures.
Limit determination
Thresholds can either be absolute values determined by company policy, or
relative values that depend on the relative magnitudes and quantity of data.
One approach for determining relative values is to use average values as
follows.

JQME
7,4

256
Figure 3.
Log dispersion plot of
mean repair times
versus number of
failures

Figure 4.
Log scatterplot showing
limit values

The total cost, C, or downtime, D, consumed by unplanned failures is given by:


C i Costi

D i Downtimei

and
The total number of failures is:
N i ni

Letting Q be the number of distinct failure codes used to categorise the repair
data, the threshold limit for acute failures can be defined as:
C
8
LimitMRC
N

or
LimitMDT

D
N

Rethinking
Pareto analysis

and the threshold limit for chronic failures can be determined as:
Limitn

N
Q

10

In the case of the unplanned electrical failures for the fleet of shovels,
D = 7,797 minutes, N = 270 and Q = 17. Therefore, the limit value for acute
failures is 7,797/270 = 28.9 minutes and the limit value for chronic failures is
270/17 = 15.9 repairs.
Jack-knife diagrams
When dealing with large data sets, it may be desirable to focus on only those
chronic failures having highest direct cost or downtime impact. To this effect,
the right-hand lower quadrant can be divided into two regions, A and B as
illustrated in Figure 4. The dividing limit is a line of constant cost or downtime,
defined by the product of the two limits shown in equations (8) and (10) or (9)
and (10) according to which parameter is of interest. The expression for this
line is:
Cost

C
C
where 0 < Cost 
Q
N

11

D
D
where 0 < Downtime 
Q
N

12

or
Downtime

In a similar manner, the acute failures in the left-hand upper quadrant could be
divided according to direct cost or downtime impact. However, proportionally
greater benefit can be obtained by preventing the reoccurrence of a single acute
failure than preventing the reoccurrence of a chronic failure. For this reason, all
of the acute failures remain within the priority area defined by the limit shown
in Figure 4. The resulting graphs have been christened ``jack-knife'' diagrams
after the inverted V shape of the limit. In Table II the unplanned electrical
breakdowns for the shovel fleet have been classified according to jack-knife
principles.
Root cause failure analysis and remedial action
To improve equipment availability, attention should be focussed on either
reducing or eliminating the number of unplanned failures, or reducing the time
necessary to diagnosis and repair failures.

257

JQME
7,4

258

Table II.
Unplanned shovel
downtime: electrical
maintenance problems
prioritised according to
jack-knife principles

Time
(%)

Average
time

1,015
685

13.0
8.8
21.8

33.8
29.8
63.3

15
13
7
12
8
6
5

785
600
575
555
355
220
155

10.1
7.7
7.4
7.1
4.6
2.8
2.0
41.7

52.3
46.2
82.1
46.3
44.4
36.7
31.0
339.0

Chronic failures type A


11
Motor overtemperature
3
Change of substation or shovel move
Sub total

36
27

745
690

9.6
8.8
18.4

20.7
25.6
46.3

Chronic failures type B


5
Power cuts to substations
9
Shovel lights
Sub total

21
26

395
240

5.1
3.1
8.2

18.8
9.2
28.0

Code

Description

Quantity

Duration

Acute and chronic failures


1
Electrical inspections
10
Overload relay
Sub total

30
23

Acute failures
2
Damaged feeder cable
7
Auxiliary motors
12
Earth faults
8
Main motors
15
Air compressor
17
Overcurrent faults
16
Operator controls
Sub total

Once a prioritised list of failure codes has been identified, hypotheses can be
made about the possible cause (or causes) of each problem. Experienced
maintenance and operating personnel are indispensable to this process, since
familiarity with the machine, the operating environment and with maintenance
and operating practices is required. A list of possible root causes is as
illustrated in Table III. Although not necessarily exhaustive, these root causes
can be grouped according to whether they are inspection, maintenance,
operational, design, material quality or maintenance resource problems.
Chronic repairs are often associated with component quality defects, equipment
design problems, inappropriate operator practices or poor quality control in
upstream processes.
Two good examples of chronic repairs are provided by the data: motor overtemperature alarms (failure code 11) are more often than not a result of poor
blast fragmentation or shovel abuse. In both cases, corrective action should be
directed at mine operations. Outages to the shovel lighting system (failure code
9) typically result from wiring damage to structural vibration or poor filament
reliability. Redesign of the wiring harness may be one way of tackling this
problem. Another chronic problem, power cuts to the feeder substation (failure
code 5) could be due to operational planning problems or the electrical company
supplying power to the mine. The maintenance department can do little other
than draw mine management's attention to the problem.

Root cause of failure or repair delay

Action

1. Inspection
A. Insufficient inspection frequency
B. Inadequate inspection procedures
C. Poor quality inspection
D. Difficulty in accessing/diagnosing
component

A. Increase inspection frequency


B. Revise inspection procedures and training
C. Revise PM or inspection supervision
D. Increase PM frequency
E. Analyse criteria for replacing minor
components
F. Revise PM work procedures and training

2. Maintenance
A. Insufficient PM frequency
B. Inadequate work procedures
C. Poor quality PM
D. Poor quality component installation

G. Revise installation procedures and training


H. Design warning system for operator abuse
I. Design warning system for impending
failure
J. Implement operator precautions
K. Analyse extreme operating conditions of
machine
L. Modify or adapt machine or component
design
M. Change component supplier
N. Standardise component supplier
O. Analyse potential to extend service life of
spares
P. Analyse procedures for reconditioning
spares
Q. Revise spares stocking policy
R. Contract extra labour

3. Operation
A. Incorrect operation or operator abuse
B. Poor quality control in upstream process

Rethinking
Pareto analysis

259

4. Design
A. Original component or design inadequate
for conditions
B. Modified component or design inadequate S. Purchase/lease additional tools
for conditions
5. Materials
A. Variation in component quality one
supplier
B. Variation in component quality many
suppliers
6. Resources
A. Wait on spares
B. Wait on personnel
C. Wait on shop space
D. Wait on tools

The process of repairing an unplanned failure begins with an operator


notifying the maintenance department that the machine is faulty. Time is then
required for maintenance personnel to travel to the machine, access the faulty
component and diagnose the problem. If a spare part is required, more time is
lost in collecting and transporting the spare from the warehouse to the
machine. If the spare is not immediately available, the machine may have to
remain idle until a suitable component can be obtained. When the spare arrives
at the machine, active repair work begins. The final stage in a repair involves

Table III.
Root causes and
possible actions

JQME
7,4

260

testing the machine to verify that it has been returned to its normal operating
state. Possible repair delays include difficulty in accessing and/or diagnosing a
faulty component, and waiting on maintenance resources (spares, personnel,
tools or workshop space; see Table III). Good examples of problems subject to
extended repair delays are earth faults (failure code 12). Earth faults in
electrical circuits can be difficult to diagnose and isolate. In order to reduce the
time necessary to isolate earth faults, the mine could consider installing or
modifying shovel indicator panels so that they display more detailed
information concerning the electrical status of various points in a circuit.
Following the assignation of root causes to each failure code, a set of actions
should be formulated to eliminate or mitigate the factors causing unplanned
downtime. A list of possible actions for eliminating or reducing unplanned
downtime is as shown in Table III. Table IV illustrates the application of these
principles to the maintenance priorities previously identified for the electric
shovel fleet.
Some maintenance actions may necessitate investment on the part of the
mine. An estimation of the expected reduction in downtime allows the
maintenance department to undertake a cost/benefit evaluation of
implementing the maintenance action plan. If the cost savings are projected
over say, a five-year period, an NPV can be calculated for the maintenance
project. The advantage of this approach is that it permits senior management
to evaluate maintenance projects alongside competing project alternatives.
Maintenance need no longer be perceived as a costly overhead, but as a
strategic tool to maximise asset utilisation.
Jack-knife trend plots
A further benefit of logarithmic scatterplots is that they provide a useful means
of visualising trends in maintenance performance. For example, Figure 5 shows
the evolution of four failure codes from a BE 495-B cable shovel working at an
open pit copper mine in Chile. Unplanned failures were analysed for a period of

Table IV.
Proposed actions for
reducing unplanned
shovel electrical
downtime

Code

Description

1
10
2
7
12
8
15
17
16
11
3
5
9

Electrical inspections
Overload relay
Damaged feeder cable
Auxiliary motors
Earth faults
Main motors
Air compressor
Ovecurrent faults
Operator controls
Motor overtemperature
Change of substation or shovel move
Power cuts to substation
Shovel lights

Root cause(s)

Action

2A
3A, 3B
3A
2A
1B, 1D
2A
1B, 2C
3A
4A
3A, 3B

B, F
J, K
J
B, F
B
B, F
B, C, F
J
D
J, K

1A, 5A

A, N

ACUTE

Rethinking
Pareto analysis

261

Figure 5.
Trends in unplanned
failures for BE 495-B
cable shovel

three years, 1997 to 1999 inclusive. The threshold limits used in the graph were
calculated relative to the total unplanned failure data set for the three year
period.
It can be seen that significant improvement has been made with respect to
two of the failure codes over the period of the study. Unplanned failures to the
shovel lubrication system were chronic in 1997 and 1998, and not classified in
1999. Similarly, the total downtime due to failures of controls in the operator
cabin has decreased. However, unplanned failures to the swing system
(comprising the two swing motors, spur gears and main ring gear) are
obviously an area of concern, increasing from acute in 1997 to chronic and
acute in 1999. Likewise, unplanned stoppages due to motor over-temperature
alarms (alarms) also increased in both frequency and duration (data was not
available for the 1999 period to confirm this tendency).
Another potential application of jack-knife trend diagrams is to the
preparation of maintenance budgets. A log scatterplot of the repair costs
incurred during the most recent time period could assist a maintenance
manager to fix performance targets for forthcoming periods. It is postulated
that Windows-based software could be developed to help automate this
procedure. Using a mouse, the points representing failure codes in the log
scatterplot could be selected and dragged to desired target positions. The
software could then automatically calculate the resulting cost and downtime
reductions, as well as display the corresponding operating budget for the
maintenance department.
Establishing failure priorities from trend data
When data is available for two consecutive time periods, maintenance priorities
should be established by not only considering the chronic or acute
classification of the most recent data points, but also the trend in movement of

JQME
7,4

262

those points. As Table V shows, six possible combinations exist for the
movement of failure codes between two consecutive time periods ni, MDTi,
and Downtimei refer to changes in the mean number of failures, mean
downtime and total downtime experienced by the ith failure code over
successive time periods (in the case that repair cost is the parameter of interest,
the parameters ni, MRCi, and Costi should be used).
Using these six possible combinations, failure priorities can be established
as shown in Table VI. Three priority classifications are suggested; high,
medium and low. High priority is assigned to those unplanned failures
determined to have increased in total downtime or cost and currently
positioned in the priority area defined by the jack-knife limit (comprising the
acute, chronic and acute, and chronic type A quadrants). Medium priority is
assigned to those failure codes that have:
.
experienced a reduction in total downtime or cost (i.e. some progress has
been made) yet are still currently located in the priority area defined by
the jack-knife limit; and
.
increased in total downtime or cost and are currently classified as
chronic type B failures.
Remaining failure codes are classified as low priority.
This classification scheme assumes that if the total downtime or cost
contribution of a failure code has reduced over two successive time periods,
then the maintenance department must be taking positive steps and need not
necessarily modify their policies regarding the corresponding component or
subsystem. (In practice, other factors may also influence failure frequencies and

Table V.
Classes of possible
failure code trends

Class

ni

MDTi

Downtimei

I
II
III
IV
V
VI

Decrease
Decrease
Increase
Decrease
Increase
Increase

Decrease
Increase
Decrease
Increase
Decrease
Increase

Decrease
Decrease
Decrease
Increase
Increase
Increase

Class

Table VI.
Matrix to establish
failure priorities

I
II
III
IV
V
VI

None
3
3
3
3
3
3

Most recent quadrant


Chronic
B
A
Acute
3
3
3
2
2
2

2
2
2
1
1
1

2
2
2
1
1
1

Acute and chronic


2
2
2
1
1
1

associated repair times: for example, seasonal variations. If failure audits are
carried out on a regular basis, then such variations will eventually come to the
attention of maintenance personnel.) In the short term, the maintenance
department should focus its efforts on redefining maintenance, inspection or
operating policies for those failure codes that show adverse trends and are
associated with most machine downtime or repair cost.
Once failure priorities have been assigned, root cause failure analysis can be
undertaken and an action plan established. The methodology outlined here was
applied with considerable success to determine availability improvements for
the BE 495-B shovel. As an additional benefit, it was found that the
maintenance personnel at the mine were quick to come to terms with the
methodology, and, as result, more willing to accept the results and
recommendations of the failure analysis study.
Conclusions
This paper has identified important deficiencies in Pareto analysis methods
commonly used to determine failure priorities. An alternative means of
establishing failure priorities is proposed using logarithmic (log) scatterplots.
Log scatterplots preserve the basic information content of a Pareto histogram,
but enable the identification of the dominant factors influencing the failures,
namely the failure frequency and mean downtime or cost. By applying limit
values, log scatterplots can be divided into four quadrants in order to classify
failures according to acute or chronic characteristics. This classification
facilitates root cause failure analysis, and allows the identification of chronic
failures often associated with considerable hidden lost production costs. By
trending failure data over successive time periods, log scatterplots provide a
useful graphical means of analysing the performance of maintenance
improvement initiatives. The methodology described in this paper has been
applied and adopted by a number of mining companies and equipment
providers in Chile.
References and further reading
Aranguiz, C.P. (2000) ``Analisis de reparaciones imprevistas de equipos mineros en faenas a rajo
abierto'', final year thesis, Faculty of Engineering, Catholic University of Chile, Santiago.
Hall, R., Knights, P. and Daneshmend, L.K. (2000), ``Pareto analysis and condition-based
maintenance of underground mining equipment'', Trans. IMM, Section A: Mining
Industry, Vol. 109, pp. A14-A22.
Knights, P. (1999) ``Analysing breakdowns'', Mining Magazine, September, Vol. 181 No. 3,
pp. 165-71.
Turina, C. (2001), ``Estudio comparativo de las intervenciones imprevistas en flotas de camiones
electricas operando en distintas faenas mineras en Chile'', final year thesis, Faculty of
Engineering, Catholic University of Chile, Santiago.

Rethinking
Pareto analysis

263

Vous aimerez peut-être aussi