Vous êtes sur la page 1sur 15

Simulation Modelling Practice and Theory 16 (2008) 16891703

Contents lists available at ScienceDirect

Simulation Modelling Practice and Theory


journal homepage: www.elsevier.com/locate/simpat

Detection and interactive isolation of faults in steam turbines to


support maintenance decisions
Christer Karlsson a,*, Jaime Arriagada b, Magnus Genrup b
a

Mlardalen University, Department of Public Technology, Process Diagnostics Group, P.O. Box 883, 721 23 Vsters, Sweden
Lund University Lund Institute of Technology, Department of Heat and Power Engineering, Division of Thermal Power Engineering,
P.O. Box 118, 221 00 Lund, Sweden
b

a r t i c l e

i n f o

Article history:
Received 30 January 2007
Received in revised form 30 June 2008
Accepted 21 August 2008
Available online 9 September 2008

Keywords:
Steam turbine maintenance
Articial neural network fault detection
Bayesian network fault isolation
Decision support

a b s t r a c t
The maintenance of steam turbines is expensive, particularly if dismantling is required. A
concept for the provision of support for the maintenance engineer in determining steam
turbine status in relation to the recommended maintenance interval is presented here.
The concept embodies an articial neural network which is conditioned to recognise patterns known to be related to faults. The faults simulated are not known to be recognized
on-line and the concept is in an early stage of development. An example of a Bayesian network structure containing expert knowledge is proposed to be used, in a dialogue with the
operator, to isolate the root causes of a number of fault types. The aim is to be well
informed about the statue of the turbine in order to take earlier and better informed maintenance actions. The detection procedure has been validated in a simulation environment.
2008 Elsevier B.V. All rights reserved.

1. Introduction
Most of the productive equipment in heat and power generating plants is subject to degradation and requires maintenance. Advice on when to carry out major overhauls and other maintenance operations is usually provided by the manufacturers of the equipment. The intervals between these procedures are usually based on a periodic maintenance schedule. The
single largest cost of keeping a steam turbine in operation is related to these activities [19], with dismantling being one of the
most expensive procedures [3]. The dismantling itself introduces the risk of creating problems such as vibration and leakage.
Monitoring and fault diagnostic systems for steam turbines are efcient means of preventing avoidable and costly turbine
maintenance. There are examples of steam turbines operating for up to 17 years without dismantling [3]. This is only possible with efcient monitoring of the turbine.
If a steam turbine owner wishes to change from periodic maintenance to condition-based maintenance, on-line monitoring of the turbine performance is required to provide sufcient information to evaluate the risks of continued operation.
Methods of determining steam turbine condition include thermodynamic calculations to determine turbine efciency and
leakage [7], and vibration analysis [13]. Methods that address detection and isolation have been reviewed from a risk assessment perspective [8], and research and methods aimed at solving the whole chain of tasks for diagnostics (including detection and isolation) are presented by [18].
In steam turbine engineering, the steps involved in determining the need for maintenance action include fault detection,
isolation and identication. Faults are detected by observing abnormal patterns in data. Articial neural networks (ANNs) are
one means by which such patterns can be recognized [5]. ANNs learn patterns from input provided either from plant data or

* Corresponding author. Tel.: +46 21 101356; fax: +46 21 101370.


E-mail address: christer.karlsson@mdh.se (C. Karlsson).
1569-190X/$ - see front matter 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.simpat.2008.08.013

1690

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

from extensive simulations and are therefore data-driven. Fault isolation can be performed by ANNs, but they do not fully
explain the domain and human interaction is needed to compensate for the incomplete system domain knowledge. Isolation
is a difcult task but among the possible methods for solving this kind of problem, Bayesian networks (BNs) have been found
to be suitable [20]. The operator feeds evidence into a BN that handles uncertainties in variables. The structure of the BN
makes it easy to monitor the effect of new evidence on the probability of a certain root cause.
The operator of a heat and power plant solves most operational problems in co-operation with the maintenance staff, but
there are some tasks that require additional support. The identication and isolation of faults that develop slowly is one such
task. Our proposed concept for on-line decision support uses ANNs for fault detection and BNs for fault isolation. ANNs automate the detection of faults, and BNs automate the isolation of faults to some degree. In order to be cost effective, this system
is applied to only a small part of the domain in a heat and power plant that which has the greatest impact on maintenance
costs. The system is kept simple and focused on detection and decision support associated with the most expensive maintenance. The system is demonstrated here in an application that only addresses problems that can be determined by thermodynamic models and have impact on turbine efciency.
The concept is evaluated by detecting and isolating seven fault types introduced in a simulation of a steam turbine and
surrounding equipment. Training data is from simulations of plant operation at 100% turbine load with the fault types present. An ANN is trained to detect patterns of several fault types. For one of the seven fault types, an expert network has been
developed for a dialogue with the operator. The BNs for fault isolation are constructed from empirical knowledge of turbine
faults [4,12,17] and faults in surrounding equipment [9]. The parts of the concept considered here are the fault detection and
the fault isolation. Sensors in addition to those proposed here can increase the number of detectable faults and enhance performance of detection and isolation. The concept of using a combination of ANNs and a BN to provide a decision support
structure to the operator is presented in the following text, along with a case study and description of fault types used to
demonstrate the concept. Issues related to the selection of methods for each task and the division of the work between these
methods are discussed. Finally we discuss the results and the structure and draw conclusions from the study.
2. Concept of the fault diagnosis system
Previous work by this joint research group, has demonstrated that ANNs have excellent capabilities for detecting faults in
thermal power systems. This includes the important capacity to detect incipient faults to permit the generation of early
warnings [1]. However, we have not achieved a satisfactory ANN approach to the performance of root cause analysis. On
the other hand, attempted solution of similar tasks through the application of BNs has revealed their inherent ability to perform root cause analysis [11,22]. Discussion of the pros and cons of both tools has led to the basic idea presented in this
work: the combination of ANN and BNs, both derived from the articial intelligence eld.
One of the key features of the concept is the dialogue with the operator. The dialogue gives the expert system access to
the common sense of the operator, his eyes and ears, and his ability to quantify relative measures (if a motor is in good or bad
condition, for example). The following paragraphs and Fig. 1 describe the working of the concept from detection to root cause
isolation.
When there is no sensor fault and the ANNs report a developing type 7 fault in the steam cycle, information about the
detection of this fault type is sent to the operator. The fault types considered are those which do not require immediate actions, like shutdown, so the operator has time to consider the appropriate course of action. His next task is to isolate the fault
to one of causes of the type 7 fault. This task is carried out by collecting evidence in the form of observations such as manual
temperature measurements, a check of the valve stem position, estimate of equipment status, etc. This evidence is important

Fig. 1. Fault diagnostic system.

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

1691

in distinguishing one root cause from another, but is not important enough to be monitored on-line for process operation
under normal conditions. This is another example of the importance of the operator as a user of informed judgment in
the context of the constraints on the system, particularly in isolating faults.
Inputs to the BN from the operator via Observations affect the probability of different root causes. When the operator
decides that enough evidence has been collected, he can continue to interact with the system to test alternative hypotheses.
An important issue here is that maintenance decisions are not solely inuenced by the plant owner. The maintenance
instructions of the equipment manufacturer are also important. The insurance company that insures the equipment and
other stakeholders must be convinced of the benets of the new maintenance strategy. Their requirements must also be taken into account.
In this study, the ANNs for fault detection and the BNs for fault isolation are applied in a steam turbine simulation. The
concept is applied to a simulation of a steam turbine and its surrounding components (Fig. 2). The application in the example
illustrates how maintenance decisions based on measurements can be supported. Additional measurements that provide
information about the condition of the plant and steam turbine can be incorporated to increase the effectiveness of the fault
detection and isolation. Performance of the detection and isolation tasks presented here usually requires special performance tests performed by experts. The detection levels of the faults are very low and are not achievable using thresholds
on single measurement devices.
We believe that this concept helps operators and plant owners to take appropriate maintenance actions both earlier and
based on a better foundation. The concept does not aim to replace steam turbine experts, but to provide more relevant information to support improved decision-making. When the correct information about the turbine status is available, a steam
turbine expert is probably consulted at an earlier stage than in the absence of the detectionisolation system.
3. Case study and fault types
The case study of the steam turbine and plant components is presented briey in this section and the fault types are
described.
3.1. Steam turbine and steam cycle components
The thermal process studied is a biomass-fuelled heat and power co-generation plant with a ue gas condenser and catalyst ue gas cleaning. The boiler was installed in 1994 and has steam data 540 C, 100 bar(a) and generates 80 MWth before
losses. The steam powers a turbine that generates up to 23 MWe. The process model is illustrated in Fig. 2 [9]. District heating is obtained from two condensers that output a total of 55 MWth. The ue gas condenser for district heating is not included in the model. The previously developed process model included high and low pressure turbines, turbine gear,
generator, condensers, preheaters, leakage condenser and seals [9]. The components manipulated for simulation of the fault
types in the process model are illustrated in Fig. 2 and further described in the next section.
3.2. Fault types
Dismantling of the turbine is the largest single maintenance cost for a heat and power plant and considerable savings in
maintenance costs are made with each year that passes without dismantling the turbine. The fault types and their causes,

Fig. 2. Steam turbine and steam cycle [9, p. 136]. Fault types 17 and the indicated manipulated process components in the simulations.

1692

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

effects and in some cases treatments are presented. Seven fault types were selected from the literature and from the experience of the authors [4,9,17,12]. Faults in the steam turbine and surrounding equipment considered are: (1) solid particle
erosion on turbine rst stage; (2) leakage in overow valve; (3) fouling and deposits in stages 2 and 3; (4) fouling and deposits in stage 4; (5) damaged shell and rotor sealing; (6) ageing and wear; and (7) fouling and gassing in the condenser.
The system proposed is intended for a subset of faults with increased emphasis on some examples of faults that are difcult to detect and may potentially cause major economic costs. The fault types 1, 3, 4, 5, and 6 have been chosen as examples of errors that require dismantling of the turbine for verication and correction. Fault types 2 and 7 do not require
dismantling of the turbine but in common with faults 1, 3, 4, 5, and 6 are hard to detect and isolate and can cause plant shutdown or damage of the turbine. In this context hard to detect means that the installed sensors alone are insufcient to isolate
the root cause of the fault. Dialogue with the operator is needed to isolate the fault.
3.2.1. Solid particle erosion on turbine rst stage (type 1)
Solid particle erosion in the steam path is due to exfoliation of iron oxide and magnetite particles from the high temperature section of the boiler [4]. The impact of the particles on the rst turbine stage causes damage to the blades (Fig. 3),
which increases the swallowing capacity of the turbine, and decreases the efciency of the turbine stage. Solid particle erosion can to some extent be avoided by using a bypass valve that leads the steam to the condenser during start-up. Other
counter-measures to reduce the effects of solid particle erosion include the chemical treatment of the steam system to reduce exfoliation, and armouring the particle removal system and turbine with erosion-resistant coatings, [15]. Recently the
use of fewer and larger blades in the rst stage has been identied as the most important factor in eliminating solid particle
erosion [26].
3.2.2. Leakage in overow valve (fault type 2)
Leakage in the overow valve can be due to a broken spindle [6] or solid particle erosion [4]. Leakage in the overow valve
causes loss of turbine performance [9]. Steam of high quality bypasses the rst turbine stage and is either lost as condensate
in the leakage condenser or re-enters the turbine at a lower pressure. The leakage can be detected by measuring the steam
temperature downstream of the valve and by checking the valve position.
3.2.3. Fouling and carry-over in the turbine steam path (fault types 3 and 4)
Fouling originates from impurities in the raw water entering the steam system and from additives used in water processing. These impurities are transported from the boiler to the superheated steam by three different mechanisms: Mechanical
carry-over, vaporous carry-over and attemperators (i.e. spray in a superheater) [17]. The degree of fouling and depositing is
dependent on the boiler drum pressure level, the separation efciency, spraying in superheaters, and other factors [21]. Fouling in the turbine steam path causes degradation of turbine performance. Deposits change the blade prole and increase the
surface roughness as shown in Fig. 4. Compounds deposit on different turbine parts, depending on the temperature in the
steam path. Fault type 3 consists of fouling and deposits in stages 2 and 3, and fault type 4 consists of fouling and deposits
in stage 4. With correctly located sensors it is possible to distinguish between these two faults. Fouling and deposits can be
reduced by generally improving the quality of the processed water and by reducing spray in the superheaters.
3.2.4. Damaged shell and rotor sealing (fault type 5)
Internal leakage can be caused by factors such as the erosion of seal tips and the rubbing of seals. Vibrations and axial
displacement beyond design limits result in contact and rubbing between the seals of rotor and shell which cause deforma-

Fig. 3. Solid particle erosion, picture from [26].

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

1693

Fig. 4. Fouling on turbine blades, detail of picture from [7].

tion of the seals (Fig. 5). Flow patterns in the steam path can also cause erosion of the seals. Internal leakage through imperfect seals causes inefcient passage of high value steam past a turbine stage, decreasing the turbine efciency [12]. The seal
tip geometry is very important for the efciency of the turbine as even a small deformation can cause a considerable increase
in internal leakage.
3.2.5. Ageing and wear (fault type 6)
Increased surface roughness (Fig. 6) and degradation of mechanical components decrease turbine performance. The
causes of ageing and wear include effects of temperature gradients, steam quality, and particles in steam and the effects
are reduced turbine stage efciency. This fault is detected as a uniform degradation in performance of the turbine.
3.2.6. Fouling and gassing in the condenser (fault type 7)
Fouling stems from residues in district heating water that build up and insulate the condenser tubes from the inside and
thereby reduce heat transfer between the district heating water and the steam. Gassing occurs when non-condensable atmospheric gases form an insulating lm on the tubes. This has a considerable impact on heat transfer between the steam and
the condenser tubes even at low volume fractions of non-condensable gases [14]. In the low pressure part of the steam turbine the working pressures are lower than atmospheric pressure. External air continuously leaks into the condenser through
low pressure turbine seals. This air should be removed from the condenser by vacuum pumps to avoid reduced heat transfer.
Another source of non-condensable gases in the condenser is the raw water. The treatment of raw water removes most of
these gases but some can reach the condenser, where they accumulate. Air can also leak into the condenser through seals

Fig. 5. Damaged (bent) rotor sealing inside black circle, detail of picture from [7].

1694

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

Fig. 6. Increased turbine blade roughness [12].

and joints. Fouling and gassing reduces the efciency of the condenser and increases the temperature difference at the condenser outlet.
3.3. Simulation of fault types
The fault types reported in the literature referred to in this work have occurred in different power plants, and have been
investigated and documented [4,6,9,12,17,21,26]. The authors know of no data set from a real process where a plant has been
subject to all the faults presented.
Production of articial data is preferred to full-scale testing with physical components because of the known relationships
between model parameters and fault types that make it possible to induce the faults in the mass and heat balance model, the
lack of risk to plant personnel and equipment, the low cost of experiments and output that is free from problems such as
outliers and missing data.
Generation of process data with the fault types present was conducted using a set of district heating loads and process
plant set-points. The outgoing steam pressure and temperature from the boiler were set to xed values corresponding to
the full load. This mimics the goal of the operator to maintain high pressure and goal temperature at the outlet of the boiler
to maximize power generated in the turbine.
The simulation output was used to train the ANN to recognise and memorise and fault type patterns. A model of a steam
cycle [9] that included a steam turbine was used in the simulation. The fault types were simulated by manipulating parameters in the model as indicated in the description of the fault types. The parameters and the associated fault types are summarized in Table 1. For fault types 2, 5 and 7, some relationships between cause and effect which permit the isolation of the
root cause of the fault are known. For fault type 7 a BN structure has been developed for operator interaction as an example
of how to isolate the root cause of this fault.
4. Fault detection by articial neural networks
This section will explain how ANNs can be used to detect faults. The basic idea is to make use of the pattern recognition
ability of ANNs to interpret combinations of data that would be difcult for a human expert to interpret. It is not possible to
set alarm thresholds on single measurement devices to detect the faults simulated because of strong interactions between
measurements, part load operations, and subtle changes in correlations between measurements that are related to the simulated faults.
Although the ANN procedure is similar to the procedure that is often used by operators in the control room, ANNs are able
to handle systems with high degrees of interaction and multiple inputs and outputs. After the patterns have been recognised

Table 1
Manipulated parameters in the process model by fault type
Fault type

Description

Manipulated parameters

1
2
3
4
5
6
7

Solid particle erosion


Leakage in overow valve
Carry-over and deposits on stages 2 and 3
Carry-over and deposits on stage 4
Damaged shell and rotor sealing
Ageing and wear
Fouling and gassing in the condensers

Increased swallowing capacity and decreased efciency for rst stage


Leakage ow through overow valve
Turbine parts 2 and 3 decreased swallowing capacity and efciency
Stage 4 decreased swallowing capacity and efciency
Increased mass ow through sealing
Decreased efciency for all turbine stages
Reduced heat transfer coefcients for both condensors

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

1695

they can be classied into groups. In this way, different combinations of data representing several malfunctions can be sorted
into different groups of pre-determined faults.
4.1. Articial neural networks a short introduction
An ANN is a mathematical construction that can be used for modelling multi-dimensional systems, i.e. mapping of many
inputs onto many outputs [10]. Among their most common applications are pattern recognition and multi-dimensional nonlinear regression [5]. ANNs are not programmed as regular codes, instead they learn from experience. This experience is represented by data that is used to train the ANN. An ANN is therefore classied as a data-driven system.
There are different types of neural networks. The type of network depends on factors such as their architecture, the paradigm used to train them, and the direction in which the data ows through them, among others. When the data ows
strictly forward within an ANN, this is referred to as a feed-forward network. If feedback connections are introduced, then
the ANN is a feedback or recurrent network (which can be used to introduce time-dependency, for instance). Feed-forward
networks are trained in a supervised manner, meaning that the ANN is trained using patterns for which the desired output is
known beforehand.
The simplest feed-forward ANN consists of a single processing unit (or neuron) in which several parallel input signals
[x1, x2, . . . , xM] are summed after they have been multiplied by their respective synaptic connection, i.e., the weights
[w1, w2, . . . , wM]. The resulting weighted signal (s) is the effective input to an activation or transfer function, F, which produces
an output signal, y. An additional input, x0, with the xed value of +1, is added in order to introduce an off-set to F. Its weight
w0 is called the bias. The output from this simple network can be represented as a generic function of the inputs:

yF

M
X

wi xi

i0

where x0 = +1.
If F is a threshold function, then this network is called Perceptron (see Fig. 7), which was rst presented by Rosenblatt in
1958 [16], and if F is a linear function, then this is a linear classier called ADALINE, introduced by Widrow and Hoff in 1960
[27]. Fig. 8 shows the threshold and the linear transfer functions.
This basic conguration can be expanded both in depth and length, i.e., units can be added in parallel to form a layer of
neurons, and more layers can be added after each other. In the latter case, the ANN is referred to as a multi-layer feed-forward network, which is formed by an input layer, an output layer, and one or more hidden layers, as shown in Fig. 9. The
input signals, the weights and the outputs can be arranged in vectors and matrices, in order to simplify the mathematical
notation.
The input layer collects but does not process the information from the environment (the system to be modeled or analyzed). The hidden and the output layers are the processing layers of the network. The weight matrices store the information
about the underlying governing relationships of the actual system. They are the long-term memory of the ANN [23]. The output vector sends the response of the ANN back out to the environment. Every input unit corresponds to an input parameter,
and every output neuron corresponds to an output parameter. This kind of ANN is called multi-layer perceptron (MLP), because of its kinship to the Perceptron.
It has been shown that one hidden layer is sufcient to carry out non-linear mapping of a continuous function if the number of hidden neurons may be increased [10]. Therefore, we only consider two-layered MLPs in this study (the input layer is
not counted in the number of layers). The generic expression for a two-layered MLP with M input signals, H neurons in the
hidden layer and N outputs has the following form:

Fig. 7. The Perceptron.

1696

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

Fig. 8. Threshold and linear transfer function.

Fig. 9. Feed-forward multi-layer network.

yk F o

H
X
j0

wkj  F h

M
X

!!
wji xi

i0

where k = 1, . . . , N; x0 = +1.
Depending on the application, it may not be necessary for the hidden and output layers to have the same activation functions. For this reason Eq. 2 uses different indices for each layer. The training of the MLP requires supervised learning, i.e., that
a data set for which the targets are known is available. The most popular training algorithm is the backpropagation method,
which was popularized by Rumelhart et al. in 1986 [24]. The principle is to present an input to the ANN, compare the output
generated with the target, and to adjust the weights if there is an error. The updating of the weights continues until a least
mean square (LMS) error function reaches the training goal. The weight correction is assumed to be in the decreasing direction of the gradient of the error function with respect to the weights. During the training, the MLP learns the internal representations for the training data, and once the training is over, it can make predictions for new input patterns. This method
requires activation functions which are differentiable in their entire range, such as the sigmoidal activation functions, for
example. The two most popular functions are the logistic sigmoid and the tanh sigmoid, which are shown in Fig. 10.
In practice, the available training data set is divided into two portions one for training, and a second for cross validation
during training. When the performance of the ANN using this cross validation data set is satisfactory, the training procedure
is stopped and the weights remaining are kept unchanged. The ANN is then tested further on a third independent data set in
order to determine if it has a generalization capability, i.e. if its performance is also satisfactory for previously unknown patterns that were not used during the training phase [2]. An ANN with generalization capability is expected to perform well in
a real case application for the system for which it has been trained if the data used for training is representative of the
system.

Fig. 10. Sigmoidal activation functions.

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

1697

4.2. Structure of the articial neural network system for fault detection and selection of the input parameters
ANNs of the type described above constitute the fault detection module of the monitoring system. Sensor validation is not
considered in this theoretical example, but is needed in an industrial application.
Twenty-three parameters that were measured in the real system are selected as input parameters for training the ANN,
based on empirical evidence. The input parameters for ANN#1 and ANN#2 as shown in Table 2 have been recognized to be
related to the faults studied. Later on, the same parameters are fed to the fault detection (the trained ANNs) in order to identify them during the operation of the plant.
In total 29 data sets were generated with a heat and mass balance model of the plant. These included one for the Healthy
condition (H), and one for each of four different levels (25%, 50%, 75% and 100%) of each fault. The data sets for the Healthy
and for Faulty conditions at 100% fault level were used to train different ANNs by a trial-and-error procedure where the number of neurons in the hidden layer of the network was varied. The ANN that showed the best overall recognition capacity the
faults contained 24 hidden neurons, and is referred to as ANN#1. ANN#1 contains eight output neurons, corresponding to
the number of conditions to be diagnosed, one Healthy, and seven Faulty. The architecture of ANN#1 is therefore 2224
8 (22 input neurons, 24 hidden neurons and 8 output neurons).
The eight outputs are dened as binary, i.e. either 1 or 0. When input data is presented to the ANN, the output is 1 at the
position corresponding to the predicting condition (H, F1, F2, F3, F4, F5, F6 or F7) and the remaining positions are occupied by
zeros. The ANN produces values between 0 and 1 for each output neuron. The detection was divided into three classes: detected, not detected and not classied. Using arbitrarily chosen threshold values the fault is considered detected for values
P0.7, not detected for values 60.3, and values in between are dened as not classied.
When testing ANN#1 with independent data (including data at lower Fault Levels) it is capable of recognizing most of the
faults that have been considered in this study. However, ANN#1 is not fully successful in distinguishing a healthy (H) steam
turbine from a steam turbine with fault type 5 (F5). Therefore, a second ANN (called ANN#2) has been trained for this specic
purpose. ANN#1 and ANN#2 are shown in Fig. 11. As previously stated, the inputs to ANN#1 are parameters that are measured in the real plant. ANN#2 is supplemented with an extra input parameter not previously measured in the facility but
that was discovered to be necessary for ANN#2 to work well. The extra input is a pressure measurement between turbine
stages inside the steam turbine and provides the extra information needed in order to better detect F5. Thus ANN#2 has one
more neuron in its input layer, making a total of 23. The trained ANN#2 delivered good results when 30 neurons were utilized in its hidden layer. Using the same binary classication method as in ANN#1, ANN#2 has only two outputs. The resulting architecture for ANN#2 is 23302.
Fig. 11 shows how the module capacity can be subsequently expanded to include the detection of new faults. It also
shows how ANNs can be utilized to present an analysis indicating which measurement values should be added in order
to improve the detection capability of the module. By adding the new measurement value to the second network in series
with the rst, a dramatically improved performance was obtained for fault type 5.

Table 2
List of input data to ANNs
1. Load on steam turbine
2. Flow principal steam before HP-Turbine
3. Outlet temperature LP-Turbine
4. Pressure at condenser 1
5. Temperature at condenser 1
6. Pressure at condenser 2
7. Temperature feed-water
8. Flow feed-water
9. Pressure in deaerator
10. Temperature in deaerator
11. Temperature outgoing district heating water
12. Temperature water after condenser 1
13. Temperature district heating water after condenser 2
14. Temperature district heating water after gland steam condenser
15. Flow condensate to deaerator
16. Active power
17. Position control valve group 1
18. Position control valve group 2
19. Position turbine overow valve
20. Temperature of condensate after condenser
21. Flow condensate after condenser 1
22. Pressure in the turbine between two stages, also called steam ow in turbine
Additional parameter for ANN #2 is
23. Temperature at low pressure turbine extraction A0

1698

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

Fig. 11. Structure of the neural networks for the fault detection module.

4.3. Results of the detection


Table 3 shows the performance of ANN#1 when used alone in comparison to the performance of ANN#1 in combination
with ANN#2, as shown in Fig. 11. The table only shows their performance on 100% fault level ability to recognise all the faults
is improved in the second case.
Table 4 shows the results of using ANN#1 and ANN#2 in combination for all Fault levels, i.e. for fully developed faults
(fault level 100%) and developing faults (fault level 75%, 50% and 25%, respectively). It is apparent that the system nds it
difcult to recognise Fault number 7 when it is not fully developed. However, around half of the faults can be discovered
at quite an early stage of their development.
5. Fault isolation by Bayesian network
An important part of the concept of combining ANN and BN is operator interaction. The operator uses the BN as an advisor
on what information is needed to discern root causes. The system structure contains expert knowledge (stored as acyclic

Table 3
Target ratio for ANN#1 and the combination of ANN#1 & ANN#2
Target ratio in %

Fault level 100 %

Fault type

ANN#1

ANN#1 & ANN#2

H
Fl
F2
F3
F4
F5
F6
F7

80.7
100
99.7
100
100
75
100
100

94.6

98.9

Table 4
Target ratio for the combination ANN#1 & ANN#2 for different fault levels
Target ratio in %

H
Fl
F2
F3
F4
F5
F6
F7

Fault level (%)


0

25

50

75

100

94.6

4.3
0
84
0
6
0
0

41.8
6.8
100
85
22
100
7.6

75.9
62.6
99.7
97.8
97.6
100
12.2

100
99.7
100
100
98.9
100
100

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

1699

graph and condition-probability tables (CPTs) of some faults and degeneration mechanisms that do not often appear but
may cause turbine breakdown or plant malfunction if not detected. Here the interaction and use of the expert knowledge
takes place through a BN, where the operator inputs information in observation nodes. The fault isolation begins with the
ANN recognising a pattern in sensor readings and classifying it as a fault type. For each fault type a separate BN is required
for root cause isolation. The BN observation node connected to the ANN has two states (0 and 1). To allow for the possibility
of false alarms and missed detection (type I and II errors), the ANN-node (ANN_F7) should show a probability for classication error derived from evaluation of the power of detection.
To compute probabilities for root causes the operator information and ANN information are used and communicated to
the BN through observation nodes. The ANN observation node is the only information that is passed on to the BN from the
ANN, the other information needed to isolate a root cause being provided by the operator and the maintenance staff. The size
of the BN is kept small by only considering the most important root causes to follow the concept idea. The combination of
ANN pattern recognition for classication and BN isolation of root cause has the feature of accepting the ANN classication (0
or 1) as absolute evidence.
5.1. Bayesian networks summarized
A BN consists of a combination of directed acyclic graph and condition-probability distributions. The graph structure is
where the causal structure of the domain is visualized [11]. A directed acyclic graph is not necessarily causal, but the directed
acyclic graphs used here are all causal. The second part of a BN is the condition-probability distributions that can be estimated from historic data. If data are not available, an expert may estimate the distributions and create conditional probability tables. The conditional probability table and the directed acyclic graph together form the BN for decision support.
Isolation is the task of nding a unique root cause of a detected symptom pattern. For some of these patterns, it is not
possible to distinguish a single root cause. For instance, we have built a BN fault type 7. The BN structure supports the operator in isolating a root cause. The BN requires additional evidence beyond the information from ANN detection of fault type
to isolate root causes. New evidence is collected manually by the operator and entered by dialogue into the BN to help distinguish the different root causes. If the same symptom has several potential causes, a constraint in the BN can simplify the
analysis [20]. The BN only considers the most important, but not all of the possible root causes. The transparency of the BN to
the operator and the interaction with the operator in collecting evidence are of great importance in this application.
5.2. Structure of Bayesian network for fault isolation
A BN consists of nodes and conditional probability tables assembled to model relationships between fault types. For each
fault type, there is a BN which models the relationship between the root cause and symptoms. The purpose of the BN is to
help the operator nd a single root cause for detected fault types. It may also help eliminate specic faults as the cause of a
symptom. Fig. 12 shows an example of a BN that considers the root causes of fault type 7, demonstrating the principle of the
concept.
BN structures contain a layer of observations (also called inputs) of different types. In Fig. 12 observations are indicated by
the dashed circles. Examples of observations converted to inputs include the ANN signal, the operator estimate of current
conditions such as fouling of condenser tubes, direct observations of valve stem position, manual measurement of temperature, etc. All of these observations constitute evidence which enables the BN to help the operator isolate the fault. To reduce
the number of possible states to be estimated in the BN, auxiliary nodes that include constraints for a single fault root cause
called Constraint in Fig. 12 and the symptom nodes are in the middle layer (condFouling, pumpStatus, condHerm). These
nodes indicate the status of a physical component or the function of a component.

Fig. 12. Bayesian network of root causes for fouling and gassing in the condenser.

1700

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

5.2.1. Example of operator dialogue using the Bayesian network for fault type 7
In the example in Fig. 12 we have used the Software Hugin Researcher 6.6 to construct and simulate a BN. When F7 is not
indicated by the ANN, the value for o_F7_ANN is set to not F7 by the ANN. The Constraint node is set to fault for single fault
assumption. The results are shown in Fig. 13a.
The operator may prepare the BNs with information about the process before a fault is detected. Prior data input is restricted to the parameters that do not change without maintenance action or due to degeneration. The root cause may be
isolated directly using BN based on prior input data. The BN may even detect a fault type before the ANN does. This is possible because the ANN and BN receive information from different sources. In addition, prior information may indicate if there
is conict in input data. If this occurs the operator has time to consult an expert to resolve the conict in observations that
may have been overlooked in a more urgent situation, which could result in faulty diagnosis and wrong action being taken.
When the fault F7 is detected by the ANN, o_F7_ANN is set to F7 and the Constraint node is set to fault for single fault
assumption. Without any input in the top layer of observation nodes the BN indicates the most probable root cause
(Fig. 13b), given prior probabilities. The operator sets the states of the observation nodes he has information about in order
to isolate the root cause to the most probable root cause and receives updated probabilities (Fig. 13c). After the operator has
input information about the states of the observation nodes, there are three possible results:
1. Based on the information he sees, the operator decides that the BN can isolate the root cause with acceptably high
probability.
2. The observations create conict in the BN, meaning that the evidence is pointing in more than one direction. The operator
needs to check the states again and if this does not resolve the conict, expert help is needed.
3. Based on the information the operator decides the BN cannot isolate the root cause with acceptably high probability. More
observations of node states or further inspection using methods outside of the BN are needed.
The BN cannot cover the entire domain and therefore only represents a subset of the faults and root causes. The operator
also needs to consider the possibility that the ANN is reporting a false alarm. The BN supports the operator in a structured
search for a root cause and makes a faster judgement on when to gather more information and when to call for expert help.
Hypotheses proposed by experts about the root cause may be falsied using the BN. Feedback data is also used to update the
BN to improve performance.

Fig. 13a. Result in BN for F7 when no fault is detected (F7_ANN set to not F7, and Constraint set to fault).

Fig. 13b. Result in BN for F7 when fault is detected (F7_ANN set to F7 and Constraint set to fault).

Fig. 13c. Result in BN for F7 after input of collected evidence.

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

1701

5.3. Resulting BN for isolation


A BN was developed for isolation of root causes related to power loss of condenser heat power, denoted as fault type F7.
The BN for F7 shown in Fig. 12 indicated two possible symptoms: fouling of condenser tubes (condFouling), and degassing. Degassing may be due to one of two causes: reduced vacuum pump function (PumpStatus), or air leaking into the condenser (condHerm).
The interaction between operator and BN is conducted by manual entry of observations that provide putative evidence for
each root cause. The observation connected to each root cause is the manual estimate or manual reading of the observations
at each node. These observations are: the pressure drop over the condenser (dpCondenser), prior fouling of the condenser
that may affect the current state of fouling (priorFouling), the mechanical status of the pump (mechParts), the mechanical
status of the pump motor (pumpMotor), the hermetic status of the manhole cover (manhole), the general state of the condenser shell (shell), and the presence of water on the backpressure lid (waterBack). As the node Constraint assumes a single
fault, it forces the network to choose one of the above fault symptoms at a time. The CPT for the BN is constructed from estimated probabilities. This work has resulted in rough estimates of the probabilities and needs further development before
implementation in a specic plant. The F7_ANN probability is a product of training the BN for detection.

6. Division of detection and isolation tasks between ANN, BN and operator


Various types of information are used by the operator to conduct maintenance decisions. The data may be collected manually or automatically and may be qualitative or quantitative. A sensor reading that is automatically collected by DCS and
stored in a data base is considered quantitative. Manually collected data used by the BN may be either quantitative or qualitative. The ANN is limited to processing patterns from automatically collected data. Pattern recognition (used here for detection) is a task ANNs are well suited for, especially when there are databases available containing long time series of collected
data. ANNs may also be trained on articial data when no real data is available for one or more fault types. In these cases
there is a need for a detailed simulator, such as the used in this project. Human operators are not procient at detecting
slowly developing faults. The ANN supports the operator in early detection of fault types related to correlations in several
variables.
The ANN is essentially used for pattern recognition and classication. A pattern in the data is recognized (detected) and a
corresponding fault type is selected. Input from the BN or the operator is unlikely to enhance the performance of this task for
slowly developing faults, therefore our concept does not support the transfer of information in this direction. The ANN detects and classies the fault and passes on this information to the BN and the operator. The fault type is described beforehand
by experts and may be presented to the operator through a written manual or a pop-up window on a screen. The ANN is a
black box with inputs and outputs and the operator does not require knowledge of the processing between input and output
to use it. This structure does not support explanation of the relationships between fault types and root causes. BN on the
other hand may be represented by objects and links between them illustrating the relationship between fault type and root
cause. Once the ANN detection threshold of a fault is reached, the dialogue between the BN and the operator is a powerful
tool for rationalizing and isolating the fault.
6.1. Drawing the line between ANN and BN
Developing ANNs from time series data in a database has been found to be less time consuming than developing BNs,
which require expert input. The argument for separating ANNs and BNs is to make it possible to explain cause-effect relationships by building BNs for specic faults and using the representational power of ANNs to build the model and to detect
developing faults efciently. This line of thought was promoted in a recent work treating a comparison of neural networks
and BNs [25].
ANNs differ from BNs for this particular case in that they are able to detect faults but may not be able to determine the
root causes of the faults using sensor readings collected by the distributed control system. If further isolation of the faults is
required, a BN is constructed and extended as far as the plant owner nds economical arguments for. The output node of the
ANN is the connection to the BN and same as the observation node (F7_ANN) in the BN example in Fig. 12.
6.1.1. Dialogue between operator and BN
Humans tend to forget and distort memories over time, but a BN structure remains intact and does not forget or distort
the correlations. Once the fault is detected and classied by the ANN the corresponding BN observation node for that fault
type is set to 1. The detection and classication is still prone to false alarms and missed alarms (type I and type II errors). The
BN can be used in at least two ways by the operator. The rst is to help to discern between root causes when a fault is detected, and the second is to provide support in deciding whether a detected fault by the ANN is a false alarm. To discern between root causes additional data is input into the observation nodes in the BN and the most probable root cause is updated.
If the observed data support a single root cause the probability for that cause increases as data is added. If the observed data
that is entered is not consistent with one fault type, the probability of the root causes is less dened or may point in more
than one direction, causing conicting results. At this point, the limitation of the BN reached it can only say that the pro-

1702

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

vided data does not point to a root cause that is part of the BN structure or that it points to two or more root causes. At this
point the decision on the course of action is down to the operator. The BN can no longer assist, and a turbine expert may need
to be called in.

7. Discussion and conclusions


The ANNs used here can detect all the fault types introduced into the simulator environment. The detection part of the
system has been validated against simulated data (Tables 3 and 4). The BN part of the system provides the operator with
support for fault isolation. The system is intended to work as a decision support function for maintenance actions such as
dismantling the turbine through operator dialogue using BNs. As well as the plant owner, the costly maintenance decision
to dismantle the turbine relies on input from the steam turbine manufacturer and the insurance company.
An ANN has been trained to detect seven fault types in a steam turbine. An approach using a network of ANNs is seen to
be a powerful aid in reducing the number of mismatches at lower fault levels (75%, 50%, etc.). This is because ANN#2 focuses
on a much smaller amount of data and classies it in only 2 classes, i.e. H or F5, this being easier than the original task performed by ANN#1 (classication in eight different classes). The addition of the extra input parameter also helps when distinguishing between these two conditions. Empirical evidence indicates that this parameter is related to fault 5 only, and its
inclusion in ANN#1 could lead to an incorrect classication of other faults. The approach presented here does not indicate
the extent of the developed fault, but only indicates whether they are present. However, incipient faults can be determined
in an earlier stage, as shown in Table 4. A possible approach to detecting developing faults at an early stage can be to use a
graphical approach such as that discussed in [2]. It should also be mentioned that the simulated levels of the faults are small
in order to test the ANN for early fault detection. Failures in the steam turbine and surrounding equipment are normally detected by alarm thresholds set in the distributed control system. The detection threshold of the ANN is normally much lower
than these alarm thresholds.
The very good detection power for low fault levels is probably partly due to the absence of noise and errors in data that
would be expected to appear in a dataset from a real process. The number of unclassied data was also very low compared
to those previously seen by the authors. This system does not allow conclusions to be drawn from data that produce output that indicates more than one fault. Though it may be tempting to regard the highest neuron output value as the most
probable and therefore detected, all the possibilities should be considered, including the presence of more than one fault
at a time. The ANN must be trained on datasets generated for multiple faults if it is to be able to detect multiple faults
reliably.
The ANN may be trained on datasets prepared to resemble real data by introducing small errors and noise in measurement data to improve its performance when working with data generated by a real system. Mesbahi et al. [23] proposed
gradual identication of fault types using a graphical display (5 by 8 pixels). With this system, as the certainty of a detected
fault type increases, it is accompanied by a gradual increase in the resolution of the gure representing the number of the
fault type.
Using the combination of a BN and ANN together is a result of an effort to maintain a compact structure and to make use
of easily checked variables. We believe that making use of the operators knowledge of the process and judgment as an expert input helps to minimize the extent and therefore development cost of the BN and speeds up the isolation of a fault. We
also believe that the BN provides more information for the operator to use himself or to pass on to an external expert if required. The ability of the concept to estimate turbine condition by detecting a healthy state from on-line measurements and
to combine this with operator knowledge through a BN is valuable for making earlier maintenance decisions to reduce damage and to avoid making maintenance decisions that consume resources and capital unnecessarily.
Economic evaluations of indirect costs of maintenance actions rely on estimation of numerous parameters and limitations
such as current and future electricity and fuel prices, signed contracts, insurance, maintenance intervals and random events
affecting prices. Here the improvisation and experience of an operator is needed. The direct costs of a maintenance operation
may be integrated in the BN through inuence diagrams [11]. The calculation of indirect costs should be done in a spreadsheet application in order to minimize the size of the inuence diagram and for easy communication with other data used by
the decision-maker.
The action sequence of classication using the ANN followed by isolation performed by the interaction of operator and BN
is straightforward. The ANN is a black box and the classication cannot therefore be evaluated as in the case of BN causeeffect graphs. The ANN output becomes the sole criterion for selection of a certain BN. However, the operator is free to
set the ANN observation node to another state in order to test faults other than the one indicated by the ANN if the ANN
is believed to be providing faulty information.
It is the authors opinion that work on integrating economics in the decision-making process using inuence diagrams
should be taken further. Summarizing the estimation of direct cost of maintenance action through inuence diagrams
and the indirect costs estimated by the operator may be calculated in a spreadsheet. An alternative approach to using the
ANN for fault classication is to use a BN. This is potentially interesting as its application would likely simplify the training
of the BN as. A library of BN structures for different graphs from observation to root cause is a requirement to achieve component-based detection and decision-making support. We believe that such a system would be of value to heat and power
plants that are less able to invest in fault detection and isolation systems of the size presented in [18].

C. Karlsson et al. / Simulation Modelling Practice and Theory 16 (2008) 16891703

1703

Power plants are subject to design changes and the ANNBN may need to be updated with changes in the plant. Degradation of the plant over time also affects the performance of the ANNBN. These models mimic a specic data set and
changes in the plant over time may cause problems if the ANNBN is not kept up to date. A brief discussion on how this
challenge may be met follows. If the real process is drifting slowly (e.g. through fouling, decreasing heat transfer coefcients)
the error in the ANN increases over time. The BN part of the system is almost unaffected by degeneration, because it is
uncoupled from measurements affected by degeneration of the plant. The states in the BN are relative and not directly coupled to absolute gures measured in the plant. The ANN training dataset is generated from a set of mass and heat balance
model parameters for the Healthy condition and datasets for the faults F1F7. If the original model parameters change, the
performance of the ANN will decrease. The ANN must be adapted to the changes to maintain its accuracy. This will be the
subject to further work. Our preferred approach relies on updating the rst principles model. Updated ANNs may be automatically generated from the updated model and the system performance (detection threshold, low errors Types I and II) can
be maintained over time in this way. In addition the parameter estimation results provide valuable data on how parameter
changes can be used to inform maintenance decisions.
References
[1] J. Arriagada, On the Analysis and Fault-Diagnosis Tools for Small-Scale Heat and Power Plants, Ph.D. Thesis, Lund University, 2003.
[2] J. Arriagada, M. Genrup, A Loberg, M. Assadi, Fault diagnosis system for an industrial gas turbine by means of neural networks, in: Proceedings of the
International Gas Turbine Congress, Tokyo, 2003.
[3] R. Beebe, Condition monitoring of steam turbine by performance analysis, Journal of Quality in Maintenance Engineering 9 (2003) 102112.
[4] C.P. Bellanca, Diagnostic monitoring of solid particle erosion in steam turbines, IEEE Transactions on Energy Conversion 3 (1988) 249253.
[5] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, USA, 1995.
[6] J.H. Bulloch, A.G. Callagy, Malfunctions of a steam turbine mechanical control system, Engineering Failure Analysis 5 (1995) 235240.
[7] K.C. Cotton, Evaluating and Improving Steam Turbine Performance, Cotton Fact Inc., 1988.
[8] M.H. Faber, M.G. Stewart, Risk assessment for civil engineering facilities: critical overview and discussion, Reliability Engineering and System Safety 80
(2003) 173184.
[9] M. Genrup, On Degradation and Monitoring Tools for Gas and Steam Turbines, Ph.D. Thesis, Lund University, 2005.
[10] S. Haykin, Neural Networks: A Comprehensive Foundation, Prentice-Hall, UK, 1995.
[11] F.V. Jensen, Bayesian Networks and Decision Graphs, Springer Verlag, New York, 2001.
[12] J. Kubiak, A. Garca-Gutirrez, B. Urquiza, The diagnosis of turbine component degradation case stories, Applied Thermal Engineering 22 (2002)
19551963.
[13] A.Sh. Leyzerovich, Large Power Steam Turbines, PennWell Publishing Company, USA, 1997.
[14] O. Lyle, The Efcient Use of Steam, Her Majestys Stationery Ofce, UK, 1958. ISBN: B0000CKCTV.
[15] T.H. McCloskey, C. Bellanca, Minimizing solid particle erosion in power plant steam turbines, Power Engineering 8 (1989) 3538.
[16] F. Rosenblatt, The Perceptron: A probabilistic model for information storage and organization in the brain, Psychological Review 65 (1958) 386408.
[17] R. Svoboda, M. Bodmer, Deposits and corrosion in steam turbines, ABB Power Generation TEZ 87-20, paper presented at Ringhals, Sweden, March 1987.
[18] V. Uraikul, C.W. Chan, P. Tontiwachwuthikul, Articial intelligence for monitoring and supervisory control of process systems, Engineering
Applications of Articial Intelligence 20 (2007) 115131.
[19] G. Waltenberger, Betriebskosten, VGB-Kraftverkstechnik 68 (1988) 244248.
[20] G. Weidl, Root Cause Analysis and Decision support on Process Operation, Ph.D. Thesis, Mlardalen University, 2002.
[21] A. Whitehead, K.T. Sullivan, Carryover and its effects on efciency, operation and maintenance of industrial steam turbines, Pulp and Paper Canada 11
(1983) 309313.
[22] B. Widarsson, C. Karlsson, E. Dahlquist, Bayesian Network for Decision Support on Soot Blowing Superheaters in a Biomass Fuelled Boiler, International
Conference on Probabilistic Methods Applied Power Systems 1 (2004) 212217.
[23] E. Mesbahi, Articial Neural Networks for Fault Diagnosis, Modelling and Control of Diesel Engines, Ph.D. Thesis, Department of Marine Technology,
University of Newcastle Upon Tyne, UK, 2000.
[24] D. Rumelhart, G. Hinton, R. Williams, Learning internal representations by error propagation, in: Parallel Distributed Processing: Explorations in the
Microstructure of Cognition, vol. 1: Foundations, 1986, pp. 318362.
[25] R. Zhang, A. Bivens, Comparing the use of Bayesian networks and neural networks in response time modeling for service-oriented systems, in:
Proceedings of the 2007 Workshop on Service-Oriented Computing Performance: Aspects, Issues, and Approaches. Monterey, California, USA, pp. 67
74.
[26] A. Holmes, A European OEMs Experience, Presented at San Francisco Steam Turbine Retrot Conference, San Francisco, USA, 1617th September 2003.
[27] B. Widrow, M. Hoff, Adaptive switching circuits, IRE WESCON Convention Record, pt. 4, 1960, pp. 96104.

Vous aimerez peut-être aussi