875 F

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO.
3, JUNE 2003 585
A Comprehensive Review for Industrial Applicability

of Artificial Neural Networks
Magali R. G. Meireles, Paulo E. M. Almeida, Student Member, IEEE, and
Marcelo Godoy Simes, Senior Member, IEEE
AbstractThis paper presents a comprehensive review of the pable of application. Since the late 1980s, ANN started to be
industrial applications of artificial neural networks (ANNs), in utilized in a plethora of industrial applications.
the last 12 years. Common questions that arise to practitioners Nowadays, ANN are being applied to a lot of real world,
and control engineers while deciding how to use NNs for specific
industrial tasks are answered. Workable issues regarding imple- industrial problems, from functional prediction and system
mentation details, training and performance evaluation of such modeling (where physical processes are not well understood
algorithms are also discussed, based on a judiciously chronolog- or are highly complex), to pattern recognition engines and
ical organization of topologies and training methods effectively robust classifiers, with the ability to generalize while making
used in the past years. The most popular ANN topologies and decisions about imprecise input data. The ability of ANN to
training methods are listed and briefly discussed, as a reference
to the application engineer. Finally, ANN industrial applications learn and approximate relationships between input and output
are grouped and tabulated by their main functions and what is decoupled from the size and complexity of the problem
they actually performed on the referenced papers. The authors [49]. Actually, as relationships based on inputs and outputs
prepared this paper bearing in mind that an organized and are enriched, approximation capability improves. ANN offers
normalized review would be suitable to help industrial managing ideal solutions for speech, character and signal recognition.
and operational personnel decide which kind of ANN topology and
training method would be adequate for their specific problems. There are many different types of ANN. Some of the more
popular include multilayer perceptron (MLP) (which is generally
Index TermsArchitecture, industrial control, neural network trained with the back-propagation of error algorithm), learning
(NN) applications, training.
vector quantization, radial basis function (RBF), Hopfield and
Kohonen, to name a few. Some ANN are classified as feed
I. INTRODUCTION forward while others are recurrent (i.e., implement feedback)
depending on how data is processed through the network.
I N engineering and physics domains, algebraic and differen-
tial equations are used to describe the behavior and func-
tioning properties of real systems and to create mathematical
Another way of classifying ANN types is by their learning
method (or training), as some ANN employ supervised training,
models to represent them. Such approaches require accurate while others are referred to as unsupervised or self-organizing.
knowledge of the system dynamics and the use of estimation This paper concentrates on industrial applications of neural
techniques and numerical calculations to emulate the system networks (NNs). It was found that training methodology is more
operation. The complexity of the problem itself may introduce conveniently associated with a classification of how a certain
uncertainties, which can make the modeling nonrealistic or in- NN paradigm is supposed to be used for a particular indus-
accurate. Therefore, in practice, approximate analysis is used trial problem. There are some important questions to answer,
and linearity assumptions are usually made. Artificial neural in order to adopt an ANN solution for achieving accurate, con-
networks (ANNs) implement algorithms that attempt to achieve sistent and robust modeling. What is required to use an NN?
a neurological related performance, such as learning from ex- How are NNs superior to conventional methods? What kind of
perience, making generalizations from similar situations and problem functional characteristics should be considered for an
judging states where poor results were achieved in the past. ANN paradigm? What kind of structure and implementation
ANN history begins in the early 1940s. However, only in the should be used in accordance to an application in mind? This
mid-1980s these algorithms became scientifically sound and ca- article will follow a procedure that will bring such manage-
rial questions together and into a framework that can be used
to evaluate where and how such technology fits for industrial
Manuscript received October 23, 2001; revised September 20, 2002. Abstract applications, by laying out a classification scheme by means
published on the Internet March 4, 2003. This work was supported by the Na-
tional Science Foundation under Grant ECS 0134130.
of clustered concepts and distinctive characteristics of ANN
M. R. G. Meireles was with the Colorado School of Mines, Golden, CO engineering.
80401 USA. She is now with the Mathematics and Statistics Department, Pon-
tific Catholic University of Minas Gerais, 30.535-610 Belo Horizonte, Brazil
(e-mail: magali@pucminas.br). II. NN ENGINEERING
P. E. M. Almeida was with the Colorado School of Mines, Golden, CO 80401
USA. He is now with the Federal Center for Technological Education of Minas Before laying out the foundations for choosing the best ANN
Gerais, 30.510-000 Belo Horizonte, Brazil (e-mail: paulo@dppg.cefetmg.br). topology, learning method and data handling for classes of
M. G. Simes is with the Colorado School of Mines, Golden, CO 80401 USA
(e-mail: msimoes@mines.edu). industrial problems, it is important to understand how artificial
Digital Object Identifier 10.1109/TIE.2003.812470 intelligence (AI) evolved with required computational resources.
0278-0046/03$17.00 2003 IEEE
586 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO. 3, JUNE 2003
Artificial intelligence applications moved away from laboratory regarding the ANN topology selection are related to network
experiments to real world implementations. Therefore, soft- design, training, and practical considerations [25].
ware complexity also became an issue since conventional Von
Neumann machines are not suitable for symbolic processing,
B. Training Considerations
nondeterministic computations, dynamic execution, parallel,
distributed processing, and management of extensive knowl- Considerations, such as determining the input and output
edge bases [118]. variables, choosing the size of the training data set, initializing
In many AI applications, the knowledge needed to solve network weights, choosing training parameter values (such
a problem may be incomplete, because the source of the as learning rate and momentum rate), and selecting training
knowledge is unknown at the time the solution is devised, or stopping criteria, are important for several network topologies.
the environment may be changing and cannot be anticipated There is no generic formula that can be used to choose the
at design time. AI systems should be designed with an open parameter values. Some guidelines can be followed as an initial
concept that allows continuous refinement and acquisition of trial. After a few trials, the network designer should have
new knowledge. enough experience to set appropriate criteria that suit a given
There exist engineering problems for which finding the per- problem.
fect solution requires a practically impossible amount of re- The initial weights of an NN play a significant role in the con-
sources and an acceptable solution would be fine. NNs can give vergence of the training method. Without a priori information
good solutions for such classes of problems. Tackling the best about the final weights, it is a common practice to initialize all
ANN topology, learning method and data handling themselves weights randomly with small absolute values. In linear vector
become engineering approaches. The success of using ANN for quantization and derived techniques, it is usually required to
any application depends highly on the data processing, (i.e., data renormalize the weights at every training epoch. A critical pa-
handling before or during network operation). Once variables rameter is the speed of convergence, which is determined by
have been identified and data has been collected and is ready to the learning coefficient. In general, it is desirable to have fast
use, one can process it in several ways, to squeeze more infor- learning, but not so fast as to cause instability of learning iter-
mation out of and filter it. ations. Starting with a large learning coefficient and reducing
A common technique for coping with nonnormal data is to it as the learning process proceeds, results in both fast learning
perform a nonlinear transform to the data. To apply a transform, and stable iterations. The momentum coefficients are, usually,
one simply takes some function of the original variable and uses set according to a schedule similar to the one for the learning
the functional transform as a new input to the model. Commonly coefficients [128].
used nonlinear transforms include powers, roots, inverses, expo- Selection of training data plays a vital role in the performance
nentials, and logarithms [107]. of a supervised NN. The number of training examples used to
train an NN is sometimes critical to the success of the training
A. Assessment of NN Performance process. If the number of training examples is not sufficient,
then the network cannot correctly learn the actual inputoutput
An ANN must be used in problems exhibiting knottiness,
relation of the system. If the number of training examples is too
nonlinearity, and uncertainties that justify its utilization [45].
large, then the network training time will be longer. For some
They present the following features to cope with such complex-
applications, such as real-time adaptive neural control, training
ities:
time is a critical variable. For others, such as training the net-
learning from training data used for system identifi- work to perform fault detection, the training can be performed
cation; finding a set of connection strengths will allow off-line and more training data are preferred, over using insuffi-
the network to carry out the desired computation [96]; cient training data to achieve greater network accuracy. Gener-
generalization from inputs not previously presented ally, rather than focusing on volume, it is better to concentrate
during the training phase; by accepting an input and on the quality and representational nature of the data set. A good
producing a plausible response determined by the in- training set should contain routine, unusual and boundary-con-
ternal ANN connection structure makes such a system dition cases [8].
robust against noisy data, features exploited in indus- Popular criteria used to stop network training are small mean-
trial applications [59]; square training error and small changes in network weights.
mapping of nonlinearity making them suitable for Definition about how small is usually up to the network designer
identification in process control applications [90]; and is based on the desired accuracy level of the NN. Using
parallel processing capability, allowing fast pro- as an example a motor bearing fault diagnosis process, they
cessing for large-scale dynamical systems; used a learning rate of 0.01 and momentum of 0.8; the training
applicable to multivariable systems; they naturally was stopped when the root mean-square error of the training
process many inputs and have many outputs. set or the change in network weights was sufficiently small
used as a black-box approach (no prior knowledge for that application (less than 0.005) [72]. Therefore, if any
about a system) and implemented on compact proces- prior information about the relationship between inputs and
sors for space and power constrained applications. outputs is available and used correctly, the network structure
In order to select a good NN configuration, there are several and training time can be reduced and the network accuracy
factors to take into consideration. The major points of interest can be significantly improved.
MEIRELES et al.: INDUSTRIAL APPLICABILITY OF ANNs 587
TABLE I III. EVOLUTION OF UNDERLYING FUNDAMENTALS THAT

ORGANIZATION OF NNS BASED ON THEIR FUNCTIONAL CHARACTERISTICS PRECEDED INDUSTRIAL APPLICATIONS
While there are several tutorials and reviews discussing the
full range of NNs topologies, learning methods and algorithms,
the authors in this paper intend to cover what had actually been
applied to industrial applications. An initial historical perspec-
tive is important to get the picture for the age of industrial ap-
plications, which started in 1988, just after the release of [123]
by Widrow.
It is well known that the concept of NNs came into ex-
C. Network Design istence around the Second World War. In 1943, McCulloch
and Pitts proposed the idea that a mind-like machine could be
Some of the design considerations include determining the manufactured by interconnecting models based on behavior
number of input and output nodes to be used, the number of of biological neurons, laying out the concept of neurological
hidden layers in the network and the number of hidden nodes networks [77]. Wiener gave this new field the popular name
used in each hidden layer. The number of input nodes is typi- cybernetics, whose principle is the interdisciplinary relation-
cally taken to be the same as the number of state variables. The ship among engineering, biology, control systems, and brain
number of output nodes is typically the number that identifies functions [125]. At that time, computer architecture was not
the general category of the state of the system. Each node con- fully defined and the research led to what is today defined
stitutes a processing element and it is connected through var- as the Von Neumann-type computer. With the progress in re-
ious weights to other elements. In the past, there was a general search on the brain and computers, the objective changed from
practice of increasing the number of hidden layers, to improve the mind-like machine to manufacturing a learning machine,
training performance. Keeping the number of layers at three and for which Hebbs learning model was initially proposed [53].
In 1958, Rosenblatt from the Cornell Aeronautical Labora-
adjusting the number of processing elements in the hidden layer,
tory put together a learning machine, called the perceptron.
can achieve the same goal. A trial-and-error approach is usually
That was the predecessor of current NNs. He gave specific
used to determine the number of hidden layer processing ele-
design guidelines used by the early 1960s [91]. Widrow and
ments, starting with a low number of hidden units and increasing
Hoff proposed the ADALINE (ADAptive LINear Element),
this number as learning problems occur. Even though choosing a variation on the pPerceptron, based on a supervised learning
these parameters is still a trial-and-error process, there are some rule (the error correction rule) which could learn in a faster
guidelines that can be used, (i.e., testing the networks perfor- and more accurate way: synaptic strengths were changed in
mance). It is a common practice to choose a set of training data proportion to the error (what the output is and what it should
and a set of testing data that are statistically significant and rep- have been) multiplied by the input. Such a scheme was suc-
resentative of the system under consideration. The training data cessfully used for echo cancellation in telephone lines and is
set is used to train the NN, while the testing data is used to test considered to be the first industrial application of NNs [124].
the network performance, after the training phase finishes. During the 1960s, the forerunner for current associative memory
systems was the work of Steinbuch with his Learning Ma-
trix, which was a binary matrix accepting a binary vector
D. Practical Considerations
as input, producing a binary vector as output and capable of
Practical considerations regarding the network accuracy, ro- forming associations between pairs with a Boolean Hebbian
bustness and implementation issues must be addressed, for learning procedure [108]. The perceptron received consider-
real-world implementation. For ANN applications, it is usu- able excitement, when it was first introduced, because of its
ally considered a good estimation performance when pattern conceptual simplicity. The ADALINE is a weighted sum of the
inputs, together with a least-mean-square (LMS) algorithm to
recognition achieves more than 95% of accuracy in overall and
adjust the weights and to minimize the difference between the
comprehensive data recalls [25]. Selection and implementation
desired signal and the actual output. Because of the rigorous
of the network configuration needs to be carefully studied since
mathematical foundation of the LMS algorithm, ADALINE
it is desirable to use the smallest possible number of nodes
has become a powerful tool for adaptive signal processing and
while maintaining a suitable level of conciseness. Pruning al- adaptive control, leading to work on competitive learning and
gorithms try to make NNs smaller by trimming unnecessary self-organization. However, Minsky and Papert proved math-
links or units, so the cost of runtime, memory and hardware ematically that the Perceptron could not be used for a class
implementation can be minimized and generalization is im- of problems defined as nonseparable logic functions [80].
proved. Depending on the application, some system functional Very few investigators conducted research on NNs during the
characteristics are important in deciding which ANN topology 1970s. Albus developed his adaptive Cerebellar Model Articu-
should be used [81]. Table I summarizes the most common ANN lation Controller (CMAC), which is a distributed table-lookup
structures used for pattern recognition, associative memory, system based on his view of models of human memory [1]. In
optimization, function approximation, modeling and control, 1974, Werbos originally developed the backpropagation algo-
image processing, and classification purposes. rithm. Its first practical application was to estimate a dynamic
model, to predict nationalism and social communications [120]. Topology of the networks: multilayered, single-lay-
However, his work remained almost unknown in the scien- ered, or recurrent. The network is multilayered if it has
tific community for more than ten years. In the early 1980s, distinct layers such as input, hidden and output. There
Hopfield introduced a recurrent-type NN, which was based are no connections among the neurons within the same
on Hebbian learning law. The model consisted of a set of layer. If each neuron can be connected with every other
first-order (nonlinear) differentiable equations that minimize a neuron in the network through directed edges, except
certain energy function [55]. In the mid-1980s, backpropaga- the output node, this network is called single layered
tion was rediscovered by two independent groups led by Parker (i.e., there is no hidden layer). A recurrent network
and Rumelhart et al., as the learning algorithm of feedforward distinguishes itself from the feedforward topologies in
NNs [88], [95]. Grossberg and Carpenter made significant con- that it has at least one feedback loop [49].
tributions with the Adaptive Resonance Theory (ART) in Data flow: recurrent or nonrecurrent. A nonrecur-
the mid-1980s, based on the idea that the brain spontaneously rent or feedforward model where the outputs always
organizes itself into recognition codes and neurons organize propagate from left to right in the diagrams. The
themselves to tune various and specific patterns defined as outputs from the input layer neurons propagate to the
self-organizing maps [20]. The dynamics of the network were right, becoming inputs to the hidden layer neurons and
modeled by first order differentiable equations based on im- then, outputs from the hidden layer neurons propagate
plementations of pattern clustering algorithms. Furthermore, to the right, becoming inputs to the output layer
Kosko extended some of the ideas of Grossberg and Hopfield
neurons. An NN, in which the outputs can propagate
to develop his adaptive Bi-directional Associative Memory
in both directions, forward and backward, is called a
(BAM) [67]. Hinton, Sejnowski, and Ackley developed the
recurrent model.
Boltzmann Machine, which is a kind of Hopfield net that
Types of input values: binary, bipolar or contin-
settles into solutions by a simulated annealing process as a
uous. Neurons in an artificial network can be defined to
stochastic technique [54]. Broomhead and Lowe first intro-
process different kinds of signals. The most common
duced RBF networks in 1988 [15]. Although the basic idea
types are binary (restricted to either 0 or 1), bipolar (ei-
of RBF was developed before, under the name method of po-
tential function, their work opened another NN frontier. Chen ther 1 or 1) and continuous (continuous real numbers
proposed functional-link networks (FLNs), where a nonlinear in a certain range).
functional transform of the network inputs aimed lower com- Forms of activation: linear, step, or sigmoid. Acti-
putational efforts and fast convergence [22]. vation functions will define the way neurons will be-
The 1988 DARPA NN study listed various NN applications have inside the network structure and, therefore, the
supporting the importance of such technology for commercial kind of relationship that will occur between input and
and industrial applications and triggering a lot of interest in output signals.
the scientific community, leading eventually to applications A common classification of ANNs is based on the way in
in industrial problems. Since then, the application of ANN to which their elements are interconnected. There is a study that
sophisticated systems has skyrocketed. NNs found widespread presents the approximate percentage of network utilization as:
relevance for several different fields. Our literature review MLP, 81.2%; Hopfield, 5.4%; Kohonen, 8.3%; and the others,
showed that practical industrial applications were reported in 5.1% [49]. This section will cover the main types of networks
peer-reviewed engineering journals from as early as 1988. that have been used in industrial applications, in a reasonable
Extensive use has been reported in pattern recognition and number of reports and applications. A comprehensive listing of
classification for image and speech recognition, optimization all available ANN structures and topologies is out of the scope
in planning of actions, motions, and tasks and modeling, of this discussion.
identification, and control.
Fig. 1 shows some industrial applications of NNs reported in A. MLPs
the last 12 years. The main purpose here is to give an idea of the In this structure, each neuron output is connected to every
most used ANN topologies and training algorithms and to relate neuron in subsequent layers connected in cascade with no con-
them to common fields in the industrial area. For each entry, nections between neurons in the same layer. A typical diagram
the type of the application, used ANN topology, implemented of this structure is detailed in Fig. 2.
training algorithm, and the main authors are presented. The col- MLP has been reported in several applications. Some ex-
lected data give a good picture of what has actually migrated amples are speed control of dc motors [94], [117], diagnostics
from academic research to practical industrial fields and shows
of induction motor faults [24], [25], [41], [42], induction
some of the authors and groups responsible for this migration.
motor control [17], [18], [56], [127], and current regulator for
pulsewidth-modulation (PWM) rectifiers [31]. Maintenance
IV. DILEMMATIC PIGEONHOLE OF NEURAL STRUCTURES and sensor failure detection was reported in [82], check valves
Choosing an ANN solution for an immediate application is operating in a nuclear power plant [57], [114], and vibration
a situation that requires a choice between options that are (or monitoring in rolling element bearings [2]. It was widely
seem) equally unfavorable or mutually exclusive. Several issues applied in feedback control [19], [40], [52], [59], [87], [89],
must be considered when regarding the problem point of view. [109], [110] and fault diagnosis of robotic systems [116].
The main features of an ANN can be classified as follows [81]. This structure was also used in a temperature control system
Fig. 1. Selected industrial applications reported since 1989.
[63], [64], monitoring feed water flow rate and component fault diagnosis in a heat exchanger continuous stirred tank
thermal performance of pressurized water reactors [61], and reactor system [102]. It was used in a controller for turbo
Fig. 2. MLP basic topology.

Fig. 4. Hopfield network structure.
manipulators, facilitating the rapid execution of the adaptation

process [60]. A recurrent network was used to approximate a
trajectory tracking to a very high degree of accuracy [27]. It
was applied to estimate the spectral content of noisy periodic
waveforms that are common in engineering processes [36]. The
Hopfield network model is the most popular type of recurrent
NN. It can be used as associative memory and can also be ap-
plied to optimization problems. The basic idea of the Hopfield
network is that it can store a set of exemplar patterns as multiple
stable states. Given a new input pattern, which may be partial
or noisy, the network can converge to one of the exemplar pat-
terns nearest to the input pattern. As shown in Fig. 4, a Hopfield
Fig. 3. Typical recurrent network structure. network consists of a single layer of neurons. The network is
recurrent and fully interconnected (every neuron in the network
generators [117], digital current regulation of inverter drivers is connected to every other neuron). Each input/output takes a
[16], and welding process modeling and control [4], [32]. The discrete bipolar value of either 1 or 1 [81].
MLP was used in modeling chemical process systems [12], to A Hopfield network was used to indicate how to apply it to
produce quantitative estimation of concentration of chemical the problem of linear system identification, minimizing the least
components [74], and to select powder metallurgy materials square of error rates of estimates of state variables [29]. A mod-
and process parameters [23]. Optimization of the gas industry ified Hopfield structure was used to determine the imperfection
was reported by [121], as well as prediction of daily natural gas by the degree of orthogonality between the automated extracted
consumption needed by gas utilities [65]. The MLP is indeed feature, from the send-through image and the class feature of
the most used structure and spread out across several disciplines early good samples. The performance measure used for such
like identification and defect detection on woven fabrics [99], an automatic feature extraction is based on a certain mini-max
prediction of paper cure in the papermaking industry [39], cost function useful for image classification [112]. Simulation
controller steering backup truck [85], and modeling of plate results illustrated the Hopfield networks use, showing that this
rolling processes [46]. technique can be used to identify the frequency transfer func-
tions of dynamic plants [28]. An approach to detect and isolate
B. Recurrent NNs (RNNs) faults in linear dynamic systems was proposed and systems pa-
A network is called recurrent when the inputs to the neurons rameters were estimated by Hopfield network [105].
come from external input, as well as from the internal neurons,
consisting of both feed forward and feedback connections be- C. Nonnrecurrent Unsupervised Kohonen Networks
tween layers and neurons. Fig. 3 shows such a structure; it was A Kohonen network is a structure of interconnected pro-
demonstrated that recurrent network could be effectively used cessing units that compete for the signal. It uses unsupervised
for modeling, identification, and control of nonlinear dynam- learning and consists of a single layer of computational nodes
ical systems [83]. (and an input layer). This type of network uses lateral feedback,
The trajectory-tracking problem, of controlling the nonlinear which is a form of feedback whose magnitude is dependent on
dynamic model of a robot, was evaluated using an RNN; this the lateral distance from the point of application. Fig. 5 shows
network was used to estimate the dynamics of the system and its the architecture with two layers. The first is the input layer
inverse dynamic model [97]. It was also used to control robotic and the second is the output layer, called the Kohonen layer.
Fig. 7. Adaptive resonance theory network.

Fig. 5. Kohonen network structure.
schematic diagram of this structure. This figure depicts the non-

linear input mapping in the Albus approach and a hashing oper-
ation that can be performed to decrease the amount of memory
needed to implement the receptive fields.
CMAC networks are considered local algorithms because, for
a given input vector, only a few receptive fields will be active
and contribute to the corresponding network output [3]. In the
same way, the training algorithm for a CMAC network should
affect only the weights corresponding to active fields, excluding
the majority of inactive fields in the network. This increases the
efficiency of the training process, minimizing the computational
efforts needed to perform adaptation in the whole network.
CMAC was primarily applied to complex robotic systems
involving multiple feedback sensors and multiple command
variables. Common experiments involved control of position
Fig. 6. CMAC network structure.
and orientation of an object using a video camera mounted
at the end of a robot arm and moving objects with arbitrary
orientation relative to the robot [79]. This network was also
Every input neuron is connected to every output neuron with its used for air-to-fuel ratio control of automotive fuel-injection
associated weight. The network is nonrecurrent, input informa- systems. Experimental results showed that the CMAC is very
tion propagates only from the left to right. Continuous (rather effective in learning the engine nonlinearities and in dealing
than binary or bipolar) input values representing patterns are with the significant time delays inherent in engine sensors [76].
presented sequentially in time through the input layer, without
specifying the desired output. The output neurons can be E. Adaptive Resonance Theory (ART) (Recurrent,
arranged in one or two dimensions. A neighborhood parameter, Unsupervised)
or radius, , can be defined to indicate the neighborhood of a
The main feature of ART, when compared to other similar
specific neuron. It has been used as a self-organization map for
structures, is its ability to not forget after learning. Usually,
classification [98] and pattern recognition purposes in general.
NNs are not able to learn new information without damaging
what was previously ascertained. This is caused by the fact
that when a new pattern is presented to an NN in the phase
D. CMAC of learning, the network tries modifying the weights at node
inputs, which only represent what was previously learned. The
The input mapping of the CMAC algorithm can be seen as ART network is recurrent and self-organizing. Its structure is
a set of multidimensional interlaced receptive fields, each one shown in Fig. 7. It has two basic layers and no hidden layers.
with finite and sharp borders. Any input vector to the network The input layer is also called comparing while the output
excites some of these fields, while the majority of the receptive layer is called recognizing. This network is composed of two
fields remain unexcited, not contributing to the corresponding completely interconnected layers in both directions. [58]. It was
output. On the other hand, the weighted average of the excited successfully used for sensor pattern interpretation problems
receptive fields will form the network output. Fig. 6 shows a [122], among others.
Fig. 9. Probabilistic ANN structure.
Fig. 8. RBF network structure.
F. RBF Networks
Fig. 8 shows the basic structure of an RBF network. The input
nodes pass the input values to the connecting arcs and the first
layer connections are not weighted. Thus, each hidden node
receives each input value unaltered. The hidden nodes are the
RBF units. The second layer of connections is weighted and the
output nodes are simple summations. This network does not ex-
tend to more layers.
For applications such as fault diagnosis, RBF networks offer
advantages over MLP. It is faster to train because training of
the two layers is decoupled [70]. This network was used to
improve the quantity and the quality of the galvanneale sheet Fig. 10. Polynomial ANN structure.
produced on galvanizing lines, integrating various approaches,
including quality monitoring, diagnosis, control, and optimiza- A practical advantage is that, unlike other networks, it operates
tion methods [13]. RBF was trained to evaluate and compare completely in parallel and the signal flows in a unique direction,
the different grasping alternatives by a robotic hand, according without a need for feedback from individual neurons to the
to geometrical and technological aspects of object surfaces, inputs. It can be used for mapping, classification and associative
as well as the specific task to be accomplished. The adoption memory, or to directly estimate a posteriori probabilities [103],
of RBF topology was justified by two reasons [34]. [104]. Probabilistic NNs were used to assist operators while
In most cases, it presents higher training speed identifying transients in nuclear power plants, such as plant
when compared with ANN based on back-propagation accident scenario, equipment failure or an external disturbance
training methods. to the system, at the earliest stages of their developments [6].
It allows an easier optimization of performance,
since the only parameter that can be used to modify its H. Polynomial Networks
structure is the number of neurons in the hidden layer. Fig. 10 depicts a polynomial network. It has its topology
Results using RBF networks were presented to illustrate that it is formed during the training process. Due to this feature, it is
possible to successfully control a generator system [43]. Power defined as a plastic network. The neuron activation function is
electronic drive controllers have also been implemented with based on elementary polynomials of arbitrary order. In this ex-
these networks, in digital signal processors (DSPs), to attain ro- ample, the network has seven inputs, although the network uses
bust properties [38]. only five of them. This is due to the automatic input selection
capability of the training algorithm. Automatic feature selection
G. Probabilistic NNs (PNNs) is very useful in control applications when the plant model order
PNNs are somewhat similar in structure to MLPs. Some is unknown. Each neuron output can be expressed by a second-
basic differences among them are the use of activation by order polynomial function
exponential functions and the connection patterns between , where and are inputs and
neurons. As a matter of fact, the neurons at the internal layers and are polynomial coefficients which are equivalent to the
are not fully connected, depending on the application in turn. network weights and is the neuron output. The Group Method
Fig. 9 depicts this structure, showing its basic differences from of Data Handling (GMDH) is a statistics-based training method
ordinary MLP structure. PNN training is normally easy and largely used in modeling economic, ecological, environmental
instantaneous, because of the smaller number of connections. and medical problems. The GMDH training algorithm can be
Fig. 12. Common grids of CNNs.
model due to the automatic input selection capability of the

polynomial networks. The FPN presents advantages such as fast
Fig. 11. Functional link ANN structure. convergence, no local minima problem, structure automatically
defined by the training process, and no adjustment of learning
used to adjust the polynomial coefficients and to find the net- parameters. It has been tested for speed control with a dc motor
work structure. This algorithm employs two sets of data: One and the results have been compared with the ones provided by
for estimating the network weights and the other for testing with an indirect adaptive control scheme based on MLPs trained by
neurons should survive during the training process. A new form backpropagation [101].
of implementation of a filter was proposed using a combination
of recurrent NN and polynomial NN [101]. K. Cellular NNs (CNNs)
The most general definition for such networks is that they
I. FLNs
are arrays of identical dynamical systems, called cells, which
Since NNs are used for adaptive identification and control, are only locally connected. Only adjacent cells interact directly
the learning capabilities of the networks can have significant with each other [78]. In the simplest case, a CNN is an array
effects on the performance of closed-loop systems. If the infor- of simple, identical, nonlinear, dynamical circuits placed on a
mation content of data input to a network can be modified on- twodimensional (2-D) geometric grid, as shown in Fig. 12. If
line, then it will more easily extract salient features of the data. these grids are duplicated in a threedimensional (3-D) form,
The functional link acts on an element of an input vector, or on a multilayer CNN can be constructed [30]. It is an efficient ar-
all the input vectors, by generating a set of linearly independent chitecture for performing image processing and pattern recog-
functions, then evaluating these functions with the pattern as the nition [51]. This kind of network has been applied to problems
argument. Thus, both the training time and training error of the of image classification for quality control. Gulglielmi et al. [47]
network can be improved [113]. Fig. 11 shows a functional link described a fluorescent magnetic particle inspection, which is
NN, which can be considered a one-layer feedforward network a nondestructive method for quality control of ferromagnetic
with an input preprocessing element. Only the weights in the materials.
output layer are adjusted [66].
The application of an FLN was presented for heating, ven- V. TRAINING METHODS
tilating, and air conditioning (HVAC) thermal dynamic system
There are basically two main groups of training (or learning)
identification and control. The use of an NN provided a means of
algorithms: supervised learning (which includes reinforcement
adapting a controller online, in an effort to minimize a given cost
learning) and unsupervised learning. Once the structure of an
index. The identification networks demonstrated the capacity to
NN has been selected, a training algorithm must be attached,
learn changes in the plant dynamics and to accurately predict
to minimize the prediction error made by the network (for
future plan behavior [113]. A robust ANN controller, to the mo-
supervised learning) or to compress the information from the
tion control of rigid-link electrically driven robot using an FLN,
inputs (for unsupervised learning). In supervised learning, the
had been presented. The method did not require the robot dy- correct results (target values, desired outputs) are known and
namics to be exactly known [69]. Multilayer feedforward and are given to the ANN during training so that the ANN can
FLN forecasters were used to model the complex relationship adjust its weights to try match its outputs to the target values.
between weather parameters and previous gas intake with future After training, an ANN is tested as follows. One gives it only
consumption [65]. The FLN was used to improve performance input values, not target values and sees how close the network
in the face of unknown nonlinear characteristics by adding non- comes to outputting the correct target values. Unsupervised
linear effects to the linear optimal controller of robotic systems learning involves no target values; it tries to auto-associate
[66]. information from the inputs with an intrinsic reduction of data
dimension, similar to extracting principal components in linear
J. Functional Polynomial Networks (FPNs) systems. This is the role of the training algorithms (i.e., fitting the
This network structure merges both models of functional model represented by the network to the training data available).
link and polynomial network resulting in a very powerful ANN The error, of a particular configuration of the network, can be
LMS or WidrowHoff Learning: The LMS algorithm is quite

similar to perceptron learning algorithm. The differences are as
follows.
1) The error is based on the sum of inputs to the unit rather
than the binary-valued output to the unit . Therefore,
(2)
2) The linear sum of the inputs is passed through bipolar
sigmoid functions, which produces the output 1 or 1,
Fig. 13. Supervised learing scheme. depending on the polarity of the sum.
This algorithm can be used in structures as RBF networks and
determined by running all the training cases through the network was successfully applied [43], [70]. Some CMAC approaches
and comparing the actual output generated with the desired or can also use this algorithm to adapt a complex robotic system
target outputs or clusters. In learning algorithms considerations, involving multiple feedback sensors and multiple command
one is interested in whether a particular algorithm converges, variables [79].
its speed of convergence and the computational complexity Grossberg Learning: Sometimes known as in-star and
of the algorithm. out-star training, this algorithm is updated as follows:
In supervised learning, a set of inputs and correct outputs is (3)
used to train the network. Before the learning algorithms are
applied to update the weights, all the weights are initialized where could be the desired input values (in-star training) or
randomly [84]. The network, using this set of inputs, produces the desired output values (out-star training) depending on the
its own outputs. These outputs are compared with the correct network structure.
outputs and the differences are used to modify the weights, as
shown in Fig. 13. A special case of supervised learning is rein- B. First-Order Gradient Methods
forcement learning, shown in Fig. 14, where there is no set of Backpropagation: Backpropagation is a generalization of
inputs and correct outputs. Training is commanded only by sig- the LMS algorithm. In this algorithm, an error function is
nals indicating if the produced output is bad or good, according defined as the mean-square difference between the desired
to a defined criterion. output and the actual output of the feedforward network [45]. It
Fig. 15 shows the principles of unsupervised learning, also is based on steepest descent techniques extended to each of the
known as self-organized learning, where a network develops its layers in the network by the chain rule. Hence, the algorithm
classification rules by extracting information from the inputs computes the partial derivative , of the
presented to the network. In other words, by using the correla- error function with respect to the weights. The error function
tion of the input vectors, the learning rule changes the network is defined as , where is the
weights to group the input vectors into clusters. By doing so, desired output, is the network output. The objective is to
similar input vectors will produce similar network outputs since minimize the error function by taking the error gradient
they belong to a same cluster. with respect to the parameters or weight vector, for example,
, that is to be adapted. The weights are then updated by using
A. Early Supervised Learning Algorithms
(4)
Early learning algorithms were designed for single layer NNs.
They are generally more limited in their applicability, but their where is the learning rate and
importance in history is remarkable.
Perceptron Learning: A single-layer perceptron is trained as (5)
follows.
1) Randomly initialize all the networks weights. This algorithm is simple to implement and computationally less
2) Apply the inputs and calculate the sum of each unit complex than other modified forms. Despite some disadvan-
. tages, it is popularly used and there are numerous extensions
3) The outputs from each unit are to improve it. Some of these techniques will be presented.
Backpropagation With Momentum (BPM): The basic im-
threshold provement to the backpropagation algorithm is to introduce a
(1) momentum term in the weights updating equation
otherwise
4) Compute the error where

is the known desired output value. (6)
5) Update each weight as where the momentum factor is commonly selected inside
. [0,1]. Adding the momentum term improves the convergence
6) Repeat steps 2)4) until the errors reach the satisfactory speed and helps the network from being trapped in a local
level. minimum.
Fig. 14. Reinforcement learning scheme.
where
(10)
(11)
The positive constant and parameters ( ) are specified by

the user. The quantity is basically an exponentially de-
caying trace of gradient values. When the and are set to
Fig. 15. Unsupervised learning scheme. zero, the learning rates assume a constant value as in the stan-
dard backpropagation algorithm.
Using momentum along with the DBD algorithm can enhance
A modification to (6) was proposed in 1990, inserting the performance considerably. However, it can make the search di-
constant , defined by the user [84] verge wildly, especially if is moderately large. The reason is
that momentum magnifies learning rate increments and quickly
leads to inordinately large learning steps. One possible solution
is to keep factor very small, but this can easily lead to slow
(7) increase in and little speedup [84].
The idea was to reduce the possibility of the network being C. Second-Order Gradient Methods
trapped in the local minimum. These methods use the Hessian matrix , which is the matrix
DeltaBarDelta (DBD): The DBD learning rules use adap- of second derivatives of with respect to the weights . This
tive learning rates, to speed up the convergence. The adaptive matrix contains information about how the gradient changes in
learning rate adopted is based on a local optimization method. different directions in weight space
This technique uses gradient descent for the search direction and
then applies individual step sizes for each weight. This means (12)
the actual direction taken in weight space is not necessarily
along the line of the steepest gradient. If the weight updates be-
Newton Method: The Newton method weights update is pro-
tween consecutive iterations are in opposite directions, the step
size is decreased; otherwise, it is increased. This is prompted by cessed as follows:
the idea that if the weight changes are oscillating, the minimum
(13)
is between the oscillations and a smaller step size might find
that minimum. The step size may be increased again once the
error has stopped oscillating. However, the Newton method is not commonly used because
If denotes the learning rate for the weight , then computing the Hessian matrix is computationally expensive.
Furthermore, the Hessian matrix may not be positive definite
at every point in the error surface. To overcome the problem,
(8)
several methods have being proposed to approximate the Hes-
sian matrix [84].
and is as follows: GaussNewton Method: The GaussNewton method pro-
duces an matrix that is an approximation to the Hessian
matrix, having elements represented by
(9)
otherwise (14)
where . However, the eligibility traces can be generated using the following linear
GaussNewton method may still have ill conditioning if difference equation:
close to or is singular [75].
LevenbergMarquardt (LM) Method: The LM method over- (22)
comes this difficulty by including an additional term, which
is added to the GaussNewton approximation of the Hessian where determines the trace decay rate, is the
giving input, and is the output.
Adaptive Critic Learning: The weights update in a critic net-
(15) work is as follows [9]:
where is a small positive value and is the identity matrix.

could also be made adaptive by having (23)
where , , is a constant discount factor, is the
if
(16) reinforcement signal from the environment, and is the
if
trace of the input variable . is the prediction at time of
where is a value defined by the user. It is important to notice eventual reinforcement, which can be described as a linear func-
that when is large, the algorithm becomes backpropagation tion . The adaptive critic network output
with learning rate and, when is small, the algorithm be- , the improved or internal reinforcement signal, is computed
comes GaussNewton. An NN, trained by using this algorithm, from the predictions as follows:
can be found in the diagnosis of various motor bearing faults
(24)
through appropriate measurement and interpretation of motor
bearing vibration signals [72].
E. Unsupervised Learning
D. Reinforcement Learning Algorithms
Hebbian Learning: Weights are updated as follows:
Reinforcement learning has one of its roots in psychology,
from the idea of rewards and penalties used to teach animals to (25)
do simple tasks with small amounts of feedback information.
Barto and others proposed the Adaptive Critic Learning algo- (26)
rithm to solve discrete domain decision-making problems in the
1980s. Their approach was generalized to the NNs field later where is the weight from th unit to th unit at time
used by Sutton, who used CMAC and other ANN structures to step , is the excitation level of the source unit or th
learn paths for mobile robots in maze-like environments [111]. unit, and is the excitation level of the destination unit
Linear RewardPenalty Learning: When the reinforcement or the th output unit. In this system, learning is a purely
signal is positive ( 1), the learning rule is local phenomenon, involving only two units and a synapse.
No global feedback system is required for the neural pattern
(17) to develop. A special case of Hebbian learning is correlation
for (18) learning, which uses binary activation for function and
is defined as the desired excitation level for the destination
If the reinforcement signal is negative ( 1), the learning rule is unit. While Hebbian learning is performed in unsupervised
(19) environments, correlation learning is supervised [128].
Boltzmann Machine Learning: The Boltzmann Machine
for (20) training algorithm uses a kind of stochastic technique known
as simulated annealing, to avoid being trapped in local minima
where and are learning rates, denotes the probability of the network energy function. The algorithm is as follows.
at iteration , and is the number of actions taken. For positive 1) Initialize weights.
reinforcement, the probability of the current action is increased 2) Calculate activation as follows.
with relative decrease in the probabilities of the other actions. a) Select an initial temperature.
The adjustment is reversed in the case of negative reinforcement. b) Until thermal equilibrium, repeatedly calculate the
Associative Search Learning: In this algorithm, the weights probability that is active by (23).
are updated as follows [9]: c) Exit when the lowest temperature is reached.
Otherwise, reduce temperature by a certain annealing
(21)
schedule and repeat step 2)
where is the reinforcement signal and is eligibility.
Positive indicates the occurrence of a rewarding event and (27)
negative indicates the occurrence of a punishing event. It
can be regarded as a measure of the change in the value of a Above, is the temperature, is the total input
performance criterion. Eligibility reflects the extent of activity received by the th unit, and the activation level of unit
in the pathway or connection link. Exponentially decaying is set according to this probability.
Kohonen Self-Organizing Learning: The network is trained varied until satisfactory convergence is achieved. Currently,
according to the following algorithm, frequently called the the most widely used algorithm for training MLPs is the back-
winner-takes-all rule. propagation algorithm. It minimizes the mean square error
1) Apply an input vector . between the desired and the actual output of the network. The
2) Calculate the distance (in -dimensional space) be- optimization is carried out with a gradient-descent technique.
tween and the weight vectors of each unit. In Eu- There are two critical issues in network learning: estimation
clidean space, this is calculated as follows: error and training time. These issues may be affected by the
network architecture and the training set. The network archi-
(28) tecture includes the number of hidden nodes, number of hidden
3) The unit that has the weight vector closest to is declared layers and values of learning parameters. The training set is
the winner unit. This weight vector, called , becomes related to the number of training patterns, inaccuracies of input
the center of a group of weight vectors that lie within a data and preprocessing of data. The backpropagation algorithm
distance from . does not always find the global minimum, but may stop at a
4) For all weight vectors within a distance of , train local minimum. In practice, the type of minimum has little
this group of nearby weight vectors according to the for- importance, as long as the desired mapping or classification
mula that follows: is reached with a desired accuracy. The optimization criterion
of the backpropagation algorithm is not very good, from the
(29) pattern recognition point of view. The algorithm minimizes
the square error between the actual and the desired output, not
5) Perform steps 1)4), cycling through each input vector
the number of faulty classifications, which is the main goal
until convergence.
in pattern recognition. The algorithm is too slow for prac-
F. Practical Considerations tical applications, especially if many hidden layers are used.
In addition, a backpropagation net has poor memory. When
NNs are unsurpassed at identifying patterns or trends in data the net learns something new it forgets the old. Despite its
and well suited for prediction or forecasting needs including shortcomings, bac-propagation is broadly used. Although the
sales and customer research, data validation, risk management, back-propagation algorithm has been a significant milestone,
and industrial process control. One of the fascinating aspects, of many attempts have been made to speed up the convergence
the practical implementation of NNs to industrial applications, and significant improvement are observed by using various
is the ability to manage data interaction between electrical and second order approaches, namely, Newtons method, conjugate
mechanical behavior and often other disciplines, as well. The gradients, or the LM optimization technique [5], [10], [21],
majority of the reported applications involve fault diagnosis and [48]. The issues to be dealt with are [84], [102] as follows:
detection, quality control, pattern recognition, and adaptive con-
trol [14], [44], [74], [115]. Supervised NNs can mimic the be- 1) slow convergence speed;
havior of human control systems, as long as data corresponding 2) sensitivity to initial conditions;
to the human operator and the control input are supplied [7], 3) trapping in local minima;
[126]. Most of the existing, successful applications in control 4) instability if learning rate is too large.
use supervised learning, or any form of a reinforcement learning One of the alternatives for the problem of being trapped in a local
approach that is also supervised. Unsupervised learning is not minimum is adding the momentum term using the BPM, which
suitable, particularly for online control, due the slow adaptation also improves the convergence speed. Another alternative used
and required time for the network to settle into stable condi- when a backpropagation algorithm is difficult to implement,
tions. Unsupervised learning schemes are used mostly for pat- as in analog hardware, is the random weight change (RWC).
tern recognition, by defining group of patterns into a number of This algorithm has shown to be immune to offset sensitivity
clusters or classes. and nonlinearity errors. It is a stochastic learning that makes
There are some advantages to NNs over multiple data re- sure that the error function decreases on average, since it is
gression. There is no need to select the most important inde- going up or down at any one time. It is often called simulated
pendent variables in the data set. The synapses associated with annealing because of its operational similarity to annealing
irrelevant variables readily show negligible weight values; in processes [73]. Second order gradient methods use the matrix
their turn, relevant variables present significant synapse weight of second derivatives of with respect to the weights .
values. There is also no need to propose a model function However, computing this matrix is computationally expensive
as required in multiple regressions. The learning capability of and the methods presented tried to approximate this matrix
NNs allows them to discover more complex and subtle in- to make algorithms more accessible.
teractions between the independent variables, contributing to Linear rewardpenalty, associative search, and adaptive critic
the development of a model with maximum precision. NNs algorithms are characterized as a special case of supervised
are intrinsically robust showing more immunity to noise, an learning called reinforcement learning. They do not need to
important factor in modeling industrial processes. NNs have explicitly compute derivatives. Computation of derivatives
been applied within industrial domains, to address the inherent usually introduces a lot of high frequency noise in the control
complexity of interacting processes under the lack of robust loop. Therefore, they are very suitable for some complex
analytical models of real industrial processes. In many cases, systems, where basic training algorithms may fail or produce
network topologies and training parameters are systematically suboptimal results. On the other hand, those methods present
slower learning processes; and, because of this, they are adopted themselves, have to be found from data in the training phase,
especially in the cases where only a single bit of information without supervision.
(for example, whether the output is right or wrong) is available.
C. Process Control
VI. TAXONOMY OF NN APPLICATIONS The NN makes use of nonlinearity, learning, parallel pro-
From the viewpoint of industrial applications, ANN applica- cessing, and generalization capabilities for application to
tions can be divided into four main categories. advanced intelligent control. They can be classified into some
major methods, such as supervised control, inverse control,
A. Modeling and Identification neural adaptive control, back-propagation of utility (which is
an extended method of a back-propagation through time) and
Modeling and identification are techniques to describe the adaptive critics (which is an extended method of reinforcement
physical principles existing between the input and the output learning algorithm) [2]. MLP structures were used for digital
of a system. The ANN can approximate these relationships current regulation of inverter drives [16], to predict trajectories
independent of the size and complexity of the problem. It has in robotic environments [19], [40], [52], [73], [79], [87], [89],
been found to be an effective system for learning discriminants, [110], to control turbo generators [117], to monitor feed water
for patterns from a body of examples. MLP is used as the flow rate and component thermal performance of pressurized
basic structure for a bunch of applications [4], [12], [17], [18], water reactors [61], to regulate temperature [64], and to predict
[32], [46], [56], [85], [94], [119], [127]. The Hopfield network natural gas consumption [65]. Dynamical versions of MLP
can be used to identify problems of linear time-varying or networks were used to control a nonlinear dynamic model of
time-invariant systems [28]. Recurrent network topology [36], a robot [60], [97], to control manufacturing cells [92], and to
[83] has received considerable attention for the identification implement a programmable cascaded low-pass filter [101]. A
of nonlinear dynamical systems. A functional-link NN ap- dynamic MLP is a classical MLP structure where the outputs
proach (FLN) was used to perform thermal dynamical system are fed back to the inputs by means of time delay elements.
identification [113]. Other structures can be found as functional link networks to
control robots [69] and RBF networks to predict, from operating
B. Optimization and Classification
conditions and from features of a steel sheet, the thermal en-
Optimization is often required for design, planning of ac- ergy required to correct alloying [13]. RBF networks can be ob-
tions, motions, and tasks. However, as known in the Traveling served, as well, in predictive controllers for drive systems [38].
Salesman Problem, many parameters cause the amount of cal- Hopfield structures were used for torque minimization control
culation to be tremendous and the ordinary method cannot be of redundant manipulators [33]. FPNs can be used for function
applied. An affective approach is to find the optimal solution approximation inside specific control schemes [100]. CMAC
by defining an energy function and using the NN with parallel networks were implemented in research automobiles [76] and
processing, learning and self-organizing capabilities to operate to control robots [79].
in such a way that the energy is reduced. It is shown that ap-
plication of the optimal approach makes effective use of ANN D. Pattern Recognition
sensing, recognizing, and forecasting capabilities, in the control Some specific ANN structures, such as Kohonen and proba-
of robotic manipulators with impact taken into account [45]. bilistic networks, are studied and applied mainly for image and
Classification using an ANN can also be viewed as an voice recognition. Research in image recognition includes ini-
optimization problem, provided that the existing rules to dis- tial vision (stereo vision of both eyes, outline extraction, etc.)
tinguish the various classes of events/materials/objects can be close to the biological (particularly brain) function, manually
described in functional form. In such cases, the networks will written character recognition by cognition at the practical level
decide if a particular input belongs to one of the defined classes and cell recognition for mammalian cell cultivation by using
by optimizing the functional rules and a posteriori evaluating NNs [45]. Kohonen networks were used for image inspection
the achieved results. Different authors [34], [70] have proposed and for disease identification from mammographic images [98].
RBF approaches. For applications such as fault diagnosis, RBF Probabilistic networks were used for transient detection to en-
networks offer clear advantages over MLPs. They are faster hance nuclear reactors operational safety [6]. As in the others
to train, because layer training is decoupled [70]. Cellular categories, the MLP is widely used as well. The papermaking
networks [47], ART networks [122], and Hopfield networks industry [39] is one such example.
[105], [112] can be used as methods to detect, to isolate faults,
and to promote industrial quality control. MLPs are also widely
VII. CONCLUSION
used for these purposes [23], [25], [35], [102], [114], [121].
This structure can be found in induction motor [41], [42] and This paper has described theoretical aspects of NNs related to
bearing [72] fault diagnosis, for nondestructive evaluation of their relevance for industrial applications. Common questions
check valve performance and degradation [2], [57], in defect that an engineer would ask when choosing an NN for a par-
detection on woven fabrics [99], and in robotic systems [116]. ticular application were answered. Characteristics of industrial
Finally, it is important to mention clustering applications, processes, which would justify the ANN utilization, were dis-
which are special cases of classification where there is no cussed and some areas of importance were proposed. Important
supervision during the training phase. The relationships be- structures and training methods, with relevant references that il-
tween elements of the existing classes and even the classes lustrated the utilization of those concepts, were presented.
This survey observed that, although ANNs have a history [19] R. Carelli, E. F. Camacho, and D. Patio, A neural network based feed
of more than 50 years, most of industrial applications were forward adaptive controller for robots, IEEE Trans. Syst., Man, Cy-
bern., vol. 25, pp. 12811288, Sept. 1995.
launched in the last ten years, where it was justified that the [20] G. A. Carpenter and S. Grossberg, Associative learning, adaptive
investigators provided either an alternative or a complement to pattern recognition and cooperative- competitive decision making, in
other classical techniques. Those ANN applications demon- Optical and Hybrid Computing, H. Szu, Ed. Bellingham, WA: SPIE,
1987, vol. 634, pp. 218247.
strated adaptability features integrated with the industrial [21] C. Charalambous, Conjugate gradient algorithm for efficient training
problem, thus becoming part of the industrial processes. The of artificial neural networks, Proc. Inst. Elect. Eng., vol. 139, no. 3, pp.
authors firmly believe that such an intricate field of NNs is just 301310, 1992.
[22] S. Chen and S. A. Billings, Neural networks for nonlinear dynamic
starting to permeate a broad range of interdisciplinary problem system modeling and identification, Int. J. Control, vol. 56, no. 2, pp.
solving streams. The potential of NNs will be integrated into a 319346, 1992.
still larger and all-encompassing field of intelligence systems [23] R. P. Cherian, L. N. Smith, and P. S. Midha, A neural network approach
for selection of powder metallurgy materials and process parameters,
and will soon be taught for students and engineers as an Artif. Intell. Eng., vol. 14, pp. 3944, 2000.
ordinary mathematical tool. [24] M. Y. Chow, P. M. Mangum, and S. O. Yee, A neural network approach
to real-time condition monitoring of induction motors, IEEE Trans. Ind.
Electron., vol. 38, pp. 448453, Dec. 1991.
REFERENCES [25] M. Y. Chow, R. N. Sharpe, and J. C. Hung, On the application and
design of artificial neural networks for motor fault detectionPart II,
[1] J. S. Albus, A new approach to manipulator control: The cerebellar IEEE Trans. Ind. Electron., vol. 40, pp. 189196, Apr. 1993.
model articulation controller, Trans. ASME, J. Dyn. Syst., Meas. Con- [26] , On the application and design of artificial neural networks for
trol, vol. 97, pp. 220227, Sept. 1975. motor fault detectionPart I, IEEE Trans. Ind. Electron., vol. 40, pp.
[2] I. E. Algundigue and R. E. Uhrig, Automatic fault recognition in me- 181188, Apr. 1993.
chanical components using coupled artificial neural networks, in Proc. [27] T. W. S. Chow and Y. Fang, A recurrent neural-network based real-time
IEEE World Congr. Computational Intelligence, JuneJuly 1994, pp. learning control strategy applying to nonlinear systems with unknown
33123317. dynamics, IEEE Trans. Ind. Electron., vol. 45, pp. 151161, Feb. 1998.
[3] P. E. M. Almeida and M. G. Simes, Fundamentals of a fast conver- [28] S. R. Chu and R. Shoureshi, Applications of neural networks in learning
gence parametric CMAC network, in Proc. IJCNN01, vol. 3, 2001, of dynamical systems, IEEE Trans. Syst., Man, Cybern., vol. 22, pp.
pp. 30153020. 160164, Jan./Feb. 1992.
[4] K. Andersen, G. E. Cook, G. Karsai, and K. Ramaswamy, Artificial [29] S. R. Chu, R. Shoureshi, and M. Tenorio, Neural networks for system
neural networks applied to arc welding process modeling and control, identification, IEEE Contr. Syst. Mag., vol. 10, pp. 3135, Apr. 1990.
IEEE Trans. Ind. Applicat., vol. 26, pp. 824830, Sept./Oct. 1990. [30] L. O. Chua, T. Roska, T. Kozek, and . Zarndy, The CNN Para-
[5] T. J. Andersen and B. M. Wilamowski, A modified regression algorithm digmA short tutorial, in Cellular Neural Networks, T. Roska and J.
for fast one layer neural network training, in Proc. World Congr. Neural Vandewalle, Eds. New York: Wiley, 1993, pp. 114.
Networks, vol. 1, Washington DC, July 1721, 1995, pp. 687690. [31] M. Cichowlas, D. Sobczuk, M. P. Kazmierkowski, and M. Malinowski,
[6] I. K. Attieh, A. V. Gribok, J. W. Hines, and R. E. Uhrig, Pattern recogni- Novel artificial neural network based current controller for PWM rec-
tion techniques for transient detection to enhance nuclear reactors oper- tifiers, in Proc. 9th Int. Conf. Power Electronics and Motion Control,
ational safety, in Proc. 25th CNS/CNA Annu. Student Conf., Knoxville, 2000, pp. 4146.
TN, Mar. 2000. [32] G. E. Cook, R. J. Barnett, K. Andersen, and A. M. Strauss, Weld mod-
[7] S. M. Ayala, G. Botura Jr., and O. A. Maldonado, AI automates substa- eling and control using artificial neural network, IEEE Trans. Ind. Ap-
tion control, IEEE Comput. Applicat. Power, vol. 15, pp. 4146, Jan. plicat., vol. 31, pp. 14841491, Nov./Dec. 1995.
2002. [33] H. Ding and S. K. Tso, A fully neural-network-based planning scheme
[8] D. L. Bailey and D. M. Thompson, Developing neural-network appli- for torque minimization of redundant manipulators, IEEE Trans. Ind.
cations, AI Expert, vol. 5, no. 9, pp. 3441, 1990. Electron., vol. 46, pp. 199206, Feb. 1999.
[9] A. G. Barto, R. S. Sutton, and C. W. Anderson, Neuronlike elements [34] G. Dini and F. Failli, Planning grasps for industrial robotized applica-
that can solve difficult control problems, IEEE Trans. Syst., Man, Cy- tions using neural networks, Robot. Comput. Integr. Manuf., vol. 16,
bern., vol. SMC-13, pp. 834846, Sept./Oct. 1983. pp. 451463, Dec. 2000.
[10] R. Battiti, First- and second-order methods for learning: Between [35] M. Dolen and R. D. Lorenz, General methodologies for neural net-
steepest descent and Newtons method, Neural Computation, vol. 4, work programming, in Proc. IEEE Applied Neural Networks Conf.,
no. 2, pp. 141166, 1992. Nov. 1999, pp. 337342.
[11] B. Bavarian, Introduction to neural networks for intelligent control, [36] , Recurrent neural network topologies for spectral state estimation
IEEE Contr. Syst. Mag., vol. 8, pp. 37, Apr. 1988. and differentiation, in Proc. ANNIE Conf., St. Louis, MO, Nov. 2000.
[12] N. V. Bhat, P. A. Minderman, T. McAvoy, and N. S. Wang, Modeling [37] M. Dolen, P. Y. Chung, E. Kayikci, and R. D. Lorenz, Disturbance
chemical process systems via neural computation, IEEE Contr. Syst. force estimation for CNC machine tool feed drives by structured neural
Mag., vol. 10, pp. 2430, Apr. 1990. network topologies, in Proc. ANNIE Conference, St. Louis, MO, Nov.
[13] G. Bloch, F. Sirou, V. Eustache, and P. Fatrez, Neural intelligent control 2000.
for a steel plant, IEEE Trans. Neural Networks, vol. 8, pp. 910918, [38] Y. Dote, M. Strefezza, and A. Suyitno, Neuro fuzzy robust controllers
July 1997. for drive systems, in Proc. IEEE Int. Symp. Industrial Electronics,
[14] Z. Boger, Experience in developing models of industrial plants by large 1993, pp. 229242.
scale artificial neural networks, in Proc. Second New Zealand Interna- [39] P. J. Edwards, A. F. Murray, G. Papadopoulos, A. R. Wallace, J. Barnard,
tional Two-Stream Conf. Artificial Neural Networks and Expert Systems, and G. Smith, The application of neural networks to the papermaking
1995, pp. 326329. industry, IEEE Trans. Neural Networks, vol. 10, pp. 14561464, Nov.
[15] D. S. Broomhead and D. Lowe, Multivariable functional interpolation 1999.
and adaptive network, Complex Syst., vol. 2, pp. 321355, 1988. [40] M. J. Er and K. C. Liew, Control of adept one SCARA robot using
[16] M. Buhl and R. D. Lorenz, Design and implementation of neural net- neural networks, IEEE Trans. Ind. Electron., vol. 44, pp. 762768, Dec.
works for digital current regulation of inverter drives, in Conf. Rec. 1997.
IEEE-IAS Annu. Meeting, 1991, pp. 415423. [41] F. Filippetti, G. Franceschini, and C. Tassoni, Neural networks aided
[17] B. Burton and R. G. Harley, Reducing the computational demands of on-line diagnostics of induction motor rotor faults, IEEE Trans. Ind.
continually online-trained artificial neural networks for system identi- Applicat., vol. 31, pp. 892899, July/Aug. 1995.
fication and control of fast processes, IEEE Trans. Ind. Applicat., vol. [42] F. Filippetti, G. Franceschini, C. Tassoni, and P. Vas, Recent develop-
34, pp. 589596, May/June 1998. ments of induction motor drives fault diagnosis using AI techniques,
[18] B. Burton, F. Kamran, R. G. Harley, T. G. Habetler, M. Brooke, and R. IEEE Trans. Ind. Electron., vol. 47, pp. 9941004, Oct. 2000.
Poddar, Identification and control of induction motor stator currents [43] D. Flynn, S. McLoone, G. W. Irwin, M. D. Brown, E. Swidenbank, and
using fast on-line random training of a neural network, in Conf. Rec. B. W. Hogg, Neural control of turbogenerator systems, Automatica,
IEEE-IAS Annu. Meeting, 1995, pp. 17811787. vol. 33, no. 11, pp. 19611973, 1997.
[44] D. B. Fogel, Selecting an optimal neural network industrial electronics [71] F. L. Lewis, A. Yesildirek, and K. Liu, Multilayer neural-net robot con-
society, in Proc. IEEE IECON90, vol. 2, 1990, pp. 12111214. troller with guaranteed tracking performance, IEEE Trans. Neural Net-
[45] T. Fukuda and T. Shibata, Theory and applications of neural networks works, vol. 7, pp. 388399, Mar. 1996.
for industrial control systems, IEEE Trans. Ind. Applicat., vol. 39, pp. [72] B. Li, M. Y. Chow, Y. Tipsuwan, and J. C. Hung, Neural-network based
472489, Nov./Dec. 1992. motor rolling bearing fault diagnosis, IEEE Trans. Ind. Electron., vol.
[46] A. A. Gorni, The application of neural networks in the modeling of 47, pp. 10601069, Oct. 2000.
plate rolling processes, JOM-e, vol. 49, no. 4, electronic document, Apr. [73] J. Liu, B. Burton, F. Kamran, M. A. Brooke, R. G. Harley, and T. G.
1997. Habetler, High speed on-line neural network of an induction motor im-
[47] N. Guglielmi, R. Guerrieri, and G. Baccarani, Highly constrained mune to analog circuit nonidealities, in Proc. IEEE Int. Symp. Circuits
neural networks for industrial quality control, IEEE Trans. Neural and Systems, June 1997, pp. 633636.
Networks, vol. 7, pp. 206213, Jan. 1996. [74] Y. Liu, B. R. Upadhyaya, and M. Naghedolfeizi, Chemometric data
[48] M. T. Hagan and M. Menhaj, Training feedforward networks with analysis using artificial neural networks, Appl. Spectrosc., vol. 47, no.
the Marquardt algorithm, IEEE Trans. Neural Networks, vol. 5, pp. 1, pp. 1223, 1993.
989993, Nov. 1994. [75] L. Ljung, System Identification: Theory for the User. New York: Pren-
[49] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd tice-Hall, 1987.
ed. New York: Prentice-Hall, 1995. [76] M. Majors, J. Stori, and D. Cho, Neural network control of automotive
[50] S. M. Halpin and R. F. Burch, Applicability of neural networks to indus- fuel-injection systems, IEEE Contr. Syst. Mag., vol. 14, pp. 3136, June
trial and commercial power systems: A tutorial overview, IEEE Trans. 1994.
Ind. Applicat., vol. 33, pp. 13551361, Sept./Oct. 1997. [77] W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent
[51] H. Harrer and J. Nossek, Discrete-time cellular neural networks, in in nervous activity, Bull. Math. Biophys., vol. 5, pp. 115133, 1943.
Cellular Neural Networks, T. Roska and J. Vandewalle, Eds. New [78] M. Milanova, P. E. M. Almeida, J. Okamoto Jr., and M. G. Simes, Ap-
York: Wiley, 1993, pp. 1529. plications of cellular neural networks for shape from shading problem,
[52] H. Hashimoto, T. Kubota, M. Sato, and F. Harashima, Visual control of in Proc. Int. Workshop Machine Learning and Data Mining in Pattern
robotic manipulator based on neural networks, IEEE Trans. Ind. Elec- Recognition, Lecture Notes in Artificial Intelligence, P. Perner and M.
tron., vol. 39, pp. 490496, Dec. 1992. Petrou, Eds., Leipzig, Germany, September 1999, pp. 5263.
[53] D. O. Hebb, The Organization of Behavior. New York: Wiley, 1949. [79] W. T. Miller III, Real-time application of neural networks for sensor-
[54] G. E. Hinton and T. J. Sejnowski, Learning and relearning in Boltzmann based control of robots with vision, IEEE Trans. Syst., Man, Cybern.,
machines, in The PDP Research Group, D. Rumelhart and J. McClel- vol. 19, pp. 825831, July/Aug. 1989.
land, Eds. Cambridge, MA: MIT Press, 1986. [80] M. L. Minsky and S. Papert, Perceptrons: An Introduction to Computa-
[55] J. J. Hopfield, Neural networks and physical systems with emergent tional Geometry. Cambridge, MA: MIT Press, 1969.
collective computational abilities, in Proc. Nat. Acad. Sci., vol. 79, Apr. [81] T. Munakata, Fundamentals of the New Artificial IntelligenceBeyond
1982, pp. 24452558. Traditional Paradigms. Berlin, Germany: Springer-Verlag, 1998.
[56] C. Y. Huang, T. C. Chen, and C. L. Huang, Robust control of induction [82] S. R. Naidu, E. Zafiriou, and T. J. McAvoy, Use of neural networks for
motor with a neural-network load torque estimator and a neural-network sensor failure detection in a control system, IEEE Contr. Syst. Mag.,
identification, IEEE Trans. Ind. Electron., vol. 46, pp. 990998, Oct. vol. 10, pp. 4955, Apr. 1990.
1999. [83] K. S. Narendra and K. Parthasarathy, Identification and control of dy-
[57] A. Ikonomopoulos, R. E. Uhrig, and L. H. Tsoukalas, Use of neural namical systems using neural networks, IEEE Trans. Neural Networks,
networks to monitor power plant components, in Proc. American Power vol. 1, pp. 427, Mar. 1990.
Conf., vol. 54-II, Apr. 1992, pp. 11321137. [84] G. W. Ng, Application of Neural Networks to Adaptive Control of Non-
[58] M. Jelnek. (1999) Everything you wanted to know about ART linear Systems. London, U.K.: Research Studies Press, 1997.
neural networks, but were afraid to ask. [Online]. Available: [85] D. H. Nguyen and B. Widrow, Neural networks for self-learning control
http://cs.felk.cvut.cz/~xjeline1/semestralky/nan systems, IEEE Contr. Syst. Mag., vol. 10, pp. 1823, Apr. 1990.
[59] S. Jung and T. C. Hsia, Neural network impedance force control of [86] J. R. Noriega and H. Wang, A direct adaptive neural-network control
robot manipulator, IEEE Trans. Ind. Electron., vol. 45, pp. 451461, for unknown nonlinear systems and its application, IEEE Trans. Neural
June 1998. Networks, vol. 9, pp. 2734, Jan. 1998.
[60] A. Karakasoglu and M. K. Sundareshan, A recurrent neural network- [87] T. Ozaki, T. Suzuki, T. Furuhashi, S. Okuma, and Y. Uchikawa, Tra-
based adaptive variable structure model-following control of robotic ma- jectory control of robotic manipulators using neural networks, IEEE
nipulators, Automatica, vol. 31, no. 10, pp. 14951507, 1995. Trans. Ind. Electron., vol. 38, June 1991.
[61] K. Kavaklioglu and B. R. Upadhyaya, Monitoring feedwater flow [88] D. B. Parker, A comparison of algorithms for neuron-like cells, in
rate and component thermal performance of pressurized water reactors Neural Networks for Computing, J. S. Denker, Ed. New York: Amer-
by means of artificial neural networks, Nucl. Technol., vol. 107, pp. ican Institute of Physics, 1986, pp. 327332.
112123, July 1994. [89] P. Payeur, H. Le-Huy, and C. M. Gosselin, Trajectory prediction for
[62] M. Kawato, Y. Uno, M. Isobe, and R. Suzuki, Hierarchical neural net- moving objects using artificial neural networks, IEEE Trans. Ind. Elec-
work model for voluntary movement with application to robotics, IEEE tron., vol. 42, pp. 147158, Apr. 1995.
Contr. Syst. Mag., vol. 8, pp. 815, Apr. 1988. [90] M. H. Rahman, R. Fazlur, R. Devanathan, and Z. Kuanyi, Neural net-
[63] M. Khalid and S. Omatu, A neural network controller for a temperature work approach for linearizing control of nonlinear process plants, IEEE
control system, IEEE Contr. Syst. Mag., vol. 12, pp. 5864, June 1992. Trans. Ind. Electron., vol. 47, pp. 470477, Apr. 2000.
[64] M. Khalid, S. Omatu, and R. Yusof, Temperature regulation with neural [91] F. Rosenblatt, The perceptron: A probabilistic model for information
networks and alternative control schemes, IEEE Trans. Neural Net- storage and organization in the brain, Psych. Rev., vol. 65, pp. 386408,
works, vol. 6, pp. 572582, May 1995. 1958.
[65] A. Khotanzad, H. Elragal, and T. L. Lu, Combination of artificial [92] G. A. Rovithakis, V. I. Gaganis, S. E. Perrakis, and M. A. Christodoulou,
neural-network forecasters for prediction of natural gas consumption, Real-time control of manufacturing cells using dynamic neural net-
IEEE Trans. Neural Networks, vol. 11, pp. 464473, Mar. 2000. works, Automatica, vol. 35, no. 1, pp. 139149, 1999.
[66] Y. H. Kim, F. L. Lewis, and D. M. Dawson, Intelligent optimal control [93] A. Rubaai and M. D. Kankam, Adaptive real-time tracking controller
of robotic manipulators using neural network, Automatica, vol. 36, no. for induction motor drives using neural designs, in Conf. Rec. IEEE-IAS
9, pp. 13551364, 2000. Annu. Meeting, vol. 3, Oct. 1996, pp. 17091717.
[67] B. Kosko, Adaptive bi-directional associative memories, Appl. Opt., [94] A. Rubaai and R. Kotaru, Online identification and control of a DC
vol. 26, pp. 49474960, 1987. motor using learning adaptation of neural networks, IEEE Trans. Ind.
[68] S. Y. Kung and J. N. Hwang, Neural network architectures for robotic Applicat., vol. 36, pp. 935942, May/June 2000.
applications, IEEE Trans. Robot. Automat., vol. 5, pp. 641657, Oct. [95] D. D. Rumelhart, G. E. Hinton, and R. J. Williams, Learning repre-
1989. sentations by back-propagating errors, Nature, vol. 323, pp. 533536,
[69] C. Kwan, F. L. Lewis, and D. M. Dawson, Robust neural-network con- 1986.
trol of rigid-link electrically driven robots, IEEE Trans. Neural Net- [96] D. E. Rumelhart, B. Widrow, and M. A. Lehr, The basic ideas in neural
works, vol. 9, pp. 581588, July 1998. networks, Commun. ACM, vol. 37, no. 3, pp. 8792, Mar. 1994.
[70] J. A. Leonard and M. A. Kramer, Radial basis function networks for [97] M. Saad, P. Bigras, L. A. Dessaint, and K. A. Haddad, Adaptive robot
classifying process faults, IEEE Contr. Syst. Mag., vol. 11, pp. 3138, control using neural networks, IEEE Trans. Ind. Electron., vol. 41, pp.
Apr. 1991. 173181, Apr. 1994.
[98] S. Sardy and L. Ibrahim, Experimental medical and industrial applica- [122] J. R. Whiteley, J. F. Davis, A. Mehrotra, and S. C. Ahalt, Observations
tions of neural networks to image inspection using an inexpensive per- and problems applying ART2 for dynamic sensor pattern interpretation,
sonal computer, Opt. Eng., vol. 35, no. 8, pp. 21822187, Aug. 1996. IEEE Trans. Syst., Man, Cybern. A, vol. 26, pp. 423437, July 1996.
[99] S. Sardy, L. Ibrahim, and Y. Yasuda, An application of vision system [123] B. Widrow, DARPA Neural Network Study. Fairfax, VA: Armed Forces
for the identification and defect detection on woven fabrics by using Communications and Electronics Assoc. Int. Press, 1988.
artificial neural networks, in Proc. Int. Joint Conf. Neural Networks, [124] B. Widrow and M. E. Hoff Jr., Adaptive switching circuits, 1960 IRE
1993, pp. 21412144. Western Electric Show Conv. Rec., pt. 4, pp. 96104, Aug. 1960.
[100] A. P. A. Silva, P. C. Nascimento, G. L. Torres, and L. E. B. Silva, An [125] N. Wiener, Cybernetics. Cambridge, MA: MIT Press, 1961.
alternative approach for adaptive real-time control using a nonpara- [126] M. J. Willis, G. A. Montague, D. C. Massimo, A. J. Morris, and M.
metric neural network, in Conf. Rec. IEEE-IAS Annu. Meeting, 1995, T. Tham, Artificial neural networks and their application in process
pp. 17881794. engineering, in IEE Colloq. Neural Networks for Systems: Principles
[101] L. E. B. Silva, B. K. Bose, and J. O. P. Pinto, Recurrent-neural-net- and Applications, 1991, pp. 7174.
work-based implementation of a programmable cascaded low-pass filter [127] M. Wishart and R. G. Harley, Identification and control of induction
used in stator flux synthesis of vector-controlled induction motor drive, machines using artificial neural networks, IEEE Trans. Ind. Applicat.,
IEEE Trans. Ind. Electron., vol. 46, pp. 662665, June 1999. vol. 31, pp. 612619, May/June 1995.
[102] T. Sorsa, H. N. Koivo, and H. Koivisto, Neural networks in process [128] J. M. Zurada, Introduction to Artificial Neural Networks. Boston, MA:
fault diagnosis, IEEE Trans. Syst., Man. Cybern., vol. 21, pp. 815825, PWSKent, 1995.
July/Aug. 1991.
[103] D. F. Specht, Probabilistic neural networks for classification, mapping,
or associative memory, in Proc. IEEE Int. Conf. Neural Networks, July
1988, pp. 525532.
[104] , Probabilistic neural networks, Neural Networks, vol. 3, pp. Magali R. G. Meireles received the B.E. degree
109118, 1990. from the Federal University of Minas Gerais, Belo
[105] A. Srinivasan and C. Batur, Hopfield/ART-1 neural network-based Horizonte, Brazil, in 1986, and the M.Sc. degree
fault detection and isolation, IEEE Trans. Neural Networks, vol. 5, pp. from the Federal Center for Technological Education,
890899, Nov. 1994. Belo Horizonte, Brazil, in 1998, both in electrical
[106] W. E. Staib and R. B. Staib, The intelligence arc furnace controller: A engineering.
neural network electrode position optimization system for the electric She is an Associate Professor in the Mathematics
arc furnace, presented at the IEEE Int. Joint Conf. Neural Networks, and Statistics Department, Pontific Catholic Uni-
New York, NY, 1992. versity of Minas Gerais, Belo Horizonte, Brazil.
[107] R. Steim, Preprocessing data for neural networks, AI Expert, pp. Her research interests include applied artificial
3237, Mar. 1993. intelligence and engineering education. In 2001, she
[108] K. Steinbuch and U. A. W. Piske, Learning matrices and their applica- was a Research Assistant in the Division of Engineering, Colorado School of
tions, IEEE Trans. Electron. Comput., vol. EC-12, pp. 846862, Dec. Mines, Golden, where she conducted research in the Mechatronics Laboratory.
1963.
[109] F. Sun, Z. Sun, and PY. Woo, Neural network-based adaptive controller
design of robotic manipulators with an observer, IEEE Trans. Neural
Networks, vol. 12, pp. 5467, Jan. 2001. Paulo E. M. Almeida (S00) received the B.E. and
[110] M. K. Sundareshan and C. Askew, Neural network-assisted variable M.Sc. degrees from the Federal University of Minas
structure control scheme for control of a flexible manipulator arm, Au- Gerais, Belo Horizonte, Brazil, in 1992 and 1996,
tomatica, vol. 33, no. 9, pp. 16991710, 1997. respectively, both in electrical engineering, and the
[111] R. S. Sutton, Generalization in reinforcement learning: Successful ex- Dr.E. degree from So Paulo University, So Paulo,
amples using sparse coarse coding, in Advances in Neural Informa- Brazil.
tion Processing Systems. Cambridge, MA: MIT Press, 1996, vol. 8, He is an Assistant Professor at the Federal Center
pp. 10381044. for Technological Education of Minas Gerais, Belo
[112] H. H. Szu, Automatic fault recognition by image correlation neural net- Horizonte, Brazil. His research interests are applied
work techniques, IEEE Trans. Ind. Electron., vol. 40, pp. 197208, Apr. artificial intelligence, intelligent control systems, and
1993. industrial automation. In 20002001, he was a Vis-
[113] J. Teeter and M. Y. Chow, Application of functional link neural net- iting Scholar in the Division of Engineering, Colorado School of Mines, Golden,
work to HVAC thermal dynamic system identification, IEEE Trans. where he conducted research in the Mechatronics Laboratory.
Ind. Electron., vol. 45, pp. 170176, Feb. 1998. Dr. Almeida is a member of the Brazilian Automatic Control Society. He re-
[114] L. Tsoukalas and J. Reyes-Jimenez, Hybrid expert system-neural net- ceived a Student Award and a Best Presentation Award from the IEEE Industrial
work methodology for nuclear plant monitoring and diagnostics, in Electronics Society at the 2001 IEEE IECON, held in Denver, CO.
Proc. SPIE Applications of Artificial Intelligence VIII, vol. 1293, Apr.
1990, pp. 10241030.
[115] R. E. Uhrig, Application of artificial neural networks in industrial
technology, in Proc. IEEE Int. Conf. Industrial Technology, 1994, pp.
7377. Marcelo Godoy Simes (S89M95SM98)
[116] A. T. Vemuri and M. M. Polycarpou, Neural-network-based robust fault received the B.S. and M.Sc. degrees in electrical
diagnosis in robotic systems, IEEE Trans. Neural Networks, vol. 8, pp. engineering from the University of So Paulo, So
14101420, Nov. 1997. Paulo, Brazil, in 1985 and 1990, respectively, the
[117] G. K. Venayagamoorthy and R. G. Harley, Experimental studies with a Ph.D. degree in electrical engineering from the
continually online-trained artificial neural network controller for a turbo University of Tennessee, Knoxville, in 1995, and
generator, in Proc. Int. Joint Conf. Neural Networks, vol. 3, Wash- the Livre-Docencia (D.Sc.) degree in mechanical
ington, DC, July 1999, pp. 21582163. engineering from the University of So Paulo, in
[118] B. W. Wah and G. J. Li, A survey on the design of multiprocessing 1998.
systems for artificial intelligence applications, IEEE Trans. Syst., Man, He is currently an Associate Professor at the Col-
Cybern., vol. 19, pp. 667692, July/Aug. 1989. orado School of Mines, Golden, where he is working
[119] S. Weerasooriya and M. A. El-Sharkawi, Identification and control of to establish several research and education activities. His interests are in the
a DC motor using back-propagation neural networks, IEEE Trans. En- research and development of intelligent applications, fuzzy logic and neural
ergy Conversion, vol. 6, pp. 663669, Dec. 1991. networks applications to industrial systems, power electronics, drives, machine
[120] P. J. Werbos, Beyond regression: New tools for prediction and anal- control, and distributed generation systems.
ysis in the behavioral sciences, Ph.D. dissertation, Harvard Univ., Cam- Dr. Simes is a recipient of a National Science Foundation (NSF)Faculty
bridge, MA, 1974. Early Career Development (CAREER) Award, which is the NSFs most presti-
[121] , Maximizing long-term gas industry profits in two minutes in gious award for new faculty members, recognizing activities of teacher/scholars
lotus using neural network methods, IEEE Trans. Syst., Man, Cybern., who are considered most likely to become the academic leaders of the 21st
vol. 19, pp. 315333, Mar./Apr. 1989. century.

875 F

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

875 F

Transféré par

Droits d'auteur :

Formats disponibles

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 50, NO.

3, JUNE 2003 585

A Comprehensive Review for Industrial Applicability

TABLE I III. EVOLUTION OF UNDERLYING FUNDAMENTALS THAT

Fig. 1. Selected industrial applications reported since 1989.

Fig. 2. MLP basic topology.

manipulators, facilitating the rapid execution of the adaptation

Fig. 7. Adaptive resonance theory network.

schematic diagram of this structure. This figure depicts the non-

Fig. 9. Probabilistic ANN structure.

Fig. 8. RBF network structure.

Fig. 12. Common grids of CNNs.

model due to the automatic input selection capability of the

LMS or WidrowHoff Learning: The LMS algorithm is quite

4) Compute the error where

Fig. 14. Reinforcement learning scheme.

The positive constant and parameters ( ) are specified by

where is a small positive value and is the identity matrix.

Vous aimerez peut-être aussi