Attribution Non-Commercial (BY-NC)

38 vues

Attribution Non-Commercial (BY-NC)

- Hybrid Neural Network
- kothariapp2000-06
- Idea Group Neural Networks in Business Forecasting
- 1. Numerical Differential Protection of Power Transformer Using GA Trained ANN
- Chapter2 - Intro to ANNs
- BIS4225.16 - Knowledge Management.ppt
- 1902.04684.pdf
- Machine Hints
- MB0047OK
- Paper 35-Visualization of Learning Processes for Back Propagation Neural Network Clustering
- DP Neural Networks
- Study of Short-term Water Quality Prediction Model Based on Wavelet Neural Network
- Seminar reoport on Artificial Nural Networks
- 66_pub.pdf
- Hb 3113601373
- Ann 2
- A Hierarchical Fuzzy Neural Network Approach for Multiple Fault Diagnosis
- IJCTT-V3I5P111
- Artificial Intelligence Stuff
- ANN1

Vous êtes sur la page 1sur 10

of Feedforward Neural Networks

Samuel H. Huang and Mica R. Endsley

Abstract—The advent of artificial neural networks has stirred are tested using a bench-marking problem in Section IV. One

the imagination of many in the field of knowledge acquisition. drawback with existing rule-extracting approaches is that they

There is an expectation that neural networks will play an im- can only deal with networks with binary inputs. In Section V,

portant role in automating knowledge acquisition and encoding,

however, the problem solving knowledge of a neural network is we propose a method to extract fuzzy rules from networks with

represented at a subsymbolic level and hence is very difficult continuous-valued inputs. The method is tested using a real-

for a human user to comprehend. One way to provide an life problem involving decision-making by pilots on combat

understanding of the behavior of neural networks is to extract situations. Conclusions are drawn regarding the utility and

their problem solving knowledge in terms of rules that can be applicability of this technique.

provided to users. Several papers which propose extracting rules

from feedforward neural networks can be found in the literature,

however, these approaches can only deal with networks with II. UNDERSTANDING THE BEHAVIOR OF NEURAL

binary inputs. Furthermore, certain approaches lack theoretical

support and their usefulness and effectiveness are debatable. NETWORKS: A HUMAN FACTORS PERSPECTIVE

Upon carefully analyzing these approaches, we propose a method Neural networks are expected to play an important role

to extract fuzzy rules from networks with continuous-valued in the development of AI systems. Many researchers have

inputs. The method was tested using a real-life problem (decision-

making by pilots involving combat situations) and found to be realized that there is a need for a more explicit consideration of

effective. human factors in the AI/expert systems field [3]. Chignell and

Peterson [4] have shown that knowledge engineering and the

development of expert systems benefit from careful use and

I. INTRODUCTION

application of human factors techniques. This section explains

stirred the imagination of many in the field of knowl-

edge acquisition [1]. Neural networks belong to a family of

the need for understanding the behavior of neural networks

from a human factors point of view.

Neural networks have been inspired both by biological

models that are based on a learning-by-example paradigm in nervous systems and mathematical theories of learning. They

which problem solving knowledge is automatically generated are “massively parallel interconnected networks of simple

according to actual examples presented to the network. The (usually adaptive) elements and their hierarchical organizations

knowledge, however, is represented at a subsymbolic level in which are intended to interact with objects of the real world in

terms of connections and weights. Neural networks act like the same way as biological nervous systems do” [5]. Neural

a black box providing little insight into how decisions are computing, which may be more similar to human cognition

made. They have no explicit, declarative knowledge structure than current computing technology, is expected to facilitate the

which allows the representation and generation of explanation realization of automating human cognitive behavioral features

structures [2]. such as learning and generalization abilities. Neural networks

Experience with rule-based expert systems has shown that are expected to be an effective technique for automated

the ability to generate explanations is absolutely crucial for knowledge acquisition.

user acceptance of artificial intelligence (AI) systems. Hence, Knowledge acquisition is the process of collecting domain

it is very important to understand the behavior of neural net- knowledge from the expert and expressing it in the form

works. One way to generate an understanding of the behavior of facts and rules [4]. Domain knowledge consists of three

of neural networks is to extract their problem solving knowl- subsets: general knowledge, working-level knowledge, and

edge in terms of rules. This paper discusses the extraction expert knowledge. Expert knowledge results from an indi-

of rules from feedforward neural networks. Section II of this vidual’s extensive problem-solving experience in a specific

paper explains the need for understanding the behavior of domain. It is heuristic in nature and has also been described as

neural networks from a human factors point of view. Section unwritten or unconscious knowledge [6]. As domain experts

III provides a literature review on methods for extracting achieve greater competency, their ability to explain the fine

rules from feedforward neural networks. These approaches details associated with problem solving strategies degrade.

Manuscript received August 12, 1995; revised March 19, 1996. Intermediate solution steps are unconsciously performed as

S. H. Huang is with EDS/Unigraphics Computer-Aided Manufacturing, a matter of routine as strategies are compressed into a few

Cypress, CA 90630 USA (e-mail: huangsh@ug.eds.com). major steps [7]. Thus, not all of the knowledge involved can

M. R. Endsley is with the Department of Industrial Engineering, Texas Tech

University, Lubbock, TX 79416 USA. be decoded from schema to a semantic representation which

Publisher Item Identifier S 1083-4419(97)02924-5. is available in human working memory; and hence is not

1083–4419/97$10.00 1997 IEEE

466 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 27, NO. 3, JUNE 1997

accessible for people to report [8]. Current knowledge acquisi- with the system during its reasoning process and also be

tion approaches, e.g., verbal protocol analysis, are inadequate confident in the system’s reasoning and advice.”

for the task of obtaining complete data on the knowledge

involved in the problem solving process because a protocol Therefore, it is very important for people to be able to un-

can only reflect the information available in working memory derstand the behavior of neural networks. A good explanation

and verbalization is often much slower than the cognitive facility has been described as having two main functions: 1)

processes involved. Much information has to be inferred by to make the system more intelligible to the user and 2) to

analyzing large volumes of poorly articulated data. Since uncover shortcomings in the knowledge base [15]. To make

neural computing resembles human cognitive behavior at the neural networks fit these requirements, we need to understand

micro-level, some researchers suggest that neural networks can how knowledge is represented in neural networks. Ye and

be used for automated knowledge acquisition [9]. A neural Salvendy [9] developed three hypotheses about knowledge

network can be trained with examples to solve a problem. The representation in neural networks:

knowledge within the trained network would be more objective 1) An entity (object, concept, etc.) is more likely to be

and reliable than knowledge reasoned out by a specific person. locally encoded in a neural network.

The knowledge of a neural network, however, is implicitly 2) The knowledge is implemented in a implicit way in

embedded in its connections and weights. Developing a human the internal structure of the neural network (a group of

understanding of the knowledge in neural networks remains associated hidden neurons and their connections to entity

an open research topic. neurons), not in individual neurons or connections.

In traditional rule-based expert systems, knowledge is rep- 3) Different modules of a neural network, which implement

resented symbolically in expressions or data structures. These different conceptual schema, have similar processing

representations are subsequently manipulated or processed structures, but differ in their input and output.

to produce useful results which are the logical result of The authors validated these hypotheses using three experi-

existing representations. The situation is quite different in ments based on a task of modulo arithmetic. These three

neural networks. In neural computing, the processing is the experiments provided some potential insight into neural com-

representation. In other words, knowledge is not represented puting. Specifically, they found that neural computing provides

symbolically, but in the form of distributed processing and a theoretical and constructive foundation for understanding

localized decision rules [10]. This subsymbolic representation human cognitive behavior. The fact that the problem solving

makes neural networks more resistant to noise than traditional knowledge is implicitly embedded in the internal structure of

symbolic representation, however, this representation is very a neural network provides a good explanation for the difficulty

difficult for a human user to comprehend. involved in knowledge elicitation. The knowledge cannot be

Do we really need to know what is going on inside a directly accessed but needs to be reasoned out through problem

neural network? This question needs to be answered from a solving states. To overcome this natural limitation of human

human factors point of view. Madni [11] points out that “the ability to verbalize knowledge, one can train a neural network

success of expert systems depends not only on the quality to solve a problem and then obtain knowledge about the

and completeness of the knowledge elicited from experts problem solving process from the internal structure of the

but also on the compatibility of the recommendations and neural network. However, the authors did not address the

decisions with the user’s conceptualization of the task.” A issue of how to extract the problem solving knowledge from

study conducted by Lehner and Zirk [12] showed that when a trained neural network.

a human being and an intelligent machine cooperate to solve Mozer and Smolensky [16] point out that it is much easier

problems, but where each employs different problem-solving to understand the behavior of a feedforward neural network

procedures, the user must have an accurate model of how in terms of simple rules than in terms of a large number of

that machine operates. This is because when people deal weights and activation values. They proposed a skeletonization

with complex, interactive systems, they usually build up their technique to determine the relevance of individual units in

own conceptual mental model of the system. The model feedforward networks and to remove redundant units. The

guides their actions and helps them interpret the system’s result of the procedure is a minimal network which consists of

behavior. Such a model, when appropriate, can be very helpful only those units that really contribute to the solution. Another

or even necessary for dealing successfully with the system. weight elimination procedure was proposed by Weigend et al.

However, if inappropriate or inadequate, it can lead to serious [17]. Their method begins with a feedforward network that is

misconceptions or errors [13]. Kidd and Cooper [14] point out too large for a given problem. A cost is associated with each

that connection in the network. If a given level of performance

“If an expert system is to be responsible for complex on the training set can be obtained with fewer weights, the

decision-making and giving advice, then it is vital that cost function will encourage the reduction, and eventually

there is compatibility, at the cognitive level, between the elimination, of as many connections as possible. Weight

the user’s model of the problem and the system’s. In elimination is then extended to unit elimination and hence the

other words, the knowledge representation and problem least important hidden units are removed from the network.

solving processes employed by the system must be This will make the network more transparent and it should

readily intelligible to the user. Only if this is true will the thus be easier to understand the behavior of the network.

user both be able to interact competently and efficiently However, the weight elimination method does not provide

HUANG AND ENDSLEY: BEHAVIOR OF FEEDFORWARD NEURAL NETWORKS 467

this method has advantages only when the network size is

reasonably small.

Diederich [2] discusses the ability of neural networks to

generate explanations. He concludes that explanation is dif-

ficult to realize in unstructured neural networks. He argues

that connectionist systems benefit from the explicit coding

of relations and the use of highly structured networks in

order to allow explanation. He then developed an explanation

component for connectionist semantic networks (CSN). CSN

are massively parallel systems that allow the drawing of certain

kinds of inferences based on conceptual knowledge with

extreme efficiency [18]. They are quite different from con-

ventional neural networks. Traditional neural network learning Fig. 1. An artificial neuron.

algorithms such as back-propagation learning [19] cannot be

applied in CSN, and how to provide an understanding of the

behavior of the very popular feedforward neural networks

remains a question.

NEURAL NETWORKS: A REVIEW

One way of understanding the behavior of neural networks

is to extract their problem solving knowledge in terms of rules. Fig. 2. Rule extracting using the subset algorithm [20].

By extracting rules from a neural network, an explanation

facility can be built. Thus, a human user will have some un-

derstanding of the network. Several approaches for extracting positive output indicates that the neuron is on, while a negative

rules from feedforward neural networks have been proposed output indicates that the neuron is off).

in the literature, however, these approaches can only deal with A simple, breadth-first subset algorithm starts by determin-

networks with binary inputs. Furthermore, certain approaches ing whether any sets containing a single link (connection)

lack theoretical support and their usefulness and effectiveness are sufficient to guarantee that the bias is exceeded. If yes,

are debatable. then these sets are written as rules. The search proceeds by

increasing the size of the subsets until all possible subsets

have been explored. Finally, the algorithm removes redundant

A. The Subset Approach

rules. Fig. 2 shows an example.

One approach for rule extraction is the so called subset A problem with the subset algorithm is that the cost of

method provided by Towell and Shavlik [20]. This method finding all subsets grows exponentially with the size of the

explicitly searches for subsets of incoming weights that exceed links. Several heuristic approaches have been proposed to deal

the bias (threshold, denoted as ) of a neuron. It represents the with this problem [20]–[23]. Among these approaches, the one

state-of-the-art in the published literature. proposed by Towell and Shavlik is quite interesting. Their

The subset method has a strong theoretical foundation. algorithm, called , differs from the subset algorithm

Recall that the output of an artificial neuron (Fig. 1) is defined in that it explicitly searches for rules of the form:

in general as IF ( of the following antecedents are true) THEN ...

The idea underlying is that individual antecedents

(3.1) (links) do not have a unique importance. Rather, groups of

antecedents form equivalence classes in which each antecedent

in the binary case and has the same importance as, and is interchangeable with, other

members of the class. Compared with the subset algorithm,

(3.2) generates fewer rules for the same network. An exper-

iment performed by Towell and Shavlik [20] indicated that the

in the continuous case, in which accuracy of rules derived by was approximately equal

to that of the network from which the rule set was extracted.

(3.3)

B. The Large-Weight Approach

where is the bias of the neuron. Another approach for extracting rules from trained neural

Therefore, if a certain combination of weighted input values of networks is the large-weight approach. Sestito and Dillon [24],

a neuron is greater than its bias then the status of the neuron is [25] proposed such an approach to extract information from

on; otherwise its status is off (or, for (3.2) correspondingly, a a single-layered network. Their method basically determines

468 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 27, NO. 3, JUNE 1997

inputs whose associated weights are within some range of the

maximum weight for that particular output. For single-layer

neural networks, this is straightforward. Suppose we have

inputs and we are considering output . First, we need to find

such that

weight is within a particular range. In other words, we

select the ’s such that

the maximum that the selected inputs have to be within. The

Fig. 3. The 4-2-4 encoder/decoder neural network.

rules constructed will then be

IF and

This approach, however, is heuristic in nature and somewhat

where denotes an optional statement which can be re- unique in the literature. It is based on the postulation that if

peated. the outgoing weights of two neurons are similar then these two

A similar approach can be found in Yoon et al. [26]. neurons are closely associated. There is no theoretical support

The authors argued that neural computing could be viewed for this postulation. Even if the sufficiency of the postulation

as a simple linear discriminant analysis that attempts to could be proved, its necessity might never be proved due to

use input parameters to discriminate between a finite and the stochastic nature of neural networks. Hence, the usefulness

mutually exclusive set of output variables. In single-layer of this similar-weight approach is debatable. The weakness of

networks, the connection weights are roughly equivalent to this approach will be shown in our case study.

discriminant function. Therefore, each weight represents the

relative contribution of its associated variable. IV. A CASE STUDY: NETWORKS WITH BINARY INPUTS

In multi-layer networks, a common statistical technique

After reviewing the approaches for extracting rules from

known as factor analysis can be used. The hidden layers are

feedforward neural networks, a case study was conducted. In

comparable to factors. Although this comparison is far from

the case study, the - - encoder/decoder problem was

exact, the analogy does prove useful for knowledge-based

used to test the approaches. An “ - - encoder/decoder”

interpretation. Factors (i.e., hidden neurons) with large weights

means a three-layer neural network with neurons for the

are interpreted as important factors. This method was applied

input and output layer and neurons for the hidden layer. The

to a neural network for dermatology diagnosis and found to

network is given with distinct input patterns, each of which

be useful for the purpose of explanation.

only one bit is turned on, and all other bits are turned off. The

network should duplicate the input pattern in the output layer.

C. The Similar-Weight Approach

In a multi-layer neural network, the tracing of the contri- A. Using the Subset and the Large-Weight Approach

bution of inputs to an output is much more difficult than in a

A 4-2-4 neural network (Fig. 3) was trained using the error-

single-layer one. This is because there are hidden neurons to

back propagation algorithm. The input of each neuron was

deal with and hidden neurons usually do not have any explicit

either 1 or 1. An input 1 means the neuron is on; while an

meanings. In order to have some means of determining the

input 1 means the neuron is off. The training examples were

associations between the inputs and outputs, Sestito and Dillon

[24] decided to extend the input set to include all the desired

outputs. This means that outputs are used as additional inputs.

The rationale behind this is to enable determination of the

direct association between the inputs and the outputs.

Since the inputs and outputs are now at the same level, they The weights of the trained neural network are shown in Table I.

can be directly compared. If the weights from an original input Based on the subset approach, the following rules were

to the hidden neurons and that of an augmented input (i.e., an obtained:

output) are similar (or identical), then it is postulated that there

is a close association between the input and the output. On the 1. For the hidden layer,

basis of this postulation, the authors developed a method to Rule 1. IF 3 of ((not A), B, (not C), D) are true THEN

extract one conjunctive rule for each output of a multi-layer X.

neural network. The method was then tested in two example Rule 2. IF 3 of ((not A), (not B), C, D) are true THEN

domains and found to be useful. Y.

HUANG AND ENDSLEY: BEHAVIOR OF FEEDFORWARD NEURAL NETWORKS 469

TABLE I

WEIGHTS FOR THE 4-2-4 ENCODER/DECODER NEURAL NETWORK

TABLE II

WEIGHTS FOR THE 4-4-4 DECODER/ENCODER NEURAL NETWORK

Rule 3. IF (not X) and (not Y) THEN a.

Rule 4. IF X and (not Y) THEN b.

Rule 5. IF (not X) and Y THEN c.

Rule 6. IF X and Y THEN d.

Although these rules are somewhat difficult to comprehend,

they can be used to classify the training examples correctly.

The large-weight approach does not make too much sense

in this case because most of the weights are of the same

magnitude.

To further validate the subset and the large-weight approach,

the network was trained with all of the possible input

patterns. The network was required to duplicate the input

patterns in the output layer. As the 4-2-4 network could not

converge when trained with those 16 patterns, the number of

hidden neurons was increased from 2 to 4. The 4-4-4 network

learned all of the 16 patterns. Its structure is shown in Fig. 4

and its weights are shown in Table II.

The large weights are highlighted in Table II. Based on the

large-weight approach, the following rules were obtained:

For the hidden layer,

Rule 1. IF (not A) THEN R.

Rule 2. IF D THEN X.

Rule 3. IF (not B) THEN Y.

Rule 4. IF (not C) THEN Z.

For the output layer,

Rule 5. IF (not R) THEN a.

Rule 6. IF (not Y) THEN b. The same result was obtained when the subset approach was

Rule 7. IF (not Z) THEN c. used. By simplifying the eight rules, the following rules are

Rule 8. IF X THEN d. obtained:

470 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 27, NO. 3, JUNE 1997

SSEab OF THE 4-2-4 ENCODER/DECODER NETWORK SSEab OF THE 4-2-4 ENCODER/DECODER

NETWORK TRAINED WITH NEW PATTERNS

3) Rule 3. IF A and B and C THEN c.

b. IF B THEN b (Rule 3 and Rule 6);

4) Rule 4. IF A and B and C and D THEN d.

c. IF C THEN c (Rule 4 and Rule 7);

d. IF D THEN d (Rule 2 and Rule 8); The ’ were calculated and are shown in Table IV.

From Table IV, we really cannot conclude any useful rules

which means the 4-4-4 network will always duplicate the input

since no similar weights can be found. Therefore, the similar-

pattern in the output neurons. This result is as was expected.

weight approach is not shown to be useful in this case.

The case study showed that although the three rule extrac-

B. Using the Similar-Weight Approach

tion approaches can help explain the behavior of a neural

Sestito and Dillon’s similar-weight approach was then ap- network, they are not always useful. Most often, rules obtained

plied to the 4-2-4 encoder/decoder network. The network using the subset approach are accurate but difficult to compre-

was trained with four patterns. The sum of squares error hend. The large-weight approach failed when all the weights

measurements were calculated and are shown in of a network are of the same magnitude. The similar-weight

Table III. approach is not always useful due to the stochastic nature of

From Table III, the following rules are obtained: neural network learning.

1) Rule 1. IF A THEN a.

2) Rule 2. IF B THEN b. V. EXTRACTING FUZZY RULES FROM

3) Rule 3. IF C THEN c. NETWORKS WITH CONTINUOUS-VALUED INPUTS

4) Rule 4. IF D THEN d. As mentioned previously, the rule extraction approaches

This rule set is exactly the same as that obtained for the reported in the literature can only deal with networks with

4-4-4 neural network using the subset and the large-weight binary inputs. However, many real-life applications require

approach. This rule set can be used to duplicate the 16 input that the input values be continuous. An approach based on

patterns. However, this rule set was obtained from the 4-2-4 fuzzy logic is proposal to deal with networks with continuous-

encoder/decoder network, which could not learn to duplicate valued inputs. A study by Buckley et al. [27] showed that

all of the 16 input patterns. Obviously, the performance of the under certain assumptions, a neural network can be approxi-

4-2-4 network cannot be the same as that of the obtained rule mated to any degree of accuracy using a fuzzy system, and

set. This means the similar-weight approach is not effective vice versa. Therefore, it is justified to extract fuzzy rules from

in this case. networks with continuous-valued inputs.

To further validate the similar-weight approach, the network A continuous-valued input can be described using a lin-

was retrained with the following four patterns: guistic term, such as large, medium, or small. Each linguistic

term can be represented by a set. A continuous-valued input

can thus be classified into a specific set and represented in

a binary scheme. In this way, the continuous-valued inputs

are converted to binary inputs. The large-weight approach and

subset approach can then be applied for rule extraction. The

The desired relation (rule set) is as follows: rules extracted can be used to explain the behavior of the

1) Rule 1. IF A THEN a. network in understandable linguistic terms. They can also be

2) Rule 2. IF A and B THEN b. used for decision-making, provided that membership functions

HUANG AND ENDSLEY: BEHAVIOR OF FEEDFORWARD NEURAL NETWORKS 471

(5.1)

where

Th threat factor of sector ;

I set of indices of targets in sector ;

number of targets in sector ;

total number of targets in the TSD;

range of the th target from the center of TSD (own-

ship);

Fig. 5. A TSD [29]. mean range of targets;

mean range of targets within sector .

of the fuzzy sets corresponding to the linguistic terms are For a sector that does not have any targets in it (henceforth

referred to as a dead sector), the threat factor was defined as

constructed properly.

0. In this way, information about the targets was converted

The approach can be summarized as follows:

into a 12-dimensional input vector.

Step 1.

Classify the continuous-valued inputs into sets. The primary network had 12 input nodes, as well as 12

Step 2.

Represent the sets using a binary scheme. output nodes. It was used to decide which sector should be

Step 3.

Construct a neural network with binary inputs. attacked. Each selected sector was divided into six sub-sectors

Step 4.

Train the network. of 5 each. The threat factors of these five sub-sectors for each

Step 5.

Identify important inputs using the large-weight active sector were calculated and fed into a secondary network.

approach. This secondary network made the final decision, picking a

particular target (or targets) to attack (or not to attack any

Step 6. Extract fuzzy rules using the subset approach.

aircraft). For more details about the networks, readers may

As the test case used to examine the previous techniques refer to Sundararajan [29].

involved binary inputs, this approach was validated using a Sundararajan [29] studied the data set and found that there

real-life problem involving decision-making by pilots regard- is a probability of 74% that the decision made by one pilot

ing combat situations. The data is drawn from the work of will be exactly duplicated by at least one other pilot (This is a

Endsley and Smith [28], a study investigating the effectiveness common issue in dealing with multiple experts, particularly in

of a tactical situation display (TSD) in the fighter aircraft this domain). He thus concluded that if the neural network’s

cockpit. A simple version of a TSD is shown in Fig. 5 [29]. average percentage generalization (PG, defined as the percent-

In this study, a TSD was presented on a computer screen. age of the network’s decisions that agree with at least one

pilot) is equal to at least 74%, then the decisions made by

On each trial, between 3 and 12 targets (enemy aircraft) were

the neural network will be as much in conformance with the

presented on the display for 5 s, at which point the targets were expert pilots’ decisions as those of the experts themselves.

blanked from the display. The subjects (experienced fighter Sundararajan’s dual-network indeed achieved a PG of 74%.

aircraft pilots) were instructed to verbally report the tactical This result confirmed that the pilots’ decisions can be modeled

action they would take if presented with the situation dis- by a neural network. It was also found that the accuracy of

played. Each subject was presented with 490 distinct displays, the dual-network was heavily dependent on the accuracy of

referred to as target sets, in a random order during the test. the primary network.

Ten (10) pilots were used in the experiment. The objective of the present study was to find out what

Sundararajan [29] developed a dual-network approach to kind of problem solving knowledge is acquired by the neural

model the pilots’ decision making process. The TSD was network through training. Since the accuracy of the network’s

decision is heavily dependent on the accuracy of the primary

divided into 12 imaginary sectors of 30 each. For each sector

network, only the primary network was investigated in this

that had at least one target in it (henceforth referred to as a live study. To facilitate rule extraction, the structure of the network

sector), the range and angular location of all the targets in that was modified. Instead of using a 12-dimensional output vector,

sector were integrated into a single value called a threat factor a one-dimensional output vector was used. The underlying

that was used as the input to a primary network. The threat assumption is that the pilot should use the same set of

factor for each live sector was calculated according to rules when dealing with each sector. Therefore, the modified

472 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 27, NO. 3, JUNE 1997

TABLE V

INFORMATION ABOUT THE CONVERTED TRAINING PATTERNS

one-dimensional output vector.

A threat factor is a positive real number. It was classified

into a set as being high, medium, low, or none based on the

following:

high if

medium if

(5.2)

low if

none if

in which, . Fig. 6. Network configuration.

The linguistic terms high, medium, low, and none are rep-

resented using vectors , and , TABLE VI

WEIGHTS OF THE NETWORK

respectively. Using this set, the continuous-valued inputs (i.e.,

threat factors) were converted into binary inputs.

A neural network was then developed using the binary

inputs as determined from the set. As previously discussed,

one sector at a time will be examined and hence a network with

one output node is required. The dimension of the input vector

needs to be determined. When dealing with a sector, the threat

factor of that sector should be considered, and in addition,

so should the threat factor of the other sectors. Although it

seems that all 12 sectors should be taken into consideration,

Sudararajan [29] showed that the most important information

comes from the sector under consideration and its immediately

adjacent sectors, so just three sectors need to be considered in

each decision.

To examine the problem data, there are 490 target sets. Each

target set consists of 12 sectors, providing 5880 (490 12)

patterns. The threat factors are converted into binary vectors of input vector as small as possible. This will facilitate both

using (5.2). After converting the threat factors into binary network training and rule extraction. Therefore, it was decided

vectors, some input patterns became indistinguishable; yet to consider five sectors. To facilitate the rule extraction, a

their corresponding output patterns were not always the same. simple feedforward neural network with no hidden layers was

This problem is inevitable when continuous-valued inputs are used. The network configuration is shown in Fig. 6.

converted into binary inputs. The more sectors considered The network was trained using 431 converted binary pat-

(i.e., the larger the input dimension), the less the number of terns (182 unique patterns and 249 consistent patterns). The

conflicting patterns will be. Since the TSD is symmetric in popular back-propagation learning algorithm was applied. The

nature, the number of sectors considered should always be weights of the network are shown in Table VI.

increased by 2. For example, if the second adjacent sector to Since the network considers only one sector at a time,

the left is under consideration, one should also consider the the results of all 12 sectors within a target set need to be

second adjacent sector to the right. The information about the synthesized to make the final decision. The synthesis procedure

converted patterns is shown in Table V. is shown in the following:

From Table V, it can be seen that if only 3 sectors are con- Step 1. For each sector within the target set, calculate

sidered, the consistency of decisions made is very low (40.6%). the output of the network, denoted as

If 5 sectors are considered, the consistency will increase to .

80.6%. Although the more sectors considered, the higher the Step 2. Find .

consistency will be, it is also desirable to keep the dimension Step 3. Sector is declared as the sector to be attacked.

HUANG AND ENDSLEY: BEHAVIOR OF FEEDFORWARD NEURAL NETWORKS 473

The network was used to test decision accuracy for the 490

target sets. In 316 cases, the sector selected to be attacked was

also selected at least by one pilot. This yields a PG of 64.3%,

which is lower than that of Sundararajan’s dual-network. The

reasons are most likely 1) the network is much simpler than

Sundararajan’s dual-network and 2) binary inputs instead of

continuous-valued inputs are used for decision making.

From Table VI, one can see that connections [6], [9], and

[12] have significantly large weights. According to the large-

weight approach, the corresponding inputs are considered to Fig. 7. Membership function of threat factor.

be significant. All the other connection weights were then

set to zero, except for these three connections. The modified The fuzzy rule, however, is not quite as effective as Sun-

network was again tested using the 490 target sets. In 302 dararajan’s dual-network which showed a 74% PG. The main

cases, the sector selected by the new network was also selected reason is that our network is simpler than the dual-network.

at least by one pilot. This yields a PG of 61.6%. This means Furthermore, it only uses information from the most significant

that those three inputs contribute 95.6% of the decision and inputs. Despite these disadvantages, the fuzzy rule achieved a

indeed are significant inputs. Hence, only inputs 6, 9, and reasonably high accuracy.

12 will be considered. Notice that these three inputs are

from the sector under consideration, its left most adjacent

sector, and its right most adjacent sector, respectively. This VI. CONCLUSION

confirms Sundararajan’s observation that the most important Neural networks have been used successfully to model

information comes from the sector under consideration and its human decision making processes. However, neural networks

most adjacent sectors. act like a black box providing little insight into how decisions

Next, the subset approach for rule extraction was applied. are made. The knowledge of a neural network is represented

Only the bias and the weights associated with inputs 6, 9, at a subsymbolic level and hence is very difficult for a human

and 12 are considered. The bias is 8.3, while the weights user to comprehend. This paper sheds light into the neural net-

associated with inputs 6, 9, and 12 are 15.9, 23.7, and 18.9, work black box by discussing methods for extracting problem

respectively. Since an input is either 1 or 0, the network’s solving knowledge of feedforward neural networks in terms of

output can be 1 only when the 6th input is 0, the 9th input is rules. An innovative approach is also presented for extracting

1, and the 12th input is fuzzy rules from networks with continuous-valued inputs. The

. In other words, the network’s output will be 1 only when approach is applied to a real-life problem with promising

the threat factor of the sector under consideration is [0, 0, 1] results. However, the approach has some limitations. The

and the threat factors of the two most adjacent sectors are not approach can be applied only to feedforward neural networks

[0, 0, 1]. Therefore, the following rule can be concluded: with a sigmoid nodal function. Each neuron of the network

IF (threat factor of the sector is high) AND (threat fac- must be associated with a certain concept. The approach works

tors of the two most adjacent sectors are not high) THEN well with neural networks that have a simple structure. For

attack. networks with a complex structure, the effectiveness of the

This fuzzy rule can be used for the purpose of explanation. approach still needs to be accessed. We believe that neural

It can also be used to deal with the original target sets instead networks and fuzzy systems are closely related. They should

of the neural network, provided that the membership functions be integrated for the purpose of knowledge representation.

for the threat factor are constructed properly. The construction The integration of neural networks with fuzzy systems shows

of fuzzy membership functions for threat factor is independent promise for providing a higher level of understanding of

of the classification method used for converting continuous- intelligent systems.

valued inputs to binary inputs. After all, a fuzzy membership

is a real number in the range of 0 to 1. REFERENCES

A common method for constructing the fuzzy membership [1] K. G. Coleman and S. Watenpool, “Neural networks in knowledge

functions for threat factor is shown in Fig. 7. For each sector, acquisition,” AI Expert, vol. 7, no. 1, pp. 36–39, 1992.

the membership for attack was calculated based on the fuzzy [2] J. Diederich, “Explanation and artificial neural networks,” Int. J.

Man–Machine Studies, vol. 37, pp. 335–355, 1992.

rule. Among the 12 sectors, the sector with the highest [3] A. S. Maida, “Selecting a humanly understandable representation for

membership of attack is declared to be attacked. For the 490 reasoning about knowledge,” Int. J. Man-Machine Studies, vol. 22, pp.

target sets, in 329 cases, the sector selected was also selected 151–161, 1985.

[4] H. Chignell and J. G. Peterson, “Strategic issues in knowledge engi-

by at least one pilot. This yields a PG of 67.1% using the neering,” Human Factors, vol. 30, No. 4, pp. 381–391, 1988.

explanatory rule generated from the network. The result is even [5] T. Kohonen, “An introduction to neural computing,” Neural Networks,

vol. 1, pp. 3–16, 1988.

better than that of the network (61.6%) because the original [6] D. A. Mitta, “Knowledge acquisition: Human factors issues,” in Proc.

value rather than the converted value of the threat factor was Human Factors Soc. 33rd Annu. Meet., 1989, pp. 351–355.

used. In other words, the original information (continuous- [7] B. R. Gaines, “An overview of knowledge acquisition and transfer,” Int.

J. Man–Mach. Stud., vol. 26, pp. 453–472, 1987.

valued inputs) rather than the compressed information (binary [8] K. A. Ericsson and H. A. Simon, Protocol Analysis: Verbal Reports as

inputs) was used for decision making. Data. Englewood Cliffs, NJ: Prentice-Hall, 1984.

474 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 27, NO. 3, JUNE 1997

[9] N. Ye and G. Salvendy, “Cognitive engineering based knowledge Samuel H. Huang received the B.S. degree in

representation in neural networks,” Behav. Inf. Technol., vol. 10, no. instrument engineering from Zhejiang University,

5, pp. 403–418, 1991. P. R. China, in 1991, and the M.S. and Ph.D. de-

[10] Y.-H. Pao and D. J. Sobajic, “Neural networks and knowledge engineer- grees in industrial engineering from Texas Tech Uni-

ing,” IEEE Trans. Knowledge Data Eng., vol. 3, no. 2, pp. 185–192, versity, Lubbock, in 1992 and 1995, respectively.

1991. He is an R&D Engineer with EDS/Unigraphics

[11] A. M. Madni, “The role of human factors in expert systems design and Computer-Aided Manufacturing, Cypress, CA. His

acceptance,” Human Factors, vol. 30, no. 4, pp. 395–414, 1988. research interests are CAD/CAM, application of

[12] P. E. Lehner and D. A. Zirk, “Cognitive factors in user/expert system artificial intelligence in manufacturing, tolerance

interaction,” Human Factors, vol. 29, no. 1, pp. 97–109, 1987. analysis, and environmentally conscious manufac-

[13] R. M. Young, “The machine inside the machine: Users’ models of pocket turing. His work in these areas has been published in

calculators,” Int. J. Man–Mach. Stud., vol. 15, pp. 51–85, 1981. research journals including the International Journal of Production Research,

[14] A. L. Kidd and M. B. Cooper, “Man-machine interface issues in the the IEEE TRANSACTIONS ON COMPONENTS, PACKAGING, AND MANUFACTURING

construction and use of an expert system,” Int. J. Man–Mach. Stud., TECHNOLOGY, Computers in Industry, and the Journal of Engineering Design

vol. 22, pp. 91–102, 1985. and Automation.

[15] R. Davis and D. B. Lenat, Knowledge-Based Systems in Artificial Dr. Huang is a member of IIE, SME, and ASME.

Intelligence. New York: McGraw-Hill, 1982.

[16] M. C. Mozer and P. Smolensky, “Using relevance to reduce network

size automatically,” Connection Sci., vol. 1, no. 1, pp. 3–16, 1989.

[17] A. S. Weigend, B. A. Huberman, and D. E. Rumelhart, Predicting the

Future: A Connectionist Approach. Palo Alto, CA: System Sciences Mica R. Endsley received the B.S. degree from

Lab., Xerox Palo Alto Res. Center, 1990. Texas Tech University, Lubbock, and the M.S. de-

[18] L. Shastri, Semantic Networks: An Evidential Formalization and its gree from Purdue University, West Lafayette, IN,

Connectionist Realization. San Mateo, CA: Morgan Kaufman, 1988. both in industrial engineering, and the Ph.D. degree

[19] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal in industrial and systems engineering with a spe-

representations by error propagation,” Parallel Distributed Processing: cialization in human factors from the University of

Explorations in the Microstructure of Cognition, Vol. 1: Foundations. Southern California, Los Angeles.

Cambridge, MA: MIT Press, 1986, pp. 318–362. She is currently an Associate Professor of In-

[20] G. G. Towell and J. W. Shavlik, “Extracting refined rules from dustrial Engineering at Texas Tech University. She

knowledge-based neural networks,” Mach. Learn., vol. 13, pp. 71–101, has been working on issues related to situation

1993. awareness in high performance aircraft for the past

[21] S. I. Gallant, Neural Network Learning and Expert Systems. Cam- ten years, most recently expanding this research to air traffic control and

bridge, MA: MIT Press, 1993. maintenance for the Federal Aviation Administration. Prior to joining Texas

[22] L.-M. Fu, “Rule learning by searching on adapted nets,” in Proc. 9th Tech in 1990, she was an Engineering Specialist for the Northrop Corporation,

Nat. Conf. Artificial Intelligence. Anaheim, CA: AAAI Press, 1991, serving as Principal Investigator of a research and development program

pp. 590–595. focused on the areas of situation awareness, mental workload, expert systems,

[23] K. Saito and R. Nakano, “Medical diagnostic expert system based on

and interface design for the next generation of fighter cockpits. She the author

PDP model,” in Proc. IEEE Int. Conf. Neural Networks, San Diego,

of over 40 scientific articles and reports on numerous subjects, including

CA, 1988, vol. 1, pp. 255–262.

the implementation of technological change, the impact of automation, the

[24] S. Sestito and T. Dillon, “Knowledge acquisition of conjunctive rules

design of expert system interfaces, new methods for knowledge elicitation for

using multilayered neural networks,” Int. J. Intelligent Syst., vol. 8, pp.

artificial intelligence system development, pilot decision-making, and various

779–805, 1993.

[25] , “Using single-layered neural networks for the extracting of aspects of situation awareness.

conjuctive rules and hierarchical classifications,” J. Appl. Intell., vol. Dr. Endsley is the recipient of numerous awards for teaching, research, and

1, pp. 157–173, 1991. contributions to system development.

[26] Y. Yoon, R. W. Brobst, P. R. Bergstresser, and L. L. Peterson, “A

desktop neural network for dermatology diagnosis,” J. Neural Network

Comput., pp. 43–52, Summer 1989.

[27] J. J. Buckley, Y. Hayashi, and E. Czogala, “On the equivalence of neural

nets and fuzzy expert systems,” Fuzzy Sets Syst., vol. 53, pp. 129–134,

1993.

[28] M. R. Endsley and R. P. Smith, “Attention distribution and decision

making in tactical air combat,” unpublished.

[29] M. Sundararajan, “A neural network approach to model decision pro-

cesses,” M.S. thesis, Dept. Indust. Eng., Texas Tech Univ., Lubbock,

1993.

- Hybrid Neural NetworkTransféré parbalashivasri
- kothariapp2000-06Transféré parkodandaram
- Idea Group Neural Networks in Business ForecastingTransféré parBJ Tiew
- 1. Numerical Differential Protection of Power Transformer Using GA Trained ANNTransféré parZunairaNazir
- Chapter2 - Intro to ANNsTransféré parbhairavthakkar1975
- BIS4225.16 - Knowledge Management.pptTransféré parvicrattlehead2013
- 1902.04684.pdfTransféré parKarthik
- Machine HintsTransféré parDenno Stein
- MB0047OKTransféré parKumar Gaurav
- Paper 35-Visualization of Learning Processes for Back Propagation Neural Network ClusteringTransféré parEditor IJACSA
- DP Neural NetworksTransféré parThiago Monteiro Tuxi
- Study of Short-term Water Quality Prediction Model Based on Wavelet Neural NetworkTransféré parAnonymous PsEz5kGVae
- Hb 3113601373Transféré parAnonymous 7VPPkWS8O
- Seminar reoport on Artificial Nural NetworksTransféré parRajesh Pedamutti
- Ann 2Transféré parakarimi7961
- A Hierarchical Fuzzy Neural Network Approach for Multiple Fault DiagnosisTransféré parAbu Hussein
- 66_pub.pdfTransféré pargetmak99
- IJCTT-V3I5P111Transféré parsurendiran123
- Artificial Intelligence StuffTransféré parEfe Ariaroo
- ANN1Transféré parnug3d
- WEAR RATE PREDICTION OF FRICTION STIR WELDED DISSIMILAR ALUMINUM ALLOY BY ANNTransféré parTJPRC Publications
- LawrenceTransféré parChihab EL Alaoui
- Fractal NetworksTransféré parAsociación laCuerda
- ANN by Muskan in Word FormatTransféré parGaurav Chawla
- JstTransféré parsahid
- Development of Prediction Model for Mechanical Properties of Batch Annealed Thin SteelTransféré parvaalgatamilram
- 01 NN.pptxTransféré parMuhammad Ammar Khan
- ME-IV-IITransféré parMurthy Mandalika
- IJETR031817Transféré parerpublication
- Main Paper for Elbow ControlTransféré parAhmed Khalifa

- Agentm2 Agents CopyTransféré parAmirah Diyana
- Fuzzy Jerry M. MendelTransféré parJuan Pablo Rodriguez Herrera
- 28 Signal Reconstruction.docTransféré par何家銘
- MLP and RBFN as ClassifiersTransféré parCHAITALI29
- ELT TerminologyTransféré pargregpe1
- Chapter+4+Decision+Support+and+Artificial+Intelligence (1)Transféré parduli87
- DominanceCheck'MishraTransféré parNguyen Trung Kien
- [Charu_C._Aggarwal]_Neural_Networks_and_Deep_Learn(z-lib.org).pdfTransféré parNam
- BITI 1113 ExpertSystem2Transféré parluqman hakim
- A Survey on Deep Learning for Big Data 2018 Information FusionTransféré parmindworkz pro
- DC-DC ConvertersTransféré parMark Ryan To
- Pengertian SemanticsTransféré parAmelyaHerdaLosari
- FRDL_MayraMacasTransféré parMayra Alexandra
- Gesture Recognition With the Leap Motion ControllerTransféré parsavisu
- Hawkins MSFuturesGroup(On Intelligence – Jeff Hawkins Presented by: Melanie Swan, Futurist MS Futures Group 650-681-9482 m@melanieswan.com http://www.melanieswan.com Chapter 8: The Future of Intelligenc)Transféré parAndreea Ion
- K-meansTransféré parKurushNishanth
- GPC-ITransféré parVignesh Ramakrishnan
- DMX4543-Control Systems Engineering-2017-2018 Assignment 1Transféré parDon Nuwan Danushka
- RecentTransféré parxychen
- Resource Management Techniques - Lecture Notes, Study Materials and Important questions answersTransféré parBrainKart Com
- Group ReportTransféré parAnis Farhanah
- Matlab simulinkTransféré parAmiir Amir
- Adaline_1Transféré parvtnnhan
- Emotion Recognition Based on Joint Visual and AudiTransféré parLaclassedifabio Fabio
- Wide Area Backup Protection of Power Systems Using Artificial Neural NetworksTransféré parIraj Az Irani
- Machine Learning Top Trumps with David TarrantTransféré parOpen Data Institute
- The Skill in Eap and EopTransféré parPrilia Dwiher Fitriana
- 5- Fuzzy ControlTransféré parIsmael Espinoza
- AI Q and answerTransféré parምንሼ ነው ዘመኑ
- Cybernetic Muse, Hannah ArendtTransféré parDaniela Gutiérrez