Vous êtes sur la page 1sur 13

SEMINAR ON THE TOPIC

OF
NEURAL NETWORKS

AND

FUZZY SYSTEMS

PRESENTED BY

K. NEELIMA

B TECH 2 ND YEAR

CVR COLLEGE,

IBP, HYDERABAD
Abstract

This paper deals with the concepts of Neural Networks and Fuzzy systems. At
First an overview of the Artificial Intelligence is given. The scope of this
presentation to make a brief induction to Artificial Neural Networks (ANNs) for
people who have no previous knowledge of them. We first make a brief
introduction to models of networks, for then describing in general terms ANNs.
As an application, we explain the backpropagation algorithm, since it is widely
used and many other algorithms are derived from it. Lastly a brief introduction to
Fuzzy systems is presented.
CONTENTS

TOPIC PAGE NUMBER

Artificial intelligence 1

Networks 2

Artificial neural networks 2

Perceptrons 3

The error back-propagation learning algorithm 5

Successful applications of backpropagation 7

Fuzzy systems 8

Bibliography 10

1. Artificial Intelligence
There are many definitions of given for Artificial Intelligence(1). Here are some
of them.

1.1 Systems that think like humans (cognitive modelling)

“The exciting new effort to make computers think . . .machines with minds, in the
full and literal sense” (Haugeland, 1985).
“The automation of activities that we associate with human thinking” (Bellman,
1978).

1.2 Systems that act like humans (Turing test)

“The art of creating machines that perform functions that require intelligence
when performed by people” (Kurzweil, 1990).
“The study of how to make computers do things which, at the moment, people
do better” (Rich and Knight, 1991).

1.3 Systems that think rationally (Aristotelian logic)

“The study of mental faculties through the use of computational models”


(Charniak and McDermott, 1985).
“The study of the computations that make it possible to perceive, reason
and act” (Winston, 1992)

1.4 Systems that act rationally (Goal oriented agents)

“A field of study that seeks to explain and emulate intelligent behaviour in terms
of computational processes” (Schalkoff, 1992).
“The branch of computer science that is concerned with the automation of
intelligent behaviour” (Luger and Stubblefi eld, 1992).

We can see that many definitions exist. But the question is that which one to
follow? Infact the answer is none. We have to build our own definition.

1.5 Typical applications of AI

• Game playing and Microworlds


• Computer vision
• Natural language processing
• Diagnosis systems
• Control
• Optimisation
• Robotics

1.6 Previuos work


Now we willl have a look at some of the works done previously in the filed of AI

• McCulloch and Pits: Mathematical model of a neuron (1943)


• Marvin Minsky: Implemented the first neural network (1951)
• John McCarthy: first AI workshop (1956)
• Newell and Simon: Demonstrated their logic Theorist (1956)

2. Networks
One efficient way of solving complex problems is following the lemma
“divide and conquer”. A complex system may be decomposed into simpler
elements, in order to be able to understand it. Also simple elements may be
gathered to produce a complex system (Bar Yam, 1997). Networks(2) are one
approach for achieving this. There are a large number of different types of
networks, but they all are characterized by the following components: a set of
nodes, and connections between nodes.
The nodes can be seen as computational units. They receive inputs, and
process them to obtain an output. This processing might be very simple (such
as summing the inputs), or quite complex (a node might contain another
network...). The connections determine the information flow between nodes.
They can be unidirectional, when the information flows only in one sense, and
bi-directional, when the information flows in either sense. The interactions of
nodes though the connections lead to a global behavior of the network, which
cannot be observed in the elements of the network. This global behavior is
said to be emergent. This means that the abilities of the network supercede the
ones of its elements, making networks a very powerful tool.

Networks are used to model a wide range of phenomena in physics, computer


science, biochemistry, ethology, mathematics, sociology, economics,
telecommunications, and many other areas. This is because many systems can
be seen as a network: proteins, computers, communities, etc.

2.1 Artificial neural networks

One type of network sees the nodes as ‘artificial neurons’. These are called
artificial neural networks(2) (ANNs). An artificial neuron is a computational
model inspired in the natural neurons. Natural neurons receive signals through
synapses located on the dendrites or membrane of the neuron. When the
signals received are strong enough (surpass a certain threshold), the neuron is
activated and emits a signal though the axon. This signal might be sent to
another synapse, and might activate other neurons.
Figure 1. Neuron architecture(3)

The complexity of real neurons is highly abstracted when modelling artificial


neurons. These basically consist of inputs (like synapses), which are
multiplied by weights (strength of the respective signals), and then computed
by a mathematical function, which determines the activation(∅) of the
neuron. Another function (which may be the identity) computes the output of
the artificial neuron (sometimes in dependence of a certain threshold). ANNs
combine artificial neurons in order to process information.

Figure 2: Artificial neuron (node) architecture

2.2 Perceptrons

• A perceptron(3) is a simple pattern classifier.


• Given a binary input vector, x, a weight vector, w, and a threshold value,
T, if

Σi wixi > T

then the output is 1, indicating membership of a class, otherwise it is 0,


indicating exclusion from the class.

• w.x = T describes a hyperplane and the goal of perceptron learning is to


find a weight vector, w, that results in correct classification for all training
examples.
2.3 Multilayer Perceptrons

• In a multi-layer perceptron(3) (MLP), there is a layer of input nodes, a


layer of output nodes, and one or more intermediate layers of nodes that are
referred to as hidden nodes (hidden layers).
• In addition to this, each node in an MLP includes a non-linearity at its
output end.
• That is, the output function of the node is of the form:

φ(Σi wixi)

where φ(x) is a differentiable (smooth) function, frequently the logistic function:

φ(x) = 1/(1 + e-x)

Figure 3: Graph of logistic function φ(x)

2.4 Overall Layout of MLP

• Typically also, each node in a layer (other than the output layer) is connected
to every node in the next layer by a trainable weight.
• The overall layout is illustrated in Figure 5.
Figure 4. A multi-layer network

2.4 Node Internals

• Figure 5 shows the internals of a node.


• The weight wj0 acts like a threshold.
• The yi are the outputs of other nodes (or perhaps inputs to the network).
• The first step is forming the weighted sum vj = Σ i wjiyi.
• The second step is applying the non-linearity function φ to vj to produce the
output yj.

Figure 5. Internal functioning of node j

2.5 The Error Back-Propagation Learning Algorithm

• This algorithm was discovered and rediscovered a number of times - for


details, see, e.g. chapter 4 of Haykin, S. Neural Networks - a comprehensive
foundation,, 2nd ed., p.156. This reference also contains the mathematical
details of the derivation of the backpropagation(3) equations, which we shall
omit.
• Back-propagation attempts to reduce the errors between the output of the
network and the desired result.
• However, assigning blame for errors to hidden nodes (i.e. nodes in the
intermediate layers), is not so straightforward. The error of the output nodes
must be propagated back through the hidden nodes.
• The contribution that a hidden node makes to an output node is related to the
strength of the weight on the link between the two nodes and the level of
activation of the hidden node when the output node was given the wrong level
of activation.
• This can be used to estimate the error value for a hidden node in the
penultimate layer, and that can, in turn, be used in make error estimates for
earlier layers.

2.6 Weight Change Equation

• The basic algorithm can be summed up in the following equation (the


delta rule) for the change to the weight wji from node i to node j:

weight learning local input signal


change rate gradient to node j
Δwji = η* δj * yi
where the local gradient δj is defined as follows:

1. If node j is an output node, then δj is the product of φ'(vj) and the error
signal ej, where φ(*) is the logistic function and vj is the total input to node j
(i.e. Σi wjiyi), and ej is the error signal for node j (i.e. the difference between
the desired output and the actual output);
2. If node j is a hidden node, then δj is the product of φ'(vj) and the weighted
sum of the δ's computed for the nodes in the next hidden or output layer that
are connected to node j.[The actual formula is δj = φ'(vj) &Sigmak δkwkj where
k ranges over those nodes for which wkj is non-zero (i.e. nodes k that actually
have connections from node j. The δk values have already been computed as
they are in the output layer (or a layer closer to the output layer than node j).]

2.7 Two Passes of Computation

FORWARD PASS: weights fixed, input signals propagated through network and
outputs calculated. Outputs oj are compared with desired outputs dj; the error
signal ej = dj - oj is computed.

BACKWARD PASS: starts with output layer and recursively computes the local
gradient δj for each node. Then the weights are updated using the equation above
for Δwji, and back to another forward pass.
2.8 Sigmoidal Nonlinearity

With the sigmoidal function φ(x) defined above, it is the case that φ'(vj) = yj(1 -
yj), a fact that simplifies the computations.

2.9 Rate of Learning

• If the learning rate η is very small, then the algorithm proceeds slowly, but
accurately follows the path of steepest descent in weight space.
• If η is largish, the algorithm may oscillate ("bounce off the canyon walls").
• A simple method of effectively increasing the rate of learning is to modify the
delta rule by including a momentum term:
Δwji(n) = α Δwji(n-1) + η δj(n)yi(n)
where α is a positive constant termed the momentum constant. This is called the
generalized delta rule.

The effect is that if the basic delta rule is consistently pushing a weight in the
same direction, then it gradually gathers "momentum" in that direction.

2.10 Stopping Criterion.

Two commonly used stopping criteria are:

• stop after a certain number of runs through all the training data (each run
through all the training data is called an epoch);
• stop when the total sum-squared error reaches some low level. By total
sum-squared error we mean Σ p Σi ei2 where p ranges over all of the
training patterns and i ranges over all of the output units.

2.11 Initialization

The weights of a network to be trained by backpropagation must be initialized to


some non-zero values. The usual thing to do is to initialize the weights to small
random values.

3. Successful Applications of Backpropagation

Backpropagation tends to work well in some situations where human experts are
unable to articulate a rule for what they are doing - e.g. in areas depending on raw
perception, and where it is difficult to determine the attributes (in the ID3 sense)
that are relevant to the problem at hand.
For example, there is a proprietary system, which includes a backpropagation
component, for assisting in classifying Pap smears.

 The system picks out from the image the most suspicious-looking
cells.
 A human expert then inspects these cells.
 This reduces the problem from looking at maybe 10,000 cells to
looking at maybe 100 cells - this reduces the boredom-induced
error rate.
 Other successful systems have been built for tasks like reading
handwritten postcodes.

3. FUZZY SYSTEMS

3.1 Introduction

Fuzzy logic(4) is concerned with the theory of fuzzy sets, which describe
vagueness. It is a mathematical framework describing human thinking
using linguistic variables.

• Experts rely on ”common sense” and use vague and ambiguous


terms, which require underlying knowledge and experience.
If symptoms are X then diagnosis is probably Y and treatment is Z
• Fuzzy logic allows us to compute with words the way the experts
themselves do.
• Boolean logic forces us to draw lines between class members and
non-members that do not describe the real situation.
– When does a hill become a mountain?

3.2 Fuzzy sets and Membership functions

Fuzzy sets are defined as sets of ordered pairs of objects and their
membership functions.

Example Young={x,µ(x)| x age}


-Jerker still is young (0.49)

• Membership functions (MFs) always range between [0,1].

• If it outputs either 0 or 1 we have a crisp set


example – Example: Young={x|x<32}, x person’s age in years
• Can be discrete or continuous.

3.3 Fuzzy set notation


BIBLOGRAPHY

1.ArtificialIntelligence
<http://www2.du.se/kurser/resurser.asp?iKursId=2097698285&kategoriID
2086658928>
2. Artificial Neural Networks for Beginners
Author Carlos
Gershenson(cgershen@vub.ac.bemailto:carlos@jlagunez.iquimica.unam.mx)
<http://www.cogs.susx.ac.uk/users/carlos/doc/FCS-ANN-tutorial.htm>

3. Neural Networks and Error Backpropagation Learning


Author Haykin
< http://www.cogs.susx.ac.uk/users/carlos/doc/FCS-ANN-tutorial.htm>

4. Fuzzy logic
<http://www2.du.se/kurser/resurser.asp?iKursId=2097698467&kategoriID=
2086658216>

Vous aimerez peut-être aussi