Artificial Neural Networks

Artificial Neural
Networks
Introduction to Neural
Networks
Artificial Intellect:
Who is stronger and
why?
Applied Problems:
•Image, Sound, and Pattern NEUROINFORMATICS

recognition
•Decision making - modern theory about principles
•Knowledge discovery
•Context-Dependent Analysis and new mathematical models of
•… information processing, which
based on the biological prototypes
and mechanisms of human brain 2
activities
Principles of Brain Processing
How our brain

manipulates
with patterns ?
A process of
pattern recognition
and pattern
manipulation is
based on:
Connectionism Associative
Massive parallelism
distributed
Brain computer as an Brain computer is a highly memory
information or signal interconnected neurons
processing system, is system in such a way that Storage of information in
composed of a large number the state of one neuron a brain is supposed to be
of a simple processing affects the potential of the concentrated in synaptic
elements, called neurons. large number of other connections of brain
These neurons are neurons which are neural network, or more
interconnected by numerous connected according to precisely, in the pattern
direct links, which are called weights or strength. The of these connections and
connection, and cooperate key idea of such principle is strengths (weights) of the
which other to perform a the functional capacity of synaptic connections. 3
Brain Computer: What is it?
Human brain
contains a
massively
interconnected net
of 1010 -1011 (10
billion) neurons
(cortical cells)
Biological
Neuron
- The simple
“arithmetic
computing”
element
4
Biological Neurons
1. Soma or body cell - is a large,
round central body in which almost
all the logical functions of the
neuron are realized.
2. The axon (output), is a nerve
fibre attached to the soma which
can serve as a final output channel Synapse
of the neuron. An axon is usually s
highly branched. Axon from
other
3. The dendrites (inputs)- neuron
represent a highly branching tree of
fibres. These long irregularly shaped Soma
nerve fibres (processes) are
attached to the soma.
Dendrit
4. Synapses are specialized Axon e from
contacts on a neuron which are the other
termination points for the axons Dendrites
from other neurons.
The schematic
model of a
biological 5
neuron
Brain-like Computer
Artificial Neural Network –
Mathematical The new paradigm of
Paradigms of Brain-Like Computer
computing
mathematics consists
Neurons and
of the combination of
Neural Net
such artificial neurons
into some artificial
neuron net.
Brain-Like
Computer
Brain-like computer –
is a mathematical model of humane-
brain principles of computations. This
computer consists of those
elements which can be called the ?
biological neuron prototypes, which
are interconnected by direct links
called connections and which
cooperate to perform parallel
distributed processing (PDP) in order 6
ANN as a Brain-Like
NN as an Computer
An artificial neural network

model of brain- (ANN) is a massively parallel
like Computer distributed processor that has a
natural propensity for storing
experimental knowledge and
making it available for use. It
means that:

Knowledge is acquired by the
Brain
network through a learning
(training) process;
The human brain is still
 The strength of the
not well understood and
interconnections between
indeed its behavior is
neurons is implemented by
very complex!
means of the synaptic weights
There are about 10
used to store the knowledge.
billion neurons in the
human cortex and 60
The learning process is a
trillion synapses of
procedure of the adapting the
connections 7
weights with a learning

Applications of Artificial Neural
Networks
Intellige
Intellige
Advanc nt
nt
Advanc Control
ee Control Technic
Technic
Robotic
Robotic al
al
ss Diagnist
Diagnist
Machine ics
ics Intelligent
Machine Intelligent
Vision Data
Data
Vision
Artificial Analysis
Analysis
Intellect andSignal
and Signal
with Neural Processing
Processing
Networks
Image&&
Image
Pattern
Pattern
Recognitio
Recognitio Intellige
Intellige
nn nt
nt
Expert
Expert
Intellige
Intellige Intellige
Intellige System
ntl nt System
ntl
Medicin
nt
Security ss
Medicin Security
ee System
System
Devices
Devices ss 8
Image Recognition:
Decision Rule and Classifier
• Is it possible to formulate (and formalize!) the
decision rule, using which we can classify or
recognize our objects basing on the selected
features?
• Can you propose the rule using which we can
definitely decide is it a tiger or a rabbit?
9
Image Recognition:
• Once we know our decision rule, it is not
difficult to develop a classifier, which will
perform classification/recognition using the
selected features and the decision rule.
• However, if the decision rule can not be
formulated and formalized, we should use a
classifier, which can develop the rule from the
learning process
10
Image Recognition:
• In the most of recognition/classification problems,

the formalization of the decision rule is very
complicated or impossible at all.
• A neural network is a tool, which can accumulate
knowledge from the learning process.
• After the learning process, a neural network is
able to approximate a function, which is
supposed to be our decision rule
11
Why neural network?
f ( x1 ,..., xn ) - unknown multi-factor decision
rule
Learning process using a representative learning

set
- a set of weighting vectors is

( w0 , w1 ,..., wn ) the result of the learning
process
fˆ ( x1 ,..., xn ) = - a partially defined function,

which is an approximation of
= P ( w0 + w1 x1 + ... + wn xn ) the decision rule function 12
Mathematical Interpretation of
Classification in Decision
Making
1. Quantization of pattern
space into p decision
classes f: ℜn⇒ ℜ p
yi
xi F ≡ { f ( t)}
m
ℜ p
1
ℜn m
m
2
m
p
3
2. Mathematical model of
quantization:
Input Patterns “Learning by Examples” Response:

 x1( 1)   y1( 1) 
 ( 1)   ( 1) 
x y
yi =  2 
xi =  2   
   ( 1) 
 ( 1)   yn 
xn 
13
Intelligent Data Analysis in
Engineering Experiment
Interpretation Rules
and &
Decision Knowledge
Making Production
s
Characteristics
Data &
Analysis Estimations
Signals
Data &
Acquisitio parameters
Adaptive Machine
n
Learning
via Neural Network
Data Data Decision

Acquisition Analysis Making
Knowledge
Base
14
Learning via Self-Organization
Principle
Self-organization – basic
principle of learning:
Structure reconstruction
The learning
involves
change of
structure
Responce
Input Neuroprocess
Images or
Learning Rule
Teacher
15
Symbol Manipulation or Pattern
Recognition ?
Ill-Formalizable
Tasks:
•Sound and Pattern Which way
recognition of
•Decision making
•Knowledge discovery
imaginatio
•Context-Dependent n is best
Analysis for you ?
What is
difference Symbol Pattern recognition
between manipulation
human brain
and traditional Dove flies
computer via Lion goes
specific Tortoise scrawls
approaches to
Donkey sits
solution of ill-
formalizing Shark swims
tasks (those
tasks that can
not be
formalized 16
directly)?
Artificial Neuron
w0
A neuron has a set of n synapses
w0
associated to the inputs. Each of
them is characterized by a weight .
x1 w1 x1 Z= A signal xi , i = 1,..., n at the ith
w1 ∑
wx i i ϕ
(Z ) input is multiplied (weighted) by
= 1,..., n
wi , iweight
.. . Output the
ϕ
( z) =f ( x1 ,. .., xn )
The weighted input signals are
xn wn wn x n
summed. 1 + ... +aw
w1 xThus, n xn
linear
combination of the input signalsw0
is
obtained. A "free weight" (or bias)
w1
x1 , which does not z =correspond
w0 + w1 x1 + ... + to
wn xn
Σ ϕ
x2 y
w2 any input, is added to this linear
xn combination and this forms a
wn weighted sum y = φ ( z ) .
A nonlinear activation function
φ is applied to the weighted sum. A
value of the activation function 17
is the neuron's output.
A Neuron
f ( x1 ,..., xn ) = F ( w0 + w1 x1 + ... + wn xn )
f is a function to be earned
x1 ,..., xn are the inputs
x1
φ is the activation function
. f ( x1 , ..., xn )
. φ(z)
.
xn z = w0 + w1 x1 +... +wn xn Z is the weighted sum

18
A Neuron
• Neurons’ functionality is determined
by the nature of its activation
function, its main properties, its
plasticity and flexibility, its ability to
approximate a function to be learned
19
Artificial Neuron:
Classical Activation Functions
Linear activation Logistic
activation
φ ( z) = z φ ( z) =
1
1 + e −α z
1
Σϕ
z
z
0
Threshold Hyperbolic tangent

activation activation 1 − e − 2γu
 1, if
φ ( z ) = sign( z ) = 
z ≥ 0,
ϕ ( u ) = tanh( γu ) =
−1, if z < 0. 1 + e −2γu
1
z 0 z
-
1 20
Principles of
Neurocomputing
Connectionizm
NN is a highly interconnected structure in such a way
that the state of one neuron affects the potential of the
large number of another neurons to which it is
connected accordiny to weights of connections
Not Programming but Training

NN is trained rather than programmed to perform the
given task since it is difficult to separate the hardware
and software in the structure. We program not solution
of tasks but ability of learning to solve the tasks
Distributed Memory
 w11 w11 w11 w11 
w NN presents an distributed memory so that changing-
 11 w11 w11 w11 
 w11 w11 w11 w11 
adaptation of synapse can take place everywhere in

 w11 w11 w11 w11 
 the structure of the network.
21
Principles of
Neurocomputing
Learning and Adaptation
NN are capable to adapt themselves (the synapses
connections between units) to special
environmental conditions by changing their
structure or strengths connections.
y = ϕ( x )
Non-Linear Functionality
2
Every new states of a neuron is a nonlinear
function of the input pattern created by the firing
nonlinear activity of the other neurons.
Robustness of Assosiativity
NN states are characterized by high robustness
or insensitivity to noisy and fuzzy of input data
owing to use of a highly redundance distributed
structure
22
Threshold Neuron
(Perceptron)
• Output of a threshold neuron is binary, while
inputs may be either binary or continuous
• If inputs are binary, a threshold neuron
implements a Boolean function
• The Boolean alphabet {1, -1} is usually used
in neural networks theory instead of {0, 1}.
Correspondence with the classical Boolean
alphabet {0, 1} is established as follows:
0 → 1; 1 → -1; y ∈{0,1}, x ∈ {1,-1} ⇒ x = 1- 2y = (− 1) y
23
Threshold Boolean
Functions
• The Boolean function f ( x1 ,..., x n ) is called a
threshold (linearly separable) function, if it is
possible to find such a real-valued weighting vector
that equation
W = ( w0 , w1 ,..., wn )
holds forf all
( x1 ,...
the = sign(of
xn )values + w1variables
w0the x1 + ... + wn xxnfrom
) the
domain of the function f.
• Any threshold Boolean function may be learned by
a single neuron with the threshold activation
function.
24
Threshold Boolean Functions:
Geometrical Interpretation
“OR” (Disjunction) is an example of the threshold XOR is an example of the non-threshold (not linearly
(linearly separable) Boolean function: separable) Boolean function: it is impossible separate
“-1s” are separated from “1” by a line “1s” from “-1s” by any single line
(-1, 1) (1, 1)
(-1, 1) (1, 1)
(-1,-1) (1,-1) (-1,-1) (1,-1)
• 1 1 1 • 1 1 1
• 1 -1 -1 • 1 -1 -1
• -1 1 -1
• -1 1 -1
• -1 -1 1
• -1 -1 -1
25
Threshold Neuron: Learning
• A main property of a neuron and of a neural
network is their ability to learn from its
environment, and to improve its performance
through learning.
• A neuron (a neural network) learns about its
environment through an iterative process of
adjustments applied to its synaptic weights.
• Ideally, a network (a single neuron) becomes
more knowledgeable about its environment
after each iteration of the learning process.
26
• Let us have a finite set of n-
dimensional vectors that describe
some objects belonging to some
classes (let us assume for simplicity,
but without loss of generality that
there are just two classes and that
X = ( x1 ,..., xn ) ; X ∈ Ck , k = 1, 2; j = 1,..., m;
our
j vectors
j j are jbinary). This set is
called a learning set:

xi ∈ { 1, −1}
j
27
• Learning of a neuron (of a network) is a
process of its adaptation to the automatic
identification of a membership of all
vectors from a learning set, which is
based on the analysis of these vectors:
their components form a set of neuron
(network) inputs.
• This process should be utilized through a
learning algorithm.
28
• Let T be a desired output of a neuron
(of a network) for a certain input
vector and Y be an actual output of a
neuron.
• If T=Y, there is nothing to learn.
• If T≠Y, then a neuron has to learn, in
order to ensure that after adjustment
of the weights, its actual output will
coincide with a desired output
29
Error-Correction Learning
δ = T −Y
• If T≠Y, then is the error .
• A goal of learning is to adjust the
weights in such a way that for a new
actual output we willY% = Y + ∂the
have =T
following:
• That is, the updated actual output
must coincide with the desired
output.
30
Error-Correction Learning
• The error-correction learning rule determines
how the weights must be adjusted to ensure
that the updated actual output will coincide
with the desired output:
W = ( w0 , w1 ,..., wn ) ; X = ( x1 ,..., xn )
0 = w0 + αδ
w%
i = wi + αδ xi ; i = 1,..., n
w%
• α is a learning rate (should be equal to 1 for

the threshold neuron, when a function to be
learned is Boolean)
31
Learning Algorithm
• Learning algorithm consists of the sequential
checking for all vectors from a learning set,
whether their membership is recognized
correctly. If so, no action is required. If not, a
learning rule must be applied to adjust the
weights.
• This iterative process has to continue either
until for all vectors from the learning set their
membership will be recognized correctly or it
will not be recognized just for some
acceptable small amount of vectors (samples
from the learning set).
32
When we need a network
• The functionality of a single neuron is
limited. For example, the threshold
neuron (the perceptron) can not learn
non-linearly separable functions.
• To learn those functions (mappings
between inputs and output) that can
not be learned by a single neuron, a
neural network should be used.
33
A simplest network
x1
x2
34
Solving XOR problem using
the simplest network
x1 ⊕ x2 = x1 x2 ∨ x1 x2 = f1 ( x1 , x2 ) ∨ f 2 ( x1 , x2 )
x1
x2
35
Solving XOR problem using
the simplest network
# Inputs Neuron 1 Neuron 2 Neuron 3 XOR=
~ ~ ~
W = (1,−3,3) W = (3,3,−1) W = (−1,3,3)
x1 x2
Z sign ( z ) Z sign ( z ) Z sign ( z ) = x1 ⊕ x 2
output output output
1) 1 1 1 1 5 1 5 1 1
2) 1 -1 -5 -1 7 1 -1 -1 -1
3) -1 1 7 1 -1 -1 -1 -1 -1
4) -1 -1 1 1 1 1 5 1 1
36
Threshold Functions and
Threshold Neurons
• Threshold (linearly separable) functions can be learned
by a single threshold neuron
• Non-threshold (nonlinearly separable) functions can
not be learned by a single neuron. For learning of
these functions a neural network created from
threshold neurons is required (Minsky-Papert, 1969)
• The number of all Boolean functions of n variables is equal2 nto
2
, but the number of the threshold ones is substantially
smaller. Really, for n=2 fourteen from sixteen functions
(excepting XOR and not XOR) are threshold, for n=3 there are
104 threshold functions from 256, but for n>3 the following
correspondence is true (T is a number of threshold functions
of n variables):
T
2n →
0
2 nthreshold
• For example, for n=4 there are only about 2000 >3
functions from 65536
37
Is it possible to learn XOR, Parity n and
other non-linearly separable
functions
using a single neuron?
• Any classical monograph/text book on neural
networks claims that to learn the XOR function a
network from at least three neurons is needed.
• This is true for the real-valued neurons and real-
valued neural networks.
• However, this is not true for the complex-valued
neurons !!!
• A jump to the complex domain is a right way to
overcome the Misky-Papert’s limitation and to
learn multiple-valued and Boolean nonlinearly
separable functions using a single neuron.
38
XOR problem
n=2, m=4 – four sectors

PB = −1 PB = 1 W=(0, 1, i) – the weighting vector
i
z=
P (z )
= w0 + w1 x1 + w2 x2 B
-1 1
-i
x1 x2
PB = 1 PB = −1
39
Blurred Image Restoration
(Deblurring) and Blur
Identification by MLMVN
40
Blurred Image Restoration (Deblurring)
and Blur Identification by MLMVN
• I. Aizenberg, D. Paliy, J. Zurada, and
J. Astola, "Blur Identification by
Multilayer Neural Network based on
Multi-Valued Neurons", IEEE
Transactions on Neural
Networks, vol. 19, No 5, May 2008,
pp. 883-898.
41
Problem statement:
capturing
• Mathematically a variety of capturing principles can be
Phot
described by the Fredholm integral of the first kind

o
z (x ) = ∫ v2 (x,t )y (t )dt , x∈
,t ¡ 2
¡

∈
• where x,t ℝ2, v(t) is a point-spread function (PSF) of a
system, y(t) is a function of a real object and z(x) is an

observed signal.
To ap
gr y
osco
Micr
m h
py
o

42
Image deblurring: problem
statement
• Mathematically blur is caused by the convolution of
an image with the distorting kernel.
• Thus, removal of the blur is reduced to the
deconvolution.
• Deconvolution is an ill-posed problem, which results
in the instability of a solution. The best way to solve
it is to use some regularization technique.
• To use any kind of regularization technique, it is
absolutely necessary to know the distorting kernel
corresponding to a particular blur: so it is necessary
to identify the blur.
43
Blur Identification
• We use multilayer neural network based

on multi-valued neurons (MLMVN) to
recognize Gaussian, motion and
rectangular (boxcar) blurs.
• We aim to identify simultaneously both
blur and its parameters using a single
neural network.
44
Degradation in the frequency
domain:
True Image Gaussian Rectangular Horizontal Vertical

Motion Motion
Images and log of their Power Spectralog Z 45
Examples of training
vectors
True Image Gaussian Rectangul

ar
Horizontal Vertical
Motion Motion 46
Neural Network
5356
Training 1
Blur 1
(pattern) vectors
2
Blur 2
n
Blur
N
Hidden layers Output layer 47

Simulation
Experiment 1 (2700 training pattern vectors corresponding

to 72 images): six types of blur with the following
parameters:
MLMVN structure: 5356 τ ∈ { 1, 1.33, 1.66, 2, 2.33, 2.66, 3} ;
1) The Gaussian blur is considered with
2) The linear uniform horizontal motion blur of the lengths
3, 5, 7, 9;
3) The linear uniform vertical motion blur of the length 3, 5,
7, 9;
4) The linear uniform diagonal motion from South-West to
North- East blur of the lengths 3, 5, 7, 9;
5) The linear uniform diagonal motion from South-East to
North- West blur of the lengths 3, 5, 7, 9;
6) rectangular has sizes 3x3, 5x5, 7x7, 9x9.
48
Results
Classification Results
Blur MLMVN, SVM
381 inputs, Ensemble from
5356, 27 binary decision
SVMs,
2336 weights in total 25.717.500 support
vectors in total
No blur 96.0% 100.0%

Gaussian 99.0% 99.4%
Rectangular 99.0% 96.4
Motion horizontal 98.5% 96.4
Motion vertical 98.3% 96.4
Motion North-East Diagonal 97.9% 96.5
Motion North-West Diagonal 97.2% 96.5 49
Restored images
Blurred noisy Blurred noisy

image: image:
rectangular 9x9 Gaussian, σ=2
Restored Restored
50

Artificial Neural Networks

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Artificial Neural Networks

Transféré par

Droits d'auteur :

Formats disponibles

Artificial Neural

•Image, Sound, and Pattern NEUROINFORMATICS

How our brain

• In the most of recognition/classification problems,

Learning process using a representative learning

- a set of weighting vectors is

fˆ ( x1 ,..., xn ) = - a partially defined function,

Input Patterns “Learning by Examples” Response:

Data Data Decision

xn z = w0 + w1 x1 +... +wn xn Z is the weighted sum

Threshold Hyperbolic tangent

Not Programming but Training

0 → 1; 1 → -1; y ∈{0,1}, x ∈ {1,-1} ⇒ x = 1- 2y = (− 1) y

(-1,-1) (1,-1) (-1,-1) (1,-1)

called a learning set:

• α is a learning rate (should be equal to 1 for

# Inputs Neuron 1 Neuron 2 Neuron 3 XOR=

n=2, m=4 – four sectors

described by the Fredholm integral of the first kind

• We use multilayer neural network based

True Image Gaussian Rectangular Horizontal Vertical

True Image Gaussian Rectangul

Hidden layers Output layer 47

Experiment 1 (2700 training pattern vectors corresponding

No blur 96.0% 100.0%

Blurred noisy Blurred noisy

Vous aimerez peut-être aussi