Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Introduction to Artificial Neural Networks
1. 11
Introduction to ArtificialIntroduction to Artificial
Neural NetworksNeural Networks
D M V RAJESH
9/29/2013 ANN Lecture 1
Dr.M.V.RAJESH
EE--mail:mail: mvrajeshihrd@gmail.commvrajeshihrd@gmail.com
From Biological to ArtificialFrom Biological to Artificial
Neural NetworksNeural Networks
Overview:Overview:
Why Artificial Neural Networks?Why Artificial Neural Networks?
How do NNs and ANNs work?How do NNs and ANNs work?
9/29/2013 ANN Lecture 2
An Artificial NeuronAn Artificial Neuron
Capabilities of Threshold NeuronsCapabilities of Threshold Neurons
Linear andLinear and SigmoidalSigmoidal NeuronsNeurons
Learning in ANNsLearning in ANNs
Computers vs. Neural NetworksComputers vs. Neural Networks
“Standard” Computers“Standard” Computers Neural NetworksNeural Networks
one CPUone CPU highly parallelhighly parallel
processingprocessing
fast processing unitsfast processing units slow processing unitsslow processing units
9/29/2013 ANN Lecture 3
p gp g p gp g
reliable unitsreliable units unreliable unitsunreliable units
static infrastructurestatic infrastructure dynamic infrastructuredynamic infrastructure
ProgrammingProgramming Learning / TrainingLearning / Training
Why Artificial Neural Networks?Why Artificial Neural Networks?
There are two basic reasons why we are interested inThere are two basic reasons why we are interested in
building artificial neural networks (ANNs):building artificial neural networks (ANNs):
•• Technical viewpoint:Technical viewpoint: Some problems such asSome problems such as
character recognition or the prediction of futurecharacter recognition or the prediction of future
states of a system require massively parallel andstates of a system require massively parallel and
9/29/2013 ANN Lecture 4
states of a system require massively parallel andstates of a system require massively parallel and
adaptive processing.adaptive processing.
•• Biological viewpoint:Biological viewpoint: ANNs can be used toANNs can be used to
replicate and simulate components of the humanreplicate and simulate components of the human
(or animal) brain, thereby giving us insight into(or animal) brain, thereby giving us insight into
natural information processing.natural information processing.
How do NNs and ANNs work?How do NNs and ANNs work?
•• The “building blocks” of neural networks are theThe “building blocks” of neural networks are the
neuronsneurons..
•• In technical systems, we also refer to them asIn technical systems, we also refer to them as unitsunits
oror nodesnodes..
•• Basically, each neuronBasically, each neuron
–– receivesreceives inputinput from many other neuronsfrom many other neurons
9/29/2013 ANN Lecture 5
–– receivesreceives inputinput from many other neurons,from many other neurons,
–– changes its internal state (changes its internal state (activationactivation) based on) based on
the current input,the current input,
–– sendssends one output signalone output signal to many otherto many other
neurons, possibly including its input neuronsneurons, possibly including its input neurons
(recurrent network)(recurrent network)
How do NNs and ANNs work?How do NNs and ANNs work?
•• Information is transmitted as a series of electricInformation is transmitted as a series of electric
impulses, soimpulses, so--calledcalled spikesspikes..
•• TheThe frequencyfrequency andand phasephase of these spikes encodesof these spikes encodes
the information.the information.
•• In biological systems, one neuron can beIn biological systems, one neuron can be
9/29/2013 ANN Lecture 6
connected to as many asconnected to as many as 10,00010,000 other neurons.other neurons.
•• Neurons of similar functionality are usuallyNeurons of similar functionality are usually
organized in separateorganized in separate areasareas (or(or layerslayers).).
•• Often, there is aOften, there is a hierarchyhierarchy of interconnected layersof interconnected layers
with the lowest layer receiving sensory input andwith the lowest layer receiving sensory input and
neurons in higher layers computing more complexneurons in higher layers computing more complex
functions.functions.
2. 22
Real NeuronReal Neuron
++++++++++++++++++++++
9/29/2013 ANN Lecture 7
(Carlson, 1986; reprinted in Ellis & Humphreys,(Carlson, 1986; reprinted in Ellis & Humphreys,
1999)1999)
What areWhat are NaturalNatural Neural Nets?Neural Nets?
Our brains are madeOur brains are made
up of billions ofup of billions of
neurons.neurons.
Each neuronEach neuron
receives manyreceives many
electrical signals viaelectrical signals via
A single neuron
Nucleus Cell body
Axon
9/29/2013 ANN Lecture 8
electrical signals viaelectrical signals via
its dendrites.its dendrites.
It ‘fires’ along itsIt ‘fires’ along its
axon if the sum ofaxon if the sum of
those signals is overthose signals is over
some threshold.some threshold. Synapse
Dendrite
Graceful DegredationGraceful Degredation
** Every day some of our neurons die!Every day some of our neurons die!
* The good news is that our neural nets* The good news is that our neural nets
exhibitexhibit graceful degredationgraceful degredation..
9/29/2013 ANN Lecture 9
* This means we can lose many cells before* This means we can lose many cells before
we notice any effect, and we have anwe notice any effect, and we have an
amazing ability to recover and/or adjust.amazing ability to recover and/or adjust.
* It would be useful if ML had this property* It would be useful if ML had this property..
How do NNs and ANNs work?How do NNs and ANNs work?
•• NNs are able toNNs are able to learnlearn byby adapting theiradapting their
connectivity patternsconnectivity patterns so that the organismso that the organism
improves its behavior in terms of reaching certainimproves its behavior in terms of reaching certain
(evolutionary) goals.(evolutionary) goals.
•• The strength of a connection or whether it isThe strength of a connection or whether it is
9/29/2013 ANN Lecture 10
•• The strength of a connection, or whether it isThe strength of a connection, or whether it is
excitatory or inhibitory, depends on the state of aexcitatory or inhibitory, depends on the state of a
receiving neuron’sreceiving neuron’s synapsessynapses..
•• The NN achievesThe NN achieves learninglearning by appropriatelyby appropriately
adapting the states of its synapses.adapting the states of its synapses.
What are Artificial Neural Nets?What are Artificial Neural Nets?
Artificial Neural Networks (ANNs) areArtificial Neural Networks (ANNs) are
networks of nodes, based onnetworks of nodes, based on
neurons, that perform a simpleneurons, that perform a simple
summing functionsumming function
9/29/2013 ANN Lecture 11
summing function.summing function.
Sum
Activation
function
-3.4
23.1
5.2
24.9 1.0Inputs
Weights
A More Formal DefinitionA More Formal Definition
Given weights,Given weights, wwjiji, a summing input, a summing input
function, and an activation function thatfunction, and an activation function that
defines the output we have:defines the output we have: ininii ==
wwjijiaajj == WWii aajj
9/29/2013 ANN Lecture 12
wwjijiaajj WWii .. aajj
Activation function
w3 = -3.4
w2 =23.1
w1 = 5.2
1.0
ai
ai = g(ini)
g()
Inputs Output
3. 33
NeuronNeuron
Activation
Function
Synaptic
Weights
Input
Signals
x0 = + 1 wk0 = bk
9/29/2013 ANN Lecture 13
∑=
=
m
j
jkjk xw
0
ν ( )kbsky νφ=
x1
x2
xm
...
wk1
wk2
wkm
...
Σ φbs(νk)
Output
yk
νk
Activation FunctionsActivation Functions
The input function isThe input function is
usually just summationusually just summation
of its inputs.of its inputs.
Step functionStep function
Sign functionSign function
ini
ai+1
-1
9/29/2013 ANN Lecture 14
pp
The activationThe activation
function, however, canfunction, however, can
vary considerably.vary considerably.
gg
Sigmoid functionSigmoid function
ini
ai+1
-1
ini
ai+1
-1
Sigmoid transfer functionSigmoid transfer function
( ) k
e
y kbsk αν
νφ −
+
==
1
1
( ) k
k
e
e
y khtsk αν
αν
νφ 2
2
1
1
−
−
+
−
==
Binary (Logistic) Sigmoid Function Hyperbolic Tangent Sigmoid Function
9/29/2013 ANN Lecture 15
νk
φbs(νk)
νk
φhts(νk)
1
0
-1
1
0
-5 5 -10 10
α=0.5
α=1.0
α=2.0
An Artificial NeuronAn Artificial Neuron
o1
o2
…
wi1
wi2
…
win
oi
neuron i
synapses
9/29/2013 ANN Lecture 16
on
∑=
=
n
j
jiji totwt
1
)()()(netnet input signal
output ))(()(o taft iii =
The Above Model of AN is CalledThe Above Model of AN is Called
McCullochMcCulloch--Pitts NeuronPitts Neuron
Warren McCulloch and Walter PittsWarren McCulloch and Walter Pitts
(1943) introduce this idea of using a(1943) introduce this idea of using a
neural network for computationneural network for computation
9/29/2013 ANN Lecture 17
pp
McCullochMcCulloch--Pitts Neuron (Pitts Neuron ( MM--P UnitP Unit ))
Neuron produce a binary output (0/1)Neuron produce a binary output (0/1)
A specific number of inputs must be excited to fire.A specific number of inputs must be excited to fire.
The only delay is transmission time on the synapse.The only delay is transmission time on the synapse.
Any nonzero inhibatory input prevents firingAny nonzero inhibatory input prevents firing
Fixed network structure (no learning)Fixed network structure (no learning)
Similarities with neuronsSimilarities with neurons
NeuronsNeurons N.NetN.Net
NeuronNeuron
SynapseSynapse
UnitUnit
ConnectionConnection
9/29/2013 ANN Lecture 18
Action PotentialAction Potential
Analogue inputAnalogue input
Discrete outputDiscrete output
OutputOutput
Analogue inputAnalogue input
Discrete outputDiscrete output
4. 44
DifferencesDifferences
NeuronsNeurons N.NetN.Net
DifferentDifferent
neurotransmittersneurotransmitters
IgnoredIgnored
9/29/2013 ANN Lecture 19
Complex calculationComplex calculation
of excitationof excitation
--
Large, complexLarge, complex
Simple summation ofSimple summation of
inputsinputs
Learning algorithmLearning algorithm
Small, simpleSmall, simple
Capabilities of Threshold NeuronsCapabilities of Threshold Neurons
What can threshold neurons do for us?What can threshold neurons do for us?
To keep things simple, let us consider such a neuronTo keep things simple, let us consider such a neuron
with two inputs:with two inputs:
o1 wi1
wi2
oi
9/29/2013 ANN Lecture 20
The computation of this neuron can be described asThe computation of this neuron can be described as
the inner product of thethe inner product of the twotwo--dimensional vectorsdimensional vectors oo
andand wwii, followed by a threshold operation., followed by a threshold operation.
o2
wi2
Capabilities of Threshold NeuronsCapabilities of Threshold Neurons
Let us assume that the thresholdLet us assume that the threshold θθ = 0 and illustrate the= 0 and illustrate the
function computed by the neuron for sample vectorsfunction computed by the neuron for sample vectors wwii andand oo::
wwii
second vector component
oo
9/29/2013 ANN Lecture 21
Since the inner product is positive forSince the inner product is positive for --9090°° << αα << 9090°°, in this, in this
example the neuron’s output is 1 for any input vectorexample the neuron’s output is 1 for any input vector oo to theto the
right of or on the dotted line, and 0 for any other input vector.right of or on the dotted line, and 0 for any other input vector.
first vector component
Capabilities of Threshold NeuronsCapabilities of Threshold Neurons
By choosing appropriate weightsBy choosing appropriate weights wwii and thresholdand threshold θθ
we can place thewe can place the lineline dividing the input space intodividing the input space into
regions of output 0 and output 1inregions of output 0 and output 1in any position andany position and
orientationorientation..
Therefore, our threshold neuron can realize anyTherefore, our threshold neuron can realize any
linearly separablelinearly separable functionfunction RRnn →→ {0, 1}.{0, 1}.
9/29/2013 ANN Lecture 22
y py p { , }{ , }
Although we only looked at twoAlthough we only looked at two--dimensional input,dimensional input,
our findings apply toour findings apply to any dimensionality nany dimensionality n..
For example, for n = 3, our neuron can realize anyFor example, for n = 3, our neuron can realize any
function that divides the threefunction that divides the three--dimensional inputdimensional input
space along a twospace along a two--dimension planedimension plane
((general term: (ngeneral term: (n--1)1)--dimensional hyperplanedimensional hyperplane).).
Linear SeparabilityLinear Separability
Examples (two dimensions):
1100
1111
x2
0
1
1100
0011
x2
0
1
OR function XOR function
9/29/2013 ANN Lecture 23
x1
0
0 1
x1
0
0 1
linearly separable linearly inseparable
This means that a single threshold neuron cannot
realize the XOR function.
GRAPHICAL INTERPRETAION OF THE ORGRAPHICAL INTERPRETAION OF THE OR
AND XOR PROBLEMSAND XOR PROBLEMS
9/29/2013 ANN Lecture 24
OR problem:OR problem:
(x1,x2)=(1,1), (1,0) and (0,1) Gives(x1,x2)=(1,1), (1,0) and (0,1) Gives
the output= 1. (ie The Red Group)the output= 1. (ie The Red Group)
(0,0) produces the output=0(Blue)(0,0) produces the output=0(Blue)
XOR problem:XOR problem:
(x1,x2)= (1,0) and (0,1) Gives the(x1,x2)= (1,0) and (0,1) Gives the
output= 1. (ie The Pink Group)output= 1. (ie The Pink Group)
(0,0) and (1,1) produces the(0,0) and (1,1) produces the
output=0(Blue group)output=0(Blue group)
5. 55
Limitations of perceptronsLimitations of perceptrons
A single perceptron is limited to dividing aA single perceptron is limited to dividing a
hyperspace into two regions by a singlehyperspace into two regions by a single
hyperplane.hyperplane.
A perceptron can only separate linearlyA perceptron can only separate linearly
9/29/2013 ANN Lecture 25
A perceptron can only separate linearlyA perceptron can only separate linearly
separable patterns.separable patterns.
Sample McCullochSample McCulloch--Pitts NetworksPitts Networks
x1
θ = 2
1
y
x1
θ = 2
2
y
9/29/2013 ANN Lecture 26
x2
1 x2
2
2
2
-1
-1
2
2
x1
x2
x3
x4
y
θ = 2
θ = 2
θ = 2
Capabilities of Threshold NeuronsCapabilities of Threshold Neurons
What do we do if we need a more complex function?What do we do if we need a more complex function?
We can alsoWe can also combinecombine multiple artificial neurons tomultiple artificial neurons to
formform networksnetworks with increased capabilities.with increased capabilities.
For example, we can build a twoFor example, we can build a two--layer network withlayer network with
any number of neurons in the first layer giving input toany number of neurons in the first layer giving input to
9/29/2013 ANN Lecture 27
any number of neurons in the first layer giving input toany number of neurons in the first layer giving input to
a single neuron in the second layer.a single neuron in the second layer.
The neuron in the second layer could, for example,The neuron in the second layer could, for example,
implement an AND function.implement an AND function.
NonNon--Linear ClassificationLinear Classification
Consider a simple, 2Consider a simple, 2--inputinput
perceptron.perceptron.
We can plot its input’s weightsWe can plot its input’s weights
on a 2D graph.on a 2D graph.
If its output is twoIf its output is two--valued (0.0 orvalued (0.0 or
1.0), we can show the output on1.0), we can show the output on
the graph as an ‘x’ or a ‘o’.the graph as an ‘x’ or a ‘o’.
Weight 2
x
x
x
x
o
9/29/2013 ANN Lecture 28
g pg p
Unlike the linear example, theseUnlike the linear example, these
points can NOT be ‘linearlypoints can NOT be ‘linearly
separated’separated’----I.e. no straight lineI.e. no straight line
can separate them.can separate them.
We need a nonWe need a non--linear separator,linear separator,
and therefore a multilayer ANN.and therefore a multilayer ANN.
Weight 1x
x
x
o
x
x o
Capabilities of Neural NetworksCapabilities of Neural Networks
oo11
oo22
oo11
oo22 ooii
9/29/2013 ANN Lecture 29
What kind of function can such a network realize?What kind of function can such a network realize?
oo11
oo22
..
..
..
Capabilities of Neural NetworksCapabilities of Neural Networks
Assume that the dotted lines in the diagram represent theAssume that the dotted lines in the diagram represent the
inputinput--dividing lines implemented by the neurons in the firstdividing lines implemented by the neurons in the first
layer:layer:
22ndnd comp.comp.
9/29/2013 ANN Lecture 30
11stst comp.comp.
Then, for example, the secondThen, for example, the second--layer neuron could output 1 iflayer neuron could output 1 if
the input is within athe input is within a polygonpolygon, and 0 otherwise., and 0 otherwise.
6. 66
Capabilities of Neural NetworksCapabilities of Neural Networks
However, we still may want to implement functionsHowever, we still may want to implement functions
that are more complex than that.that are more complex than that.
An obvious idea is to extend our network evenAn obvious idea is to extend our network even
further.further.
Let us build a network that hasLet us build a network that has three layersthree layers, with, with
bi b f i h fi d dbi b f i h fi d d
9/29/2013 ANN Lecture 31
arbitrary numbers of neurons in the first and secondarbitrary numbers of neurons in the first and second
layers and one neuron in the third layer.layers and one neuron in the third layer.
The first and second layers areThe first and second layers are completelycompletely
connectedconnected, that is, each neuron in the first layer, that is, each neuron in the first layer
sends its output to every neuron in the second layer.sends its output to every neuron in the second layer.
Capabilities of Neural NetworksCapabilities of Neural Networks
oo11
oo22
oo11
oo22 ooii
9/29/2013 ANN Lecture 32
What type of function can a threeWhat type of function can a three--layer network realize?layer network realize?
oo11
oo22
..
..
..
..
..
..
Feed ForwardFeed Forward
Fully ConnectedFully Connected
ANNANN
Capabilities of Neural NetworksCapabilities of Neural Networks
Assume that the polygons in the diagram indicate the inputAssume that the polygons in the diagram indicate the input
regions for which each of the secondregions for which each of the second--layer neurons yieldslayer neurons yields
output 1:output 1:
22ndnd comp.comp.
9/29/2013 ANN Lecture 33
11stst comp.comp.
Then, for example, the thirdThen, for example, the third--layer neuron could output 1 if thelayer neuron could output 1 if the
input is withininput is within any of the polygonsany of the polygons, and 0 otherwise., and 0 otherwise.
Capabilities of Neural NetworksCapabilities of Neural Networks
The more neurons there are in the first layer, theThe more neurons there are in the first layer, the
more vertices can the polygons have.more vertices can the polygons have.
With a sufficient number of firstWith a sufficient number of first--layer neurons, thelayer neurons, the
polygons can approximatepolygons can approximate anyany given shape.given shape.
The more neurons there are in the second layer, theThe more neurons there are in the second layer, the
f h l b bi d f hf h l b bi d f h
9/29/2013 ANN Lecture 34
more of these polygons can be combined to form themore of these polygons can be combined to form the
output function of the network.output function of the network.
With a sufficient number of neurons and appropriateWith a sufficient number of neurons and appropriate
weight vectorsweight vectors wwii, a three, a three--layer network of thresholdlayer network of threshold
neurons can realizeneurons can realize any (!)any (!) functionfunction RRnn →→ {0, 1}.{0, 1}.
TerminologyTerminology
Usually, we draw neural networks in such a way thatUsually, we draw neural networks in such a way that
the input enters at the bottom and the output isthe input enters at the bottom and the output is
generated at the top.generated at the top.
Arrows indicate the direction of data flow.Arrows indicate the direction of data flow.
The first layer, termedThe first layer, termed input layerinput layer, just contains the, just contains the
input vector and does not perform any computations.input vector and does not perform any computations.
9/29/2013 ANN Lecture 35
input vector and does not perform any computations.input vector and does not perform any computations.
The second layer, termedThe second layer, termed hidden layerhidden layer, receives, receives
input from the input layer and sends its output to theinput from the input layer and sends its output to the
output layeroutput layer..
After applying their activation function, the neurons inAfter applying their activation function, the neurons in
the output layer contain the output vector.the output layer contain the output vector.
TerminologyTerminology
Example:Example: Network function f:Network function f: RR33 →→ {0, 1}{0, 1}22
output layeroutput layer
output vectoroutput vector
9/29/2013 ANN Lecture 36
hidden layerhidden layer
input layerinput layer
input vectorinput vector
7. 77
Multilayer Network ArchitecturesMultilayer Network Architectures
1. Feed Forward1. Feed Forward
Input Units Hidden Units Output Units
Signal Flow
9/29/2013 ANN Lecture 37
...
Error Flow
Multilayer Network ArchitecturesMultilayer Network Architectures
2. Feed Back Network2. Feed Back Network
9/29/2013 ANN Lecture 38
Example Hopfield network : Network with No Self FedbackExample Hopfield network : Network with No Self Fedback
FEEDBACK N/W is also called RECURRENT N?WFEEDBACK N/W is also called RECURRENT N?W
i1i1 i2i2
i3i3
Multilayer Network ArchitecturesMultilayer Network Architectures
3.Lattice structure3.Lattice structure
9/29/2013 ANN Lecture 39
Single or Multi Dimensional Lattice Structures are PossibleSingle or Multi Dimensional Lattice Structures are Possible
Linear NeuronsLinear Neurons
Obviously, the fact that threshold units can onlyObviously, the fact that threshold units can only
output the values 0 and 1 restricts their applicability tooutput the values 0 and 1 restricts their applicability to
certain problems.certain problems.
We can overcome this limitation by eliminating theWe can overcome this limitation by eliminating the
threshold and simply turning fthreshold and simply turning fii into theinto the identityidentity
functionfunction so that we get:so that we get:
9/29/2013 ANN Lecture 40
functionfunction so that we get:so that we get:
)(net)( tto ii =
With this kind of neuron, we can build networks withWith this kind of neuron, we can build networks with
m input neurons and n output neurons that compute am input neurons and n output neurons that compute a
function f: Rfunction f: Rmm →→ RRnn..
Linear NeuronsLinear Neurons
Linear neurons are quite popular and useful forLinear neurons are quite popular and useful for
applications such as interpolation.applications such as interpolation.
However, they have a serious limitation: Each neuronHowever, they have a serious limitation: Each neuron
computes a linear function, and therefore the overallcomputes a linear function, and therefore the overall
network function f: Rnetwork function f: Rmm →→ RRnn is alsois also linearlinear..
9/29/2013 ANN Lecture 41
This means that if an input vector x results in anThis means that if an input vector x results in an
output vector y, then for any factoroutput vector y, then for any factor φφ the inputthe input φ⋅φ⋅x willx will
result in the outputresult in the output φ⋅φ⋅y.y.
Obviously, many interesting functions cannot beObviously, many interesting functions cannot be
realized by networks of linear neurons.realized by networks of linear neurons.
Sigmoidal NeuronsSigmoidal Neurons
Sigmoidal neuronsSigmoidal neurons accept any vectors of realaccept any vectors of real
numbers as input, and they output a real numbernumbers as input, and they output a real number
between 0 and 1.between 0 and 1.
Sigmoidal neurons are the most common type ofSigmoidal neurons are the most common type of
artificial neuron, especially in learning networks.artificial neuron, especially in learning networks.
9/29/2013 ANN Lecture 42
A network of sigmoidal units with m input neuronsA network of sigmoidal units with m input neurons
and n output neurons realizes a network functionand n output neurons realizes a network function
f: Rf: Rmm →→ (0,1)(0,1)nn
8. 88
Sigmoidal NeuronsSigmoidal Neurons
11
ffii(net(netii(t))(t))
τθ /))(net(
1
1
))(net( −−
+
= tii i
e
tf
ττ = 1= 1
ττ = 0.1= 0.1
9/29/2013 ANN Lecture 43
The parameterThe parameter ττ controls the slope of the sigmoid function,controls the slope of the sigmoid function,
while the parameterwhile the parameter θθ controls the horizontal offset of thecontrols the horizontal offset of the
function in a way similar to the threshold neurons.function in a way similar to the threshold neurons.
00
11 netnetii(t)(t)--11
Learning in ANNsLearning in ANNs
In supervised learning, we train an ANN with a set ofIn supervised learning, we train an ANN with a set of
vector pairs, sovector pairs, so--calledcalled exemplarsexemplars..
Each pair (Each pair (xx,, yy) consists of an input vector) consists of an input vector xx and aand a
corresponding output vectorcorresponding output vector yy..
Whenever the network receives inputWhenever the network receives input xx, we would like, we would like
it to provide outputit to provide output yy..
9/29/2013 ANN Lecture 44
it to provide outputit to provide output yy..
The exemplars thus describe the function that weThe exemplars thus describe the function that we
want to “teach” our network.want to “teach” our network.
BesidesBesides learninglearning the exemplars, we would like ourthe exemplars, we would like our
network tonetwork to generalizegeneralize, that is, give plausible output, that is, give plausible output
for inputs that the network had not been trained with.for inputs that the network had not been trained with.
Learning in ANNsLearning in ANNs
There is a tradeoff between a network’s ability toThere is a tradeoff between a network’s ability to
precisely learn the given exemplars and its ability toprecisely learn the given exemplars and its ability to
generalize (i.e., intergeneralize (i.e., inter-- and extrapolate).and extrapolate).
This problem is similar toThis problem is similar to fitting a functionfitting a function to a givento a given
set of data points.set of data points.
9/29/2013 ANN Lecture 45
Let us assume that you want to find a fitting functionLet us assume that you want to find a fitting function
f:f:RR→→RR for a set of three data points.for a set of three data points.
You try to do this with polynomials of degree one (aYou try to do this with polynomials of degree one (a
straight line), two, and nine.straight line), two, and nine.
Learning in ANNsLearning in ANNs
f(x)f(x)
deg. 1deg. 1
deg. 2deg. 2
deg. 9deg. 9
9/29/2013 ANN Lecture 46
Obviously, the polynomial of degree 2 provides theObviously, the polynomial of degree 2 provides the
most plausible fit.most plausible fit.
xx
LEARNING / TRAINING IN ANNLEARNING / TRAINING IN ANN
##Learning is a relatively permanentLearning is a relatively permanent
change brought out in behavior duechange brought out in behavior due
to experienceto experience..
LEARNINGPROCESSES
9/29/2013 ANN Lecture 47
ERRORCORRECTION BOLTSMANLEARNING HEBBIANLEARNING COMPETETIVELEARNING
LEARNINGALGORITHMS
(RULES)
SUPERVISED
(WITHATEACHER)
UNSUPERVISED
(WITHOUTATEACHER)
REINFORCEMENT
LEARNINGPARADIGM
LEARNINGPROCESESS
Learning in ANNsLearning in ANNs
The same principle applies to ANNs:The same principle applies to ANNs:
•• If an ANN hasIf an ANN has too fewtoo few neurons, it may not haveneurons, it may not have
enough degrees of freedom to preciselyenough degrees of freedom to precisely
approximate the desired function.approximate the desired function.
•• If an ANN hasIf an ANN has too manytoo many neurons, it will learn theneurons, it will learn the
exemplars perfectly, but its additional degrees ofexemplars perfectly, but its additional degrees of
9/29/2013 ANN Lecture 48
exemplars perfectly, but its additional degrees ofexemplars perfectly, but its additional degrees of
freedom may cause it to show implausible behaviorfreedom may cause it to show implausible behavior
for untrained inputs; it then presents poorfor untrained inputs; it then presents poor
ability of generalization.ability of generalization.
Unfortunately, there areUnfortunately, there are no known equationsno known equations thatthat
could tell you the optimal size of your network for acould tell you the optimal size of your network for a
given application; you always have to experiment.given application; you always have to experiment.
9. 99
Training DataTraining Data
InputInput Desired OutputDesired Output
NetworkNetwork
TargetTarget
InputInput
9/29/2013 ANN Lecture 49
OutputOutput
InputInput
ObjectiveObjective
FunctionFunction
Training AlgorithmTraining Algorithm
(Optimization Method)(Optimization Method)
Weight UpdationWeight Updation
ErrorError
Neural networks as cognitionNeural networks as cognition
Input unitsInput units hidden unitshidden units output unitsoutput units
Sensory inputSensory input processingprocessing
behaviourbehaviour
Pattern of activation = mental statePattern of activation = mental state
9/29/2013 ANN Lecture 50
atte o act at o e ta stateatte o act at o e ta state
Changing weights = learningChanging weights = learning
Pattern of weights = long term memoryPattern of weights = long term memory
Reinstatement of pattern of activation =Reinstatement of pattern of activation =
retrievalretrieval
LearningLearning
HardHard--wiredwired
–– Weights specified by designerWeights specified by designer
Error correction / SupervisedError correction / Supervised
–– Responses compared to target output andResponses compared to target output and
9/29/2013 ANN Lecture 51
weights adjusted accordinglyweights adjusted accordingly
•• Back propagationBack propagation
Hebbian LearningHebbian Learning
Hebbian LearningHebbian Learning
Visual input
9/29/2013 ANN Lecture 52
CAT
If 2 units are simultaneously activated, increase the
strength of the connection between them
Auditory input
Supervised LearningSupervised Learning
A teacher provides correctA teacher provides correct
responses for givenresponses for given
inputs during a traininginputs during a training
processprocess
Error feedback is create byError feedback is create by
comparing desiredcomparing desired
E
9/29/2013 ANN Lecture 53
p gp g
responses to actualresponses to actual
responses to simulate aresponses to simulate a
teacherteacher
–– The learning process can beThe learning process can be
viewed as moving a point on anviewed as moving a point on an
error surface.error surface.
–– If gradient information isIf gradient information is
available, the search can moveavailable, the search can move
in the direction of the steepestin the direction of the steepest
descent of the gradient.descent of the gradient.
Unsupervised LearningUnsupervised Learning
Reinforcement LearningReinforcement Learning
–– Receives a reinforcement signal from theReceives a reinforcement signal from the
environment that is convert to rewards andenvironment that is convert to rewards and
punishments in the system using a heuristicpunishments in the system using a heuristic
–– There is no detailed information to guide the systemThere is no detailed information to guide the system
9/29/2013 ANN Lecture 54
for intermediate steps.for intermediate steps.
–– Temporal credit assignment problemTemporal credit assignment problem
Unsupervised LearningUnsupervised Learning
–– Uses a problem independent performance measureUses a problem independent performance measure
–– System must extract statistical information to formSystem must extract statistical information to form
internal representationinternal representation
–– Competitive learning is an exampleCompetitive learning is an example
10. 1010
SUPERVISED Vs UNSUPERVISEDSUPERVISED Vs UNSUPERVISED
LEARNINGLEARNING
HOW TO
CLASSIFY THE
GIVEN
9/29/2013 ANN Lecture 55
GIVEN
PATTERNS/
OBJECTS ?
2 WAYS:WHICH IS CORRECT?2 WAYS:WHICH IS CORRECT?
C1
9/29/2013 ANN Lecture 56
C1
C2
Pattern recognition
Function approximation
Application of learning
9/29/2013 ANN Lecture 57
pp
Pattern association
Control
Filtering
Pattern AssociationPattern Association
AutoassociationAutoassociation
–– Learn a set of patternsLearn a set of patterns
–– Given a corrupted pattern, retrieve the full patternGiven a corrupted pattern, retrieve the full pattern
Pattern Pattern
9/29/2013 ANN Lecture 58
HeterassociationHeterassociation
–– Learn a set of pattern pairsLearn a set of pattern pairs
–– Given a pattern, retrieve the associated patternGiven a pattern, retrieve the associated pattern
Pattern
Associator
“storage phase”
Pattern
Associator
“recall phase”
Pattern
Associator
“storage phase”
Pattern
Associator
“recall phase”
Pattern RecognitionPattern Recognition
Feature
9/29/2013 ANN Lecture 59
Feature
Extractor
Classifierpattern label
Unsupervised
Neural
Network
Supervised
Neural
Network
f
Function ApproximationFunction Approximation
Given a set of I/O pairs from an unknownGiven a set of I/O pairs from an unknown
function f of the form: {xfunction f of the form: {xii,f(x,f(xii)})}
Train a network to compute a function F(*)Train a network to compute a function F(*)
such thatsuch that
|| F(x)|| F(x) –– f(x) || <f(x) || < εε wherewhere εε is an error bound.is an error bound.
System identification: train a system to model I/O pairsSystem identification: train a system to model I/O pairs
9/29/2013 ANN Lecture 60
–– System identification: train a system to model I/O pairsSystem identification: train a system to model I/O pairs
–– Inverse system: train a system that generates the inputInverse system: train a system that generates the input
for a given output.for a given output.
Unknown
Function
Neural
Network
xi Σ
eiyi
di
-
+
Function
Neural
Network
xi
Σ
ei
yidi
-
+
11. 1111
Control and FilteringControl and Filtering
Control involves the development of aControl involves the development of a
neural network to model a “plant”. Theneural network to model a “plant”. The
simulated plant is then used to estimatesimulated plant is then used to estimate
a Jacobian matrix that can be used ina Jacobian matrix that can be used in
ti l i l ithti l i l ith
9/29/2013 ANN Lecture 61
an error correcting learning algorithm.an error correcting learning algorithm.
Filtering involves the extraction ofFiltering involves the extraction of
selected information from an input. Thisselected information from an input. This
includes application such as smoothingincludes application such as smoothing
and prediction.and prediction.
How ANN is Materialized?How ANN is Materialized?
HardwareHardwareSoftwareSoftware
MATLABMATLAB
C programmingC programming
4 K ON CHIP WEIGHT4 K ON CHIP WEIGHT
MEMORYMEMORY
9/29/2013 ANN Lecture 62
Intel ETANNIntel ETANN--8017080170
C programmingC programming
Other suitableOther suitable
Platforms.Platforms.
NEURON FUNCTION UNITNEURON FUNCTION UNIT
( )T
I IU W X=
BARREL SHIFTERBARREL SHIFTER
INPUTINPUT OUTPUTOUTPUT
Applications of ANNApplications of ANN
IMAGE PROCESSING & COMPUTER VISIONIMAGE PROCESSING & COMPUTER VISION
SIGNAL PROCESSINGSIGNAL PROCESSING
PATTERN RECOGNITIONPATTERN RECOGNITION
MEDICINE/ BIOMEDICAL ENGINEERINGMEDICINE/ BIOMEDICAL ENGINEERING
9/29/2013 ANN Lecture 63
MEDICINE/ BIOMEDICAL ENGINEERINGMEDICINE/ BIOMEDICAL ENGINEERING
MILITARY SYSTEMSMILITARY SYSTEMS
FINANCIAL SYSTEMSFINANCIAL SYSTEMS
PLANNING CONTROL & RESEARCHPLANNING CONTROL & RESEARCH
ARTIFICIAL INTELLIGENCEARTIFICIAL INTELLIGENCE
POWER SYSTEMSPOWER SYSTEMS
REFERENCESREFERENCES
1.1. Simon Haykin, Neural NetworksSimon Haykin, Neural Networks –– A Comprehenssive Foundation,A Comprehenssive Foundation,
Pearson India.Pearson India.
2.2. Robert J. Schalkoff, artificial neural Networks. McGrawRobert J. Schalkoff, artificial neural Networks. McGraw--HillHill
Publications Co.Publications Co.
3.3. Stamatios V. kartalopoulos, Prentice Hall India LtdStamatios V. kartalopoulos, Prentice Hall India Ltd
9/29/2013 ANN Lecture 64
4.4. Jacek M Zurada,Introduction to Artificial Neural Systems, JaicoJacek M Zurada,Introduction to Artificial Neural Systems, Jaico
PublishersPublishers
5.5. Hagan, Demuth, Beale.Neural Network Design.Hagan, Demuth, Beale.Neural Network Design. Vikas PublishingVikas Publishing
House(Thomson Learning).House(Thomson Learning).
6.6. B. Yegnanarayana,B. Yegnanarayana, Artificial Neural Networks,PrenticeArtificial Neural Networks,Prentice--Hall Of IndiaHall Of India
Private Limited .Private Limited .
7.7. Fredric M.Ham & Ivica Kostanic, Principles of Neurocomputing forFredric M.Ham & Ivica Kostanic, Principles of Neurocomputing for
Science and Engineering, Tata McGraw Hill Publishing CoScience and Engineering, Tata McGraw Hill Publishing Co
THANKYOUTHANKYOU