SlideShare a Scribd company logo
Introduction to Neural Networks
CS405
What are connectionist neural
networks?
• Connectionism refers to a computer modeling
approach to computation that is loosely based
upon the architecture of the brain.
• Many different models, but all include:
– Multiple, individual “nodes” or “units” that operate at
the same time (in parallel)
– A network that connects the nodes together
– Information is stored in a distributed fashion among the
links that connect the nodes
– Learning can occur with gradual changes in connection
strength
Neural Network History
• History traces back to the 50’s but became popular in the 80’s
with work by Rumelhart, Hinton, and Mclelland
– A General Framework for Parallel Distributed Processing in Parallel
Distributed Processing: Explorations in the Microstructure of
Cognition
• Peaked in the 90’s. Today:
– Hundreds of variants
– Less a model of the actual brain than a useful tool, but still some
debate
• Numerous applications
– Handwriting, face, speech recognition
– Vehicles that drive themselves
– Models of reading, sentence production, dreaming
• Debate for philosophers and cognitive scientists
– Can human consciousness or cognitive abilities be explained by a
connectionist model or does it require the manipulation of symbols?
Comparison of Brains and Traditional
Computers
• 200 billion neurons, 32
trillion synapses
• Element size: 10-6
m
• Energy use: 25W
• Processing speed: 100 Hz
• Parallel, Distributed
• Fault Tolerant
• Learns: Yes
• Intelligent/Conscious:
Usually
• 1 billion bytes RAM but
trillions of bytes on disk
• Element size: 10-9
m
• Energy watt: 30-90W (CPU)
• Processing speed: 109
Hz
• Serial, Centralized
• Generally not Fault Tolerant
• Learns: Some
• Intelligent/Conscious:
Generally No
Biological Inspiration
“My brain: It's my second favorite organ.”
- Woody Allen, from the movie Sleeper
Idea : To make the computer more robust, intelligent, and learn, …
Let’s model our computer software (and/or hardware) after the brain
Neurons in the Brain
• Although heterogeneous, at a low level
the brain is composed of neurons
– A neuron receives input from other neurons
(generally thousands) from its synapses
– Inputs are approximately summed
– When the input exceeds a threshold the neuron
sends an electrical spike that travels that
travels from the body, down the axon, to the
next neuron(s)
Learning in the Brain
• Brains learn
– Altering strength between neurons
– Creating/deleting connections
• Hebb’s Postulate (Hebbian Learning)
– When an axon of cell A is near enough to excite a cell B and repeatedly
or persistently takes part in firing it, some growth process or metabolic
change takes place in one or both cells such that A's efficiency, as one
of the cells firing B, is increased.
• Long Term Potentiation (LTP)
– Cellular basis for learning and memory
– LTP is the long-lasting strengthening of the connection between two
nerve cells in response to stimulation
– Discovered in many regions of the cortex
Perceptrons
• Initial proposal of connectionist networks
• Rosenblatt, 50’s and 60’s
• Essentially a linear discriminant composed of
nodes, weights
I1
I2
I3
W1
W2
W3
θ O










>+





= ∑
otherwise
Iw
O i
ii
:0
0:1 θ
I1
I2
I3
W1
W2
W3
θ
O
or
1
Activation Function
Perceptron Example
2
1
.5
.3 θ =-1
2(0.5) + 1(0.3) + -1 = 0.3 , O=1
Learning Procedure:
Randomly assign weights (between 0-1)
Present inputs from training data
Get output O, nudge weights to gives results toward our
desired output T
Repeat; stop when no errors, or enough epochs completed
Perception Training
)()()1( twtwtw iii ∆+=+
ii IOTtw )()( −=∆
Weights include Threshold. T=Desired, O=Actual output.
5.1)2)(10(5.0)1(1 −=−+=+tw
Example: T=0, O=1, W1=0.5, W2=0.3, I1=2, I2=1,Theta=-1
7.0)1)(10(3.0)1(2 −=−+=+tw
2)1)(10(1)1( −=−+−=+twθ
If we present this input again, we’d output 0 instead
How might you use a perceptron
network?
• This (and other networks) are generally used to learn how
to make classifications
• Say you have collected some data regarding the diagnosis
of patients with heart disease
– Age, Sex, Chest Pain Type, Resting BPS, Cholesterol, …,
Diagnosis (<50% diameter narrowing, >50% diameter narrowing)
– 67,1,4,120,229,…, 1
– 37,1,3,130,250,… ,0
– 41,0,2,130,204,… ,0
• Train network to predict heart disease of new patient
Perceptrons
• Can add learning rate to speed up the learning process; just
multiply in with delta computation
• Essentially a linear discriminant
• Perceptron theorem: If a linear discriminant exists that can
separate the classes without error, the training procedure is
guaranteed to find that line or plane.
Class1 Class2
10
1 0
XOR Problem: Not Linearly Separable!
Exclusive Or (XOR) Problem
Input: 0,0 Output: 0
Input: 0,1 Output: 1
Input: 1,0 Output: 1
Input: 1,1 Output: 0
We could however construct multiple layers of perceptrons to get around
this problem. A typical multi-layered system minimizes LMS Error,
LMS Learning
LMS = Least Mean Square learning Systems, more general than the
previous perceptron learning rule. The concept is to minimize the total
error, as measured over all training examples, P. O is the raw output,
as calculated by
( )∑ −=
P
PP OTLMSceDis
2
2
1
)(tan
E.g. if we have two patterns and
T1=1, O1=0.8, T2=0, O2=0.5 then D=(0.5)[(1-0.8)2
+(0-0.5)2
]=.145
We want to minimize the LMS:
E
W
W(old)
W(new)
C-learning rate
∑ +
i
ii Iw θ
LMS Gradient Descent
• Using LMS, we want to minimize the error. We can do this by finding
the direction on the error surface that most rapidly reduces the error
rate; this is finding the slope of the error function by taking the
derivative. The approach is called gradient descent (similar to hill
climbing).
k
k
w
Error
cw
δ
δ
−=∆
To compute how much to change weight for link k:
k
j
jk w
O
O
Error
w
Error
δ
δ
δ
δ
δ
δ
∗=
Chain rule:
)(
)(
2
1)(
2
1 22
jj
j
jj
j
P
PP
j
OT
O
OT
O
OT
O
Error
−−=
−
=
−
=
∑
δ
δ
δ
δ
δ
δ
We can remove the sum since we are taking the partial derivative wrt Oj
( ))('
kkk
k
j
WIFunctionActivationfI
w
O
=
δ
δ
( ) ( )FunctionActivationfIOTcw kjjk ')( −−−=∆
)( WIfOj =
Activation Function
• To apply the LMS learning rule, also
known as the delta rule, we need a
differentiable activation function.
( ) ( )FunctionActivationfOTcIw jjkk '−=∆







 >+
=
∑
otherwise
Iw
O i
ii
:0
0:1 θ
Old:
θ−
New:
∑
+
= Θ+−
i
ii Iw
e
O
1
1
LMS vs. Limiting Threshold
• With the new sigmoidal function that is differentiable, we
can apply the delta rule toward learning.
• Perceptron Method
– Forced output to 0 or 1, while LMS uses the net output
– Guaranteed to separate, if no error and is linearly separable
• Otherwise it may not converge
• Gradient Descent Method:
– May oscillate and not converge
– May converge to wrong answer
– Will converge to some minimum even if the classes are not
linearly separable, unlike the earlier perceptron training method
Backpropagation Networks
• Attributed to Rumelhart and McClelland, late 70’s
• To bypass the linear classification problem, we can
construct multilayer networks. Typically we have fully
connected, feedforward networks.
I1
I2
1
Hidden Layer
H1
H2
O1
O2
Input Layer Output Layer
Wi,j
Wj,k
1’s - bias
∑
+
= −
j
jxj Hw
e
xO
,
1
1
)(
I3
1
∑
+
= −
i
ixi Iw
e
xH
,
1
1
)(
Backprop - Learning
Learning Procedure:
Randomly assign weights (between 0-1)
Present inputs from training data, propagate to outputs
Compute outputs O, adjust weights according to the delta
rule, backpropagating the errors. The weights will be
nudged closer so that the network learns to give the
desired output.
Repeat; stop when no errors, or enough epochs completed
Backprop - Modifying Weights
( ) ( ) 





+
=−=∆ −sumjjkk
e
fFunctionActivationfOTcIw
1
1
;'
We had computed:
For the Output unit k, f(sum)=O(k). For the output units, this is:
( ) )1(, kkkkjkj OOOTcHw −−=∆
( )( ))(1)(( sumfsumfOTcIw jjkk −−=∆
I H O
Wi,j Wj,k
For the Hidden units (skipping some math), this is:
kj
k
kkkkijjji wOOOTIHcHw ,, )1()()1( ∑ −−−=∆
Backprop
• Very powerful - can learn any function, given enough
hidden units! With enough hidden units, we can generate
any function.
• Have the same problems of Generalization vs.
Memorization. With too many units, we will tend to
memorize the input and not generalize well. Some schemes
exist to “prune” the neural network.
• Networks require extensive training, many parameters to
fiddle with. Can be extremely slow to train. May also fall
into local minima.
• Inherently parallel algorithm, ideal for multiprocessor
hardware.
• Despite the cons, a very powerful algorithm that has seen
widespread successful deployment.
Backprop Demo
• QwikNet
– Learning XOR, Sin/Cos functions
Unsupervised Learning
• We just discussed a form of supervised learning
– A “teacher” tells the network what the correct output is
based on the input until the network learns the target
concept
• We can also train networks where there is no
teacher. This is called unsupervised learning.
The network learns a prototype based on the
distribution of patterns in the training data. Such
networks allow us to:
– Discover underlying structure of the data
– Encode or compress the data
– Transform the data
Unsupervised Learning – Hopfield
Networks
• A Hopfield network is a type of content-
addressable memory
– Non-linear system with attractor points that
represent concepts
– Given a fuzzy input the system converges to the
nearest attractor
• Possibility to have “spurious” attractors that is a blend
of multiple stored patterns
• Also possible to have chaotic patterns that never
converge
Standard Binary Hopfield Network
• Recurrent; Every unit is connected to every other unit
• Weights connecting units are symmetrical
– wij = wji
• If the weighted sum of the inputs exceeds a threshold, its
output is 1 otherwise its output is -1
• Units update themselves asynchronously as their inputs
change
B
C
A
D
wAD
wABwAC
wBC wBD
wCD
Hopfield Memories
• Setting the weights:
– A pattern is a setting of on or off for each unit
– Given a set of Q patterns to store
• For every weight connecting units i and j:
• This is a form of a Hebbian rule which makes the weight
strength proportional to the product of the firing rates of the
two interconnected units
p
j
Q
p
p
iij xxw ∑=
=
1
Hopfield Network Demo
• http://www.cbu.edu/~pong/ai/hopfield/hopfieldap
plet.html
• Properties
– Settles into a minimal energy state
– Storage capacity low, only 13% of number of units
– Can retrieve information even in the presence of noisy
data, similar to associative memory of humans
Unsupervised Learning – Self
Organizing Maps
• Self-organizing maps (SOMs) are a data
visualization technique invented by Professor
Teuvo Kohonen
– Also called Kohonen Networks, Competitive Learning,
Winner-Take-All Learning
– Generally reduces the dimensions of data through the
use of self-organizing neural networks
– Useful for data visualization; humans cannot visualize
high dimensional data so this is often a useful technique
to make sense of large data sets
Basic “Winner Take All” Network
• Two layer network
– Input units, output units, each input unit is connected to each
output unit
I1
I2
O1
O2
Input Layer
Wi,j
I3
Output Layer
Basic Algorithm
– Initialize Map (randomly assign weights)
– Loop over training examples
• Assign input unit values according to the values in the current
example
• Find the “winner”, i.e. the output unit that most closely
matches the input units, using some distance metric, e.g.
• Modify weights on the winner to more closely match the input
( )
2
1
∑=
−
n
i
iij IW
For all output units j=1 to m
and input units i=1 to n
Find the one that minimizes:
)(1 tt
i
t
WXcW −=∆ +
where c is a small positive learning constant
that usually decreases as the learning proceeds
Result of Algorithm
• Initially, some output nodes will randomly be a
little closer to some particular type of input
• These nodes become “winners” and the weights
move them even closer to the inputs
• Over time nodes in the output become
representative prototypes for examples in the input
• Note there is no supervised training here
• Classification:
– Given new input, the class is the output node that is the
winner
Typical Usage: 2D Feature Map
• In typical usage the output nodes form a 2D “map”
organized in a grid-like fashion and we update weights in a
neighborhood around the winner
I1
I2
Input Layer
I3
Output Layers
O11 O12 O13 O14 O15
O21 O22 O23 O24 O25
O31 O32 O33 O34 O35
O41 O42 O43 O44 O45
O51 O52 O53 O54 O55
…
Modified Algorithm
– Initialize Map (randomly assign weights)
– Loop over training examples
• Assign input unit values according to the values in the current
example
• Find the “winner”, i.e. the output unit that most closely
matches the input units, using some distance metric, e.g.
• Modify weights on the winner to more closely match the input
• Modify weights in a neighborhood around the winner so
the neighbors on the 2D map also become closer to the
input
– Over time this will tend to cluster similar items closer on the
map
Updating the Neighborhood
• Node O44 is the winner
– Color indicates scaling to update neighbors
Output Layers
O11 O12 O13 O14 O15
O21 O22 O23 O24 O25
O31 O32 O33 O34 O35
O41 O42 O43 O44 O45
O51 O52 O53 O54 O55
)(1 tt
i
t
WXcW −=∆ +
c=1
c=0.75
c=0.5
Consider if O42
is winner for
some other
input; “fight”
over claiming
O43,O33, O53
Selecting the Neighborhood
• Typically, a “Sombrero Function” or Gaussian
function is used
• Neighborhood size usually decreases over time to
allow initial “jockeying for position” and then
“fine-tuning” as algorithm proceeds
Color Example
• http://davis.wpi.edu/~matt/courses/soms/ap
plet.html
Kohonen Network Examples
• Document Map:
http://websom.hut.fi/websom/milliondemo/
html/root.html
Poverty Map
http://www.cis.hut.fi/rese
arch/som-
research/worldmap.html
SOM for Classification
• A generated map can also be used for classification
• Human can assign a class to a data point, or use the
strongest weight as the prototype for the data point
• For a new test case, calculate the winning node and
classify it as the class it is closest to
• Handwriting recognition example: http://fbim.fh-
regensburg.de/~saj39122/begrolu/kohonen.html
Psychological and Biological
Considerations of Neural Networks
• Psychological
– Neural network models learn, exhibit some behavior similar to
humans, based loosely on brains
– Create their own algorithms instead of being explicitly programmed
– Operate under noisy data
– Fault tolerant and graceful degradation
– Knowledge is distributed, yet still some localization
• Lashley’s search for engrams
• Biological
– Learning in the visual cortex shortly after birth seems to correlate
with the pattern discrimination that emerges from Kohonen Networks
– Criticisms of the mechanism to update weights; mathematically
driven; feedforward supervised network unrealistic
Connectionism
• What’s hard for neural networks? Activities beyond
recognition, e.g.:
– Variable binding
– Recursion
– Reflection
– Structured representations
• Connectionist and Symbolic Models
– The Central Paradox of Cognition (Smolensky et al., 1992):
– "Formal theories of logical reasoning, grammar, and other higher
mental faculties compel us to think of the mind as a machine for rule-
based manipulation of highly structured arrays of symbols. What we
know of the brain compels us to think of human information
processing in terms of manipulation of a large unstructured set of
numbers, the activity levels of interconnected neurons. Finally, the full
richness of human behavior, both in everyday environments and in the
controlled environments of the psychological laboratory, seems to defy
rule-based description, displaying strong sensitivity to subtle statistical
factors in experience, as well as to structural properties of information.
To solve the Central Paradox of Cognition is to resolve these
contradictions with a unified theory of the organization of the mind, of
the brain, of behavior, and of the environment."
Possible Relationships?
• Symbolic systems implemented via
connectionism
– Possible to create hierarchies of networks with
subnetworks to implement symbolic systems
• Hybrid model
– System consists of two separate components;
low-level tasks via connectionism, high-level
tasks via symbols
Proposed Hierarchical Model
• Jeff Hawkins
• Founder: Palm Computing, Handspring
• Deep interest in the brain all his life
• Book: “On Intelligence”
– Variety of neuroscience research as input
– Includes his own ideas, theories, guesses
– Increasingly accepted view of the brain
The Cortex
• Hawkins’s point of interest in the brain
– “Where the magic happens”
• Hierarchically-arranged in regions
• Communication up the hierarchy
– Regions classify patterns of their inputs
– Regions output a ‘named’ pattern up the hierarchy
• Communication down the hierarchy
– A high-level region has made a prediction
– Alerts lower-level regions what to expect
Hawkins Quotes
“The human cortex is particularly large and
therefore has a massive memory capacity.
It is constantly predicting what you will see,
hear and feel, mostly in ways you are
unconscious of. These predictions are our
thoughts, and when combined with sensory
inputs, they are our perceptions. I call this
view of the brain the memory-prediction
framework of intelligence.”
Hawkins Quotes
“Your brain constantly makes predictions
about the very fabric of the world we live
in, and it does so in a parallel fashion. It
will just as readily detect an odd texture, a
misshapen nose, or an unusual motion. It
isn’t obvious how pervasive these mostly
unconscious predictions are, which is
perhaps why we missed their importance.”
Hawkins Quotes
“Your brain constantly makes predictions
about the very fabric of the world we live
in, and it does so in a parallel fashion. It
will just as readily detect an odd texture, a
misshapen nose, or an unusual motion. It
isn’t obvious how pervasive these mostly
unconscious predictions are, which is
perhaps why we missed their importance.”
Hawkins Quotes
“Suppose when you are out, I sneak over to
your home and change something about
your door. It could be almost anything. I
could move the knob over by and inch,
change a round knob into a thumb latch, or
turn it from brass to chrome…. When you
come home that day and attempt to open the
door, you will quickly detect that
something is wrong.”
Prediction
• Prediction means that the neurons involved
in sensing your door become active in
advance of them actually receiving sensory
input.
– When the sensory input does arrive, it is
compared with what is expected.
– Two way communication; classification up the
hierarchy, prediction down the hierarchy
Prediction
• Prediction is not limited to patterns of low-level
sensory information like hearing and seeing
• Mountcastle’s principle : we have lots of
different neurons, but they basically do the
same thing (particularly in the neocortex)
– What is true of low-level sensory areas must be true
for all cortical areas. The human brain is more
intelligent than that of other animals because it can
make predictions about more abstract kinds of
patterns and longer temporal pattern sequences.”
Visual Hierarchies
• Lowest visual level inputs pixels
• Second level recognizes edges, lines, etc
from known patterns of pixels
• Third level recognizes shapes from known
patterns of edges, lines, etc
• Fourth level recognizes objects from known
patterns of shapes
Layers
Not there yet…
• Many issues remain to be addressed by
Hawkins’ model
– Missing lots of details on how his model could
be implemented in a computer
– Creativity?
– Evolution?
– Planning?
– Rest of the brain, not just neocortex?
Links and Examples
• http://davis.wpi.edu/~matt/courses/soms/ap
plet.html
• http://websom.hut.fi/websom/milliondemo/
html/root.html
• http://www.cis.hut.fi/research/som-
research/worldmap.html
• http://www.patol.com/java/TSP/index.html

More Related Content

What's hot

Neural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseNeural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics Course
Mohaiminur Rahman
 
Artificial Neural Network Abstract
Artificial Neural Network AbstractArtificial Neural Network Abstract
Artificial Neural Network Abstract
Anjali Agrawal
 
Neural networks of artificial intelligence
Neural networks of artificial  intelligenceNeural networks of artificial  intelligence
Neural networks of artificial intelligence
alldesign
 
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
COMPEGENCE
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applications
shritosh kumar
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
nainabhatt2
 
Neural network
Neural network Neural network
Neural network
Faireen
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networks
Nikhil Kansari
 
Neural networks
Neural networksNeural networks
Neural networks
Rizwan Rizzu
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
Burhan Muzafar
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its applicationHưng Đặng
 
Neural network
Neural networkNeural network
Neural network
Saddam Hussain
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
Nagarajan
 
Neural network and artificial intelligent
Neural network and artificial intelligentNeural network and artificial intelligent
Neural network and artificial intelligent
HapPy SumOn
 
Neural net NWU 4.3 Graphics Course
Neural net NWU 4.3 Graphics CourseNeural net NWU 4.3 Graphics Course
Neural net NWU 4.3 Graphics Course
Mohaiminur Rahman
 
Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network report
Anjali Agrawal
 
Neural Network Research Projects Topics
Neural Network Research Projects TopicsNeural Network Research Projects Topics
Neural Network Research Projects Topics
Matlab Simulation
 

What's hot (19)

Neural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics CourseNeural network final NWU 4.3 Graphics Course
Neural network final NWU 4.3 Graphics Course
 
Artificial Neural Network Abstract
Artificial Neural Network AbstractArtificial Neural Network Abstract
Artificial Neural Network Abstract
 
Neural networks of artificial intelligence
Neural networks of artificial  intelligenceNeural networks of artificial  intelligence
Neural networks of artificial intelligence
 
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
Compegence: Dr. Rajaram Kudli - An Introduction to Artificial Neural Network ...
 
Artificial Neural Network and its Applications
Artificial Neural Network and its ApplicationsArtificial Neural Network and its Applications
Artificial Neural Network and its Applications
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Neural network
Neural network Neural network
Neural network
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networks
 
Neural networks
Neural networksNeural networks
Neural networks
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Artificial neural networks and its application
Artificial neural networks and its applicationArtificial neural networks and its application
Artificial neural networks and its application
 
Project Report -Vaibhav
Project Report -VaibhavProject Report -Vaibhav
Project Report -Vaibhav
 
Neural network
Neural networkNeural network
Neural network
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
 
Neural network and artificial intelligent
Neural network and artificial intelligentNeural network and artificial intelligent
Neural network and artificial intelligent
 
Lesson 37
Lesson 37Lesson 37
Lesson 37
 
Neural net NWU 4.3 Graphics Course
Neural net NWU 4.3 Graphics CourseNeural net NWU 4.3 Graphics Course
Neural net NWU 4.3 Graphics Course
 
Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network report
 
Neural Network Research Projects Topics
Neural Network Research Projects TopicsNeural Network Research Projects Topics
Neural Network Research Projects Topics
 

Similar to Neutral Network

Artificial Neural Network (ANN
Artificial Neural Network (ANNArtificial Neural Network (ANN
Artificial Neural Network (ANN
Andrew Molina
 
Neural
NeuralNeural
7 nn1-intro.ppt
7 nn1-intro.ppt7 nn1-intro.ppt
7 nn1-intro.ppt
Sambit Satpathy
 
02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN
Tamer Ahmed Farrag, PhD
 
Artificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.pptArtificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.ppt
NJUSTAiMo
 
33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf
gnans Kgnanshek
 
Neural Computing
Neural ComputingNeural Computing
Neural Computing
Jehoshaphat Abu
 
2011 0480.neural-networks
2011 0480.neural-networks2011 0480.neural-networks
2011 0480.neural-networks
Parneet Kaur
 
Neural networks
Neural networksNeural networks
Neural networks
PeterPan284882
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
Viet-Trung TRAN
 
L005.neural networks
L005.neural networksL005.neural networks
L005.neural networks
Dr. G. Bharadwaja Kumar
 
2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…
2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…
2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…Dongseo University
 
Ch 1-1 introduction
Ch 1-1 introductionCh 1-1 introduction
Ch 1-1 introduction
Zahra Amini
 
Neural networks
Neural networksNeural networks
Neural networks
kingofvnr
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0
Joe Xing
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Randa Elanwar
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
NainaBhatt1
 
Neural Networks-introduction_with_prodecure.pptx
Neural Networks-introduction_with_prodecure.pptxNeural Networks-introduction_with_prodecure.pptx
Neural Networks-introduction_with_prodecure.pptx
RatuRumana3
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)
Jon Lederman
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
Prof. Neeta Awasthy
 

Similar to Neutral Network (20)

Artificial Neural Network (ANN
Artificial Neural Network (ANNArtificial Neural Network (ANN
Artificial Neural Network (ANN
 
Neural
NeuralNeural
Neural
 
7 nn1-intro.ppt
7 nn1-intro.ppt7 nn1-intro.ppt
7 nn1-intro.ppt
 
02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN02 Fundamental Concepts of ANN
02 Fundamental Concepts of ANN
 
Artificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.pptArtificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.ppt
 
33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf
 
Neural Computing
Neural ComputingNeural Computing
Neural Computing
 
2011 0480.neural-networks
2011 0480.neural-networks2011 0480.neural-networks
2011 0480.neural-networks
 
Neural networks
Neural networksNeural networks
Neural networks
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 
L005.neural networks
L005.neural networksL005.neural networks
L005.neural networks
 
2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…
2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…
2013-1 Machine Learning Lecture 04 - Torsten Reil - Artificial Neural Netw…
 
Ch 1-1 introduction
Ch 1-1 introductionCh 1-1 introduction
Ch 1-1 introduction
 
Neural networks
Neural networksNeural networks
Neural networks
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Neural Networks-introduction_with_prodecure.pptx
Neural Networks-introduction_with_prodecure.pptxNeural Networks-introduction_with_prodecure.pptx
Neural Networks-introduction_with_prodecure.pptx
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
 

Recently uploaded

Pharma Pcd Franchise in Jharkhand - Yodley Lifesciences
Pharma Pcd Franchise in Jharkhand - Yodley LifesciencesPharma Pcd Franchise in Jharkhand - Yodley Lifesciences
Pharma Pcd Franchise in Jharkhand - Yodley Lifesciences
Yodley Lifesciences
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
Krishan Murari
 
Thyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptx
Thyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptxThyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptx
Thyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptx
Dr. Rabia Inam Gandapore
 
planning for change nursing Management ppt
planning for change nursing Management pptplanning for change nursing Management ppt
planning for change nursing Management ppt
Thangamjayarani
 
Non-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdfNon-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdf
MedicoseAcademics
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
Anurag Sharma
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
Dr. Rabia Inam Gandapore
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
د.محمود نجيب
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
KafrELShiekh University
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
MedicoseAcademics
 
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists  Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Saeid Safari
 
Flu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore KarnatakaFlu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore Karnataka
addon Scans
 
Pictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdfPictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdf
Dr. Rabia Inam Gandapore
 
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
i3 Health
 
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
kevinkariuki227
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
aljamhori teaching hospital
 
Basavarajeeyam - Ayurvedic heritage book of Andhra pradesh
Basavarajeeyam - Ayurvedic heritage book of Andhra pradeshBasavarajeeyam - Ayurvedic heritage book of Andhra pradesh
Basavarajeeyam - Ayurvedic heritage book of Andhra pradesh
Dr. Madduru Muni Haritha
 
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model SafeSurat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Savita Shen $i11
 
Knee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdfKnee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdf
vimalpl1234
 

Recently uploaded (20)

Pharma Pcd Franchise in Jharkhand - Yodley Lifesciences
Pharma Pcd Franchise in Jharkhand - Yodley LifesciencesPharma Pcd Franchise in Jharkhand - Yodley Lifesciences
Pharma Pcd Franchise in Jharkhand - Yodley Lifesciences
 
BRACHYTHERAPY OVERVIEW AND APPLICATORS
BRACHYTHERAPY OVERVIEW  AND  APPLICATORSBRACHYTHERAPY OVERVIEW  AND  APPLICATORS
BRACHYTHERAPY OVERVIEW AND APPLICATORS
 
Thyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptx
Thyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptxThyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptx
Thyroid Gland- Gross Anatomy by Dr. Rabia Inam Gandapore.pptx
 
planning for change nursing Management ppt
planning for change nursing Management pptplanning for change nursing Management ppt
planning for change nursing Management ppt
 
Non-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdfNon-respiratory Functions of the Lungs.pdf
Non-respiratory Functions of the Lungs.pdf
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
 
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptxTriangles of Neck and Clinical Correlation by Dr. RIG.pptx
Triangles of Neck and Clinical Correlation by Dr. RIG.pptx
 
Cervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptxCervical & Brachial Plexus By Dr. RIG.pptx
Cervical & Brachial Plexus By Dr. RIG.pptx
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
 
Ophthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE examOphthalmology Clinical Tests for OSCE exam
Ophthalmology Clinical Tests for OSCE exam
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
 
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists  Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
 
Flu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore KarnatakaFlu Vaccine Alert in Bangalore Karnataka
Flu Vaccine Alert in Bangalore Karnataka
 
Pictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdfPictures of Superficial & Deep Fascia.ppt.pdf
Pictures of Superficial & Deep Fascia.ppt.pdf
 
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
 
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
TEST BANK for Operations Management, 14th Edition by William J. Stevenson, Ve...
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
 
Basavarajeeyam - Ayurvedic heritage book of Andhra pradesh
Basavarajeeyam - Ayurvedic heritage book of Andhra pradeshBasavarajeeyam - Ayurvedic heritage book of Andhra pradesh
Basavarajeeyam - Ayurvedic heritage book of Andhra pradesh
 
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model SafeSurat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
Surat @ℂall @Girls ꧁❤8527049040❤꧂@ℂall @Girls Service Vip Top Model Safe
 
Knee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdfKnee anatomy and clinical tests 2024.pdf
Knee anatomy and clinical tests 2024.pdf
 

Neutral Network

  • 1. Introduction to Neural Networks CS405
  • 2. What are connectionist neural networks? • Connectionism refers to a computer modeling approach to computation that is loosely based upon the architecture of the brain. • Many different models, but all include: – Multiple, individual “nodes” or “units” that operate at the same time (in parallel) – A network that connects the nodes together – Information is stored in a distributed fashion among the links that connect the nodes – Learning can occur with gradual changes in connection strength
  • 3. Neural Network History • History traces back to the 50’s but became popular in the 80’s with work by Rumelhart, Hinton, and Mclelland – A General Framework for Parallel Distributed Processing in Parallel Distributed Processing: Explorations in the Microstructure of Cognition • Peaked in the 90’s. Today: – Hundreds of variants – Less a model of the actual brain than a useful tool, but still some debate • Numerous applications – Handwriting, face, speech recognition – Vehicles that drive themselves – Models of reading, sentence production, dreaming • Debate for philosophers and cognitive scientists – Can human consciousness or cognitive abilities be explained by a connectionist model or does it require the manipulation of symbols?
  • 4. Comparison of Brains and Traditional Computers • 200 billion neurons, 32 trillion synapses • Element size: 10-6 m • Energy use: 25W • Processing speed: 100 Hz • Parallel, Distributed • Fault Tolerant • Learns: Yes • Intelligent/Conscious: Usually • 1 billion bytes RAM but trillions of bytes on disk • Element size: 10-9 m • Energy watt: 30-90W (CPU) • Processing speed: 109 Hz • Serial, Centralized • Generally not Fault Tolerant • Learns: Some • Intelligent/Conscious: Generally No
  • 5. Biological Inspiration “My brain: It's my second favorite organ.” - Woody Allen, from the movie Sleeper Idea : To make the computer more robust, intelligent, and learn, … Let’s model our computer software (and/or hardware) after the brain
  • 6. Neurons in the Brain • Although heterogeneous, at a low level the brain is composed of neurons – A neuron receives input from other neurons (generally thousands) from its synapses – Inputs are approximately summed – When the input exceeds a threshold the neuron sends an electrical spike that travels that travels from the body, down the axon, to the next neuron(s)
  • 7. Learning in the Brain • Brains learn – Altering strength between neurons – Creating/deleting connections • Hebb’s Postulate (Hebbian Learning) – When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased. • Long Term Potentiation (LTP) – Cellular basis for learning and memory – LTP is the long-lasting strengthening of the connection between two nerve cells in response to stimulation – Discovered in many regions of the cortex
  • 8. Perceptrons • Initial proposal of connectionist networks • Rosenblatt, 50’s and 60’s • Essentially a linear discriminant composed of nodes, weights I1 I2 I3 W1 W2 W3 θ O           >+      = ∑ otherwise Iw O i ii :0 0:1 θ I1 I2 I3 W1 W2 W3 θ O or 1 Activation Function
  • 9. Perceptron Example 2 1 .5 .3 θ =-1 2(0.5) + 1(0.3) + -1 = 0.3 , O=1 Learning Procedure: Randomly assign weights (between 0-1) Present inputs from training data Get output O, nudge weights to gives results toward our desired output T Repeat; stop when no errors, or enough epochs completed
  • 10. Perception Training )()()1( twtwtw iii ∆+=+ ii IOTtw )()( −=∆ Weights include Threshold. T=Desired, O=Actual output. 5.1)2)(10(5.0)1(1 −=−+=+tw Example: T=0, O=1, W1=0.5, W2=0.3, I1=2, I2=1,Theta=-1 7.0)1)(10(3.0)1(2 −=−+=+tw 2)1)(10(1)1( −=−+−=+twθ If we present this input again, we’d output 0 instead
  • 11. How might you use a perceptron network? • This (and other networks) are generally used to learn how to make classifications • Say you have collected some data regarding the diagnosis of patients with heart disease – Age, Sex, Chest Pain Type, Resting BPS, Cholesterol, …, Diagnosis (<50% diameter narrowing, >50% diameter narrowing) – 67,1,4,120,229,…, 1 – 37,1,3,130,250,… ,0 – 41,0,2,130,204,… ,0 • Train network to predict heart disease of new patient
  • 12. Perceptrons • Can add learning rate to speed up the learning process; just multiply in with delta computation • Essentially a linear discriminant • Perceptron theorem: If a linear discriminant exists that can separate the classes without error, the training procedure is guaranteed to find that line or plane. Class1 Class2
  • 13. 10 1 0 XOR Problem: Not Linearly Separable! Exclusive Or (XOR) Problem Input: 0,0 Output: 0 Input: 0,1 Output: 1 Input: 1,0 Output: 1 Input: 1,1 Output: 0 We could however construct multiple layers of perceptrons to get around this problem. A typical multi-layered system minimizes LMS Error,
  • 14. LMS Learning LMS = Least Mean Square learning Systems, more general than the previous perceptron learning rule. The concept is to minimize the total error, as measured over all training examples, P. O is the raw output, as calculated by ( )∑ −= P PP OTLMSceDis 2 2 1 )(tan E.g. if we have two patterns and T1=1, O1=0.8, T2=0, O2=0.5 then D=(0.5)[(1-0.8)2 +(0-0.5)2 ]=.145 We want to minimize the LMS: E W W(old) W(new) C-learning rate ∑ + i ii Iw θ
  • 15. LMS Gradient Descent • Using LMS, we want to minimize the error. We can do this by finding the direction on the error surface that most rapidly reduces the error rate; this is finding the slope of the error function by taking the derivative. The approach is called gradient descent (similar to hill climbing). k k w Error cw δ δ −=∆ To compute how much to change weight for link k: k j jk w O O Error w Error δ δ δ δ δ δ ∗= Chain rule: )( )( 2 1)( 2 1 22 jj j jj j P PP j OT O OT O OT O Error −−= − = − = ∑ δ δ δ δ δ δ We can remove the sum since we are taking the partial derivative wrt Oj ( ))(' kkk k j WIFunctionActivationfI w O = δ δ ( ) ( )FunctionActivationfIOTcw kjjk ')( −−−=∆ )( WIfOj =
  • 16. Activation Function • To apply the LMS learning rule, also known as the delta rule, we need a differentiable activation function. ( ) ( )FunctionActivationfOTcIw jjkk '−=∆         >+ = ∑ otherwise Iw O i ii :0 0:1 θ Old: θ− New: ∑ + = Θ+− i ii Iw e O 1 1
  • 17. LMS vs. Limiting Threshold • With the new sigmoidal function that is differentiable, we can apply the delta rule toward learning. • Perceptron Method – Forced output to 0 or 1, while LMS uses the net output – Guaranteed to separate, if no error and is linearly separable • Otherwise it may not converge • Gradient Descent Method: – May oscillate and not converge – May converge to wrong answer – Will converge to some minimum even if the classes are not linearly separable, unlike the earlier perceptron training method
  • 18. Backpropagation Networks • Attributed to Rumelhart and McClelland, late 70’s • To bypass the linear classification problem, we can construct multilayer networks. Typically we have fully connected, feedforward networks. I1 I2 1 Hidden Layer H1 H2 O1 O2 Input Layer Output Layer Wi,j Wj,k 1’s - bias ∑ + = − j jxj Hw e xO , 1 1 )( I3 1 ∑ + = − i ixi Iw e xH , 1 1 )(
  • 19. Backprop - Learning Learning Procedure: Randomly assign weights (between 0-1) Present inputs from training data, propagate to outputs Compute outputs O, adjust weights according to the delta rule, backpropagating the errors. The weights will be nudged closer so that the network learns to give the desired output. Repeat; stop when no errors, or enough epochs completed
  • 20. Backprop - Modifying Weights ( ) ( )       + =−=∆ −sumjjkk e fFunctionActivationfOTcIw 1 1 ;' We had computed: For the Output unit k, f(sum)=O(k). For the output units, this is: ( ) )1(, kkkkjkj OOOTcHw −−=∆ ( )( ))(1)(( sumfsumfOTcIw jjkk −−=∆ I H O Wi,j Wj,k For the Hidden units (skipping some math), this is: kj k kkkkijjji wOOOTIHcHw ,, )1()()1( ∑ −−−=∆
  • 21. Backprop • Very powerful - can learn any function, given enough hidden units! With enough hidden units, we can generate any function. • Have the same problems of Generalization vs. Memorization. With too many units, we will tend to memorize the input and not generalize well. Some schemes exist to “prune” the neural network. • Networks require extensive training, many parameters to fiddle with. Can be extremely slow to train. May also fall into local minima. • Inherently parallel algorithm, ideal for multiprocessor hardware. • Despite the cons, a very powerful algorithm that has seen widespread successful deployment.
  • 22. Backprop Demo • QwikNet – Learning XOR, Sin/Cos functions
  • 23. Unsupervised Learning • We just discussed a form of supervised learning – A “teacher” tells the network what the correct output is based on the input until the network learns the target concept • We can also train networks where there is no teacher. This is called unsupervised learning. The network learns a prototype based on the distribution of patterns in the training data. Such networks allow us to: – Discover underlying structure of the data – Encode or compress the data – Transform the data
  • 24. Unsupervised Learning – Hopfield Networks • A Hopfield network is a type of content- addressable memory – Non-linear system with attractor points that represent concepts – Given a fuzzy input the system converges to the nearest attractor • Possibility to have “spurious” attractors that is a blend of multiple stored patterns • Also possible to have chaotic patterns that never converge
  • 25. Standard Binary Hopfield Network • Recurrent; Every unit is connected to every other unit • Weights connecting units are symmetrical – wij = wji • If the weighted sum of the inputs exceeds a threshold, its output is 1 otherwise its output is -1 • Units update themselves asynchronously as their inputs change B C A D wAD wABwAC wBC wBD wCD
  • 26. Hopfield Memories • Setting the weights: – A pattern is a setting of on or off for each unit – Given a set of Q patterns to store • For every weight connecting units i and j: • This is a form of a Hebbian rule which makes the weight strength proportional to the product of the firing rates of the two interconnected units p j Q p p iij xxw ∑= = 1
  • 27. Hopfield Network Demo • http://www.cbu.edu/~pong/ai/hopfield/hopfieldap plet.html • Properties – Settles into a minimal energy state – Storage capacity low, only 13% of number of units – Can retrieve information even in the presence of noisy data, similar to associative memory of humans
  • 28. Unsupervised Learning – Self Organizing Maps • Self-organizing maps (SOMs) are a data visualization technique invented by Professor Teuvo Kohonen – Also called Kohonen Networks, Competitive Learning, Winner-Take-All Learning – Generally reduces the dimensions of data through the use of self-organizing neural networks – Useful for data visualization; humans cannot visualize high dimensional data so this is often a useful technique to make sense of large data sets
  • 29. Basic “Winner Take All” Network • Two layer network – Input units, output units, each input unit is connected to each output unit I1 I2 O1 O2 Input Layer Wi,j I3 Output Layer
  • 30. Basic Algorithm – Initialize Map (randomly assign weights) – Loop over training examples • Assign input unit values according to the values in the current example • Find the “winner”, i.e. the output unit that most closely matches the input units, using some distance metric, e.g. • Modify weights on the winner to more closely match the input ( ) 2 1 ∑= − n i iij IW For all output units j=1 to m and input units i=1 to n Find the one that minimizes: )(1 tt i t WXcW −=∆ + where c is a small positive learning constant that usually decreases as the learning proceeds
  • 31. Result of Algorithm • Initially, some output nodes will randomly be a little closer to some particular type of input • These nodes become “winners” and the weights move them even closer to the inputs • Over time nodes in the output become representative prototypes for examples in the input • Note there is no supervised training here • Classification: – Given new input, the class is the output node that is the winner
  • 32. Typical Usage: 2D Feature Map • In typical usage the output nodes form a 2D “map” organized in a grid-like fashion and we update weights in a neighborhood around the winner I1 I2 Input Layer I3 Output Layers O11 O12 O13 O14 O15 O21 O22 O23 O24 O25 O31 O32 O33 O34 O35 O41 O42 O43 O44 O45 O51 O52 O53 O54 O55 …
  • 33. Modified Algorithm – Initialize Map (randomly assign weights) – Loop over training examples • Assign input unit values according to the values in the current example • Find the “winner”, i.e. the output unit that most closely matches the input units, using some distance metric, e.g. • Modify weights on the winner to more closely match the input • Modify weights in a neighborhood around the winner so the neighbors on the 2D map also become closer to the input – Over time this will tend to cluster similar items closer on the map
  • 34. Updating the Neighborhood • Node O44 is the winner – Color indicates scaling to update neighbors Output Layers O11 O12 O13 O14 O15 O21 O22 O23 O24 O25 O31 O32 O33 O34 O35 O41 O42 O43 O44 O45 O51 O52 O53 O54 O55 )(1 tt i t WXcW −=∆ + c=1 c=0.75 c=0.5 Consider if O42 is winner for some other input; “fight” over claiming O43,O33, O53
  • 35. Selecting the Neighborhood • Typically, a “Sombrero Function” or Gaussian function is used • Neighborhood size usually decreases over time to allow initial “jockeying for position” and then “fine-tuning” as algorithm proceeds
  • 37. Kohonen Network Examples • Document Map: http://websom.hut.fi/websom/milliondemo/ html/root.html
  • 39. SOM for Classification • A generated map can also be used for classification • Human can assign a class to a data point, or use the strongest weight as the prototype for the data point • For a new test case, calculate the winning node and classify it as the class it is closest to • Handwriting recognition example: http://fbim.fh- regensburg.de/~saj39122/begrolu/kohonen.html
  • 40. Psychological and Biological Considerations of Neural Networks • Psychological – Neural network models learn, exhibit some behavior similar to humans, based loosely on brains – Create their own algorithms instead of being explicitly programmed – Operate under noisy data – Fault tolerant and graceful degradation – Knowledge is distributed, yet still some localization • Lashley’s search for engrams • Biological – Learning in the visual cortex shortly after birth seems to correlate with the pattern discrimination that emerges from Kohonen Networks – Criticisms of the mechanism to update weights; mathematically driven; feedforward supervised network unrealistic
  • 41. Connectionism • What’s hard for neural networks? Activities beyond recognition, e.g.: – Variable binding – Recursion – Reflection – Structured representations • Connectionist and Symbolic Models – The Central Paradox of Cognition (Smolensky et al., 1992): – "Formal theories of logical reasoning, grammar, and other higher mental faculties compel us to think of the mind as a machine for rule- based manipulation of highly structured arrays of symbols. What we know of the brain compels us to think of human information processing in terms of manipulation of a large unstructured set of numbers, the activity levels of interconnected neurons. Finally, the full richness of human behavior, both in everyday environments and in the controlled environments of the psychological laboratory, seems to defy rule-based description, displaying strong sensitivity to subtle statistical factors in experience, as well as to structural properties of information. To solve the Central Paradox of Cognition is to resolve these contradictions with a unified theory of the organization of the mind, of the brain, of behavior, and of the environment."
  • 42. Possible Relationships? • Symbolic systems implemented via connectionism – Possible to create hierarchies of networks with subnetworks to implement symbolic systems • Hybrid model – System consists of two separate components; low-level tasks via connectionism, high-level tasks via symbols
  • 43. Proposed Hierarchical Model • Jeff Hawkins • Founder: Palm Computing, Handspring • Deep interest in the brain all his life • Book: “On Intelligence” – Variety of neuroscience research as input – Includes his own ideas, theories, guesses – Increasingly accepted view of the brain
  • 44. The Cortex • Hawkins’s point of interest in the brain – “Where the magic happens” • Hierarchically-arranged in regions • Communication up the hierarchy – Regions classify patterns of their inputs – Regions output a ‘named’ pattern up the hierarchy • Communication down the hierarchy – A high-level region has made a prediction – Alerts lower-level regions what to expect
  • 45. Hawkins Quotes “The human cortex is particularly large and therefore has a massive memory capacity. It is constantly predicting what you will see, hear and feel, mostly in ways you are unconscious of. These predictions are our thoughts, and when combined with sensory inputs, they are our perceptions. I call this view of the brain the memory-prediction framework of intelligence.”
  • 46. Hawkins Quotes “Your brain constantly makes predictions about the very fabric of the world we live in, and it does so in a parallel fashion. It will just as readily detect an odd texture, a misshapen nose, or an unusual motion. It isn’t obvious how pervasive these mostly unconscious predictions are, which is perhaps why we missed their importance.”
  • 47. Hawkins Quotes “Your brain constantly makes predictions about the very fabric of the world we live in, and it does so in a parallel fashion. It will just as readily detect an odd texture, a misshapen nose, or an unusual motion. It isn’t obvious how pervasive these mostly unconscious predictions are, which is perhaps why we missed their importance.”
  • 48. Hawkins Quotes “Suppose when you are out, I sneak over to your home and change something about your door. It could be almost anything. I could move the knob over by and inch, change a round knob into a thumb latch, or turn it from brass to chrome…. When you come home that day and attempt to open the door, you will quickly detect that something is wrong.”
  • 49. Prediction • Prediction means that the neurons involved in sensing your door become active in advance of them actually receiving sensory input. – When the sensory input does arrive, it is compared with what is expected. – Two way communication; classification up the hierarchy, prediction down the hierarchy
  • 50. Prediction • Prediction is not limited to patterns of low-level sensory information like hearing and seeing • Mountcastle’s principle : we have lots of different neurons, but they basically do the same thing (particularly in the neocortex) – What is true of low-level sensory areas must be true for all cortical areas. The human brain is more intelligent than that of other animals because it can make predictions about more abstract kinds of patterns and longer temporal pattern sequences.”
  • 51. Visual Hierarchies • Lowest visual level inputs pixels • Second level recognizes edges, lines, etc from known patterns of pixels • Third level recognizes shapes from known patterns of edges, lines, etc • Fourth level recognizes objects from known patterns of shapes
  • 53. Not there yet… • Many issues remain to be addressed by Hawkins’ model – Missing lots of details on how his model could be implemented in a computer – Creativity? – Evolution? – Planning? – Rest of the brain, not just neocortex?
  • 54. Links and Examples • http://davis.wpi.edu/~matt/courses/soms/ap plet.html • http://websom.hut.fi/websom/milliondemo/ html/root.html • http://www.cis.hut.fi/research/som- research/worldmap.html • http://www.patol.com/java/TSP/index.html

Editor's Notes

  1. Recognizing pictures, making decisions, functioning with motor skills in reaction to environment, etc… Most abstract concepts at the top (the big ideas, concepts, etc) and the features, details, pixels of images at the bottom More on the naming shortly