Neural Networks Explained

Neural Networks
Adri Jovin J J, M.Tech., Ph.D.
UITE221- SOFT COMPUTING

McCulloch-Pitts Neuron
• Discovered in 1943
• Usually called M-P Neuron
• Connected by directed weighted paths
• Binary
• Has both excitatory connections and inhibitory connections
𝑓 𝑦𝑚 =
1 𝑖𝑓 𝑦𝑖𝑛 ≥ 𝜃
0 𝑖𝑓 𝑦𝑖𝑛 < 𝜃
UITE221 SOFT COMPUTING 2
Y
x1
x2
xn
Xn+1
Xn+m
-p
-p
w
w
w
y

McCulloch-Pitts Neuron
• For inhibition
𝜃 > 𝑛𝑤 − 𝑝
• For firing
𝑘𝑤 ≥ 𝜃 > 𝑘 − 1 𝑤
• No particular training algorithm
• Analysis has to be done to determine the values of the weights and the threshold
• Used to make a neuron perform a simple logic function

Linear Separability
• ANN does not give exact solution for a nonlinear problems rather it provides possible approximate solutions
• Linear separability: separation of input space into regions based on whether the network response is positive or not
• Decision line (decision-making line or decision-support line or linear separable line) separates the positive and
negative responses
𝑦𝑖𝑛 = 𝑏 +
𝑖=1
𝑛
𝑥𝑖𝑤𝑖

Hebb Network
Hebb Rule: 𝑤𝑖 𝑛𝑒𝑤 = 𝑤𝑖 𝑜𝑙𝑑 + 𝑥𝑖𝑦
The Algorithm:
Step 0 : Initialize the weights. Basically in this network they may be set to zero,
i.e., 𝑤𝑖 = 0 𝑓𝑜𝑟 𝑖 𝑡𝑜 𝑛
Step 1 : Steps 2 – 4 have to be performed for each input training vector and the target output
pair, s:t
Step 2 : Input units activations are set. Generally, the activation function of the input layer us
identify function: 𝑥𝑖 = 𝑠𝑖 𝑓𝑜𝑟 𝑖 = 1 𝑡𝑜 𝑛
Step 3 : Output units activations are set: 𝑦 = 𝑡
Step 4 : Weight adjustments and bias adjustments are performed:
𝑤𝑖(𝑛𝑒𝑤) = 𝑤𝑖(𝑜𝑙𝑑) + 𝑥𝑖𝑦
𝑏(𝑛𝑒𝑤) = 𝑏(𝑜𝑙𝑑) + 𝑦
Start
Initialize Weights
For
each
𝑠: 𝑡
Activate input units
𝑥𝑖 = 𝑠𝑖
Activate output units
𝑦 = 𝑡
Weight update
𝑤𝑖(𝑛𝑒𝑤) = 𝑤𝑖(𝑜𝑙𝑑) + 𝑥𝑖𝑦
Bias update
𝑏(𝑛𝑒𝑤) = 𝑏(𝑜𝑙𝑑) + 𝑦
Stop
No
Yes

Perceptron Networks
• Also known as simple perceptron
• Single-layer feed-forward network
• Consist of 3 units: sensory unit (input unit), associator unit (hidden unit) and response unit (output unit)
• Sensory units are connected to associator units with fixed weights having values 1, 0, -1 (assigned randomly)
• Binary activation function is used in sensory and associator unit
• Response unit has an activation of 1, 0, -1. The binary step with fixed threshold 𝜃 is used as activation for associator.
• Output signals from associator unit to the response unit is binary

Perceptron Networks
• The output of the perceptron network is given by
𝑦 = 𝑓(𝑦𝑖𝑛)
where 𝑓(𝑦𝑖𝑛) is the activation function which is defined as
𝑓 𝑦𝑖𝑛 =
1 𝑖𝑓 𝑦𝑖𝑛 > 𝜃
0 𝑖𝑓 − 𝜃 ≤ 𝑦𝑖𝑛 ≤ 𝜃
−1 𝑖𝑓 𝑦𝑖𝑛 < −𝜃
• The perceptron learning rule is used in the weight update between the associator unit and the response unit
• For each training input the network will calculate the response and it will determine whether or not an error has
occurred
• Error calculation is based on the comparison of the values of targets with those of the calculated outputs
• The weights on the connections from the units that send the nonzero signal will get adjusted suitably

Perceptron Networks
The weights will be adjusted based on the learning rule if an error has occurred for a particular training pattern
𝑤𝑖 𝑛𝑒𝑤 = 𝑤𝑖 𝑜𝑙𝑑 + 𝛼𝑡𝑥𝑖
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + 𝛼𝑡
• If there is no error, no weight update takes place and hence the training process may be stopped
• In the above equation the target value 𝑡 is +1 or −1 and 𝛼 is the learning rate
• The associator unit is found to consist if a set of sub-circuits called feature predicates
• Feature predicates are hardwired to detect the specific feature of a pattern and are equivalent to the feature detectors
• The weights present in the input layers are all fixed, while the weights on the response unit are trainable

Original Perceptron Networks
x x x
x
x
𝑥1
𝑥2
𝑥𝑛
.
.
.
𝑌1
𝑌2
𝑌𝑚
.
.
.
.
.
.
.
.
.
𝑥1
𝑥2
𝑥𝑛
𝑦1
𝑦2
𝑦𝑚
𝑡1
𝑡2
𝑡𝑛
𝜃1
𝜃2
𝜃𝑛
𝑤11
𝑤12
𝑤1𝑚
𝑤21
𝑤22
𝑤2𝑚
𝑤𝑛1
𝑤𝑛2
𝑤𝑛𝑚
Sensory unit sensor grid
representing any pattern
Fixed weight
value of 1, 0,
-1 at random
Output 0 or 1 Output 0 or 1 Desired output

Perceptron Learning Rule
The activation function applied over the network input is as follows:
𝑓 𝑦𝑖𝑛 =
0 𝑖𝑓 − 𝜃 ≤ 𝑦𝑖𝑛 ≤ 𝜃
−1 𝑖𝑓 𝑦𝑖𝑛 < −𝜃
The update of weight in case of perceptron is as shown below:
If 𝑦 ≠ 𝑡, then
𝑤(𝑛𝑒𝑤) = 𝑤(𝑜𝑙𝑑) + 𝛼𝑡𝑥
else, we have
𝑤(𝑛𝑒𝑤) = 𝑤(𝑜𝑙𝑑)

Architecture
Y
1
x1
xi
Xn
wn
wi
w1
b
y
x1
xi
xn
x0

Flowchart
Start Initialize weights & bias Set 𝛼(0 𝑡𝑜 1)
For
each
𝑠: 𝑡
Activate input units
xi=si
Calculate net
input yin
Apply activation, obtain
y=f(yin)
if y!=t
wi(new) = wi(old) + αtx
b(new) = b(old) + αt
wi(new) = wi(old)
b(new) = b(old)
If weight
changes Stop
Yes
Yes
Yes
No
No
No

Perceptron Network Testing Algorithm
Step 0 : The initial weights to be used here are taken from the training algorithms (the final weights obtained during training).
Step 1 : For each input vector X to be classified, perform steps 2-3.
Step 2 : Set activations of the input unit.
Step 3 : Obtain the response of the output unit
𝑦𝑖𝑛 =
𝑖=1
𝑛
𝑥𝑖𝑤𝑖
𝑦 = 𝑓 𝑦𝑖𝑛 =
0 𝑖𝑓 − 𝜃 ≤ 𝑦𝑖𝑛 ≤ 𝜃
−1 𝑖𝑓𝑦𝑖𝑛 < −𝜃

Adaptive Linear Neuron (ADALINE)
• Network with single linear unit
• The input-output relationship is linear
• Uses bipolar activation
• Trained using delta rule, also known as Least Mean Square (LMS) rule or Widrow-Hoff rule
• Learning rule minimize the mean-squared error between the activation and target value
• Delta rule for adjusting the weight of the ith pattern (i = 1 to n)
∆𝑤𝑖 = 𝛼 𝑡 − 𝑦𝑖𝑛 𝑥𝑖
Change of weight rate of learning net input to output unit Vector of activation of input unit

Architecture
Σ
1
x1
x2
Xn
wn
w2
w1
b
Y
x1
x2
xn
x0= 1
f(yin)
Adaptive
Algorithm
Output error
generator
𝜃 = 𝑡 − 𝑦𝑖𝑛
𝑦𝑖𝑛 = 𝑥𝑖𝑤𝑖
𝑦𝑖𝑛
t

Flowchart
Start
Set initial values weights
and bias, learning state
𝑤, 𝑏, 𝛼
Input the specified tolerance
error 𝐸𝑠
For
each
𝑠: 𝑡
Activate input layer units
𝑥𝑖 = 𝑠𝑖(𝑖 = 1 𝑡𝑜 𝑛)
Calculate net input
𝑦𝑖𝑛 = 𝑏 + 𝑥𝑖𝑤𝑖
Weight update
𝑤𝑖 𝑛𝑒𝑤 = 𝑤𝑖 𝑜𝑙𝑑 + 𝛼(𝑡 − 𝑦𝑖𝑛)𝑥𝑖
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + 𝛼(𝑡 − 𝑦𝑖𝑛)
Calculate error
𝐸𝑖 = (𝑡 − 𝑌𝑖𝑛)2
If
𝐸𝑖 = 𝐸𝑠
Stop
Yes
Yes
No
No

Training Algorithm

Testing Algorithm

Multiple Adaptive Linear Neuron
• Multiple adaptive linear neurons (MADLINE) consist of
many ADALINE in parallel with a single output unit
• Output is based on certain selection rules
• Majority vote rule – answer is true or false
• AND rule – true if and only if both inputs are true
• Weights from Adaline to Madaline are fixed whereas
weights between input and Adaline layer are adjusted
during the training process

Flow chart

Algorithm

Algorithm(Contd…)

Back-Propagation Network
Most important development in neural networks (Bryson and Ho, 1969; Werbos, 1974; Lecun, 1985; Parker, 1985;
Rumelhart, 1986)
Applied to multilayer feed-forward networks consisting of processing elements with continuous differentiable activation
functions
Networks associated with back-propagation learning algorithm are called back-propagation networks (BPNs)
Provides procedure for change of weights in BPN to classify patterns correctly
Memorization and Generalization

Architecture

Flowchart

Training Algorithm

Training Algorithm (Contd…)

Learning Factors
• Initial Weights
• Learning Rate
• Momentum Factor
• Generalization
• Number of training data
• Number of hidden layers

Testing Algorithm

Radial Basis Function Network
• Radial Basis Function (RBF) is a classification and functional approximation Neural Network developed by M.J.D.
Powell
• Uses either sigmoidal or Gaussian kernel functions

Flowchart

Training Algorithm

Time Delay Neural Network
• Respond to sequence of patterns
• Produce a particular output sequence in response to particular sequence of inputs
Time delay neural network (FIR Filter) TDNN with output feedback (IIR Filter)

Associative Memory Networks
• Can store a set of patterns as memories
• Content-addressable memories (CAM)
• Associates data to address

Training algorithms for pattern association
Hebb Rule

Auto-associative Memory Network
• Training input and the target output vectors are the same
• Determination of weights of the net is called storing of vectors

Flowchart

Training Algorithm

Testing Algorithm

Heteroassociative Memory Network
• The training input and the target output are different
• The weights are determined such that the net can store a set of pattern associations
• Determination of weight is done either by Hebb rule or delta rule

Testing Algorithm

Bidirectional Associative Memory (BAM)
• Developed by Kosko in 1988
• Performs forward and backward associative searches
for stored stimulus responses
• Recurrent pattern-matching network that encodes
binary or bipolar patterns using Hebbian rule
• Two types: discrete and continuous

Testing algorithm (Discrete BAM)

Hopfield Networks
Developed by John J. Hopfield in 1982
Conforms to the asynchronous nature of biological neurons
Promoted the design of the first analog VLSI neural chip
Two types: discrete and continuous

Architecture of Discrete Hopfield Net

Training algorithm

Testing Algorithm

Continuous Hopfield Network

Iterative Auto-associative Memory Networks
• Also known as recurrent auto-associative networks
• Developed by James Anderson in 1977
• Brain-in-the-box model, an extension described in 1972
• An activity pattern inside the box receives positive feedback on certain components which has the effect of forcing it
outward

Training Algorithm

Testing Algorithm

Unsupervised learning networks
Kohonen self-organizing feature maps
Learning Vector Quantization
Counter Propagation networks
Adaptive Resonance Theory network

Kohonen Self-organizing Feature Maps
Feature Mapping – converts patterns of arbitrary dimensionality into a response of one- or two- dimensional neuron
(conversion of wide pattern space into a feature space)

Architecture

Flowchart

Training Algorithm

Kohonen Self-organizing Motor Map

Learning Vector Quantization
Process of classifying the patterns where each output unit represents a particular class
For each class several units should be used
The output weight vector is called the reference vector or code book vector for the class which the unit vector
represents
Special case of competitive net
Minimizes misclassification
Used in Optical Character Recognition, speech processing etc.

Architecture

Flowchart

Training Algorithm

Variants
• LVQ 2
• LVQ 2.1
• LVQ 3

Counter Propagation Networks
Proposed by Hecht Nielsen in 1987
Based on input, output and clustering layers
Used in data compression, function approximation and pattern association
Constructed from instar-outstar model

Full Counter Propagation Net

Phase-I of Full CPN

Phase-II of Full CPN

Flowchart

Training Algorithm

Testing Algorithm

Forward-only CP Net

Flowchart

Training Algorithm

Testing Algorithm

Adaptive Resonance Theory Network
Developed by Steven Grossberg and Gail Carpenter in 1987
Based on competition
Finds categories anonymously and learns new categories if needed
Two types: ART 1: designed for clustering binary vectors
ART 2: designed to accept continuous-valued vectors

Fundamental Algorithm

ART 1
Made up of two units:
1. Computational Units
2. Supplemental Units
Supplemental part

Flowchart

Training Algorithm

ART 2
Supplemental part

Training Algorithm

References
Rajasekaran, S., & Pai, G. V. (2017). Neural Networks, Fuzzy Systems and Evolutionary Algorithms: Synthesis and
Applications. PHI Learning Pvt. Ltd..
Haykin, S. (2010). Neural Networks and Learning Machines, 3/E. Pearson Education India.
Sivanandam, S. N., & Deepa, S. N. (2007). Principles of soft computing. John Wiley & Sons.

Neural Networks Explained

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Neural Networks Explained

Similar to Neural Networks Explained (20)

More from Adri Jovin

More from Adri Jovin (20)

Recently uploaded

Recently uploaded (20)

Neural Networks Explained