The slide covers the basic concepts and designs of artificial neural networks. It explains and justifies the use of McCulloh Pitts Model, Adaline network, Perceptron algorithm, Backpropagation algorithm, Hopfield network and Kohonen network; along with its practical applications.
2. Artificial Neural Network (ANN)
- Neural network is the network of neurons as found in the brain
- ANN is the network of artificial neurons that approximately represent the
parts of the brain
- Artificial neurons are the approximation of the brain neurons using
physical device or mathematical model
3. Human brain and ANN
- In ANN, the brain is modelled as:
a) Neuron → soma
b) Input → dendrite
c) Output → axon
d) Weight → synapse
- Neuron encode activation or output as a series of electrical pulses
- Soma processes incoming activations and converts into output activations
- Dendrites receive activation from other neurons
- Axon transmits output activation to other neurons
- The junction between axon and dendrite is called synapse
4. Basic Terminologies
Weighting Factor (w)
- The value given to each input to determine its strength
- Neuron computes the weighted sum of input and compares with
threshold
- If sum is less than threshold, neuron outputs -1. Else, the neuron activates
and output +1
Threshold
- The minimum value required by the node as inputs in order to activate
the node
5. Basic Terminologies
Activation Function (f)
- The function that performs mathematical operation on the signal output
- Sign function, Sigmoid function, Step function, Linear Function
- Sign function : f = +1 (x>=t) or -1
- Step function : f = +1 (x>=t) or 0
- Sigmoid function : f = (1 / (1 + e-X))
- Linear function : f = X
10. Applications of Neural Network
1. Financial Modelling - Stock Market Value Prediction
2. Robotics - Automatic adaptable robot
3. Data Analysis
4. Predictive Model - Weather prediction
5. Bioinformatic Application - DNA Sequencing
11. Learning Process
1. Supervised Learning:
- You will have labelled training data
- You will classify the inputs into one of the label
1. Unsupervised Learning:
- You will have unlabelled data
- You will cluster the inputs into different classes
12. Learning Rate
- Constant that affects the speed of learning
- Affects the performance of the algorithm
- High learning rate : accuracy will be low but trains faster
- Low learning rate : accuracy will be high but trains slowly
- Learning rate should be adjusted as per necessity after complete analysis
of the system
13. Activation Function (f)
- Performs mathematical operation on the signal output
- Step Function : f(x) = 1 if x >= T ; 0 otherwise
- Sign Function : f(x) = 1 if x >= T ; -1 otherwise
- Sigmoid Function : f(x) = 1 / (1 + e-x)
- Linear Function : f(x) = x
14. McCulloch-Pitts Model
- Early model of neural network
- Introduced by Warren McCulloch and Walter Pitts in 1943
- Also known as linear threshold gate
- It models a neuron with a set of inputs I1, I2,, ……, In and one output Y.
- Classify the set of inputs into two different classes. So, the output is
binary.
- Mathematically, it is modelled as:
sum = I1 * w1 + I2 * w2 + …….. + In * wn
Y = f(sum)
Where, w1, w2, ….., wn are weights value in range (0, 1) or (-1, 1)
17. McCulloch-Pitts Model (Example - AND Gate)
Inputs : I1, I2
Output : Y
Threshold Function :
f(x) = 0 ; x < T
= 1; x >= T
Where, T is the threshold value
18. Adaline Network
- It can have inputs -1 or +1
- It will produce output as -1 or +1
- It uses a bias input
- Training of weight is done using delta rule (least mean squares)
- During training, activation function is identity function
- After training, activation function is threshold function
19. Adaline Network (Algorithm)
1. Initialize weights to small random values and select learning rate (alpha)
2. For each input vector s, with target output t, set input to s.
3. Compute neuron inputs: y_in = b + summation(xi * wi)
4. Use delta rule to update bias and weights.
b(new) = b(old) + alpha * (t - y_in)
wi(new) = wi(old) + alpha * (t - y_in) * xi
1. Repeat from step 2 until largest weight change across all training samples is less than a
specified tolerance value.
20. Adaline Network (Realization of AND Gate)
Assume: learning rate = 0.1
Tolerance = 0.1
Assume a two inputs AND Gate, which is true only if all the inputs are true.
So, inputs are I1, I2 and Bias(B)
Activation Function:
Y = 1 ; y_in >= 0
= -1 ; y_in < 0
Initialization:
Set weights to small random values to each of the input.
w1 = 0.2 w2 = 0.3 b = 0.1
22. Adaline Network (Realization of AND Gate)
First Cycle:
# Run 2
Inputs = [1, 1, -1]
Y_in = 0.04
b(new) = 0.04
w1(new) = 0.14
w2(new) = 0.44
So,
Largest weight change = 0.1
23. Adaline Network (Realization of AND Gate)
First Cycle:
# Run 3
Inputs = [1, -1, 1]
Y_in = 0.34
b(new) = -0.09
w1(new) = 0.27
w2(new) = 0.31
So,
Largest weight change = 0.13
24. Adaline Network (Realization of AND Gate)
First Cycle:
# Run 4
Inputs = [1, -1, -1]
Y_in = -0.67
b(new) = -0.27
w1(new) = 0.43
w2(new) = 0.47
So,
Largest weight change = 0.16
25. Adaline Network (Realization of AND Gate)
As, all the runs in the first cycle do not have largest weight change < tolerance value, we
proceed to second cycle.
26. Adaline Network (Realization of AND Gate)
Second Cycle:
# Run 1
Inputs = [1, 1, 1]
Y_in = 0.63
b(new) = -0.233
w1(new) = 0.46
w2(new) = 0.5
So,
Largest weight change = 0.03
27. Adaline Network (Realization of AND Gate)
Second Cycle:
# Run 2
Inputs = [1, 1, -1]
Y_in = -0.273
b(new) = -0.3
w1(new) = 0.39
w2(new) = 0.57
So,
Largest weight change = 0.07
28. Adaline Network (Realization of AND Gate)
Second Cycle:
# Run 3
Inputs = [1, -1, 1]
Y_in = -0.12
b(new) = -0.38
w1(new) = 0.47
w2(new) = 0.48
So,
Largest weight change = 0.09
29. Adaline Network (Realization of AND Gate)
Second Cycle:
# Run 4
Inputs = [1, -1, -1]
Y_in = -1.33
b(new) = -0.34
w1(new) = 0.43
w2(new) = 0.44
So,
Largest weight change = 0.04
30. Adaline Network (Realization of AND Gate)
In second cycle, largest weight change across all the training samples is less than the
tolerance value.
So, the solution is:
B = -0.34
W1 = 0.43
W2 = 0.44
33. Significance of Bias
- On training, the weights will be changed.
- But, the effect will only be the change in steepness of the
curve.
- The curve will always pass through origin
- What if, our problem should be defined such that it
should act as the same curve but that should be shifted
vertically.
- For such situations, bias play an important role
34. Perceptron Network
- Single layer Feed forward neural network
- Primary use - binary classification
- Able to learn any linearly separable function
- Activation function : Step function
35. Perceptron Network (Algorithm)
1. Initially random weights are assigned to input variables in
the range [-0.5, 0.5]
2. Inputs are provided and the output is observed.
3. Weights are adjusted if error is present using the rule:
Wi = Wi + (alpha * Xi * e)
1. This process is continued until a single epoch has no
error for all input set.
36. Perceptron Network (Realization of AND Gate)
Assume: learning rate = 0.1
threshold = 0.2
Assume a two inputs AND Gate, which is true only if all the inputs are true.
So, inputs are I1, I2
Activation Function:
Y = 1 ; y_in >= 0
= 0 ; y_in < 0
Initialization:
Set weights to small random values to each of the input.
w1 = 0.3 w2 = -0.1
42. Perceptron Network (Realization of AND Gate)
In epoch 5, there is no existence of errors.
So, the desired solution is:
W1 = 0.1
W2 = 0.1
43. Back Propagation Network
- Multilayer feed forward neural network
- Calculates gradient of loss function with respect to all
weights in the network
- Gradient provided to the optimization method which
updates the weight to minimize loss
- Two phase : Propagation and weight update
44. Back Propagation Network (Algorithm)
Assumption:
A = no of units in input layer
C = no of units in output layer
B = no of units in hidden layer
xi = activation level of units in input layer
yi = activation level of units in hidden layer
oi = activation level of units in output layer
w1ij = weight from input to hidden layer
w2ij = weight from hidden to output layer
45. Back Propagation Network (Algorithm)
1. Initialize weights randomly between [-0.1, 0.1]
2. Initialize activations of thresholding units (x0 and h0)
3. Choose input-output pair (xi and yi). Assign activation
level to input units.
4. Propagate activations from input layer to hidden layer
using activation function:
h(j)= 1 / [1 + e ^ {- Summation i = 0 to A (w1ij * x1)}] --------> for j=1,………..B
1. Propagate activations from hidden layer to output layer
using activation function:
o(j)= 1 / [1 + e ^ {- Summation i = 0 to B (w2ij * hi)}] --------> for j=1,………..C
46. Back Propagation Network (Algorithm)
6. Compute errors of units in output layer (d2j)
d2j = oj * (1 - oj) * (yj - oj) …. For j = 1 to C
7. Compute errors of units in hidden layer (d1j)
d1j = hj * (1 - hj) * {Sum(1 to C) [d2i * w2ji]} For j = 1 to
C
8. Adjust weights between hidden layer and output layer
w2ij (new) = (alpha * d2j * hi) + w2ij
47. Back Propagation Network (Algorithm)
9. Adjust weights between input layer and hidden layer
w1ij (new) = (alpha * d1j * xi) + w1ij
10. Repeat from step 4 until converges
48. Back Propagation Network (Application)
- To design neural network for linearly inseparable
functions.
49. Hopfield Network
- A network with memory
- Features of Hopfield network are:
a) Distributed representation
b) Asynchronous control
c) Content addressable memory
d) Fault Tolerance
50. Hopfield Network (Steps)
- Processing units are in one of two states: active or
inactive
- A positive weight indicates two units tend to activate
each other
- A negative weight indicates an active unit deactivates a
neighbouring unit
51. Hopfield Network (Algorithm)
1. A random unit is chosen.
2. If any of its neighbors are active, the unit computes sum
of weights on the connections to active neighbours.
3. If the sum is positive, the unit becomes active
4. Otherwise, it becomes inactive
5. Repeat from step 1, until it reaches stable state.
53. Kohonen Network
- Unsupervised learning network based on concept of
graded learning
- Feed forward network with input and output(Kohonen)
layers
- Every neuron of input layer is connected to every neuron
of Kohonen layer
- Each connection is associated with some weight.
- Also known as self organizing map
54. Kohonen Network (Algorithm)
1. Initialize all the weights randomly
2. Initialize neighbourhood and learning rate
3. For all input vector at random:
a) Select input vector at random
b) Find Kohonen neuron j that has its associated weight
vector closest to input vector
c) Modify weights of all neurons in neighbourhood of radius
r of selected neuron using:
wj(t+1) = wj(t) + alpha * [x(t) - wj(t)]
55. Kohonen Network (Algorithm)
d) Update the value of alpha by reducing it gradually
e) Reduce neighbourhood radius r gradually
56. Kohonen Network (Practical Application)
1. Clustering of genes in Medical field
2. Analysis of multimedia and web based contents
3. Analysis of remote satellite sensed images