ARTIFICIAL NEURAL NETWORKS

ARTIFICIAL NEURAL NETWORKS
AIMS Education

Introduction
• Simple computational elements forming a large
network
– Emphasis on learning (pattern recognition)
– Local computation (neurons)
• Configured for a particular application
– Pattern recognition/data classification
• ANN algorithm
– Modeled after brain
• Brain: 100,000 times slower response
– Complex tasks (image, sound recognition, motion con)
– –10,000,000,000 times efficient in energy
consumption/op
AIMS Education

Introduction (Contd….)
• Artificial Intelligence
• Structure
– Inputs vs Dendrites
– Weights vs Synaptic gap
– Neurons vs Soma
– Output vs Axon
AIMS Education

AIMS Education

Definition
A neural network is a massively parallel
distributed processor made up of simple
processing units, which has a natural tendency
for storing experiential knowledge and making
it available for use
AIMS Education

• The threshold value determine the final output
– If the summation < threshold, -1 is the output
– If the summation > threshold, +1 is the output
AIMS Education

• The neuron is the basic information
processing unit of an ANN.
• It consists of:
– A set of links, describing the neuron inputs, with
weightsW1, W2, …, Wm
– An adder function (linear combiner) for computing
the weighted sum of the inputs (real numbers):
AIMS Education

– Activation function(squashing function) for
limiting the amplitude of the neuron output.
• The bias b has the effect of applying an affine
transformation to the weighted sum u
v = u + b
AIMS Education

Activation Functions
AIMS Education

Designing an ANN
• Designing an ANN consist of
– Arranging neurons in various layers
– Deciding the type of connections among neurons for
different layers, as well as among the neurons within a
layer
– Deciding the way a neuron receives input and
produces output
• Determining the strength of connection within
the network by allowing the network learn the
appropriate values of connection weights by
using a training data set.
• The process of designing a neural network is an
iterative process
AIMS Education

Designing an ANN (Contd….)
• Layers
– Single layer: Input and output layer
• No computation at the input layer
• Computation takes place at the output layer
– Multi layer: Input, hidden(s) and output layers
• Computations are done at the hidden(s) and the output layer
• Connection
– Fully connected
• Each neuron on the first layer is connected to every neuron on
the second layer
– Partially connected
• A neuron of the first layer does not have to be connected to all
neurons on the second layer
AIMS Education

Designing an ANN (Contd….)
• Very complex webs of interconnected neurons
– Simple interconnected units in ANNs
• Each unit takes in a number of real-valued
inputs
– Possibly outputs of other units
• Produces a single real-valued output
– May become the input to many other units
AIMS Education

Appropriate problems for Neural
Network Learning
• Instances are represented by many attribute-
value pairs
– The target function to be learned is defined over
instances that can be described by a vector of
predefined features
– Input attributes may be highly correlated or
independent of one another
– Input values can be any real values
AIMS Education

Network Learning (Contd….)
• The target function output may be discrete-
valued, real-valued, or a vector of several real- or
discrete-valued attributes
• Training examples may contain errors
– Robust to noisy data
• Long training times are acceptable
– Network training algorithms typically require longer
training times
– Depends on factors such as
• The number of weights in the network
• The number of training examples considered
AIMS Education

Network Learning (Contd….)
• Fast evaluation of the learned target function
may be required
– Learning times are relatively long, evaluating the
learned network, in order to apply it to a
subsequent instances, is typically very fast
• The ability of humans to understand the
learned target function is not important
– Weights learned are often difficult for humans to
interpret
AIMS Education

Perceptron (Contd….)
• Takes a vector of real-valued inputs
• Calculates a linear combination of these
inputs
• Outputs a 1 if the result is greater than some
threshold
• Outputs a -1 otherwise
• Weights: An information that allows ANN to
achieve the desired results. This information
changes during the learning process
AIMS Education

• Given inputs 𝑥1 through 𝑥𝑛, the output 𝑜(𝑥
1,…,n𝑛) computed by the perceptron is
𝑜(𝑥1,…,𝑥𝑛)= 1 𝑖𝑓
𝑤0+ 𝑤1𝑥1+ 𝑤2𝑥2+ …+ 𝑤𝑛𝑥𝑛 > 0
−1 other𝑤𝑖𝑠𝑒
• Where each 𝑤𝑖 is a real-valued constant, or
weight, that determines the contribution of
input 𝑥𝑖 to the perceptron output
AIMS Education

• An additional constant input 𝑥0=1, allowing us
to write the inequality as
∑i=1 to n 𝑤𝑖𝑥𝑖>0
In vector form as 𝑤.𝑥 > 0
• For brevity, we will sometimes write the
perceptron function as 𝑜(𝑥) =𝑠𝑔𝑛(𝑤.𝑥) where
𝑠𝑔𝑛(𝑦)= 1 𝑖𝑓 𝑦 > 0
-1 otherwise
AIMS Education

Representational Power of
Perceptrons
• A perceptron can be seen as representing a
hyper plane decision surface in the 𝑛-
dimensional space of instances (i.e., points)
• It outputs a 1 for instances lying on one side of
the hyper place
• Outputs a −1 for instances lying on the other
side
AIMS Education

Perceptrons (Contd….)
AIMS Education

Perceptrons (Contd….)
• The equation for the decision surface is 𝑤.𝑥 =0
• Some sets of positive and negative examples
cannot be separated by any hyperplace
• The ones that can be separated are called
linearly separable sets of examples
AIMS Education

Perceptron Training Rule
• How to learn the weights for a single perceptron?
• The task is to learn a weight vector that causes
the perceptron to produce the correct ±1 output
for each of the given training examples
• Perceptron rule and delta rule algorithms
– Provide the basis for learning networks of many units
AIMS Education

Perceptron Training Rule (Contd….)
• One way to learn an acceptable weight vector is to
– Begin with random weights
– Iteratively apply the perceptron to each training
example
– Modify the perceptron weights whenever it
misclassifies an example
– The process is repeated
• Iterating through the training examples as many times
as needed
• Until the perceptron classifies all training examples
correctly
AIMS Education

• Weights are modified at each step according to the
perceptron training rule
• Here 𝑡 is the target output for the current training example
• 𝑜 is the output generated by the perceptron
• 𝜂 is a positive constant called the learning rate
– Moderates the degree to which weights are changed at each
step
– Usually set to some small value (e.g., 0.1)
– Sometimes made to decay as the number of weight-tuning
iterations increases
AIMS Education

• Why should it converge to successful weight
values?
– Suppose the training example is correctly classified
already by the perceptron
– In this case 𝑡 −𝑜 is zero
– Makes Δ𝑤𝑖 zero
– No weights are updated
– Suppose it outputs a -1 when the target output is +1
– Weights must be altered to increase the value of (𝑤.𝑥)
• For example, if 𝑥𝑖>0, then increasing 𝑤𝑖 will bring the
perceptron closer to correctly classifying in this example
• Can be shown to converge within a finite number of
applications of the perceptron training rule
AIMS Education

Gradient Descent and the Delta Rule
• Perceptron rule works fine when the training
examples are linearly separable
– Otherwise can fail to converge
• Delta rule is defined to overcome this hurdle
• If training examples are not linearly separable
• Delta rule converges to the best fit
approximation to the target concept
AIMS Education

Delta Rule (Contd….)
• Becomes the basis for learning interconnected
networks (multilayer network)
• Training an unthresholded perceptron, a linear
unit for which the output 𝑜 is given by
– 𝑜(𝑥) = (w.x)
– It corresponds to the first stage of a perceptron,
without the threshold
AIMS Education

Delta Rule (Contd….
• Training error (weight vector), relative to the
training examples
• Where 𝐷 is the set of training examples
• 𝑡𝑑 is the target output for the training example 𝑑
• 𝑜𝑑 is the output of the linear unit for training
example 𝑑
• is simply half the squared difference
between the target output 𝑡𝑑 and the linear unit
output 𝑜𝑑 summed over all training examples
AIMS Education

ARTIFICIAL NEURAL NETWORKS

More Related Content

What's hot

Similar to ARTIFICIAL NEURAL NETWORKS

More from AIMS Education

ARTIFICIAL NEURAL NETWORKS