UNIT I INTRODUCTION
Neural Networks-Application Scope of Neural Networks-Artificial Neural Network: An IntroductionEvolution of Neural Networks-Basic Models of Artificial Neural Network- Important Terminologies of
ANNs-Supervised Learning Network.
3. Unit – I Introduction
Neural Networks- Application Scope of Neural Networks
ANN-: An Introduction – Evolution of Neural Networks- Basic Models of ANN- Important
Terminology ANN’s- Supervised Learning Network.
1. INTRODUCTION TO NEURAL NETWORK:
Definition 1: Neural Networks are computational models that mimic the complex, functions of the
human brain.
Definition 2: The neural networks consist of interconnected nodes or neurons that process and
learn from data, enabling tasks such as pattern recognition and decision making
in machine learning.
Definition 3: A neural network is a method in artificial intelligence that teaches computers to
process data in a way that is inspired by the human brain.
EVOLUTION OF NEURAL NETWORKS:
Since the 1940s, there have been a number of noteworthy advancements in the field of neural
networks:
1940s-1950s: Early Concepts: Neural networks began with the introduction of the first
mathematical model of artificial neurons by McCulloch
and Pitts. But computational constraints made progress
difficult.
1960s-1970s: Perceptron’s: This era is defined by the work of Rosenblatt on
perceptron’s. Perceptron’s are single-layer networks whose
applicability was limited to issues that could be solved
linearly separately.
1980s: Back propagation and Connectionism: Multi-layer network training was made
possible by Rumelhart, Hinton, and Williams’ invention of
the back propagation method. With its emphasis on learning
through interconnected nodes, connectionism gained appeal.
1990s: Boom and Winter: With applications in image identification, finance, and other
fields, neural networks saw a boom. Neural network research
did, however, experience a “winter” due to exorbitant
computational costs and inflated expectations.
2000s: Resurgence and Deep Learning: Larger datasets, innovative structures, and
enhanced processing capability spurred a comeback. Deep
learning has shown amazing effectiveness in a number of
disciplines by utilizing numerous layers.
4. 2010s-Present: Deep Learning Dominance: Convolutional neural networks (CNNs) and
recurrent neural networks (RNNs), two deep learning
architectures, dominated machine learning. Their power was
demonstrated by innovations in gaming, picture recognition,
and natural language processing.
ARCHITECTURE OF NEURAL NETWORK:
Neural networks extract identifying features from data, lacking pre-programmed
understanding.
Network components include neurons, connections, weights, biases, propagation
functions, and a learning rule.
Neurons receive inputs, governed by thresholds and activation functions. Connections
involve weights and biases regulating information transfer.
Learning, adjusting weights and biases, occurs in three stages:
Input computation,
Output generation,
Iterative refinement
enhancing the network’s proficiency in diverse tasks.
These include:
1. The neural network is simulated by a new environment.
2. Then the free parameters of the neural network are changed as a result of this
simulation.
3. The neural network then responds in a new way to the environment because of the
changes in its free parameters.
HOW DOES NEURAL NETWORKS WORK?
Example: Email Classification
Let’s understand with an example of how a neural network works: Consider a neural network
for email classification.
The input layer takes features like email content, sender information, and subject.
5. These inputs, multiplied by adjusted weights, pass through hidden layers.
The network, through training, learns to recognize patterns indicating whether an email is
spam or not.
The output layer, with a binary activation function, predicts whether the email is spam (1)
or not (0).
As the network iteratively refines its weights through back propagation, it becomes adept at
distinguishing between spam and legitimate emails, showcasing the practicality of neural
networks in real-world applications like email filtering.
WORKING OF A NEURAL NETWORK
Neural networks are complex systems that mimic some features of the functioning of the
human brain.
It is composed of an input layer, one or more hidden layers, and an output layer made up of
layers of artificial neurons that are coupled.
The two stages of the basic process are called
a. Forward propagation
b. Back propagation.
a. Forward propagation:
Input Layer: Each feature in the input layer is represented by a node on the network,
which receives input data.
Weights and Connections: The weight of each neuronal connection indicates how
strong the connection is. Throughout training, these weights are changed.
Hidden Layers: Each hidden layer neuron processes inputs by multiplying them by
weights, adding them up, and then passing them through an activation function. By
doing this, non-linearity is introduced, enabling the network to recognize intricate
patterns.
Output: The final result is produced by repeating the process until the output layer is
reached.
6. b. Back propagation:
Loss Calculation: The network’s output is evaluated against the real goal values, and a
loss function is used to compute the difference. For a regression problem, the Mean
Squared Error (MSE) is commonly used as the cost function.
Loss Function:
Gradient Descent: Gradient descent is then used by the network to reduce the loss. To
lower the inaccuracy, weights are changed based on the derivative of the loss with
respect to each weight.
Adjusting weights: The weights are adjusted at each connection by applying this
iterative process, or back propagation, backward across the network.
Training: During training with different data samples, the entire process of forward
propagation, loss calculation, and back propagation is done iteratively, enabling the
network to adapt and learn patterns from the data.
Activation Functions: Model non-linearity is introduced by activation functions like
the rectified linear unit (ReLU) or sigmoid. Their decision on whether to “fire” a neuron
is based on the whole weighted input.
TYPES OF LEARNINGS IN NEURAL NETWORKS:
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
a. Supervised Learning:
As the name suggests Supervised Learning, it is a type of learning that is looked
after by a supervisor.
It is like learning with a teacher. There are input training pairs that contain a set of
input and the desired output.
7. Here the output from the model is compared with the desired output and an error is
calculated, this error signal is sent back into the network for adjusting the weights.
This adjustment is done till no more adjustments can be made and the output of the
model matches the desired output. In this, there is feedback from the environment to
the model.
b. Unsupervised Learning:
Unlike supervised learning, there is no supervisor or a teacher here.
In this type of learning, there is no feedback from the environment, there is no
desired output and the model learns on its own.
During the training phase, the inputs are formed into classes that define the
similarity of the members.
Each class contains similar input patterns.
On inputting a new pattern, it can predict to which class that input belongs based
on similarity with other patterns. If there is no such class, a new class is formed.
c. Reinforcement Learning:
It gets the best of both worlds, that is, the best of both supervised learning and
unsupervised learning.
It is like learning with a critique. Here there is no exact feedback from the
environment, rather there is critique feedback.
8. The critique tells how close our solution is.
Hence the model learns on its own based on the critique information. It is similar to
supervised learning in that it receives feedback from the environment, but it is
different in that it does not receive the desired output information, rather it receives
critique information.
TYPES OF NEURAL NETWORKS:
There are five types of neural networks that can be used.
i. Feed forward Networks
ii. Multilayer Perceptron (MLP)
iii. Convolutional Neural Network (CNN)
iv. Recurrent Neural Network (RNN)
v. Long Short-Term Memory (LSTM)
i. Feed forward Networks: A feed forward is a simple artificial neural network architecture
in which data moves from input to output in a single direction.
It has input, hidden, and output layers; feedback loops are
absent. Its straightforward architecture makes it appropriate for
a number of applications, such as regression and pattern
recognition.
ii. Multilayer Perceptron (MLP): MLP is a type of feed forward neural network with three
or more layers, including an input layer, one or more
hidden layers, and an output layer. It uses nonlinear
activation functions.
iii. Convolutional Neural Network (CNN): A Convolutional Neural Network (CNN) is a
specialized artificial neural network
designed for image processing. It employs
convolutional layers to automatically learn
hierarchical features from input images,
enabling effective image recognition and
classification. CNNs have revolutionized
computer vision and are pivotal in tasks like
object detection and image analysis.
iv. Recurrent Neural Network (RNN): An artificial neural network type intended for
sequential data processing is called a Recurrent
Neural Network (RNN). It is appropriate for
applications where contextual dependencies are
critical, such as time series prediction and natural
language processing, since it makes use of feedback
loops, which enable information to survive within
the network.
v. Long Short-Term Memory (LSTM): LSTM is a type of RNN that is designed to
overcome the vanishing gradient problem in
9. training RNNs. It uses memory cells and gates to
selectively read, write, and erase information.
ADVANTAGES OF NEURAL NETWORKS
Neural networks are widely used in many different applications because of their many benefits:
Adaptability: Neural networks are useful for activities where the link between inputs and
outputs is complex or not well defined because they can adapt to new situations and learn
from data.
Pattern Recognition: Their proficiency in pattern recognition renders them efficacious in
tasks like as audio and image identification, natural language processing, and other intricate
data patterns.
Parallel Processing: Because neural networks are capable of parallel processing by nature,
they can process numerous jobs at once, which speeds up and improves the efficiency of
computations.
Non-Linearity: Neural networks are able to model and comprehend complicated
relationships in data by virtue of the non-linear activation functions found in neurons, which
overcome the drawbacks of linear models.
DISADVANTAGES OF NEURAL NETWORKS
Neural networks, while powerful, are not without drawbacks and difficulties:
Computational Intensity: Large neural network training can be a laborious and
computationally demanding process that demands a lot of computing power.
Black box Nature: As “black box” models, neural networks pose a problem in important
applications since it is difficult to understand how they make decisions.
Over fitting: Over fitting is a phenomenon in which neural networks commit training
material to memory rather than identifying patterns in the data. Although regularization
approaches help to alleviate this, the problem still exists.
Need for Large datasets: For efficient training, neural networks frequently need sizable,
labeled datasets; otherwise, their performance may suffer from incomplete or skewed data.
2. APPLICATION SCOPE OF NEURAL NETWORK:
Neural networks have several use cases across many industries, such as the following:
Medical diagnosis by medical image classification
Targeted marketing by social network filtering and behavioral data analysis
Financial predictions by processing historical data of financial instruments
Electrical load and energy demand forecasting
Process and quality control
Chemical compound identification
10. We give four of the important applications of neural networks below.
i. Computer vision: Computer vision is the ability of computers to extract information and insights
from images and videos. With neural networks, computers can distinguish and recognize images
similar to humans. Computer vision has several applications, such as the following:
Visual recognition in self-driving cars so they can recognize road signs and other road users.
Content moderation to automatically remove unsafe or inappropriate content from image
and video archives.
Facial recognition to identify faces and recognize attributes like open eyes, glasses, and
facial hair.
Image labeling to identify brand logos, clothing, safety gear, and other image details
ii. Speech recognition: Neural networks can analyze human speech despite varying speech
patterns, pitch, tone, language, and accent. Virtual assistants like Amazon Alexa and automatic
transcription software use speech recognition to do tasks like these:
Assist call center agents and automatically classify calls
Convert clinical conversations into documentation in real time
Accurately subtitle videos and meeting recordings for wider content reach.
iii. Natural language processing: Natural language processing (NLP) is the ability to process
natural, human-created text. Neural networks help computers gather insights and meaning from
text data and documents. NLP has several use cases, including in these functions:
Automated virtual agents and chatbots.
Automatic organization and classification of written data.
Business intelligence analysis of long-form documents like emails and forms.
Indexing of key phrases that indicate sentiment, like positive and negative comments on
social media.
Document summarization and article generation for a given topic.
iv. Recommendation engines: Neural networks can track user activity to develop personalized
recommendations. They can also analyze all user behavior and discover new products or services
that interest a specific user. IPT uses neural networks to automatically find and recommend
products relevant to the user’s social media activity. Consumers don't have to hunt through
online catalogs to find a specific product from a social media image.
DEEP LEARNING VS MACHINE LEARNING: NEURAL NETWORKS
ASPECT MACHINE LEARNING NEURAL NETWORKS
Definition Subset of Artificial Intelligence
Specific architecture of Machine
Learning.
11. Focus
Broad range of algorithms and
techniques
Subset of Machine Learning for
deep learning.
Learning
Approach
Supervised, Unsupervised, Semi-
Supervised, etc.
Often associated with supervised
learning.
Data Requirement Labeled or Unlabeled data
Typically requires labeled data for
training.
Key Components
Algorithms, Feature Engineering,
Model Selection
Input Layer, Hidden Layers,
Output Layer.
Structure
Diverse structures based on the
algorithm chosen
Interconnected network of artificial
neurons.
Pattern
Recognition
Various pattern recognition techniques
used
Core capability for recognizing
patterns.
Application Diverse applications across industries
Image recognition, NLP, Speech
recognition, etc.
Usage
General-purpose for data-driven
decision making
Deep learning for complex
representations.
Complexity
Less complex compared to Neural
Networks
More complex architecture with
interconnected neurons.
3. INTRODUCTION - ARTIFICIAL NEURAL NETWORKS (ANN)
Artificial Neural Networks (ANN) is algorithms based on brain function and are used to
model complicated patterns and forecast issues.
The Artificial Neural Network (ANN) is a deep learning method that arose from the concept
of the human brain Biological Neural Networks.
The development of ANN was the result of an attempt to replicate the workings of the human
brain.
The workings of ANN are extremely similar to those of biological neural networks, although
they are not identical. ANN algorithm accepts only numeric and structured data.
Convolution Neural Networks (CNN) and Recursive Neural Networks (RNN) are used to
accept unstructured and non-numeric data forms such as Image, Text, and Speech.
12. WHAT IS ARTIFICIAL NEURAL NETWORK (ANN)?
An Artificial Neural Network (ANN) is a computational model inspired by the human brain’s
neural structure.
It consists of interconnected nodes (neurons) organized into layers.
Information flows through these nodes, and the network adjusts the connection strengths
(weights) during training to learn from data, enabling it to recognize patterns, make
predictions, and solve various tasks in machine learning and artificial intelligence.
Basic of Perceptron’s in ANN:
A single-layer feed forward neural network was introduced in the late 1950s by Frank
Rosenblatt. It was the starting phase of Deep Learning and Artificial neural networks.
Perceptron is one of the first and most straightforward models of artificial neural networks.
WHAT IS PERCEPTRON?
Perceptron is one of the simplest ANN architectures. It was introduced by Frank Rosenblatt
in 1957s. It is the simplest type of feed forward neural network, consisting of a single layer
of input nodes that are fully connected to a layer of output nodes.
It can learn the linearly separable patterns. It uses slightly different types of artificial neurons
known as threshold logic units (TLU). It was first introduced by McCulloch and Walter Pitts
in the 1940s.
TYPES OF PERCEPTRON:
i. Single-Layer Perceptron
ii. Multilayer Perceptron
13. i. Single-Layer Perceptron: This type of perceptron is limited to learning linearly separable
patterns. Effective for tasks where the data can be divided into
distinct categories through a straight line.
ii. Multilayer Perceptron: Multilayer perceptron’s possess enhanced processing capabilities as
they consist of two or more layers, adept at handling more complex
patterns and relationships within the data.
BASIC COMPONENTS OF PERCEPTRON:
A perceptron, the basic unit of a neural network, comprises essential components that collaborate
in information processing.
Input Features: The perceptron takes multiple input features, each input feature
represents a characteristic or attribute of the input data.
Weights: Each input feature is associated with a weight, determining the significance of
each input feature in influencing the perceptron’s output. During training, these weights
are adjusted to learn the optimal values.
Summation Function: The perceptron calculates the weighted sum of its inputs using
the summation function. The summation function combines the inputs with their
respective weights to produce a weighted sum.
Activation Function: The weighted sum is then passed through an activation function.
Perceptron uses Heaviside step function functions which take the summed values as
input and compare with the threshold and provide the output as 0 or 1.
Output: The final output of the perceptron, is determined by the activation function’s result.
For example, in binary classification problems, the output might represent a predicted class
(0 or 1).
Bias: A bias term is often included in the perceptron model. The bias allows the model to
make adjustments that are independent of the input. It is an additional parameter that is
learned during training.
14. Learning Algorithm (Weight Update Rule): During training, the perceptron learns by
adjusting its weights and bias based on a learning algorithm. A common approach is the
perceptron learning algorithm, which updates weights based on the difference between the
predicted output and the true output.
These components work together to enable a perceptron to learn and make predictions. While a
single perceptron can perform binary classification, more complex tasks require the use of multiple
perceptron’s organized into layers, forming a neural network.
HOW DOES PERCEPTRON WORK?
A weight is assigned to each input node of a perceptron, indicating the significance of that
input to the output.
The perceptron’s output is a weighted sum of the inputs that have been run through an
activation function to decide whether or not the perceptron will fire. it computes the
weighted sum of its inputs as:
The step function compares this weighted sum to the threshold, which outputs 1 if the input
is larger than a threshold value and 0 otherwise, is the activation function that perceptron’s
utilize the most frequently.
The most common step function used in perceptron is the Heaviside step function:
A perceptron has a single layer of threshold logic units with each TLU connected to all inputs.
15. When all the neurons in a layer are connected to every neuron of the previous layer, it is
known as a fully connected layer or dense layer.
The output of the fully connected layer can be:
Where X is the input W is the weight for each inputs neurons and b is the bias and h is the step
function.
During training, the perceptron’s weights are adjusted to minimize the difference between the
predicted output and the actual output. Usually, supervised learning algorithms like the delta
rule or the perceptron learning rule are used for this.
Here wi,j is the weight between the ith input and jth output neuron, xi is the ith input value, and
yj and is the jth actual and predicted value is n the learning rate.
ANN ARCHITECTURE:
There are three layers in the network architecture: the input layer, the hidden layer (more than
one), and the output layer. Because of the numerous layers are sometimes referred to as the
MLP (Multi-Layer Perceptron).
To understand the concept of the architecture of an artificial neural network, we have to
understand what a neural network consists of.
In order to define a neural network that consists of a large number of artificial neurons, which are
termed units arranged in a sequence of layers.
Artificial Neural Network primarily consists of three layers:
a. Input Layer
b. Hidden layer
c. Output Layer
16. a. Input Layer: As the name suggests, it accepts inputs in several different formats provided by
the programmer.
b. Hidden Layer: The hidden layer presents in-between input and output layers. It performs all
the calculations to find hidden features and patterns.
c. Output Layer: The input goes through a series of transformations using the hidden layer,
which finally results in output that is conveyed using this layer.
The artificial neural network takes input and computes the weighted sum of the inputs
and includes a bias. This computation is represented in the form of a transfer function.
It determines weighted total is passed as an input to an activation function to produce the
output. Activation functions choose whether a node should fire or not.
Only those who are fired make it to the output layer. There are distinctive activation
functions available that can be applied upon the sort of task we are performing.
4. BASIC MODELS OF ANN:
The arrangement of neurons to form layers and the connection pattern formed within and
between layers is called the network architecture. There are there fundamental classes of
ANN architecture. They are:
a. Single-layer feed-forward network
b. Multilayer feed-forward network
c. Single node with its own feedback
d. Single-layer recurrent network
e. Multilayer recurrent network
a. Single-layer feed-forward network
In this type of network, we have only two layers input layer and the output layer but
the input layer does not count because no computation is performed in this layer.
17. The output layer is formed when different weights are applied to input nodes and the
cumulative effect per node is taken.
After this, the neurons collectively give the output layer to compute the output
signals.
b. Multilayer feed-forward network
This layer also has a hidden layer that is internal to the network and has no direct
contact with the external layer.
The existence of one or more hidden layers enables the network to be computationally
stronger, a feed-forward network because of information flow through the input
function, and the intermediate computations used to determine the output Z.
There are no feedback connections in which outputs of the model are fed back into
itself.
c. Single node with its own feedback
Single Node with own Feedback
18. When outputs can be directed back as inputs to the same layer or preceding layer nodes,
then it results in feedback networks.
Recurrent networks are feedback networks with closed loops.
The above figure shows a single recurrent network having a single neuron with feedback
to itself.
d. Single-layer recurrent network
The above network is a single-layer network with a feedback connection in which the
processing element’s output can be directed back to itself or to another processing
element or both.
A recurrent neural network is a class of artificial neural networks where connections
between nodes form a directed graph along a sequence.
This allows it to exhibit dynamic temporal behavior for a time sequence.
Unlike feed forward neural networks, RNNs can use their internal state (memory) to
process sequences of inputs.
e. Multilayer recurrent network
19. In this type of network, processing element output can be directed to the processing element
in the same layer and in the preceding layer forming a multilayer recurrent network.
They perform the same task for every element of a sequence, with the output being
dependent on the previous computations.
Inputs are not needed at each time step.
The main feature of a Recurrent Neural Network is its hidden state, which captures some
information about a sequence.
5. IMPORTANT TERMINOLOGY ANN’S:
The ANN (Artificial Neural Network) is based on BNN (Biological Neural Network) as its
primary goal is to fully imitate the Human Brain and its functions.
Similar to the brain having neurons interlinked to each other, the ANN also has neurons that
are linked to each other in various layers of the networks which are known as nodes.
The ANN learns through various learning algorithms that are described as supervised or
unsupervised learning.
In supervised learning algorithms, the target values are labeled. Its goal is to try to reduce the
error between the desired output (target) and the actual output for optimization. Here, a
supervisor is present.
In unsupervised learning algorithms, the target values are not labeled and the network learns by
itself by identifying the patterns through repeated trials and experiments.
ANN Terminology:
Weights: Each neuron is linked to the other neurons through connection links that carry
weight. The weight has information and data about the input signal. The output
depends solely on the weights and input signal. The weights can be presented in a
matrix form that is known as the Connection matrix.
20. if there are “n” nodes with each node having “m” weights, then it is represented as:
Bias: Bias is a constant that is added to the product of inputs and weights to calculate the
product. It is used to shift the result to the positive or negative side. The net input weight
is increased by a positive bias while the net input weight is decreased by a negative bias.
Here,{1,x1…xn} are the inputs, and the output (Y) neurons will be computed by the function g(x)
which sums up all the input and adds bias to it.
And the role of the activation is to provide the output depending on the results of the summation
function:
21. Threshold: A threshold value is a constant value that is compared to the net input to get the
output. The activation function is defined based on the threshold value to calculate the
output.
Learning Rate: The learning rate is denoted α. It ranges from 0 to 1. It is used for balancing
weights during the learning of ANN.
Target value: Target values are correct values of the output variable and are also known as just
targets.
Error: It is the inaccuracy of predicted output values compared to Target Values.
ADVANTAGES OF ANN’s:
Problem in ANNs can have instances that are represented by many attribute-value pairs.
ANNs used for problems having the target function output may be discrete-valued, real-
valued, or a vector of several real- or discrete-valued attributes.
ANN learning methods are quite robust to noise in the training data. The training examples
may contain errors, which do not affect the final output.
It is used generally used where the fast evaluation of the learned target function may be
required.
ANNs can bear long training times depending on factors such as the number of weights in the
network, the number of training examples considered, and the settings of various learning
algorithm parameters.
6. SUPERVISED LEARNING NETWORK:
A prescribed set of well-defined rules for the solution of a learning problem is called
a learning algorithm.
Variety of learning algorithms are existing, each of which offers advantages of its
own.
Basically, learning algorithms differ from each other in the way in which the
adjustment Δwkj to the synaptic weight wkj is formulated.
Fundamentals on learning and training:
Learning is a process by which the free parameters (weights and biases) of aneural
network are adapted through a continuing process of stimulation by the
environment.
This definition of the learning process implies the following sequence of events:
1. The neural network is stimulated by an environment.
2. The neural network is changed (internal structure) as a result of this
stimulation.
3. The neural network responds in a new way to the environment.
22. Supervised learning network paradigms:
Supervised Learning in Neural Networks: Perceptron’s and Multilayer
Perceptron’s.
Training set: A training set (named P) is a set of training patterns, whichwe use to
train our neural net.
Batch training of a network proceeds by making weight and bias changes based on an
entire set (batch) of input vectors.
Incremental training changes the weights and biases of a network as needed after
presentation of each individual input vector. Incremental training is sometimes
referred to as “on line” or “adaptive” training.
Hebbian learning rule suggested by Hebb in his classic book Organization of
Behavior: The basic idea is that if two units j and k are active simultaneously, their
interconnection must be strengthened, If j receives input from k, the simplest version
of Hebbian learning prescribesto modify the weight wjk with
Where 𝜸, is a positive constant of proportionality representing the learning rate.
Another common rule uses not the actual activation of unit k but the deference
between the actual and desired activation for adjusting the weights. In which dk is
the desired activation provided by a teacher. This is oftencalled the Widrow-Hoff
rule or the delta rule.
Error-correction learning:
Error-correction learning diagram
Let 𝐝𝐤 (𝐧) denote some desired response or target response for neuron 𝐤 at time 𝐧. Let
the corresponding value of the actual response (output) of this neuron be denoted by
𝐲𝐤 (𝐧).
23. Typically, the actual response 𝐲𝐤 (𝐧) of neuron 𝐤 is different from the desired response
𝐝𝐤 (𝐧). Hence, we may define an error signal
𝐞𝐤(𝐧) = 𝐲𝐤(𝐧) − 𝐝𝐤(𝐧)
The ultimate purpose of error-correction learning is to minimize a cost function based
on the error signal 𝐞𝐤 (𝐧).
A criterion commonly used for the cost function is the instantaneous value of the mean
square-error criterion
The network is then optimized by minimizing 𝐉 (𝐧) with respect to the synaptic
weights of the network. Thus, according to the error-correction learning rule (or delta
rule), the synaptic weight adjustment is given by
Let wkj(n) denote the value of the synaptic weight wkj at time n. At time n an adjustment
Δwkj(n) is applied to the synaptic weight wkj(n), yielding the updated value
wkj (n +1) = wkj (n) + Δwkj (n)
The perceptron’ training algorithm
The Perceptron Learning Rule
Perceptron’s are trained on examples of desired behavior, which can besummarized by a
set of input-output pairs
{𝒑𝟏, 𝒕𝟏}, {𝒑𝟐, 𝒕𝟐}, … , {𝒑𝑸, 𝒕𝑸}
The objective of training is to reduce the error e, which is the difference 𝒕 – 𝒂
between the perceptron output 𝒂, and the target vector 𝒕.
This is done by adjusting the weights (W) and biases (b) of the perceptronnetwork
according to following equations:
Diagram of a neuron:
24. The neuron computes the weighted sum of the input signals and comparesthe result
with a threshold value, 𝜽. If the net input is less than the threshold, the neuron output is –
1. But if the net input is greater than or equal to the threshold, the neuron becomes
activated and its output attains a value +1.
The neuron uses the following transfer or activation function:
Single neuron’ training algorithm
In 1958, Frank Rosenblatt introduced a training algorithm that provided thefirst
procedure for training a simple ANN: a perceptron.
Single-layer two-input perceptron
The Perceptron
The operation of Rosenblatt’s perceptron is based on the McCulloch and Pitt’s neuron
model. The model consists of a linear combiner followed by a hard limiter.
The weighted sum of the inputs is applied to the hard limiter, which produces an
output equal to +1 if its input is positive and -1 if it is negative.
The aim of the perceptron is to classify inputs, x1, x2. . . xn, into one of two
classes, say A1 and A2.
In the case of an elementary perceptron, the n- dimensional space is divided by a
hyper plane into two decision regions. The hyper plane is defined by the linearly
separable function:
Linear separability in the perceptron’s
25. How does the perceptron learn its classification tasks?
This is done by making small adjustments in the weights to reduce the difference between
the actual and desired outputs of the perceptron. The initial weights are randomly assigned,
usually in the range [-0.5, 0.5], and then updated to obtain the output consistent with the
training examples.
If at iteration p, the actual output is Y(p) and the desired output is Yd (p), then the error
is given by:
Where p = 1, 2, 3 . . .
Iteration p here refers to the pth training example presented to the perceptron.
If the error, e (p), is positive, we need to increase perceptron output Y (p), butif it is
negative, we need to decrease Y (p).
The perceptron learning rule
Where p = 1, 2, 3 . . .
• 𝑎 is the learning rate, a positive constant less than unity.
• The perceptron learning rule was first proposed by Rosenblatt in 1960. Using
this rule we can derive the perceptron training algorithm for classification
tasks.
26. Two-dimensional plots of basic logical operations
A perceptron can learn the operations AND and OR, but not Exclusive-OR.
29. 17. What is a neural network?
A neural network is an artificial system made of interconnected nodes (neurons) that process
information, modeled after the structure of the human brain. It is employed in machine
learning jobs where patterns are extracted from data.
18. How does a neural network work?
Layers of connected neurons process data in neural networks. The network processes input
data, modifies weights during training, and produces an output depending on patterns that it
has discovered.
19. What are the common types of neural network architectures?
Feed forward neural networks, recurrent neural networks (RNNs), convolutional neural
networks (CNNs), and long short-term memory networks (LSTMs) are examples of common
architectures that are each designed for a certain task.
20. What is the difference between supervised and unsupervised learning in neural networks?
In supervised learning, labeled data is used to train a neural network so that it may learn to
map inputs to matching outputs. Unsupervised learning works with unlabeled data and looks
30. for structures or patterns in the data.
21. How do neural networks handle sequential data?
The feedback loops that recurrent neural networks (RNNs) incorporate allow them to process
sequential data and, over time, capture dependencies and context.
22. Explain multilayer perceptron.
Ans.: The Multilayer Perceptron (MLP) model features multiple layers that are interconnected in
such a way that they form a feed-forward neural network. Each neuron in one layer has directed
connections to the neurons of a separate layer. It consists of three types of layers: the input layer,
output layer and hidden layer.
23. Explain back propagation.
Ans.: Back propagation is a training method used for a multi-layer neural network. It is also
called the generalized delta rule. It is a gradient descent method which minimizes the total
squared error of the output computed by the net.
24. What is hyper parameters?
Ans.: Hyper parameters are parameters whose values control the learning process and determine
the values of model parameters that a learning algorithm ends up learning.
25. Explain need of hidden layers.
Ans:
1. A network with only two layers (input and output) can only represent the input with whatever
representation already exists in the input data.
2. If the data is discontinuous or non-linearly separable, the innate representation is inconsistent,
and the mapping cannot be learned using two layers (Input and Output).
3. Therefore, hidden layer(s) are used between input and output layers.
26. Explain activation functions.
Ans.: Activation functions also known as transfer function is used to map input nodes to output
nodes in certain fashion. It helps in normalizing the output between 0 to 1 and - V1 to 1. The
activation function is the most important factor in a neural network which decided whether or not
a neuron will be activated or not and transferred to the next layer.