SlideShare a Scribd company logo
1 of 6
Download to read offline
ARTIFICIAL NEURAL NETWORKS
Akash Ranjan Das, Aman Jaiswal
Computer Science and Engineering, Siddaganga Institute Of Technology
B.H. Road Tumkakuru, Karnataka, India
akash.ranjan199@gmail.com
jaiswalaman97@gmail.com
Abstract : Brain consists of 200 billion of neurons and
neuron is formed from 4 basic parts as Dendrites, Soma,
Axon, and Synapses. The neuron collects signals from
Dendrites, and the Soma cells sum up all the signals
collected, and when the summation reaches the
threshold the signal passes through the axon to the
other neurons. The synapses indicates the strength of
the interconnection in between the neurons.
Similar to the brain, the artificial neural network,
imitates this biological neural network of human body.
Normally computer programs are defined by commands
which always execute according to the commanding of
the programmer, but ANN as similar to brain, learn
through examples and experiences not from already
defined commands through programs.
Keywords: Neurons, Activation function, sigmoid
function,
I. INTRODUCTION
Deep Learning is the most exciting and powerful
branch of Machine Learning. It's a technique that
teaches computers to do what comes naturally to
humans: learn by example. Deep learning is a key
technology behind driverless cars, enabling them to
recognize a stop sign or to distinguish a pedestrian
from a lamppost. It is the key to voice control in
consumer devices like phones, tablets, TVs, and
hands-free speakers. Deep learning is getting lots of
attention lately and for good reason. It’s achieving
results that were not possible before.
In deep learning, a computer model learns to perform
classification tasks directly from images, text, or
sound. Deep learning models can achieve state-of-
the-art accuracy, sometimes exceeding human-level
performance. Models are trained by using a large set
of labeled data and neural network architectures that
contain many layers.
Deep Learning models can be used for a variety of
complex tasks:
1. Artificial Neural Network(ANN) for
Regression and classification
2. Convolutional Neural Networks(CNN) for
Computer Vision
3. Recurrent Neural Networks(RNN) for Time
Series analysis
4. Self-organizing maps for Feature extraction
5. Deep Boltzmann machines for
Recommendation systems
6. Auto Encoders for Recommendation
systems
In this paper we are focusing on Artificial Neural
Networks.
“Artificial Neural Networks or ANN is an
information processing paradigm that is inspired by
the way the biological nervous system such as a brain
processes information. It is composed of a large
number of highly interconnected processing
elements(neurons) working in unison to solve a
specific problem.”
II. NEURONS
Biological Neurons (also called nerve cells) or simply
neurons are the fundamental units of the brain and
nervous system, the cells responsible for receiving
sensory input from the external world via dendrites,
process it and give the output through Axons.
Fig1. Human Neuron
I. Cell body (Soma): The body of the neuron cell
contains the nucleus and carries out biochemical
transformation necessary to the life of neurons.
II. Dendrites: Each neuron has fine, hair-like tubular
structures (extensions) around it. They branch out
into a tree around the cell body. They accept
incoming signals.
III. Axon: It is a long, thin, tubular structure that
works like a transmission line.
IV. Synapse: Neurons are connected to one another
in a complex spatial arrangement. When axon
reaches its final destination it branches again called
terminal arborization. At the end of the axon are
highly complex and specialized structures called
synapses. The connection between two neurons takes
place at these synapses.
Dendrites receive input through the synapses of other
neurons. The soma processes these incoming signals
over time and converts that processed value into an
output, which is sent out to other neurons through the
axon and the synapses.
Fig 2. The Flow of Electric Signals through neurons
The following diagram represents the general model
of ANN which is inspired by a biological neuron. It is
also called Perceptron.
A single layer neural network is called a Perceptron.
It gives a single output.
Fig 3. Perceptron
In the above figure, for one single observation,
x0,x1,x2,......,x(n) represents various inputs
(independent variables) to the network. Each of these
inputs is multiplied by a connection weight or
synapse. The weights are represented as w0,
w1,w2,....,w(n) .Weight shows the strength of a
particular node.
b is a bias value. A bias value allows you to shift the
activation function up or down.
In the simplest case, these products are summed, fed
to a transfer function (activation function) to generate
a result, and this result is sent as output.
Mathematically,
x1*w1 + x2*w2 + x3*w3 + .. +xn*wn = ∑ xi*wi
Now activation function is applied Φ(∑ xi*wi).
3. ACTIVATION FUNCTION
The Activation function is important for an ANN to
learn and make sense of something really
complicated. Their main purpose is to convert an
input signal of a node in an ANN to an output signal.
This output signal is used as input to the next layer in
the stack.
Activation function decides whether a neuron should
be activated or not by calculating the weighted sum
and further adding bias to it. The motive is to
introduce non-linearity into the output.
If we do not apply activation function then the output
signal would be simply linear function(one-degree
polynomial). Now, a linear function is easy to solve
but they are limited in their complexity, have less
power. Without activation function, our model cannot
learn and model complicated data such as images,
videos, audio, speech, etc.
Non-Linear functions are those which have a degree
more than one and they have a curvature. Now we
need a neural network to learn and represent almost
anything and any arbitrary complex function that
maps an input to output.
Types of Activation Functions:
A. Threshold Activation Function — (Binary step
function)
A Binary step function is a threshold-based activation
function. If the input value is above or below a
certain threshold, the neuron is activated and sends
exactly the same signal to the next layer.
Fig 4. A binary step function
Activation function A = “activated” if Y > threshold
else not or A=1 if y>threshold 0 otherwise.
The problem with this function is for creating a
binary classifier ( 1 or 0), but if you want multiple
such neurons to be connected to bring in more
classes, Class1, Class2, Class3, etc. In this case, all
neurons will give 1, so we cannot decide.
classes, Class1, Class2, Class3, etc. In this case, all
neurons will give 1, so we cannot decide.
B. Sigmoid Activation Function — (Logistic
function)
A Sigmoid function is a mathematical function
having a characteristic “S”-shaped curve or sigmoid
curve which ranges between 0 and 1, therefore it is
used for models where we need to predict the
probability as an output.
Fig 5. Sigmoid Curve
The Sigmoid function is differentiable, means we can
find the slope of the curve at any 2 points.
The drawback of the Sigmoid activation function is
that it can cause the neural network to get stuck at
training time if strong negative input is provided.
C. Hyperbolic Tangent Function — (tanh)
It is similar to Sigmoid but better in performance. It is
nonlinear in nature, so great we can stack layers. The
function ranges between (-1,1).
Fig 6. Hyperbolic tangent function
The main advantage of this function is that strong
negative inputs will be mapped to negative output and
only zero-valued inputs are mapped to near-zero
outputs. So less likely to get stuck during training.
D. Rectified Linear Units — (ReLu)
ReLu is the most used activation function in CNN
and ANN which ranges from zero to infinity. [0,∞).
Fig 7. ReLu
It gives an output ‘x’ if x is positive and 0 otherwise.
It looks like having the same problem of linear
function as it is linear in the positive axis. Relu is
non-linear in nature and a combination of ReLu is
also non-linear. In fact, it is a good approximator and
any function can be approximated with a combination
of Relu.
ReLu is 6 times improved over hyperbolic tangent
function.
It should only be applied to hidden layers of a neural
network. So, for the output layer use softmax function
for classification problem and for regression problem
use a Linear function.
Here one problem is some gradients are fragile during
training and can die. It causes a weight update which
will make it never activate on any data point again.
Basically ReLu could result in dead neurons.
To fix the problem of dying neurons, Leaky ReLu
was introduced. So, Leaky ReLu introduces a small
slope to keep the updates alive. Leaky ReLu ranges
from -∞ to +∞.
Fig 8. ReLu vs Leaky ReLu
Leak helps to increase the range of the ReLu
function. Usually, the value of a = 0.01 or so.
When a is not 0.01, then it is called Randomized
ReLu.
4. HOW DOES NEURAL NETWORK WORK?
Let us take the example of the price of a property and
to start with we have different factors assembled in a
single row of data: Area, Bedrooms, Distance to city
and Age.
Fig 9.
The input values go through the weighted synapses
straight over to the output layer. All four will be
analyzed, an activation function will be applied, and
the results will be produced.
This is simple enough but there is a way to amplify
the power of the Neural Network and increase its
accuracy by the addition of a hidden layer that sits
between the input and output layers.
Fig 10. A neural network with a hidden layer(only
showing non-0 values)
Now in the above figure, all 4 variables are connected
to neurons via a synapse. However, not all of the
synapses are weighted. they will either have a 0 value
or non-0 value.
here, the non-0 value → indicates the importance
0 value → They will be discarded.
Let's take the example of Area and Distance to City
are non-zero for the first neuron, which means they
are weighted and matter to the first neuron. The other
two variables, Bedrooms and Age aren’t weighted
and so are not considered by the first neuron.
You may wonder why that first neuron is only
considering two of the four variables. In this case, it
is common on the property market that larger homes
become cheaper the further they are from the city.
That’s a basic fact. So what this neuron may be doing
is looking specifically for properties that are large but
are not so far from the city.
Now, this is where the power of neural networks
comes from. There are many of these neurons, each
doing similar calculations with different
combinations of these variables.
Once this criterion has been met, the neuron applies
the activation function and do its calculations. The
next neuron down may have weighted synapses of
Distance to the city and Bedrooms.
This way the neurons work and interact in a very
flexible way allowing it to look for specific things
and therefore make a comprehensive search for
whatever it is trained for.
Area and Distance to City are non-zero for the first
neuron, which means they are weighted and matter to
the first neuron. The other two variables, Bedrooms
and Age aren’t.
5. HOW DO NEURAL NETWORK LEARN?
Looking at an analogy may be useful in
understanding the mechanisms of a neural network.
Learning in a neural network is closely related to how
we learn in our regular lives and activities — we
perform an action and are either accepted or corrected
by a trainer or coach to understand how to get better
at a certain task. Similarly, neural networks require a
trainer in order to describe what should have been
produced as a response to the input. Based on the
difference between the actual value and the predicted
value, an error value also called Cost Function is
computed and sent back through the system.
Cost Function: One half of the squared difference
between actual and output value.
For each layer of the network, the cost function is
analyzed and used to adjust the threshold and weights
for the next input. Our aim is to minimize the cost
function. The lower the cost function, the closer the
actual value to the predicted value. In this way, the
error keeps becoming marginally lesser in each run as
the network learns how to analyze values.
We feed the resulting data back through the entire
neural network. The weighted synapses connecting
input variables to the neuron are the only thing we
have control over.
As long as there exists a disparity between the actual
value and the predicted value, we need to adjust those
wights. Once we tweak them a little and run the
neural network again, A new Cost function will be
produced, hopefully, smaller than the last.
We need to repeat this process until we scrub the cost
function down to as small as possible.
Fig 11.
The procedure described above is known as Back-
propagation and is applied continuously through a
network until the error value is kept at a minimum.
There are basically 2 ways to adjust weights: —
A. Brute-force method
Best suited for the single-layer feed-forward network.
Here you take a number of possible weights. In this
method, we want to eliminate all the other weights
except the one right at the bottom of the U-shaped
curve.
Optimal weight can be found using simple
elimination techniques. This process of elimination
work if you have one weight to optimize. What if you
have complex NN with many numbers of weights,
then this method fails because of the Curse of
Dimensionality.
The alternative approach that we have is called Batch
Gradient Descent.
B. Batch-Gradient Descent
It is a first-order iterative optimization algorithm and
its responsibility is to find the minimum cost
value(loss) in the process of training the model with
different weights or updating weights.
Fig 12. Gradient Descent
In Gradient Descent, instead of going through every
weight one at a time, and ticking every wrong weight
off as you go, we instead look at the angle of the
function line.
If slope → Negative, that means yo go down the
curve.
If slope → Positive, Do nothing
This way a vast number of incorrect weights are
eliminated. For instance, if we have 3 million
samples, we have to loop through 3 million times. So
basically you need to calculate each cost 3 million
times.
C. Stochastic Gradient
Descent(SGD)
Gradient Descent works fine when we have a convex
curve just like in the above figure. But if we don't
have a convex curve, Gradient Descent fails.
The word ‘stochastic‘ means a system or a process
that is linked with a random probability. Hence, in
Stochastic Gradient Descent, a few samples are
selected randomly instead of the whole data set for
each iteration.
In SGD, we take one row of data at a time, run it
through the neural network then adjust the weights.
For the second row, we run it, then compare the Cost
function and then again adjusting weights. And so
on…
SGD helps us to avoid the problem of local minima.
It is much faster than Gradient Descent because it is
running each row at a time and it doesn’t have to load
the whole data in memory for doing computation.
One thing to be noted is that, as SGD is generally
noisier than typical Gradient Descent, it usually took
a higher number of iterations to reach the minima,
because of its randomness in its descent. Even though
it requires a higher number of iterations to reach the
minima than typical Gradient Descent, it is still
computationally much less expensive than typical
Gradient Descent. Hence, in most scenarios, SGD is
preferred over Batch Gradient Descent for optimizing
a learning algorithm.
6. CONCLUSION
Fig 13.
Neural networks are a new concept whose potential
we have just scratched the surface of. They may be
used for a variety of different concepts and ideas, and
learn through a specific mechanism of
backpropagation and error correction during the
testing phase. By properly minimizing the error, these
multi-layered systems may be able to one day learn
and conceptualize ideas alone, without human
correction.

More Related Content

What's hot

Comparative study of ANNs and BNNs and mathematical modeling of a neuron
Comparative study of ANNs and BNNs and mathematical modeling of a neuronComparative study of ANNs and BNNs and mathematical modeling of a neuron
Comparative study of ANNs and BNNs and mathematical modeling of a neuronSaransh Choudhary
 
Understanding Deep Learning & Parameter Tuning with MXnet, H2o Package in R
Understanding Deep Learning & Parameter Tuning with MXnet, H2o Package in RUnderstanding Deep Learning & Parameter Tuning with MXnet, H2o Package in R
Understanding Deep Learning & Parameter Tuning with MXnet, H2o Package in RManish Saraswat
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksernj
 
Neural network
Neural networkNeural network
Neural networkmarada0033
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationMohammed Bennamoun
 
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Simplilearn
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networksmadhu sudhakar
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkDEEPASHRI HK
 
Neural network
Neural networkNeural network
Neural networkSilicon
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural NetworksDatabricks
 
Neural networks
Neural networksNeural networks
Neural networksSlideshare
 
Artificial neural network - Architectures
Artificial neural network - ArchitecturesArtificial neural network - Architectures
Artificial neural network - ArchitecturesErin Brunston
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications Ahmed_hashmi
 

What's hot (19)

Comparative study of ANNs and BNNs and mathematical modeling of a neuron
Comparative study of ANNs and BNNs and mathematical modeling of a neuronComparative study of ANNs and BNNs and mathematical modeling of a neuron
Comparative study of ANNs and BNNs and mathematical modeling of a neuron
 
Understanding Deep Learning & Parameter Tuning with MXnet, H2o Package in R
Understanding Deep Learning & Parameter Tuning with MXnet, H2o Package in RUnderstanding Deep Learning & Parameter Tuning with MXnet, H2o Package in R
Understanding Deep Learning & Parameter Tuning with MXnet, H2o Package in R
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Neural network
Neural networkNeural network
Neural network
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computation
 
Artificial neural networks
Artificial neural networks Artificial neural networks
Artificial neural networks
 
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
Deep Learning Tutorial | Deep Learning TensorFlow | Deep Learning With Neural...
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
071bct537 lab4
071bct537 lab4071bct537 lab4
071bct537 lab4
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Neural network
Neural networkNeural network
Neural network
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
Neural networks
Neural networksNeural networks
Neural networks
 
Artificial neural network - Architectures
Artificial neural network - ArchitecturesArtificial neural network - Architectures
Artificial neural network - Architectures
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial Neuron network
Artificial Neuron network Artificial Neuron network
Artificial Neuron network
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 

Similar to Artificial neural network paper

Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13Dr. Muhammad Ali Tirmizi., Ph.D.
 
Neural-Networks.ppt
Neural-Networks.pptNeural-Networks.ppt
Neural-Networks.pptRINUSATHYAN
 
Data Science - Part VIII - Artifical Neural Network
Data Science - Part VIII -  Artifical Neural NetworkData Science - Part VIII -  Artifical Neural Network
Data Science - Part VIII - Artifical Neural NetworkDerek Kane
 
Acem neuralnetworks
Acem neuralnetworksAcem neuralnetworks
Acem neuralnetworksAastha Kohli
 
Neural Networks Ver1
Neural  Networks  Ver1Neural  Networks  Ver1
Neural Networks Ver1ncct
 
Artificial Neural Networks ppt.pptx for final sem cse
Artificial Neural Networks  ppt.pptx for final sem cseArtificial Neural Networks  ppt.pptx for final sem cse
Artificial Neural Networks ppt.pptx for final sem cseNaveenBhajantri1
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learningRADO7900
 
Artificial neural networks
Artificial neural networks Artificial neural networks
Artificial neural networks ShwethaShreeS
 
Artificial Neural Network in Medical Diagnosis
Artificial Neural Network in Medical DiagnosisArtificial Neural Network in Medical Diagnosis
Artificial Neural Network in Medical DiagnosisAdityendra Kumar Singh
 
Neural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdfNeural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdfneelamsanjeevkumar
 
Soft Computing-173101
Soft Computing-173101Soft Computing-173101
Soft Computing-173101AMIT KUMAR
 
lecture11_Artificial neural networks.ppt
lecture11_Artificial neural networks.pptlecture11_Artificial neural networks.ppt
lecture11_Artificial neural networks.pptj7757652020
 

Similar to Artificial neural network paper (20)

Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13Dr. Syed Muhammad Ali Tirmizi - Special topics in finance   lec 13
Dr. Syed Muhammad Ali Tirmizi - Special topics in finance lec 13
 
Neural-Networks.ppt
Neural-Networks.pptNeural-Networks.ppt
Neural-Networks.ppt
 
Deep Learning Survey
Deep Learning SurveyDeep Learning Survey
Deep Learning Survey
 
A Study On Deep Learning
A Study On Deep LearningA Study On Deep Learning
A Study On Deep Learning
 
Data Science - Part VIII - Artifical Neural Network
Data Science - Part VIII -  Artifical Neural NetworkData Science - Part VIII -  Artifical Neural Network
Data Science - Part VIII - Artifical Neural Network
 
Acem neuralnetworks
Acem neuralnetworksAcem neuralnetworks
Acem neuralnetworks
 
Neural Networks Ver1
Neural  Networks  Ver1Neural  Networks  Ver1
Neural Networks Ver1
 
Artificial Neural Networks ppt.pptx for final sem cse
Artificial Neural Networks  ppt.pptx for final sem cseArtificial Neural Networks  ppt.pptx for final sem cse
Artificial Neural Networks ppt.pptx for final sem cse
 
ANN.pptx
ANN.pptxANN.pptx
ANN.pptx
 
ANN - UNIT 2.pptx
ANN - UNIT 2.pptxANN - UNIT 2.pptx
ANN - UNIT 2.pptx
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Artificial neural networks
Artificial neural networks Artificial neural networks
Artificial neural networks
 
19_Learning.ppt
19_Learning.ppt19_Learning.ppt
19_Learning.ppt
 
Artificial Neural Network in Medical Diagnosis
Artificial Neural Network in Medical DiagnosisArtificial Neural Network in Medical Diagnosis
Artificial Neural Network in Medical Diagnosis
 
ANN - UNIT 1.pptx
ANN - UNIT 1.pptxANN - UNIT 1.pptx
ANN - UNIT 1.pptx
 
ANN.ppt
ANN.pptANN.ppt
ANN.ppt
 
Neural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdfNeural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdf
 
Soft Computing-173101
Soft Computing-173101Soft Computing-173101
Soft Computing-173101
 
deep learning
deep learningdeep learning
deep learning
 
lecture11_Artificial neural networks.ppt
lecture11_Artificial neural networks.pptlecture11_Artificial neural networks.ppt
lecture11_Artificial neural networks.ppt
 

Recently uploaded

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 

Recently uploaded (20)

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 

Artificial neural network paper

  • 1. ARTIFICIAL NEURAL NETWORKS Akash Ranjan Das, Aman Jaiswal Computer Science and Engineering, Siddaganga Institute Of Technology B.H. Road Tumkakuru, Karnataka, India akash.ranjan199@gmail.com jaiswalaman97@gmail.com Abstract : Brain consists of 200 billion of neurons and neuron is formed from 4 basic parts as Dendrites, Soma, Axon, and Synapses. The neuron collects signals from Dendrites, and the Soma cells sum up all the signals collected, and when the summation reaches the threshold the signal passes through the axon to the other neurons. The synapses indicates the strength of the interconnection in between the neurons. Similar to the brain, the artificial neural network, imitates this biological neural network of human body. Normally computer programs are defined by commands which always execute according to the commanding of the programmer, but ANN as similar to brain, learn through examples and experiences not from already defined commands through programs. Keywords: Neurons, Activation function, sigmoid function, I. INTRODUCTION Deep Learning is the most exciting and powerful branch of Machine Learning. It's a technique that teaches computers to do what comes naturally to humans: learn by example. Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign or to distinguish a pedestrian from a lamppost. It is the key to voice control in consumer devices like phones, tablets, TVs, and hands-free speakers. Deep learning is getting lots of attention lately and for good reason. It’s achieving results that were not possible before. In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models can achieve state-of- the-art accuracy, sometimes exceeding human-level performance. Models are trained by using a large set of labeled data and neural network architectures that contain many layers. Deep Learning models can be used for a variety of complex tasks: 1. Artificial Neural Network(ANN) for Regression and classification 2. Convolutional Neural Networks(CNN) for Computer Vision 3. Recurrent Neural Networks(RNN) for Time Series analysis 4. Self-organizing maps for Feature extraction 5. Deep Boltzmann machines for Recommendation systems 6. Auto Encoders for Recommendation systems In this paper we are focusing on Artificial Neural Networks. “Artificial Neural Networks or ANN is an information processing paradigm that is inspired by the way the biological nervous system such as a brain processes information. It is composed of a large number of highly interconnected processing elements(neurons) working in unison to solve a specific problem.” II. NEURONS Biological Neurons (also called nerve cells) or simply neurons are the fundamental units of the brain and nervous system, the cells responsible for receiving sensory input from the external world via dendrites, process it and give the output through Axons.
  • 2. Fig1. Human Neuron I. Cell body (Soma): The body of the neuron cell contains the nucleus and carries out biochemical transformation necessary to the life of neurons. II. Dendrites: Each neuron has fine, hair-like tubular structures (extensions) around it. They branch out into a tree around the cell body. They accept incoming signals. III. Axon: It is a long, thin, tubular structure that works like a transmission line. IV. Synapse: Neurons are connected to one another in a complex spatial arrangement. When axon reaches its final destination it branches again called terminal arborization. At the end of the axon are highly complex and specialized structures called synapses. The connection between two neurons takes place at these synapses. Dendrites receive input through the synapses of other neurons. The soma processes these incoming signals over time and converts that processed value into an output, which is sent out to other neurons through the axon and the synapses. Fig 2. The Flow of Electric Signals through neurons The following diagram represents the general model of ANN which is inspired by a biological neuron. It is also called Perceptron. A single layer neural network is called a Perceptron. It gives a single output. Fig 3. Perceptron In the above figure, for one single observation, x0,x1,x2,......,x(n) represents various inputs (independent variables) to the network. Each of these inputs is multiplied by a connection weight or synapse. The weights are represented as w0, w1,w2,....,w(n) .Weight shows the strength of a particular node. b is a bias value. A bias value allows you to shift the activation function up or down. In the simplest case, these products are summed, fed to a transfer function (activation function) to generate a result, and this result is sent as output. Mathematically, x1*w1 + x2*w2 + x3*w3 + .. +xn*wn = ∑ xi*wi Now activation function is applied Φ(∑ xi*wi). 3. ACTIVATION FUNCTION The Activation function is important for an ANN to learn and make sense of something really complicated. Their main purpose is to convert an input signal of a node in an ANN to an output signal. This output signal is used as input to the next layer in the stack. Activation function decides whether a neuron should be activated or not by calculating the weighted sum and further adding bias to it. The motive is to introduce non-linearity into the output.
  • 3. If we do not apply activation function then the output signal would be simply linear function(one-degree polynomial). Now, a linear function is easy to solve but they are limited in their complexity, have less power. Without activation function, our model cannot learn and model complicated data such as images, videos, audio, speech, etc. Non-Linear functions are those which have a degree more than one and they have a curvature. Now we need a neural network to learn and represent almost anything and any arbitrary complex function that maps an input to output. Types of Activation Functions: A. Threshold Activation Function — (Binary step function) A Binary step function is a threshold-based activation function. If the input value is above or below a certain threshold, the neuron is activated and sends exactly the same signal to the next layer. Fig 4. A binary step function Activation function A = “activated” if Y > threshold else not or A=1 if y>threshold 0 otherwise. The problem with this function is for creating a binary classifier ( 1 or 0), but if you want multiple such neurons to be connected to bring in more classes, Class1, Class2, Class3, etc. In this case, all neurons will give 1, so we cannot decide. classes, Class1, Class2, Class3, etc. In this case, all neurons will give 1, so we cannot decide. B. Sigmoid Activation Function — (Logistic function) A Sigmoid function is a mathematical function having a characteristic “S”-shaped curve or sigmoid curve which ranges between 0 and 1, therefore it is used for models where we need to predict the probability as an output. Fig 5. Sigmoid Curve The Sigmoid function is differentiable, means we can find the slope of the curve at any 2 points. The drawback of the Sigmoid activation function is that it can cause the neural network to get stuck at training time if strong negative input is provided. C. Hyperbolic Tangent Function — (tanh) It is similar to Sigmoid but better in performance. It is nonlinear in nature, so great we can stack layers. The function ranges between (-1,1). Fig 6. Hyperbolic tangent function The main advantage of this function is that strong negative inputs will be mapped to negative output and only zero-valued inputs are mapped to near-zero outputs. So less likely to get stuck during training. D. Rectified Linear Units — (ReLu) ReLu is the most used activation function in CNN and ANN which ranges from zero to infinity. [0,∞).
  • 4. Fig 7. ReLu It gives an output ‘x’ if x is positive and 0 otherwise. It looks like having the same problem of linear function as it is linear in the positive axis. Relu is non-linear in nature and a combination of ReLu is also non-linear. In fact, it is a good approximator and any function can be approximated with a combination of Relu. ReLu is 6 times improved over hyperbolic tangent function. It should only be applied to hidden layers of a neural network. So, for the output layer use softmax function for classification problem and for regression problem use a Linear function. Here one problem is some gradients are fragile during training and can die. It causes a weight update which will make it never activate on any data point again. Basically ReLu could result in dead neurons. To fix the problem of dying neurons, Leaky ReLu was introduced. So, Leaky ReLu introduces a small slope to keep the updates alive. Leaky ReLu ranges from -∞ to +∞. Fig 8. ReLu vs Leaky ReLu Leak helps to increase the range of the ReLu function. Usually, the value of a = 0.01 or so. When a is not 0.01, then it is called Randomized ReLu. 4. HOW DOES NEURAL NETWORK WORK? Let us take the example of the price of a property and to start with we have different factors assembled in a single row of data: Area, Bedrooms, Distance to city and Age. Fig 9. The input values go through the weighted synapses straight over to the output layer. All four will be analyzed, an activation function will be applied, and the results will be produced. This is simple enough but there is a way to amplify the power of the Neural Network and increase its accuracy by the addition of a hidden layer that sits between the input and output layers. Fig 10. A neural network with a hidden layer(only showing non-0 values) Now in the above figure, all 4 variables are connected to neurons via a synapse. However, not all of the synapses are weighted. they will either have a 0 value or non-0 value. here, the non-0 value → indicates the importance 0 value → They will be discarded. Let's take the example of Area and Distance to City are non-zero for the first neuron, which means they are weighted and matter to the first neuron. The other two variables, Bedrooms and Age aren’t weighted and so are not considered by the first neuron.
  • 5. You may wonder why that first neuron is only considering two of the four variables. In this case, it is common on the property market that larger homes become cheaper the further they are from the city. That’s a basic fact. So what this neuron may be doing is looking specifically for properties that are large but are not so far from the city. Now, this is where the power of neural networks comes from. There are many of these neurons, each doing similar calculations with different combinations of these variables. Once this criterion has been met, the neuron applies the activation function and do its calculations. The next neuron down may have weighted synapses of Distance to the city and Bedrooms. This way the neurons work and interact in a very flexible way allowing it to look for specific things and therefore make a comprehensive search for whatever it is trained for. Area and Distance to City are non-zero for the first neuron, which means they are weighted and matter to the first neuron. The other two variables, Bedrooms and Age aren’t. 5. HOW DO NEURAL NETWORK LEARN? Looking at an analogy may be useful in understanding the mechanisms of a neural network. Learning in a neural network is closely related to how we learn in our regular lives and activities — we perform an action and are either accepted or corrected by a trainer or coach to understand how to get better at a certain task. Similarly, neural networks require a trainer in order to describe what should have been produced as a response to the input. Based on the difference between the actual value and the predicted value, an error value also called Cost Function is computed and sent back through the system. Cost Function: One half of the squared difference between actual and output value. For each layer of the network, the cost function is analyzed and used to adjust the threshold and weights for the next input. Our aim is to minimize the cost function. The lower the cost function, the closer the actual value to the predicted value. In this way, the error keeps becoming marginally lesser in each run as the network learns how to analyze values. We feed the resulting data back through the entire neural network. The weighted synapses connecting input variables to the neuron are the only thing we have control over. As long as there exists a disparity between the actual value and the predicted value, we need to adjust those wights. Once we tweak them a little and run the neural network again, A new Cost function will be produced, hopefully, smaller than the last. We need to repeat this process until we scrub the cost function down to as small as possible. Fig 11. The procedure described above is known as Back- propagation and is applied continuously through a network until the error value is kept at a minimum. There are basically 2 ways to adjust weights: — A. Brute-force method Best suited for the single-layer feed-forward network. Here you take a number of possible weights. In this method, we want to eliminate all the other weights except the one right at the bottom of the U-shaped curve. Optimal weight can be found using simple elimination techniques. This process of elimination work if you have one weight to optimize. What if you have complex NN with many numbers of weights, then this method fails because of the Curse of Dimensionality. The alternative approach that we have is called Batch Gradient Descent. B. Batch-Gradient Descent It is a first-order iterative optimization algorithm and its responsibility is to find the minimum cost value(loss) in the process of training the model with different weights or updating weights.
  • 6. Fig 12. Gradient Descent In Gradient Descent, instead of going through every weight one at a time, and ticking every wrong weight off as you go, we instead look at the angle of the function line. If slope → Negative, that means yo go down the curve. If slope → Positive, Do nothing This way a vast number of incorrect weights are eliminated. For instance, if we have 3 million samples, we have to loop through 3 million times. So basically you need to calculate each cost 3 million times. C. Stochastic Gradient Descent(SGD) Gradient Descent works fine when we have a convex curve just like in the above figure. But if we don't have a convex curve, Gradient Descent fails. The word ‘stochastic‘ means a system or a process that is linked with a random probability. Hence, in Stochastic Gradient Descent, a few samples are selected randomly instead of the whole data set for each iteration. In SGD, we take one row of data at a time, run it through the neural network then adjust the weights. For the second row, we run it, then compare the Cost function and then again adjusting weights. And so on… SGD helps us to avoid the problem of local minima. It is much faster than Gradient Descent because it is running each row at a time and it doesn’t have to load the whole data in memory for doing computation. One thing to be noted is that, as SGD is generally noisier than typical Gradient Descent, it usually took a higher number of iterations to reach the minima, because of its randomness in its descent. Even though it requires a higher number of iterations to reach the minima than typical Gradient Descent, it is still computationally much less expensive than typical Gradient Descent. Hence, in most scenarios, SGD is preferred over Batch Gradient Descent for optimizing a learning algorithm. 6. CONCLUSION Fig 13. Neural networks are a new concept whose potential we have just scratched the surface of. They may be used for a variety of different concepts and ideas, and learn through a specific mechanism of backpropagation and error correction during the testing phase. By properly minimizing the error, these multi-layered systems may be able to one day learn and conceptualize ideas alone, without human correction.