SlideShare a Scribd company logo
1 of 146
Masters Thesis Defense:
Bio-inspired Algorithms for
Evolving the Architecture of
Convolutional Neural Networks
By Ashray Bhandare Thesis Advisor:
Dr. Devinder Kaur
Page  2
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  3
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  4
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  5
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  6
Introduction
A programmer has to tell the computer what kinds of things it should be
looking for (Feature Extraction) when dealing with Traditional Machine
Learning algorithms.
Due to this, the success of the algorithm is dependent on the programmer
and his understanding of the data.
Deep networks can solve this problem as it is capable of finding the right
features on its own, requiring very little assistance from the programmer.
Convolutional Neural Network (CNN) is one such type of deep networks.
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Page  7
Introduction contd.
Many researchers are exploring the use of CNN in machine learning
problems like image recognition, video analysis, natural language
processing and so on.
A CNN architecture consists of various layers and each layer consists of
many hyperparameters.
The vast amount of architectures that can be generated based on the
choices of hyperparameters makes it impossible for an exhaustive manual
search.
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Page  8
Problem Statement
In this thesis, three bio-inspired algorithms viz. genetic algorithm, particle
swarm optimizer (PSO) and grey wolf optimizer (GWO) are used to
optimally determine the architecture of a convolutional neural network
(CNN) that is used to classify handwritten numbers.
Currently, there is no standard way to automatically determine the
architecture of a CNN. Domain knowledge and human expertise are
required in order to design a CNN architecture. Typically architectures are
created by experimenting and modifying a few existing networks.
The bio-inspired algorithms determine the exact architecture of a CNN by
evolving the various hyperparameters of the architecture for a given
application.
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Page  9
MNIST Dataset
EECS6960 Research and Thesis
EECS6960 Research and Thesis
 The MNIST dataset is scanned images of handwritten digits and the
associated labels describe which digit 0-9 is contained in each image.
 This classification problem is one of the benchmark problems and is
widely used in deep learning research. It is one of the popular datasets
as it allows researchers to study their proposed methods in a
controlled environment.
Page  10
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  11
Convolutional Neural Network
A convolutional neural network (or ConvNet) is a type of feed-forward
artificial neural network
The architecture of a ConvNet is designed to take advantage of the 2D
structure of an input image.
 
A ConvNet is comprised of one or more convolutional layers (often with a
pooling step) and then followed by one or more fully connected layers as
in a standard multilayer neural network.
EECS6960 Research and Thesis
VS
EECS6960 Research and Thesis
Page  12
Motivation behind ConvNets
Consider an image of size 200x200x3 (200 wide, 200 high, 3 color
channels)
– a single fully-connected neuron in a first hidden layer of a regular Neural
Network would have 200*200*3 = 120,000 weights.
– Due to the presence of several such neurons, this full connectivity is wasteful
and the huge number of parameters would quickly lead to overfitting
However, in a ConvNet, the neurons in a layer will only be connected to a
small region of the layer before it, instead of all of the neurons in a fully-
connected manner.
– the final output layer would have dimensions 1x1xN, because by the end of the
ConvNet architecture we will reduce the full image into a single vector of class
scores (for N classes), arranged along the depth dimension
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Page  13
MLP vs ConvNet
A regular 3-layer Neural
Network.
A ConvNet arranges its
neurons in three
dimensions (width, height,
depth), as visualized in
one of the layers.
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Page  14
How ConvNet Works
For example, a ConvNet takes the input as an image which can be
classified as ‘X’ or ‘O’
In a simple case, ‘X’ would look like:
X or OCNN
A two-dimensional
array of pixels
EECS6960 Research and Thesis
Page  15
How ConvNet Works
What about trickier cases?
CNN
X
CNN
O
EECS6960 Research and Thesis
Page  16
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 1 -1 -1 -1
-1 -1 1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 1 -1 -1
-1 -1 -1 1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
=
?
EECS6960 Research and Thesis
How ConvNet Works – What Computer Sees
Page  17
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 1 -1 -1 -1
-1 -1 1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 1 -1 -1
-1 -1 -1 1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
=x
EECS6960 Research and Thesis
How ConvNet Works
Page  18
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 X -1 -1 -1 -1 X X -1
-1 X X -1 -1 X X -1 -1
-1 -1 X 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 X -1 -1
-1 -1 X X -1 -1 X X -1
-1 X X -1 -1 -1 -1 X -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
How ConvNet Works – What Computer Sees
Since the pattern does not match exactly, the computer will not be able to
classify this as ‘X’
EECS6960 Research and Thesis
Page  19
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  20
ConvNet Layers (At a Glance)
CONV layer will compute the output of neurons that are connected to local
regions in the input, each computing a dot product between their weights
and a small region they are connected to in the input volume.
RELU layer will apply an elementwise activation function, such as the
max(0,x) thresholding at zero. This leaves the size of the volume
unchanged.
POOL layer will perform a downsampling operation along the spatial
dimensions (width, height).
FC (i.e. fully-connected) layer will compute the class scores, resulting in
volume of size [1x1xN], where each of the N numbers correspond to a
class score, such as among the N categories.
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Page  21
Since the pattern does not match exactly, the computer will not be able to
classify this as ‘X’
What got changed?
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 X -1 -1 -1 -1 X X -1
-1 X X -1 -1 X X -1 -1
-1 -1 X 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 X -1 -1
-1 -1 X X -1 -1 X X -1
-1 X X -1 -1 -1 -1 X -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Recall – What Computer Sees
EECS6960 Research and Thesis
Page  22
=
=
=
Convolution layer will work to identify patterns (features) instead of
individual pixels
EECS6960 Research and Thesis
Convolutional Layer
Page  23
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
Convolutional Layer - Filters
The CONV layer’s parameters consist of a set of learnable filters.
Every filter is small spatially (along width and height), but extends through
the full depth of the input volume.
During the forward pass, we slide (more precisely, convolve) each filter
across the width and height of the input volume and compute dot products
between the entries of the filter and the input at any position.
EECS6960 Research and Thesis
Page  24
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
Convolutional Layer - Filters
Sliding the filter over the width and height of the input gives 2-dimensional
activation map that responds to that filter at every spatial position.
EECS6960 Research and Thesis
Page  25
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Strides = 1, Filter Size = 3 X 3 X 1, Padding = 0
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  26
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  27
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  28
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  29
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  30
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  31
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  32
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  33
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Navigation Example
Page  34
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  35
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  36
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  37
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  38
1 1 1
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  39
1 1 1
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  40
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  41
1 1 1
1 1 1
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  42
1 1 1
1 1 1
1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  43
1 1 1
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  44
1
1 1 1
1 1 1
1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  45
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  46
1 1 -1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  47
1 1 -1
1 1 1
-1 1 1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  48
1
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1 1 -1
1 1 1
-1 1 1
55
1 1 -1
1 1 1
-1 1 1
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  49
Convolutional Layer - Strides
• The distance that filter is moved across the input from the previous
layer each activation is referred to as the stride.
EECS6960 Research and Thesis
Stride: 1 Stride: 2
Page  50
Convolutional Layer - Padding
Sometimes it is convenient to pad the input volume with zeros around the
border.
Zero padding is allows us to preserve the spatial size of the output
volumes
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Padding: 1 Padding: 2
Page  51
1 -1 -1
-1 1 -1
-1 -1 1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
=
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
Input Size (W): 9
Filter Size (F): 3 X 3
Stride (S): 1
Filters: 1
Padding (P): 09 X 9 7 X 7
Feature Map Size = 1+ (W – F + 2P)/S
= 1+ (9 – 3 + 2 X 0)/1 = 7
EECS6960 Research and Thesis
Convolutional Layer – Filters – Computation Example
Page  52
1 -1 -1
-1 1 -1
-1 -1 1
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
=
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 1
-1 1 -1
1 -1 1
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
=
=
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
Output Feature Map of
One complete
convolution:
– Filters: 3
– Filter Size: 3 X 3
– Stride: 1
Conclusion:
– Input Image:
9 X 9
– Output of Convolution:
7 X 7 X 3
EECS6960 Research and ThesisEECS6960 Research and Thesis
Convolutional Layer – Filters – Output Feature Map
Page  53
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
EECS6960 Research and Thesis
Convolutional Layer – Output
Page  54
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.77
EECS6960 Research and Thesis
Page  55
0.77 0
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
EECS6960 Research and Thesis
Page  56
0.77 0 0.11 0.33 0.55 0 0.33
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
EECS6960 Research and Thesis
Page  57
0.77 0 0.11 0.33 0.55 0 0.33
0 1.00 0 0.33 0 0.11 0
0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 0 0.55 0 0.33 0.33
0.55 0 0.11 0 1.00 0 0.11
0 0.11 0 0.33 0 1.00 0
0.33 0 0.55 0.33 0.11 0 0.77
Rectified Linear Units (ReLUs)
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
EECS6960 Research and Thesis
Page  58
ReLU layer
0.77 0 0.11 0.33 0.55 0 0.33
0 1.00 0 0.33 0 0.11 0
0.11 0 1.00 0 0.11 0 0.55
0.33 0.33 0 0.55 0 0.33 0.33
0.55 0 0.11 0 1.00 0 0.11
0 0.11 0 0.33 0 1.00 0
0.33 0 0.55 0.33 0.11 0 0.77
0.33 0 0.11 0 0.11 0 0.33
0 0.55 0 0.33 0 0.55 0
0.11 0 0.55 0 0.55 0 0.11
0 0.33 0 1.00 0 0.33 0
0.11 0 0.55 0 0.55 0 0.11
0 0.55 0 0.33 0 0.55 0
0.33 0 0.11 0 0.11 0 0.33
0.33 0 0.55 0.33 0.11 0 0.77
0 0.11 0 0.33 0 1.00 0
0.55 0 0.11 0 1.00 0 0.11
0.33 0.33 0 0.55 0 0.33 0.33
0.11 0 1.00 0 0.11 0 0.55
0 1.00 0 0.33 0 0.11 0
0.77 0 0.11 0.33 0.55 0 0.33
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
0.77 -0.11 0.11 0.33 0.55 -0.11 0.33
-0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11
0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55
0.33 0.33 -0.33 0.55 -0.33 0.33 0.33
0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11
-0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11
0.33 -0.11 0.55 0.33 0.11 -0.11 0.77
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11
0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11
-0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55
0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33
EECS6960 Research and Thesis
Page  59
Pooling Layer
The pooling layers down-sample the previous layers feature map.
Its function is to progressively reduce the spatial size of the representation
to reduce the amount of parameters and computation in the network
The pooling layer often uses the Max operation to perform the
downsampling process
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  60
1.00
Pooling
Pooling Filter Size = 2 X 2, Stride = 2
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  61
1.00 0.33
Pooling
EECS6960 Research and Thesis
Pooling Filter Size = 2 X 2, Stride = 2
Page  62
1.00 0.33 0.55
Pooling
EECS6960 Research and Thesis
Pooling Filter Size = 2 X 2, Stride = 2
Page  63
1.00 0.33 0.55 0.33
Pooling
 Pooling Filter Size = 2 X 2, Stride = 2
EECS6960 Research and Thesis
Pooling Filter Size = 2 X 2, Stride = 2
Page  64
1.00 0.33 0.55 0.33
0.33
Pooling
EECS6960 Research and Thesis
Pooling Filter Size = 2 X 2, Stride = 2
Page  65
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
Pooling
EECS6960 Research and Thesis
Pooling Filter Size = 2 X 2, Stride = 2
Page  66
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
EECS6960 Research and Thesis
Pooling
Page  67
Layers get stacked
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.33 0.55 0.33
0.33 1.00 0.33 0.55
0.55 0.33 1.00 0.11
0.33 0.55 0.11 0.77
0.33 0.55 1.00 0.77
0.55 0.55 1.00 0.33
1.00 1.00 0.11 0.55
0.77 0.33 0.55 0.33
0.55 0.33 0.55 0.33
0.33 1.00 0.55 0.11
0.55 0.55 0.55 0.11
0.33 0.11 0.11 0.33
EECS6960 Research and Thesis
Page  68
Deep stacking
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
1.00 0.55
0.55 1.00
0.55 1.00
1.00 0.55
1.00 0.55
0.55 0.55
EECS6960 Research and Thesis
Page  69
Fully connected layer
Fully connected layers are the
normal flat feed-forward neural
network layers.
These layers may have a non-
linear activation function or a
softmax activation in order to
predict classes.
To compute our output, we simply
re-arrange the output matrices as
a 1-D array.
1.00 0.55
0.55 1.00
0.55 1.00
1.00 0.55
1.00 0.55
0.55 0.55
1.00
0.55
0.55
1.00
1.00
0.55
0.55
0.55
0.55
1.00
1.00
0.55
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  70
Fully connected layer
A summation of product of inputs and weights at each output node
determines the final prediction
X
O
0.55
1.00
1.00
0.55
0.55
0.55
0.55
0.55
1.00
0.55
0.55
1.00
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  71
Putting it all together
-1 -1 -1 -1 -1 -1 -1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 -1 -1 1 -1 -1 -1 -1
-1 -1 -1 1 -1 1 -1 -1 -1
-1 -1 1 -1 -1 -1 1 -1 -1
-1 1 -1 -1 -1 -1 -1 1 -1
-1 -1 -1 -1 -1 -1 -1 -1 -1
X
O
EECS6960 Research and Thesis
Page  72
Hyperparameters
Convolution
– Filter Size
– Number of Filters
– Padding
– Stride
Pooling
– Window Size
– Stride
Fully Connected
– Number of neurons
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  73
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  74
Genetic Algorithm (GA)
Genetic Algorithm (or GA) is inspired by natural process of evolution.
It is based on two foundations
– Foundation I: Darwin’s Theory of Natural Selection
– Foundation II: Mendel’s Theory of Genetics
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  75
Genetic Algorithm (GA)
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  76
Selection
Selection operators give preference to better solutions (chromosomes),
allowing them to pass on their 'genes' to the next generation of the
algorithm.
The best solutions are determined using some form of objective function
(also known as a 'fitness function' in genetic algorithm), before being
passed to the crossover operator.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  77
Tournament Selection
In tournament selection, K individuals from the population are selected at
random and select the best out of these to become a parent. K is known
as the tournament selection size.
In the above example, K=3
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  78
Crossover
Crossover is the process of taking more than one parent solutions
(chromosomes) and producing a child solution from them.
By recombining portions of good solutions, the genetic algorithm is more
likely to create a better solution.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Chromosome X
Chromosome Y
Pivot Point
Offspring A
Offspring B
 A single point crossover calls
for a single pivot point
(crossover point) to be selected
on the parent chromosomes.
 All data beyond this pivot point
is swapped in both parent
chromosomes. This results in
the formation of two offspring
chromosomes.
Page  79
Mutation
The purpose of the mutation operator is to encourage genetic diversity
amongst the chromosomes.
If the chromosomes are similar to each other, the genetic algorithm
converges to a local minimum. The mutation operator prevents this from
happening.
EECS6960 Research and ThesisEECS6960 Research and Thesis
 The Mutation operator
flips a randomly
selected gene in a
chromosome.
Page  80
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  81
Hyperparameters in CNN
Convolution
– Filter Size
– Number of Filters
– Padding
– Stride
Pooling
– Window Size
– Stride
Fully Connected
– Number of neurons
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  82
Hyper parameter Range
No. of Epoch (0 - 127)
Batch Size (0 - 256)
No. of Convolution Layers (0 - 8)
No. of Filters at each Convo layer (0 - 64)
Convo Filter Size at each Convo layer (0 - 8)
Activations used at each Convo layer (sigmoid, tanh, relu, linear)
Maxpool layer after each Convo layer (true, false)
Maxpool Pool Size for each Maxpool layer (0 - 8)
No. of Feed-Forward Hidden Layers (0 - 8)
No. of Feed-Forward Hidden Neurons at each layer (0 - 64)
Activations used at each Feed-Forward layer (sigmoid, tanh, softmax, relu)
Optimizer (Adagrad, Adadelta, RMS, SGD)
EECS6960 Research and ThesisEECS6960 Research and Thesis
Hyperparameters in CNN
Page  83
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and Thesis
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
EECS6960 Research and Thesis
Page  84 EECS6960 Research and Thesis
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
No. of Epochs
100
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and Thesis
Page  85
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Batch Size
64
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  86
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
No. of Convolutions
2
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  87
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
No. of Filters at 1st Convolution
10
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  88
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Filter Size at 1st Convolution
5
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  89
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Activations used at 1st Convolution
1 = TanH
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  90
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Maxpool layer after 1st
Convolution
1 = True
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  91
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Maxpool Pool Size for 1st
Maxpool
5
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  92
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1No. of Filters at 2nd
Convolution
15
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  93
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1Filter Size at 2nd
Convolution layer
3
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  94
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1Activations used at
2nd Convolution
0= Sigmoid
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  95
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Maxpool layer after 2nd
Convolution layer
1 = True
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  96
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Maxpool Pool Size for 2nd
Maxpool
5
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  97
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
No. of Feed-Forward Hidden
Layers
3
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  98
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
No. of Feed-Forward Hidden
Neurons at 1st layer
32
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  99
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Activations used at 1st Feed-
Forward layer
0 = Sigmoid
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  100
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
No. of Feed-Forward Hidden
Neurons at 2nd layer
50
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  101
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Activations used at 2nd Feed-
Forward layer
2 = Linear
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  102
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
No. of Feed-Forward Hidden
Neurons at 3rd layer
10
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  103
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Activations used at 3rd Feed-
Forward layer
2 = Softmax
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  104
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Optimizer
0 = Adagrad
Mapping of GA Chromosome to CNN Hyperparameters
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  105
Mapping of GA Chromosome to CNN Hyperparameters
1 1 0 0 1 0 0 No. of Epochs: 100
0 1 0 0 0 0 0 0 Batch Size: 64
0 1 0 No. of Convolutions: 2
0 0 1 0 1 0 No. of Filters at 1st Convolution : 10
1 0 1 Filter Size at 1st Convolution : 5
0 1 Activations used at 1st Convolution : Tanh
1 Maxpool layer after 1st Convolution layer : True
1 0 1 Maxpool Pool Size for 1st Maxpool : 5
0 0 1 1 1 1 No. of Filters at 2nd Convolution : 15
0 1 1 Filter Size at 2nd Convolution layer : 3
0 0 Activations used at 2nd Convolution: Sigmoid
1 Maxpool layer after 2nd Convolution layer : True
1 0 1 Maxpool Pool Size for 2nd Maxpool : 5
0 1 1 No. of Feed-Forward Hidden Layers : 3
1 0 0 0 0 0 No. of Feed-Forward Hidden Neurons at 1st layer: 32
0 0 Activations used at 1st Feed-Forward layer : Sigmoid
1 1 0 0 1 0 No. of Feed-Forward Hidden Neurons at 2nd layer: 50
1 1 Activations used at 2nd Feed-Forward layer : Linear
0 0 1 0 1 0 No. of Feed-Forward Hidden Neurons at 3rd layer: 10
1 0 Activations used at 3rd Feed-Forward Layer: Softmax
0 0 Optimizer: Adagrad
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  106
Fitness Function
The fitness function used in this study is the classification accuracy which
determines the number of correctly classified patterns.
This classification accuracy ( ranges from 0 and 1) is the fitness value of a
particular CNN architecture.
For the evaluation of the CNN, Keras – which is a high-level neural
networks API, written in Python, is used to train the convolutional neural
networks. It is a deep learning library which allows easy and fast
prototyping. It supports all the layers of a CNN and can train the network
using various optimization algorithms.
Keras generates a classification accuracy when a CNN architecture is fully
trained.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  107
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  108
Evaluation
The Genetic algorithm tuner was implemented with the MNIST dataset
with 50,000 images as its training set and another 10,000 images as its
testing set.
Genetic algorithm with 10 chromosomes generated randomly was
executed 10 times, each time with randomly chosen chromosomes
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  109
Results – GA Tuning
Experiment No. Highest Fitness Value
1 0.987799989104
2 0.978100001216
3 0.947200008678
4 0.954100004768
5 0.961800005841
6 0.985799998164
7 0.991900001359
8 0.98910000065
9 0.986600002062
10 0.990600002396
EECS6960 Research and Thesis
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1 2 3 4 5 6 7 8 9 10
Score
Generation
GA Tuner: Classification Accuracy vs
Generation
EECS6960 Research and Thesis
Convergence process of GA tuning
Page  110
Generated Output after GA Tuning
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  111
Final CNN Architecture after GA Tuning
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  112
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  113
Particle Swarm Optimization Algorithm (PSO)
Inspired from the nature social behavior and dynamic movements with
communications of insects, birds and fish.
Uses a number of agents (particles) that constitute a swarm moving
around in the search space looking for the best solution.
Each particle adjusts its travelling speed dynamically corresponding to the
flying experiences of itself and its colleagues.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  114
Particle Swarm Optimization Algorithm (PSO)
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  115
Position Update Rule
The position of a particle i is given by xi, which is an L-dimensional vector in ℜL.
The change of position of a particle is denoted by Δxi, which is a vector that is
added to the position coordinates in order to move the particle from one iteration t
to the other t + 1
The vector Δxi is commonly referred to as the velocity vi of the particle.
EECS6960 Research and ThesisEECS6960 Research and Thesis
xi t + 1 = xi(t) + Δxi t + 1
Page  116
Velovity Update Rule
The particle swarm algorithm samples the search-space by modifying the velocity
of each particle.
Velocity term Δxi(t + 1) at iteration t + 1 is influenced by the current velocity
Δxi(t), the location of the particle’s best success so far Pi and the best position
found by any member of the swarm Pg
Here ϕ1 and ϕ2 represent positive random vectors composed of numbers
drawn from uniform distributions.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Δxi t + 1
Page  117
PSO – Simulation
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  118
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  119
Mapping of PSO Chromosome to CNN Hyperparameters
EECS6960 Research and Thesis
0.69 0.59 0.48 0.36 0.61 0.02 0.17 0.45 0.95 0.32 0.19
0.25 0.31 0.42 0.17 0.29 0.68 0.11 0.46 0.36 0.86 0.05
0.46 0.27 0.95 0.73 0.56 0.99 0.23 0.54 0.68 0.23 0.14
0.69 0.73 0.96 0.89 0.13 0.59 0.95 0.82 0.19 0.48 0.25
0.37 0.31 0.16 0.43 0.85 0.53 0.28 0.19 0.93 0.25 0.75
0.55 0.37 0.29 0.88 0.27 0.57 0.43 0.79 0.39 0.27 0.04
0.88 0.24 0.93 0.36 0.73 0.27 0.92 0.65 0.56 0.33 0.67
EECS6960 Research and Thesis
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Page  120
Mapping of PSO Particle to CNN Hyperparameters
1 1 0 0 1 0 0 No. of Epochs: 100
0 1 0 0 0 0 0 0 Batch Size: 64
0 1 0 No. of Convolutions: 2
0 0 1 0 1 0 No. of Filters at 1st Convolution : 10
1 0 1 Filter Size at 1st Convolution : 5
0 1 Activations used at 1st Convolution : Tanh
1 Maxpool layer after 1st Convolution layer : True
1 0 1 Maxpool Pool Size for 1st Maxpool : 5
0 0 1 1 1 1 No. of Filters at 2nd Convolution : 15
0 1 1 Filter Size at 2nd Convolution layer : 3
0 0 Activations used at 2nd Convolution: Sigmoid
1 Maxpool layer after 2nd Convolution layer : True
1 0 1 Maxpool Pool Size for 2nd Maxpool : 5
0 1 1 No. of Feed-Forward Hidden Layers : 3
1 0 0 0 0 0 No. of Feed-Forward Hidden Neurons at 1st layer: 32
0 0 Activations used at 1st Feed-Forward layer : Sigmoid
1 1 0 0 1 0 No. of Feed-Forward Hidden Neurons at 2nd layer: 50
1 1 Activations used at 2nd Feed-Forward layer : Linear
0 0 1 0 1 0 No. of Feed-Forward Hidden Neurons at 3rd layer: 10
1 0 Activations used at 3rd Feed-Forward Layer: Softmax
0 0 Optimizer: Adagrad
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  121
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  122
Evaluation
The PSO tuner was implemented with the MNIST dataset with 50,000
images as its training set and another 10,000 images as its testing set.
Particle swarm optimizer with 10 particles generated randomly was
executed 10 times, each time with a randomly chosen particle.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  123
Results – PSO Tuning
Exp No. Highest Fitness Value
1 0.984499992943
2 0.973899998105
3 0.988800008184
4 0.993600005358
5 0.947799991965
6 0.949000005102
7 0.983099997652
8 0.979799999475
9 0.956399999567
10 0.992350000068
EECS6960 Research and Thesis
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1 2 3 4 5 6 7 8 9 10
Score
Generation
PSO Tuner: Classification Accuracy vs Generation
EECS6960 Research and Thesis
Convergence process of PSO tuning
Page  124
Generated Output after PSO Tuning
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  125
Final Architecture after PSO Tuning
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  126
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  127
Grey Wolf Optimization Algorithm (GWO)
EECS6960 Research and ThesisEECS6960 Research and Thesis
The GWO algorithm mimics the leadership hierarchy and hunting
mechanism of gray wolves in nature proposed by Mirjalili et al. in 2014.
Four types of grey wolves such as alpha, beta, delta, and omega are
employed for simulating the leadership hierarchy
α
(Alpha)
β
(Beta)
δ
(Delta)
ω
(Omega)
Page  128 EECS6960 Research and ThesisEECS6960 Research and Thesis
In addition to the social hierarchy of wolves, group hunting is another
interesting social behavior of grey wolves. The main phases of grey wolf
hunting are as follows:
• Tracking, chasing, and approaching the prey
• Pursuing, encircling, and harassing the prey until it stops moving
• Attack the prey
Hunting behavior of grey
wolves: (A) chasing,
approaching, and tracking
prey (B–D) pursuing,
harassing, and encircling
(E) stationary situation
and attack
Grey Wolf Optimization Algorithm (GWO)
Page  129 EECS6960 Research and ThesisEECS6960 Research and Thesis
Grey Wolf Optimizer – Encircling the prey
Encircling is mathematically modelled as follows
Where t indicates the current iteration, 𝐴 and 𝐶 are coefficient vectors, 𝑋 𝑝
is the position vector of the prey, and 𝑋 indicates the position vector of a
grey wolf. 𝐴 and 𝐶 are given by Equations
Where components of 𝑎 are linearly decreased from 2 to 0 over the course
of iterations and r1, r2 are random vectors in the interval [0, 1].
𝐷 = 𝐶. 𝑋 𝑝 𝑡 − 𝑋 𝑡
𝑋(𝑡 + 1) = 𝑥 𝑝 − 𝐴. 𝐷
𝐴 = 2. 𝑎. 𝑟1 − 𝑎
𝐶 = 2. 𝑟2
Page  130 EECS6960 Research and ThesisEECS6960 Research and Thesis
Grey Wolf Optimizer – Attacking the prey
Grey wolves have the ability to recognize the location of prey and encircle
them. The hunt is usually guided by the alpha. The beta and delta might
also participate in hunting occasionally.
A new beta and delta emerge in each iteration as all the other wolves
update their positions.
 We assume that the alpha (best candidate solution) beta, and delta have
better knowledge about the potential location of prey.
The first three best solutions obtained so far are saved (α, β and δ ) and
the positions of the other search agents (the omegas) are updated
according to the position of the best search agent.
Page  131 EECS6960 Research and ThesisEECS6960 Research and Thesis
Grey Wolf Optimizer – Attacking the prey
Attacking is mathematically modelled with the following equations
𝐷 𝛼 = |𝐶1. 𝑋 𝛼 − 𝑋|
𝐷 𝛽 = |𝐶2. 𝑋 𝛽 − 𝑋|
𝐷 𝛾 = |𝐶3. 𝑋 𝛿 − 𝑋|
𝑋1 = 𝑋 𝛼 − 𝐴1. (𝐷 𝛼)
𝑋2 = 𝑋 𝛽 − 𝐴2. (𝐷 𝛽)
𝑋3 = 𝑋 𝛿 − 𝐴3. (𝐷 𝛿)
𝑋 𝑡 + 1 =
𝑋1 + 𝑋2 + 𝑋3
3
Page  132 EECS6960 Research and ThesisEECS6960 Research and Thesis
Grey Wolf Optimization Algorithm (GWO)
Page  133
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  134
Mapping of GWO Chromosome to CNN Hyperparameters
EECS6960 Research and Thesis
0.69 0.59 0.48 0.36 0.61 0.02 0.17 0.45 0.95 0.32 0.19
0.25 0.31 0.42 0.17 0.29 0.68 0.11 0.46 0.36 0.86 0.05
0.46 0.27 0.95 0.73 0.56 0.99 0.23 0.54 0.68 0.23 0.14
0.69 0.73 0.96 0.89 0.13 0.59 0.95 0.82 0.19 0.48 0.25
0.37 0.31 0.16 0.43 0.85 0.53 0.28 0.19 0.93 0.25 0.75
0.55 0.37 0.29 0.88 0.27 0.57 0.43 0.79 0.39 0.27 0.04
0.88 0.24 0.93 0.36 0.73 0.27 0.92 0.65 0.56 0.33 0.67
EECS6960 Research and Thesis
1 1 0 0 1 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0 1 0
0 0 1 1 1 1 0 1 1 0 0
1 1 0 1 0 1 1 1 0 0 0
0 0 0 0 1 1 0 0 1 0 1
1 0 0 1 0 1 0 1 0 0 0
1 0 1 0 1 0 1 1 1 0 1
Page  135
Mapping of GWO Solution to CNN Hyperparameters
1 1 0 0 1 0 0 No. of Epochs: 100
0 1 0 0 0 0 0 0 Batch Size: 64
0 1 0 No. of Convolutions: 2
0 0 1 0 1 0 No. of Filters at 1st Convolution : 10
1 0 1 Filter Size at 1st Convolution : 5
0 1 Activations used at 1st Convolution : Tanh
1 Maxpool layer after 1st Convolution layer : True
1 0 1 Maxpool Pool Size for 1st Maxpool : 5
0 0 1 1 1 1 No. of Filters at 2nd Convolution : 15
0 1 1 Filter Size at 2nd Convolution layer : 3
0 0 Activations used at 2nd Convolution: Sigmoid
1 Maxpool layer after 2nd Convolution layer : True
1 0 1 Maxpool Pool Size for 2nd Maxpool : 5
0 1 1 No. of Feed-Forward Hidden Layers : 3
1 0 0 0 0 0 No. of Feed-Forward Hidden Neurons at 1st layer: 32
0 0 Activations used at 1st Feed-Forward layer : Sigmoid
1 1 0 0 1 0 No. of Feed-Forward Hidden Neurons at 2nd layer: 50
1 1 Activations used at 2nd Feed-Forward layer : Linear
0 0 1 0 1 0 No. of Feed-Forward Hidden Neurons at 3rd layer: 10
1 0 Activations used at 3rd Feed-Forward Layer: Softmax
0 0 Optimizer: Adagrad
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  136
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  137
Evaluation
The GWO algorithm tuner was implemented with the MNIST dataset with
50,000 images as its training set and another 10,000 images as its testing
set.
Grey wolf optimization algorithm with 10 solutions generated randomly
was executed 10 times, each time with a randomly chosen solution.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  138
Results – GWO Tuning
Experiment No. Highest Fitness Value
1 0.946400008178
2 0.948899995995
3 0.994200000004
4 0.97359999752
5 0.961999999666
6 0.877199997282
7 0.985900000003
8 0.899900003791
9 0.959000001717
10 0.932900003999
EECS6960 Research and Thesis
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1 2 3 4 5 6 7 8 9 10
Score
Generation
GWO Tuner: Classification Accuracy vs Generation
EECS6960 Research and Thesis
Convergence process of GWO tuning
Page  139
Generated Output after GWO Tuning
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  140
Final CNN Architecture after GWO Tuning
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  141
Agenda
Introduction
Convolutional Neural Network
– How ConvNet Works
ConvNet Layers
– Convolutional Layer
– Pooling Layer
– Normalization Layer (ReLU)
– Fully-Connected Layer
Hyper Parameters
Genetic Algorithm (GA)
– Workings of GA
– Selection
– Crossover
– Mutation
EECS6960 Research and Thesis
EECS6960 Research and Thesis
Mapping GA chromosome
GA Tuner Evaluation & Results
Particle Swarm Optimmization (PSO)
– Workings of PSO
– PSO Simulation
Mapping PSO Paticle
PSO Tuner Evaluation & Results
Grey Wolf Optimization (GWO)
– Workings of GWO
Mapping GWO Candidate Solution
GWO Tuner Evaluation & Results
Conclusion
Page  142
Conclusion
In this thesis, three bio-inspired algorithms, viz. GA, PSO, and GWO were
used to generate fully trained CNN architectures for the MNIST dataset.
It has been demonstrated that the proposed method is capable of
choosing relevant hyperparameters thus forming an optimum CNN
architecture. The architectures were generated automatically and without
any human intervention.
All experiments carried out using the GA and PSO algorithm yielded
classification accuracies of more than 90% with the highest accuracy
being 99.2% and 99.36% respectively. The GWO experiments yielded
classification accuracies of more than 85%, with the highest accuracy
being 99.4%.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  143
Conclusion contd.
In the future, this work can be extended to other bio-inspired algorithms.
Also, this work can be implemented on other datasets. These datasets
may consist of colored images and may be greater in size, provided there
is access to better processing power.
EECS6960 Research and ThesisEECS6960 Research and Thesis
Algorithm Approx. Processing
Time
(in Hours)
Results
(Classification Accuracy)
Best Run Worst Run
Genetic Algorithm 4-5 0.9919 0.9472
Particle Swarm
Optimization Algorithm
4-5 0.9936 0.9478
Grey Wolf Optimization
Algorithm
5-6 0.9942 0.8772
Page  144
References
 Karpathy, A. (n.d.). CS231n Convolutional Neural Networks for Visual Recognition. Retrieved
from http://cs231n.github.io/convolutional-networks/#overview
 Rohrer, B. (n.d.). How do Convolutional Neural Networks work?. Retrieved from
http://brohrer.github.io/how_convolutional_neural_networks_work.html
 Brownlee, J. (n.d.). Crash Course in Convolutional Neural Networks for Machine Learning.
Retrieved from http://machinelearningmastery.com/crash-course-convolutional-neural-
networks/
 Lidinwise (n.d.). The revolution of depth. Retrieved from https://medium.com/@Lidinwise/the-
revolution-of-depth-facf174924f5#.8or5c77ss
 Nervana. (n.d.). Tutorial: Convolutional neural networks. Retrieved from
https://www.nervanasys.com/convolutional-neural-networks/
 L. N. d. Castro, Fundamentals of Natural Computing: Basic Concepts, Algorithms, and
Applications, Chapman and Hall/CRC , 2006.
 S. Mirjalili, S. M. Mirjalili and A. Lewis, "Grey Wolf Optimizer," Advances in Engineering Software,
vol. 69, pp. 46-61, 2014.
 A. Bhandare and D. Kaur, "Comparative Analysis of Swarm Intelligence Techniques," in
International Conference of Artificial Intelligence, 2017.
EECS6980:006 Social Network Analysis
Page  145
Questions
EECS6960 Research and ThesisEECS6960 Research and Thesis
Page  146
Thank you!!
EECS6960 Research and ThesisEECS6960 Research and Thesis

More Related Content

What's hot

Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Bio-inspired computing Algorithms.pptx
Bio-inspired computing Algorithms.pptxBio-inspired computing Algorithms.pptx
Bio-inspired computing Algorithms.pptxpawansher2002
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisXin-She Yang
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimizationAbhishek Agrawal
 
Reinforcement learning and the Frozen Lake Problem
Reinforcement learning and the Frozen Lake ProblemReinforcement learning and the Frozen Lake Problem
Reinforcement learning and the Frozen Lake ProblemVishal Kumar
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimizationSuman Chatterjee
 
Particle Swarm optimization
Particle Swarm optimizationParticle Swarm optimization
Particle Swarm optimizationmidhulavijayan
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated AnnealingJoy Dutta
 
Artificial fish swarm optimization
Artificial fish swarm optimizationArtificial fish swarm optimization
Artificial fish swarm optimizationAhmed Fouad Ali
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimizationMahesh Tibrewal
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Xin-She Yang
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimizationanurag singh
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSungjoon Choi
 
Optimization Shuffled Frog Leaping Algorithm
Optimization Shuffled Frog Leaping AlgorithmOptimization Shuffled Frog Leaping Algorithm
Optimization Shuffled Frog Leaping AlgorithmUday Wankar
 
Sca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problemsSca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problemslaxmanLaxman03209
 
Genetic Algorithm by Example
Genetic Algorithm by ExampleGenetic Algorithm by Example
Genetic Algorithm by ExampleNobal Niraula
 

What's hot (20)

Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Bio-inspired computing Algorithms.pptx
Bio-inspired computing Algorithms.pptxBio-inspired computing Algorithms.pptx
Bio-inspired computing Algorithms.pptx
 
Swarm intelligence algorithms
Swarm intelligence algorithmsSwarm intelligence algorithms
Swarm intelligence algorithms
 
Firefly algorithm
Firefly algorithmFirefly algorithm
Firefly algorithm
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimization
 
Reinforcement learning and the Frozen Lake Problem
Reinforcement learning and the Frozen Lake ProblemReinforcement learning and the Frozen Lake Problem
Reinforcement learning and the Frozen Lake Problem
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimization
 
Particle Swarm optimization
Particle Swarm optimizationParticle Swarm optimization
Particle Swarm optimization
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated Annealing
 
Artificial fish swarm optimization
Artificial fish swarm optimizationArtificial fish swarm optimization
Artificial fish swarm optimization
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimization
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms
 
Particle swarm optimization
Particle swarm optimizationParticle swarm optimization
Particle swarm optimization
 
Semantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep LearningSemantic Segmentation Methods using Deep Learning
Semantic Segmentation Methods using Deep Learning
 
Optimization Shuffled Frog Leaping Algorithm
Optimization Shuffled Frog Leaping AlgorithmOptimization Shuffled Frog Leaping Algorithm
Optimization Shuffled Frog Leaping Algorithm
 
Sca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problemsSca a sine cosine algorithm for solving optimization problems
Sca a sine cosine algorithm for solving optimization problems
 
Genetic algorithm
Genetic algorithm Genetic algorithm
Genetic algorithm
 
Hands-on Introduction to Machine Learning
Hands-on Introduction to Machine LearningHands-on Introduction to Machine Learning
Hands-on Introduction to Machine Learning
 
Genetic Algorithm by Example
Genetic Algorithm by ExampleGenetic Algorithm by Example
Genetic Algorithm by Example
 

Similar to Bio-inspired Algorithms for Evolving the Architecture of Convolutional Neural Networks

Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNNAshray Bhandare
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural NetworksAshray Bhandare
 
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET Journal
 
2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptx
2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptx2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptx
2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptxssuser67281d
 
Artificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular ApproachArtificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular ApproachRoee Levy
 
A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...eSAT Publishing House
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...Bomm Kim
 
Artificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IArtificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IRamez Abdalla, M.Sc
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesIRJET Journal
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classificationijtsrd
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
 
IRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET Journal
 
Web spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsaciijournal
 
Efficiency of Neural Networks Study in the Design of Trusses
Efficiency of Neural Networks Study in the Design of TrussesEfficiency of Neural Networks Study in the Design of Trusses
Efficiency of Neural Networks Study in the Design of TrussesIRJET Journal
 

Similar to Bio-inspired Algorithms for Evolving the Architecture of Convolutional Neural Networks (20)

Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
CNN Basics.pdf
CNN Basics.pdfCNN Basics.pdf
CNN Basics.pdf
 
Convolutional Neural Networks
Convolutional Neural NetworksConvolutional Neural Networks
Convolutional Neural Networks
 
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural Networks
 
2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptx
2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptx2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptx
2. NEURAL NETWORKS USING GENETIC ALGORITHMS.pptx
 
Artificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular ApproachArtificial Neural Network Implementation on FPGA – a Modular Approach
Artificial Neural Network Implementation on FPGA – a Modular Approach
 
A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...A simplified design of multiplier for multi layer feed forward hardware neura...
A simplified design of multiplier for multi layer feed forward hardware neura...
 
Final Poster
Final PosterFinal Poster
Final Poster
 
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...(Im2col)accelerating deep neural networks on low power heterogeneous architec...
(Im2col)accelerating deep neural networks on low power heterogeneous architec...
 
Artificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IArtificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part I
 
Garbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning TechniquesGarbage Classification Using Deep Learning Techniques
Garbage Classification Using Deep Learning Techniques
 
Hand Written Digit Classification
Hand Written Digit ClassificationHand Written Digit Classification
Hand Written Digit Classification
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
IRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural NetworkIRJET-Breast Cancer Detection using Convolution Neural Network
IRJET-Breast Cancer Detection using Convolution Neural Network
 
Mnist report
Mnist reportMnist report
Mnist report
 
Gaze detection
Gaze detectionGaze detection
Gaze detection
 
L1102017479
L1102017479L1102017479
L1102017479
 
Web spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithmsWeb spam classification using supervised artificial neural network algorithms
Web spam classification using supervised artificial neural network algorithms
 
Efficiency of Neural Networks Study in the Design of Trusses
Efficiency of Neural Networks Study in the Design of TrussesEfficiency of Neural Networks Study in the Design of Trusses
Efficiency of Neural Networks Study in the Design of Trusses
 

Recently uploaded

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Recently uploaded (20)

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

Bio-inspired Algorithms for Evolving the Architecture of Convolutional Neural Networks

  • 1. Masters Thesis Defense: Bio-inspired Algorithms for Evolving the Architecture of Convolutional Neural Networks By Ashray Bhandare Thesis Advisor: Dr. Devinder Kaur
  • 2. Page  2 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 3. Page  3 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 4. Page  4 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 5. Page  5 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 6. Page  6 Introduction A programmer has to tell the computer what kinds of things it should be looking for (Feature Extraction) when dealing with Traditional Machine Learning algorithms. Due to this, the success of the algorithm is dependent on the programmer and his understanding of the data. Deep networks can solve this problem as it is capable of finding the right features on its own, requiring very little assistance from the programmer. Convolutional Neural Network (CNN) is one such type of deep networks. EECS6960 Research and Thesis EECS6960 Research and Thesis
  • 7. Page  7 Introduction contd. Many researchers are exploring the use of CNN in machine learning problems like image recognition, video analysis, natural language processing and so on. A CNN architecture consists of various layers and each layer consists of many hyperparameters. The vast amount of architectures that can be generated based on the choices of hyperparameters makes it impossible for an exhaustive manual search. EECS6960 Research and Thesis EECS6960 Research and Thesis
  • 8. Page  8 Problem Statement In this thesis, three bio-inspired algorithms viz. genetic algorithm, particle swarm optimizer (PSO) and grey wolf optimizer (GWO) are used to optimally determine the architecture of a convolutional neural network (CNN) that is used to classify handwritten numbers. Currently, there is no standard way to automatically determine the architecture of a CNN. Domain knowledge and human expertise are required in order to design a CNN architecture. Typically architectures are created by experimenting and modifying a few existing networks. The bio-inspired algorithms determine the exact architecture of a CNN by evolving the various hyperparameters of the architecture for a given application. EECS6960 Research and Thesis EECS6960 Research and Thesis
  • 9. Page  9 MNIST Dataset EECS6960 Research and Thesis EECS6960 Research and Thesis  The MNIST dataset is scanned images of handwritten digits and the associated labels describe which digit 0-9 is contained in each image.  This classification problem is one of the benchmark problems and is widely used in deep learning research. It is one of the popular datasets as it allows researchers to study their proposed methods in a controlled environment.
  • 10. Page  10 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 11. Page  11 Convolutional Neural Network A convolutional neural network (or ConvNet) is a type of feed-forward artificial neural network The architecture of a ConvNet is designed to take advantage of the 2D structure of an input image.   A ConvNet is comprised of one or more convolutional layers (often with a pooling step) and then followed by one or more fully connected layers as in a standard multilayer neural network. EECS6960 Research and Thesis VS EECS6960 Research and Thesis
  • 12. Page  12 Motivation behind ConvNets Consider an image of size 200x200x3 (200 wide, 200 high, 3 color channels) – a single fully-connected neuron in a first hidden layer of a regular Neural Network would have 200*200*3 = 120,000 weights. – Due to the presence of several such neurons, this full connectivity is wasteful and the huge number of parameters would quickly lead to overfitting However, in a ConvNet, the neurons in a layer will only be connected to a small region of the layer before it, instead of all of the neurons in a fully- connected manner. – the final output layer would have dimensions 1x1xN, because by the end of the ConvNet architecture we will reduce the full image into a single vector of class scores (for N classes), arranged along the depth dimension EECS6960 Research and Thesis EECS6960 Research and Thesis
  • 13. Page  13 MLP vs ConvNet A regular 3-layer Neural Network. A ConvNet arranges its neurons in three dimensions (width, height, depth), as visualized in one of the layers. EECS6960 Research and Thesis EECS6960 Research and Thesis
  • 14. Page  14 How ConvNet Works For example, a ConvNet takes the input as an image which can be classified as ‘X’ or ‘O’ In a simple case, ‘X’ would look like: X or OCNN A two-dimensional array of pixels EECS6960 Research and Thesis
  • 15. Page  15 How ConvNet Works What about trickier cases? CNN X CNN O EECS6960 Research and Thesis
  • 16. Page  16 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 = ? EECS6960 Research and Thesis How ConvNet Works – What Computer Sees
  • 17. Page  17 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 =x EECS6960 Research and Thesis How ConvNet Works
  • 18. Page  18 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 X -1 -1 -1 -1 X X -1 -1 X X -1 -1 X X -1 -1 -1 -1 X 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 X -1 -1 -1 -1 X X -1 -1 X X -1 -1 X X -1 -1 -1 -1 X -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 How ConvNet Works – What Computer Sees Since the pattern does not match exactly, the computer will not be able to classify this as ‘X’ EECS6960 Research and Thesis
  • 19. Page  19 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 20. Page  20 ConvNet Layers (At a Glance) CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. RELU layer will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged. POOL layer will perform a downsampling operation along the spatial dimensions (width, height). FC (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1xN], where each of the N numbers correspond to a class score, such as among the N categories. EECS6960 Research and Thesis EECS6960 Research and Thesis
  • 21. Page  21 Since the pattern does not match exactly, the computer will not be able to classify this as ‘X’ What got changed? -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 X -1 -1 -1 -1 X X -1 -1 X X -1 -1 X X -1 -1 -1 -1 X 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 X -1 -1 -1 -1 X X -1 -1 X X -1 -1 X X -1 -1 -1 -1 X -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Recall – What Computer Sees EECS6960 Research and Thesis
  • 22. Page  22 = = = Convolution layer will work to identify patterns (features) instead of individual pixels EECS6960 Research and Thesis Convolutional Layer
  • 23. Page  23 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 Convolutional Layer - Filters The CONV layer’s parameters consist of a set of learnable filters. Every filter is small spatially (along width and height), but extends through the full depth of the input volume. During the forward pass, we slide (more precisely, convolve) each filter across the width and height of the input volume and compute dot products between the entries of the filter and the input at any position. EECS6960 Research and Thesis
  • 24. Page  24 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 Convolutional Layer - Filters Sliding the filter over the width and height of the input gives 2-dimensional activation map that responds to that filter at every spatial position. EECS6960 Research and Thesis
  • 25. Page  25 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Strides = 1, Filter Size = 3 X 3 X 1, Padding = 0 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 26. Page  26 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 27. Page  27 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 28. Page  28 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 29. Page  29 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 30. Page  30 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 31. Page  31 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 32. Page  32 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 33. Page  33 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Navigation Example
  • 34. Page  34 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 35. Page  35 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 36. Page  36 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 37. Page  37 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 38. Page  38 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 39. Page  39 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 40. Page  40 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 41. Page  41 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 42. Page  42 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 43. Page  43 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 44. Page  44 1 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 45. Page  45 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 46. Page  46 1 1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 47. Page  47 1 1 -1 1 1 1 -1 1 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 48. Page  48 1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 1 -1 1 1 1 -1 1 1 55 1 1 -1 1 1 1 -1 1 1 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 49. Page  49 Convolutional Layer - Strides • The distance that filter is moved across the input from the previous layer each activation is referred to as the stride. EECS6960 Research and Thesis Stride: 1 Stride: 2
  • 50. Page  50 Convolutional Layer - Padding Sometimes it is convenient to pad the input volume with zeros around the border. Zero padding is allows us to preserve the spatial size of the output volumes EECS6960 Research and Thesis EECS6960 Research and Thesis Padding: 1 Padding: 2
  • 51. Page  51 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 = 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 Input Size (W): 9 Filter Size (F): 3 X 3 Stride (S): 1 Filters: 1 Padding (P): 09 X 9 7 X 7 Feature Map Size = 1+ (W – F + 2P)/S = 1+ (9 – 3 + 2 X 0)/1 = 7 EECS6960 Research and Thesis Convolutional Layer – Filters – Computation Example
  • 52. Page  52 1 -1 -1 -1 1 -1 -1 -1 1 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 = 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -1 -1 1 -1 1 -1 1 -1 -1 1 -1 1 -1 1 -1 1 -1 1 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 = = -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Output Feature Map of One complete convolution: – Filters: 3 – Filter Size: 3 X 3 – Stride: 1 Conclusion: – Input Image: 9 X 9 – Output of Convolution: 7 X 7 X 3 EECS6960 Research and ThesisEECS6960 Research and Thesis Convolutional Layer – Filters – Output Feature Map
  • 53. Page  53 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 EECS6960 Research and Thesis Convolutional Layer – Output
  • 54. Page  54 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.77 EECS6960 Research and Thesis
  • 55. Page  55 0.77 0 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 EECS6960 Research and Thesis
  • 56. Page  56 0.77 0 0.11 0.33 0.55 0 0.33 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 EECS6960 Research and Thesis
  • 57. Page  57 0.77 0 0.11 0.33 0.55 0 0.33 0 1.00 0 0.33 0 0.11 0 0.11 0 1.00 0 0.11 0 0.55 0.33 0.33 0 0.55 0 0.33 0.33 0.55 0 0.11 0 1.00 0 0.11 0 0.11 0 0.33 0 1.00 0 0.33 0 0.55 0.33 0.11 0 0.77 Rectified Linear Units (ReLUs) 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 EECS6960 Research and Thesis
  • 58. Page  58 ReLU layer 0.77 0 0.11 0.33 0.55 0 0.33 0 1.00 0 0.33 0 0.11 0 0.11 0 1.00 0 0.11 0 0.55 0.33 0.33 0 0.55 0 0.33 0.33 0.55 0 0.11 0 1.00 0 0.11 0 0.11 0 0.33 0 1.00 0 0.33 0 0.55 0.33 0.11 0 0.77 0.33 0 0.11 0 0.11 0 0.33 0 0.55 0 0.33 0 0.55 0 0.11 0 0.55 0 0.55 0 0.11 0 0.33 0 1.00 0 0.33 0 0.11 0 0.55 0 0.55 0 0.11 0 0.55 0 0.33 0 0.55 0 0.33 0 0.11 0 0.11 0 0.33 0.33 0 0.55 0.33 0.11 0 0.77 0 0.11 0 0.33 0 1.00 0 0.55 0 0.11 0 1.00 0 0.11 0.33 0.33 0 0.55 0 0.33 0.33 0.11 0 1.00 0 0.11 0 0.55 0 1.00 0 0.33 0 0.11 0 0.77 0 0.11 0.33 0.55 0 0.33 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 0.77 -0.11 0.11 0.33 0.55 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.11 -0.11 0.11 -0.11 1.00 -0.33 0.11 -0.11 0.55 0.33 0.33 -0.33 0.55 -0.33 0.33 0.33 0.55 -0.11 0.11 -0.33 1.00 -0.11 0.11 -0.11 0.11 -0.11 0.33 -0.11 1.00 -0.11 0.33 -0.11 0.55 0.33 0.11 -0.11 0.77 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.11 0.33 -0.77 1.00 -0.77 0.33 -0.11 0.11 -0.55 0.55 -0.77 0.55 -0.55 0.11 -0.55 0.55 -0.55 0.33 -0.55 0.55 -0.55 0.33 -0.55 0.11 -0.11 0.11 -0.55 0.33 EECS6960 Research and Thesis
  • 59. Page  59 Pooling Layer The pooling layers down-sample the previous layers feature map. Its function is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network The pooling layer often uses the Max operation to perform the downsampling process EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 60. Page  60 1.00 Pooling Pooling Filter Size = 2 X 2, Stride = 2 EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 61. Page  61 1.00 0.33 Pooling EECS6960 Research and Thesis Pooling Filter Size = 2 X 2, Stride = 2
  • 62. Page  62 1.00 0.33 0.55 Pooling EECS6960 Research and Thesis Pooling Filter Size = 2 X 2, Stride = 2
  • 63. Page  63 1.00 0.33 0.55 0.33 Pooling  Pooling Filter Size = 2 X 2, Stride = 2 EECS6960 Research and Thesis Pooling Filter Size = 2 X 2, Stride = 2
  • 64. Page  64 1.00 0.33 0.55 0.33 0.33 Pooling EECS6960 Research and Thesis Pooling Filter Size = 2 X 2, Stride = 2
  • 65. Page  65 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 Pooling EECS6960 Research and Thesis Pooling Filter Size = 2 X 2, Stride = 2
  • 66. Page  66 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33 EECS6960 Research and Thesis Pooling
  • 67. Page  67 Layers get stacked -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1.00 0.33 0.55 0.33 0.33 1.00 0.33 0.55 0.55 0.33 1.00 0.11 0.33 0.55 0.11 0.77 0.33 0.55 1.00 0.77 0.55 0.55 1.00 0.33 1.00 1.00 0.11 0.55 0.77 0.33 0.55 0.33 0.55 0.33 0.55 0.33 0.33 1.00 0.55 0.11 0.55 0.55 0.55 0.11 0.33 0.11 0.11 0.33 EECS6960 Research and Thesis
  • 68. Page  68 Deep stacking -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1.00 0.55 0.55 1.00 0.55 1.00 1.00 0.55 1.00 0.55 0.55 0.55 EECS6960 Research and Thesis
  • 69. Page  69 Fully connected layer Fully connected layers are the normal flat feed-forward neural network layers. These layers may have a non- linear activation function or a softmax activation in order to predict classes. To compute our output, we simply re-arrange the output matrices as a 1-D array. 1.00 0.55 0.55 1.00 0.55 1.00 1.00 0.55 1.00 0.55 0.55 0.55 1.00 0.55 0.55 1.00 1.00 0.55 0.55 0.55 0.55 1.00 1.00 0.55 EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 70. Page  70 Fully connected layer A summation of product of inputs and weights at each output node determines the final prediction X O 0.55 1.00 1.00 0.55 0.55 0.55 0.55 0.55 1.00 0.55 0.55 1.00 EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 71. Page  71 Putting it all together -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 X O EECS6960 Research and Thesis
  • 72. Page  72 Hyperparameters Convolution – Filter Size – Number of Filters – Padding – Stride Pooling – Window Size – Stride Fully Connected – Number of neurons EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 73. Page  73 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 74. Page  74 Genetic Algorithm (GA) Genetic Algorithm (or GA) is inspired by natural process of evolution. It is based on two foundations – Foundation I: Darwin’s Theory of Natural Selection – Foundation II: Mendel’s Theory of Genetics EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 75. Page  75 Genetic Algorithm (GA) EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 76. Page  76 Selection Selection operators give preference to better solutions (chromosomes), allowing them to pass on their 'genes' to the next generation of the algorithm. The best solutions are determined using some form of objective function (also known as a 'fitness function' in genetic algorithm), before being passed to the crossover operator. EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 77. Page  77 Tournament Selection In tournament selection, K individuals from the population are selected at random and select the best out of these to become a parent. K is known as the tournament selection size. In the above example, K=3 EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 78. Page  78 Crossover Crossover is the process of taking more than one parent solutions (chromosomes) and producing a child solution from them. By recombining portions of good solutions, the genetic algorithm is more likely to create a better solution. EECS6960 Research and ThesisEECS6960 Research and Thesis Chromosome X Chromosome Y Pivot Point Offspring A Offspring B  A single point crossover calls for a single pivot point (crossover point) to be selected on the parent chromosomes.  All data beyond this pivot point is swapped in both parent chromosomes. This results in the formation of two offspring chromosomes.
  • 79. Page  79 Mutation The purpose of the mutation operator is to encourage genetic diversity amongst the chromosomes. If the chromosomes are similar to each other, the genetic algorithm converges to a local minimum. The mutation operator prevents this from happening. EECS6960 Research and ThesisEECS6960 Research and Thesis  The Mutation operator flips a randomly selected gene in a chromosome.
  • 80. Page  80 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 81. Page  81 Hyperparameters in CNN Convolution – Filter Size – Number of Filters – Padding – Stride Pooling – Window Size – Stride Fully Connected – Number of neurons EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 82. Page  82 Hyper parameter Range No. of Epoch (0 - 127) Batch Size (0 - 256) No. of Convolution Layers (0 - 8) No. of Filters at each Convo layer (0 - 64) Convo Filter Size at each Convo layer (0 - 8) Activations used at each Convo layer (sigmoid, tanh, relu, linear) Maxpool layer after each Convo layer (true, false) Maxpool Pool Size for each Maxpool layer (0 - 8) No. of Feed-Forward Hidden Layers (0 - 8) No. of Feed-Forward Hidden Neurons at each layer (0 - 64) Activations used at each Feed-Forward layer (sigmoid, tanh, softmax, relu) Optimizer (Adagrad, Adadelta, RMS, SGD) EECS6960 Research and ThesisEECS6960 Research and Thesis Hyperparameters in CNN
  • 83. Page  83 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and Thesis 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 EECS6960 Research and Thesis
  • 84. Page  84 EECS6960 Research and Thesis 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 No. of Epochs 100 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and Thesis
  • 85. Page  85 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Batch Size 64 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 86. Page  86 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 No. of Convolutions 2 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 87. Page  87 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 No. of Filters at 1st Convolution 10 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 88. Page  88 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Filter Size at 1st Convolution 5 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 89. Page  89 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Activations used at 1st Convolution 1 = TanH Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 90. Page  90 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Maxpool layer after 1st Convolution 1 = True Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 91. Page  91 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Maxpool Pool Size for 1st Maxpool 5 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 92. Page  92 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1No. of Filters at 2nd Convolution 15 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 93. Page  93 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1Filter Size at 2nd Convolution layer 3 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 94. Page  94 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1Activations used at 2nd Convolution 0= Sigmoid Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 95. Page  95 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Maxpool layer after 2nd Convolution layer 1 = True Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 96. Page  96 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Maxpool Pool Size for 2nd Maxpool 5 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 97. Page  97 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 No. of Feed-Forward Hidden Layers 3 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 98. Page  98 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 No. of Feed-Forward Hidden Neurons at 1st layer 32 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 99. Page  99 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Activations used at 1st Feed- Forward layer 0 = Sigmoid Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 100. Page  100 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 No. of Feed-Forward Hidden Neurons at 2nd layer 50 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 101. Page  101 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Activations used at 2nd Feed- Forward layer 2 = Linear Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 102. Page  102 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 No. of Feed-Forward Hidden Neurons at 3rd layer 10 Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 103. Page  103 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Activations used at 3rd Feed- Forward layer 2 = Softmax Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 104. Page  104 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1 Optimizer 0 = Adagrad Mapping of GA Chromosome to CNN Hyperparameters EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 105. Page  105 Mapping of GA Chromosome to CNN Hyperparameters 1 1 0 0 1 0 0 No. of Epochs: 100 0 1 0 0 0 0 0 0 Batch Size: 64 0 1 0 No. of Convolutions: 2 0 0 1 0 1 0 No. of Filters at 1st Convolution : 10 1 0 1 Filter Size at 1st Convolution : 5 0 1 Activations used at 1st Convolution : Tanh 1 Maxpool layer after 1st Convolution layer : True 1 0 1 Maxpool Pool Size for 1st Maxpool : 5 0 0 1 1 1 1 No. of Filters at 2nd Convolution : 15 0 1 1 Filter Size at 2nd Convolution layer : 3 0 0 Activations used at 2nd Convolution: Sigmoid 1 Maxpool layer after 2nd Convolution layer : True 1 0 1 Maxpool Pool Size for 2nd Maxpool : 5 0 1 1 No. of Feed-Forward Hidden Layers : 3 1 0 0 0 0 0 No. of Feed-Forward Hidden Neurons at 1st layer: 32 0 0 Activations used at 1st Feed-Forward layer : Sigmoid 1 1 0 0 1 0 No. of Feed-Forward Hidden Neurons at 2nd layer: 50 1 1 Activations used at 2nd Feed-Forward layer : Linear 0 0 1 0 1 0 No. of Feed-Forward Hidden Neurons at 3rd layer: 10 1 0 Activations used at 3rd Feed-Forward Layer: Softmax 0 0 Optimizer: Adagrad EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 106. Page  106 Fitness Function The fitness function used in this study is the classification accuracy which determines the number of correctly classified patterns. This classification accuracy ( ranges from 0 and 1) is the fitness value of a particular CNN architecture. For the evaluation of the CNN, Keras – which is a high-level neural networks API, written in Python, is used to train the convolutional neural networks. It is a deep learning library which allows easy and fast prototyping. It supports all the layers of a CNN and can train the network using various optimization algorithms. Keras generates a classification accuracy when a CNN architecture is fully trained. EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 107. Page  107 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 108. Page  108 Evaluation The Genetic algorithm tuner was implemented with the MNIST dataset with 50,000 images as its training set and another 10,000 images as its testing set. Genetic algorithm with 10 chromosomes generated randomly was executed 10 times, each time with randomly chosen chromosomes EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 109. Page  109 Results – GA Tuning Experiment No. Highest Fitness Value 1 0.987799989104 2 0.978100001216 3 0.947200008678 4 0.954100004768 5 0.961800005841 6 0.985799998164 7 0.991900001359 8 0.98910000065 9 0.986600002062 10 0.990600002396 EECS6960 Research and Thesis 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1 2 3 4 5 6 7 8 9 10 Score Generation GA Tuner: Classification Accuracy vs Generation EECS6960 Research and Thesis Convergence process of GA tuning
  • 110. Page  110 Generated Output after GA Tuning EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 111. Page  111 Final CNN Architecture after GA Tuning EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 112. Page  112 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 113. Page  113 Particle Swarm Optimization Algorithm (PSO) Inspired from the nature social behavior and dynamic movements with communications of insects, birds and fish. Uses a number of agents (particles) that constitute a swarm moving around in the search space looking for the best solution. Each particle adjusts its travelling speed dynamically corresponding to the flying experiences of itself and its colleagues. EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 114. Page  114 Particle Swarm Optimization Algorithm (PSO) EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 115. Page  115 Position Update Rule The position of a particle i is given by xi, which is an L-dimensional vector in ℜL. The change of position of a particle is denoted by Δxi, which is a vector that is added to the position coordinates in order to move the particle from one iteration t to the other t + 1 The vector Δxi is commonly referred to as the velocity vi of the particle. EECS6960 Research and ThesisEECS6960 Research and Thesis xi t + 1 = xi(t) + Δxi t + 1
  • 116. Page  116 Velovity Update Rule The particle swarm algorithm samples the search-space by modifying the velocity of each particle. Velocity term Δxi(t + 1) at iteration t + 1 is influenced by the current velocity Δxi(t), the location of the particle’s best success so far Pi and the best position found by any member of the swarm Pg Here ϕ1 and ϕ2 represent positive random vectors composed of numbers drawn from uniform distributions. EECS6960 Research and ThesisEECS6960 Research and Thesis Δxi t + 1
  • 117. Page  117 PSO – Simulation EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 118. Page  118 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 119. Page  119 Mapping of PSO Chromosome to CNN Hyperparameters EECS6960 Research and Thesis 0.69 0.59 0.48 0.36 0.61 0.02 0.17 0.45 0.95 0.32 0.19 0.25 0.31 0.42 0.17 0.29 0.68 0.11 0.46 0.36 0.86 0.05 0.46 0.27 0.95 0.73 0.56 0.99 0.23 0.54 0.68 0.23 0.14 0.69 0.73 0.96 0.89 0.13 0.59 0.95 0.82 0.19 0.48 0.25 0.37 0.31 0.16 0.43 0.85 0.53 0.28 0.19 0.93 0.25 0.75 0.55 0.37 0.29 0.88 0.27 0.57 0.43 0.79 0.39 0.27 0.04 0.88 0.24 0.93 0.36 0.73 0.27 0.92 0.65 0.56 0.33 0.67 EECS6960 Research and Thesis 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1
  • 120. Page  120 Mapping of PSO Particle to CNN Hyperparameters 1 1 0 0 1 0 0 No. of Epochs: 100 0 1 0 0 0 0 0 0 Batch Size: 64 0 1 0 No. of Convolutions: 2 0 0 1 0 1 0 No. of Filters at 1st Convolution : 10 1 0 1 Filter Size at 1st Convolution : 5 0 1 Activations used at 1st Convolution : Tanh 1 Maxpool layer after 1st Convolution layer : True 1 0 1 Maxpool Pool Size for 1st Maxpool : 5 0 0 1 1 1 1 No. of Filters at 2nd Convolution : 15 0 1 1 Filter Size at 2nd Convolution layer : 3 0 0 Activations used at 2nd Convolution: Sigmoid 1 Maxpool layer after 2nd Convolution layer : True 1 0 1 Maxpool Pool Size for 2nd Maxpool : 5 0 1 1 No. of Feed-Forward Hidden Layers : 3 1 0 0 0 0 0 No. of Feed-Forward Hidden Neurons at 1st layer: 32 0 0 Activations used at 1st Feed-Forward layer : Sigmoid 1 1 0 0 1 0 No. of Feed-Forward Hidden Neurons at 2nd layer: 50 1 1 Activations used at 2nd Feed-Forward layer : Linear 0 0 1 0 1 0 No. of Feed-Forward Hidden Neurons at 3rd layer: 10 1 0 Activations used at 3rd Feed-Forward Layer: Softmax 0 0 Optimizer: Adagrad EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 121. Page  121 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 122. Page  122 Evaluation The PSO tuner was implemented with the MNIST dataset with 50,000 images as its training set and another 10,000 images as its testing set. Particle swarm optimizer with 10 particles generated randomly was executed 10 times, each time with a randomly chosen particle. EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 123. Page  123 Results – PSO Tuning Exp No. Highest Fitness Value 1 0.984499992943 2 0.973899998105 3 0.988800008184 4 0.993600005358 5 0.947799991965 6 0.949000005102 7 0.983099997652 8 0.979799999475 9 0.956399999567 10 0.992350000068 EECS6960 Research and Thesis 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1 2 3 4 5 6 7 8 9 10 Score Generation PSO Tuner: Classification Accuracy vs Generation EECS6960 Research and Thesis Convergence process of PSO tuning
  • 124. Page  124 Generated Output after PSO Tuning EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 125. Page  125 Final Architecture after PSO Tuning EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 126. Page  126 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 127. Page  127 Grey Wolf Optimization Algorithm (GWO) EECS6960 Research and ThesisEECS6960 Research and Thesis The GWO algorithm mimics the leadership hierarchy and hunting mechanism of gray wolves in nature proposed by Mirjalili et al. in 2014. Four types of grey wolves such as alpha, beta, delta, and omega are employed for simulating the leadership hierarchy α (Alpha) β (Beta) δ (Delta) ω (Omega)
  • 128. Page  128 EECS6960 Research and ThesisEECS6960 Research and Thesis In addition to the social hierarchy of wolves, group hunting is another interesting social behavior of grey wolves. The main phases of grey wolf hunting are as follows: • Tracking, chasing, and approaching the prey • Pursuing, encircling, and harassing the prey until it stops moving • Attack the prey Hunting behavior of grey wolves: (A) chasing, approaching, and tracking prey (B–D) pursuing, harassing, and encircling (E) stationary situation and attack Grey Wolf Optimization Algorithm (GWO)
  • 129. Page  129 EECS6960 Research and ThesisEECS6960 Research and Thesis Grey Wolf Optimizer – Encircling the prey Encircling is mathematically modelled as follows Where t indicates the current iteration, 𝐴 and 𝐶 are coefficient vectors, 𝑋 𝑝 is the position vector of the prey, and 𝑋 indicates the position vector of a grey wolf. 𝐴 and 𝐶 are given by Equations Where components of 𝑎 are linearly decreased from 2 to 0 over the course of iterations and r1, r2 are random vectors in the interval [0, 1]. 𝐷 = 𝐶. 𝑋 𝑝 𝑡 − 𝑋 𝑡 𝑋(𝑡 + 1) = 𝑥 𝑝 − 𝐴. 𝐷 𝐴 = 2. 𝑎. 𝑟1 − 𝑎 𝐶 = 2. 𝑟2
  • 130. Page  130 EECS6960 Research and ThesisEECS6960 Research and Thesis Grey Wolf Optimizer – Attacking the prey Grey wolves have the ability to recognize the location of prey and encircle them. The hunt is usually guided by the alpha. The beta and delta might also participate in hunting occasionally. A new beta and delta emerge in each iteration as all the other wolves update their positions.  We assume that the alpha (best candidate solution) beta, and delta have better knowledge about the potential location of prey. The first three best solutions obtained so far are saved (α, β and δ ) and the positions of the other search agents (the omegas) are updated according to the position of the best search agent.
  • 131. Page  131 EECS6960 Research and ThesisEECS6960 Research and Thesis Grey Wolf Optimizer – Attacking the prey Attacking is mathematically modelled with the following equations 𝐷 𝛼 = |𝐶1. 𝑋 𝛼 − 𝑋| 𝐷 𝛽 = |𝐶2. 𝑋 𝛽 − 𝑋| 𝐷 𝛾 = |𝐶3. 𝑋 𝛿 − 𝑋| 𝑋1 = 𝑋 𝛼 − 𝐴1. (𝐷 𝛼) 𝑋2 = 𝑋 𝛽 − 𝐴2. (𝐷 𝛽) 𝑋3 = 𝑋 𝛿 − 𝐴3. (𝐷 𝛿) 𝑋 𝑡 + 1 = 𝑋1 + 𝑋2 + 𝑋3 3
  • 132. Page  132 EECS6960 Research and ThesisEECS6960 Research and Thesis Grey Wolf Optimization Algorithm (GWO)
  • 133. Page  133 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 134. Page  134 Mapping of GWO Chromosome to CNN Hyperparameters EECS6960 Research and Thesis 0.69 0.59 0.48 0.36 0.61 0.02 0.17 0.45 0.95 0.32 0.19 0.25 0.31 0.42 0.17 0.29 0.68 0.11 0.46 0.36 0.86 0.05 0.46 0.27 0.95 0.73 0.56 0.99 0.23 0.54 0.68 0.23 0.14 0.69 0.73 0.96 0.89 0.13 0.59 0.95 0.82 0.19 0.48 0.25 0.37 0.31 0.16 0.43 0.85 0.53 0.28 0.19 0.93 0.25 0.75 0.55 0.37 0.29 0.88 0.27 0.57 0.43 0.79 0.39 0.27 0.04 0.88 0.24 0.93 0.36 0.73 0.27 0.92 0.65 0.56 0.33 0.67 EECS6960 Research and Thesis 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 0 0 1 0 1 0 1 0 1 1 1 0 1
  • 135. Page  135 Mapping of GWO Solution to CNN Hyperparameters 1 1 0 0 1 0 0 No. of Epochs: 100 0 1 0 0 0 0 0 0 Batch Size: 64 0 1 0 No. of Convolutions: 2 0 0 1 0 1 0 No. of Filters at 1st Convolution : 10 1 0 1 Filter Size at 1st Convolution : 5 0 1 Activations used at 1st Convolution : Tanh 1 Maxpool layer after 1st Convolution layer : True 1 0 1 Maxpool Pool Size for 1st Maxpool : 5 0 0 1 1 1 1 No. of Filters at 2nd Convolution : 15 0 1 1 Filter Size at 2nd Convolution layer : 3 0 0 Activations used at 2nd Convolution: Sigmoid 1 Maxpool layer after 2nd Convolution layer : True 1 0 1 Maxpool Pool Size for 2nd Maxpool : 5 0 1 1 No. of Feed-Forward Hidden Layers : 3 1 0 0 0 0 0 No. of Feed-Forward Hidden Neurons at 1st layer: 32 0 0 Activations used at 1st Feed-Forward layer : Sigmoid 1 1 0 0 1 0 No. of Feed-Forward Hidden Neurons at 2nd layer: 50 1 1 Activations used at 2nd Feed-Forward layer : Linear 0 0 1 0 1 0 No. of Feed-Forward Hidden Neurons at 3rd layer: 10 1 0 Activations used at 3rd Feed-Forward Layer: Softmax 0 0 Optimizer: Adagrad EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 136. Page  136 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 137. Page  137 Evaluation The GWO algorithm tuner was implemented with the MNIST dataset with 50,000 images as its training set and another 10,000 images as its testing set. Grey wolf optimization algorithm with 10 solutions generated randomly was executed 10 times, each time with a randomly chosen solution. EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 138. Page  138 Results – GWO Tuning Experiment No. Highest Fitness Value 1 0.946400008178 2 0.948899995995 3 0.994200000004 4 0.97359999752 5 0.961999999666 6 0.877199997282 7 0.985900000003 8 0.899900003791 9 0.959000001717 10 0.932900003999 EECS6960 Research and Thesis 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1 2 3 4 5 6 7 8 9 10 Score Generation GWO Tuner: Classification Accuracy vs Generation EECS6960 Research and Thesis Convergence process of GWO tuning
  • 139. Page  139 Generated Output after GWO Tuning EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 140. Page  140 Final CNN Architecture after GWO Tuning EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 141. Page  141 Agenda Introduction Convolutional Neural Network – How ConvNet Works ConvNet Layers – Convolutional Layer – Pooling Layer – Normalization Layer (ReLU) – Fully-Connected Layer Hyper Parameters Genetic Algorithm (GA) – Workings of GA – Selection – Crossover – Mutation EECS6960 Research and Thesis EECS6960 Research and Thesis Mapping GA chromosome GA Tuner Evaluation & Results Particle Swarm Optimmization (PSO) – Workings of PSO – PSO Simulation Mapping PSO Paticle PSO Tuner Evaluation & Results Grey Wolf Optimization (GWO) – Workings of GWO Mapping GWO Candidate Solution GWO Tuner Evaluation & Results Conclusion
  • 142. Page  142 Conclusion In this thesis, three bio-inspired algorithms, viz. GA, PSO, and GWO were used to generate fully trained CNN architectures for the MNIST dataset. It has been demonstrated that the proposed method is capable of choosing relevant hyperparameters thus forming an optimum CNN architecture. The architectures were generated automatically and without any human intervention. All experiments carried out using the GA and PSO algorithm yielded classification accuracies of more than 90% with the highest accuracy being 99.2% and 99.36% respectively. The GWO experiments yielded classification accuracies of more than 85%, with the highest accuracy being 99.4%. EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 143. Page  143 Conclusion contd. In the future, this work can be extended to other bio-inspired algorithms. Also, this work can be implemented on other datasets. These datasets may consist of colored images and may be greater in size, provided there is access to better processing power. EECS6960 Research and ThesisEECS6960 Research and Thesis Algorithm Approx. Processing Time (in Hours) Results (Classification Accuracy) Best Run Worst Run Genetic Algorithm 4-5 0.9919 0.9472 Particle Swarm Optimization Algorithm 4-5 0.9936 0.9478 Grey Wolf Optimization Algorithm 5-6 0.9942 0.8772
  • 144. Page  144 References  Karpathy, A. (n.d.). CS231n Convolutional Neural Networks for Visual Recognition. Retrieved from http://cs231n.github.io/convolutional-networks/#overview  Rohrer, B. (n.d.). How do Convolutional Neural Networks work?. Retrieved from http://brohrer.github.io/how_convolutional_neural_networks_work.html  Brownlee, J. (n.d.). Crash Course in Convolutional Neural Networks for Machine Learning. Retrieved from http://machinelearningmastery.com/crash-course-convolutional-neural- networks/  Lidinwise (n.d.). The revolution of depth. Retrieved from https://medium.com/@Lidinwise/the- revolution-of-depth-facf174924f5#.8or5c77ss  Nervana. (n.d.). Tutorial: Convolutional neural networks. Retrieved from https://www.nervanasys.com/convolutional-neural-networks/  L. N. d. Castro, Fundamentals of Natural Computing: Basic Concepts, Algorithms, and Applications, Chapman and Hall/CRC , 2006.  S. Mirjalili, S. M. Mirjalili and A. Lewis, "Grey Wolf Optimizer," Advances in Engineering Software, vol. 69, pp. 46-61, 2014.  A. Bhandare and D. Kaur, "Comparative Analysis of Swarm Intelligence Techniques," in International Conference of Artificial Intelligence, 2017. EECS6980:006 Social Network Analysis
  • 145. Page  145 Questions EECS6960 Research and ThesisEECS6960 Research and Thesis
  • 146. Page  146 Thank you!! EECS6960 Research and ThesisEECS6960 Research and Thesis