Deep learning requirement and notes for novoice

(35) Simple explanation of convolutional neural network | Deep
Learning Tutorial 23 (Tensorflow & Python) - YouTube
Implementing a Neural Network from Scratch in Python · Denny
's Blog (dennybritz.com)
MOST IMPORTANT LINKS

Pip install tensorflow -- to install tensorflow on your system
Running Jupyter server
Accessing jupyter notepad

Most popular DL frameworks !
Pytorch
• By facebook
Tensorflow
• By google

• Keras is not a full fledge framework but rather a nice wrapper around
TensorFlow, CNTK (By Microsoft) and “Theano. It just makes programming
easier
• Post TensorFlow 2.0, Keras is now part of Tensorflow Library suite.
Two most popular deep learning
frameworks are
a)PyTorch and (b) Tensorflow

Slope of a vertical line is always undefined
Slope of a horizontal line is always 0

Beauty of CNN is that no need to
provide explicit filters, it will
automatically detect filters on its own !
Classification through
Deep Network
(All nodes connected to all
other)
CNN is not necessarily deep means all nodes not
necessarily connected to all other nodes.
We are providing
thousand of photos of
koalas here, so CNN
will use
backpropagation to
automatically generate
appropriate filters. It is
part of learning.
Only parameters we specify is
- How many filters u want
- What will be size of filters.
- No need to provide values for filter !

CNN architecture components
• Convolution
• Padding
• Stride
• Pooling
• SoftMax
• Fully
Connected NN
Tensor
Tensor

The Forward pass of kernel
• During the forward pass, the kernel slides across the height and width of the
image-producing the image representation of that receptive region.
• This produces a two-dimensional representation of the image known as an
activation map that gives the response of the kernel at each spatial position of
the image.
• The sliding size of the kernel is called a stride.

What is local receptive field?
• Subset of a feature-map or input-image

Parameters?
• # of parameters in a given layer is the count of “learnable” elements
• Parameters in general are weights that are learnt during training.
• They are weight matrices that contribute to model’s predictive power,
changed during back-propagation process.

# of parameters in an Input Layer
• Input layer has nothing to learn, at it’s core, what it does is just provide the
input image’s shape.
• So no learnable parameters here. Thus number of parameters = 0

# of parameters in a Convolutional Layer
• Consider a convolutional layer which takes
• “l” feature maps as the input
• “k” feature maps as output.
• The filter size is “n*m”.
• Example: Here the input has l=3 feature maps
as inputs, k=96 feature maps as outputs and
filter size is n=11 and m=11.
• It is important to understand, that we don’t
simply have a 11*11 filter, but actually, we
have 11*11*3 filter, as our input has 3
dimensions.
• And as an output from first conv layer, we
learn 96 different filters which total weights is
“n*m*k*l”. Then there is a term called bias
for each feature map. So, the total number of
parameters are “(n*m*l+1)*k”.
# param = ( (11 * 11 * 3) + 1) * 96

Formula for Output Shape
• N = input size
• F = size of filter
• P = # of padding
• S = # of strides
Output_shape = ((10 – 3
+2(0))/1)+1
Output_shape = 7/1 + 1
Output_shame = 8

Sample Calculation: Output_shape & # Params
# params in 1 filter = 3 x 3 + 1 = 10 (including 1 bias per filter)
# params in 5 filters = 10 * 5 = 50
Output_shape = ((10 – 3 + 2(0)) / 1 ) +1 = 7 + 1 = ( 8 x 8 x 5)

Main benefits of Pooling layer
• Reduced Size
• Translation invariance
• Feature Enhancement (Max-Pooling)
• No need of training
• No learning parameters
Source:
(58) Pooling Layer in CNN | MaxPooling in Convolutional Neural Network – YouTube.
https://www.youtube.com/watch?v=DwmGefkowCU

Benefits of Pooling: Size Reduction

Pooling = Sub-Sampling
Sub-sampling handles translation invariance
In both figures, A and B, the number ‘8’ is slightly shifted from
origin, but after subsampling / max-pooling filter applied, in the
resulting image, both are centered at origin however some details
are lost.
Therefore, generally speaking pooling (Min,Avg) focus on higher
level features and ignore the minute details.
Except Max-pooling where the features are actually enhanced
Benefits of Pooling: Size Reduction

Benefits of Pooling: Feature Enhancement
In case of max-pooling, you take small
area from the input image and take the
most dominant number (max) from it
You are actually selecting the best / most
bright weight from the receptive field,
which yields the most enhanced feature
to you.
Caution, it is only in the case of Max-
pooling. Not applicable to other forms of
pooling

Benefits of Pooling: No need of training
In Convolution layer, what will be the
weights in the filter is actually found out by
applying back propagation.
However, pooling is just an aggregate
operation (Min, Max, Avg) therefore, no
training is required. It is just an aggregate
function.
All you need to tell model is
- What is local receptive field?
- Value of stride?
- Type of pooling – (Avg, Min, Max)
Pooling is layer is faster for the above
reason. There is no back propagation
involved

Types of pooling in KERAS
• Max-pooling
• Avg-pooling
• Global-Pooling
• Global Max-Pooling
• Global Avg-Pooling
• Simply average of receptive field
Usually in majority of cases, max-
pooling is used, but some time
avg-pooling is also used.

Global Max-pooling
• You convert entire filter map into a
1x1 scaler value
• For global average pooling, you take
average of all values of an input
feature map
• For global max pooling, you take
max of all values of an input feature
map
• Where to Use?: In the end stage of a
CNN, when you are flattening your
data, you use global max pooling as
replacement of flattening, to reduce
over-fitting.
For global max
pooling of an input
4x4x3 feature
maps, you get
output 1 x 3.
1 for each feature
map

Disadvantages of Pooling: Location
• The feature of translation invariance
actually make location of required
feature irrelevant to the detection of
feature. This is quite helpful in some
classification tasks, where for example
you need to identify, if the image
contains a cat or not, regardless of her
position/location in the input image
• However, in some computer vision tasks,
location of the feature is very important.
Such as image segmentation tasks,
location does matter.
• Thus pooling is not used in image-
segmentation tasks
In image segmentation tasks, the location of
car is important. The features must all be in the
same location where the car is present.

Disadvantages of Pooling: Information loss
• Lot of information is lost
• For example for pooling conversion from 4x4=16 to 2x2=4, its almost
60% loss of information
• However, it all depend on the application and information vs
computational complexity tradeoff
In image segmentation tasks, the
location of car is important. The
features must all be in the same
location where the car is present.

LENET-5 Architecture
Considered
as 1 layer
Considered
as 1 layer
Monochrome / greyscale image
FULLY CONNECTED ANN
FLATTEN
LAYER

LeNET-5 / TENSORFLOW
# Adding libraries
import tensorflow
from tensorflow import keras
from keras.layers import Dense,Conv2D,Flatten,MaxPooling2D
from keras import Sequential
from keras.datasets import mnist
# Loading dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# Generating model through KERAS Library
model_lenet5 = Sequential()
model_lenet5.add(Conv2D(6,kernel_size=(5,5),padding='valid', activation='tanh',
input_shape=(32,32,1)))
model_lenet5.add(MaxPooling2D(pool_size=(2, 2), strides=2, padding='valid'))
model_lenet5.add(Conv2D(16,kernel_size=(5,5),padding='valid', activation='tanh'))
model_lenet5.add(MaxPooling2D(pool_size=(2, 2), strides=2, padding='valid'))
model_lenet5.add(Flatten())
model_lenet5.add(Dense(120,activation='tanh'))
model_lenet5.add(Dense(84,activation='tanh'))
model_lenet5.add(Dense(10,activation='softmax'))
# Generating model Summary
model_lenet5.summary()
OUTPUT
ARCHITECTURE
SOURCE CODE
AMMAR
AHMED

How to calculate # of Learnable Parameters for Convlution Layer
6 feature maps at output
6 Filters required
Each of size mxnxl
Because Filter is also
3Dimensional
And 3rd
dimension comes
from input channels
Input image (RGB)
3 Dimensional
mxnxl
mxnxl
mxnxl
mxnxl
mxnxl
mxnxl

LENET-5 – Parameter Estimation
Layer
Fs = (nxm)
Input Shape
(a x b x l)
Output Shape
= ( (a – n + 2p) / s ) + 1
# of learnable parameters FS = Filter Size
#F = Number
of Filters
applied
l = # of
channels at
input
n , m = size of
filter
p = padding
s = strides
k = output
feature maps
First Layer: Conv3D
fs=(5x5), p=0, s=1, #f=6
(32,32,1) = ( (32-5+2(0)) / 1 ) + 1
= 27 + 1 = 28
(28 x 28 , 6)
= ( n x m x l + 1) * k
= ( 5 x 5 x 1 + 1 ) * 6
= 156
First Layer: Max-Pool
fs=(2x2), p=0, s=2, #f=6 (28x28x6)
= ( (28-2+2(0)) / 2 ) + 1
= 13 + 1 = 14
(14 x 14 x 6 )
0
Second layer: Conv3D
fs=(5x5), p=0, s=1, #f=16
(14 x 14 x 6) = ( (14-5+2(0)) / 1 ) + 1
= 9 + 1 = 10
(10 x 10 x 16)
= ( n x m x l +1 ) * k
= ( 5 x 5 x 6 + 1 ) * 16
= 2,416
Second layer: Max-Pool
fs=(2x2), p=0, s=2, #f=16
(10 x 10 x 16) = ( (10 – 2 + 2(0)) / 2 ) + 1
= 4 + 1 = 4
(5 x 5 x 16)
0
Flatten Layer (5 x 5 x 16) (1,400) – 1D Array 0
First Dense Layer (neuron=120) (1,400) (1,120) - 1D Array =(input*neurons)+biases
=(400*120)+120=48,120
2nd
Dense Layer (neuron=84) (1,120) (1,84) – 1D Array =(120*84+84)=10,164
Final output layer (1,84) (1,10) =(84x10+10)=850
Total learnable parameters 61,706 or 241 KB
AMMAR
AHMED

Layer Input Shape
(n x m x l)
Output Shape
= ( (n – f + 2p) / s ) + 1
# of parameters FS = Filter Size
#F = Number
of Filters
applied
l = # of
channels at
input
n , m = size of
filter
p = padding
s = strides
k = output
feature maps
First Layer: Conv3D
fs=(5x5), p=0, s=1, #f=6
(32,32,1) = ( (32-5+2(0)) / 1 ) + 1
= 27 + 1 = 28
(28 x 28 , 6)
= ( n x m x l ) * k
= ( 5 x 5 x 1 + 1 ) * 6
= 156
Fs2 * INPUTCHANNEL *
OUTPUT CHANNEL + BIAS (=#
OUTPUT CHANNELS)
(5*5*1*6+6)
fs=(2,2), p=0, s=2, #f=6 (28x28x6)
= ( (28-2+2(0)) / 2 ) + 1
= 13 + 1 = 14
(14 x 14 x 6 )
0
Second layer: Conv3D
fs=(5,5), p=0, s=1, #f=16
(14 x 14 x 6) = ( (14-5+2(0)) / 1 ) + 1
= 9 + 1 = 10
(10 x 10 x 16)
= ( n x m x l ) * k
= ( 5 x 5 x 6 + 1 ) * 16
= 2,416
Second layer: Max-Pool
fs=(2,2), p=0, s=2, #f=16
(10 x 10 x 16) = ( (10 – 2 + 2(0)) / 2 ) + 1
= 4 + 1 = 4
(5 x 5 x 16)
0
Flatten Layer (5 x 5 x 16) (1,400) 0
First Dense Layer (nodes=120) (1,400) (1,120) =(input*nodes)+biases
=(400*120)+120=48,120
(1*1) * 400 * 120 + 120
nd
AMMAR
AHMED

Layer Input Shape
(n x m x l)
Output Shape
= ( (n – f + 2p) / s ) + 1
# of parameters FS = Filter Size
#F = Number
of Filters
applied
l = # of
channels at
input
n , m = size of
filter
p = padding
s = strides
k = output
feature maps
First Layer: Conv3D
fs=(9x9), p=0, s=1, #f=3 (15,15,1)
= ( (15-9+2(0)) / 1 ) + 1
=6 + 1 =7
(7x 7 , 3)
= ( n x m x l ) * k
= ( 9 x 9 x 1 + 1 ) * 3
= 82*3 = 246
fs=(2,2), p=0, s=2, #f=3 (7x7x3)
= ( (7-3+2(0)) / 2 ) + 1
= 2 + 1 = 3
(3 x 3 x 3 )
0
Flatten Layer (3 x 3 x 3) (1,27) 0
First Dense Layer (nodes=27) (1,27) (1,27)
=(input*nodes)+biases
=(27*27)+27= 756
Final output layer(nodes=3) (1,27) (1,3) =(27*3)+3=84
AMMAR
AHMED

Single-shot object detection
• Single-shot object detection uses a single pass
of the input image to make predictions about the
presence and location of objects in the image.
• It processes an entire image in a single pass,
making them computationally efficient.
• It is generally less accurate than other methods,
and it’s less effective in detecting small objects.
• Such algorithms can be used to detect objects in
real time in resource-constrained environments.
• YOLO is a single-shot detector that uses a fully
convolutional neural network (CNN) to process an
image.
Two-shot object detection
• Uses two passes of the input image to make
predictions about the presence and location of objects.
• The first pass is used to generate a set of proposals or
potential object locations, and the second pass is
used to refine these proposals and make final
predictions.
• This approach is more accurate than single-shot object
detection but is also more computationally expensive.
• Generally, single-shot object detection is better suited
for real-time applications, while two-shot object
detection is better for applications where accuracy is
more important.
Object Detection Method Types

Metrics: Intersection over Union (IoU)
• Intersection over Union is a popular metric to
measure localization accuracy and calculate
localization errors in object detection models.
• To calculate the IoU between the predicted and
the ground truth bounding boxes, we first take
the intersecting area between the two
corresponding bounding boxes for the same
object.
• Following this, we calculate the total area
covered by the two bounding boxes— also known
as the “Union” and the area of overlap between
them called the “Intersection.”
• The intersection divided by the Union gives us the
ratio of the overlap to the total area, providing a
good estimate of how close the prediction
bounding box is to the original bounding box.

Metrics: Average Precision (AP)
• Average Precision (AP) is calculated as the area under a precision vs. recall curve
for a set of predictions.
• Recall is calculated as the ratio of the total predictions made by the model under a
class with a total of existing labels for the class.
• Precision refers to the ratio of true positives with respect to the total predictions
made by the model.
• Recall and precision offer a trade-off that is graphically represented into a curve
by varying the classification threshold. The area under this precision vs. recall
curve gives us the Average Precision per class for the model. The average of this
value, taken over all classes, is called mean Average Precision (mAP).
• In object detection, precision and recall aren’t used for class predictions. Instead,
they serve as predictions of boundary boxes for measuring the decision
performance. An IoU value > 0.5. is taken as a positive prediction, while an IoU
value < 0.5 is a negative prediction.

YOLO from Ultralytics
• You Only Look Once (YOLO)
proposes using an end-to-end
neural network that makes
predictions of bounding boxes
and class probabilities all at
once.
• It differs from the approach
taken by previous object
detection algorithms, which
repurposed classifiers to
perform detection.
• YOLO performs all of its predictions
with the help of a single fully
connected layer.
Ultralytics YOLOv8 | State-of-the-Art Vision AI

YOLO Algorithm / Architecture
YOLO Algorithm for Object Detection Explained [+Examples] (v7labs.com)

YoloR History : YOLO = You Only Look Once
Yolo has
outperformed
previous R CNN, Fast
R CNN and Faster R
CNN methods for
object detection
In one forward pass,
it can make
prediction, therefore
called You only look
once (YOLO)
(3) What is YOLO algorith
m? | Deep Learning Tuto
rial 31 (
Tensorflow, Keras
& Python) - YouTube

Yolo : Multi-Object Detection : Mult-grid and center of body approach

Training YOLO on multi-grid vectors

First Issue with YOLO: Multiple objects overlap but centers of both object not in one cell
• It may detect multiple bounding box for
same object (as shown)
• We don’t which box contains person and
which contains dog
• But every bounding box will have its own
probability

No Max Suppression
• We try to find the overlap area, which
is intersection over union
We apply technique
of “No Max
suppression” to get
the two distinct
boxes as shown on
right

First Issue with YOLO: Multiple objects overlap but centers of both objects are in one cell
When one cell contains center of two objects
– we have a problem of representation
Should we generate two
separate vectors of 7 depth or
should we combine them into
one anchor box of 14 values??
In real life, its rare to have
center of multiple objects into
same small cell.
Taking into account situations
where at max two objects lie in
center of one cell, is sufficient
for most cases.

CNN with two anchor boxes: A solution

Neural Network Types and Data
RNN
CNN
ANN
ANN
RNN
CNN

Details
Shifting from sigmoid to RELU
activation function, drastically
improved computation of
gradient descent algorithm.
This enabled use of larger and
shallow networks
Yhat = output
Y = ground truth
Loss function find difference

Logistic Regression cost function

Gradient Descent
• J(w,b) = cost function
• The plot of w,b and J(w,b) is a
surface
• J(w,b) is a convex function
• Target to find minimum of J(w,b)
• It is not an non-convex function,
which has lot of local minimas
• This nature of convex is reason
why we use this cost function
• For logistic function and due to
convex nature, it is not necessary
to initialize w,b from 0. you can
start from any point on the surface

Initialization of w and b
• To find a good value for the parameters, what we'll do is
initialize w and b to some initial value may be denoted by
that little red dot.
• And for logistic regression, almost any initialization method
works.
• Usually you Initialize the values of 0.
• Random initialization also works, but people don't usually
do that for logistic regression.
• Because this function is convex, no matter where you
initialize, you should get to the same point or roughly the
same point.
• And what gradient descent does is it starts at that initial
point and then takes a step in the steepest downhill
direction.
• So after one step of gradient descent, you might end up
there because it's trying to take a step downhill in the
direction of steepest descent or as quickly down who as
possible.
• So that's one iteration of gradient descent. And after
iterations of gradient descent, you might stop there, three
iterations and so on.
Need to converge to this
point here. Which is absolute
minima or global optimum

Gradeint Descent: assume on one parameter ‘w’
1
2
Suppose initial w is at point 1
W:= w - a (1) .. Because slope is +ve, updated w will be lower, move downward the curve
Suppose initial w is at point 2
W:= w – a (-1) .. Because slope is –ve, updated w will be higher, move downward the curve
Alpha = update / trainging rate

Gradeint Descent: for both parameter ‘w’ and ‘b’
If J is function of one variable, we
use simple derivative “d”
If J is function of 2 or more
variable, we use partial derivative

Coding convention
• Dw = simple derivative
• Db = partial derivative

Derivative of a linear line is constant

Derivative of a curve (non linear) is not constant

Computation Graph
Going in reverse order (back propagation) is easier way to calculate
derivatives . DJ/Dv then Dv/dy then du/db

Logistic Regression Derivatives

Derivation
Derivation of DL/dz
- Deep Learning Specialization / N
eural Networks and Deep Learning
- DeepLearning.AI

Derivation of dZ = a – y
Y^(y cap) is denoted as ‘a’ here
dZ in ML means = dL(a,y) / dz

Derivation of dw1, dw2 and db
Dw1 = ( -y/a + 1-y / 1-a ) x a (1-a) x d/dw1 (w1x1+w2x2+b)
Dw1 = ( a – y ) . d/dw2 (w1x1 + w2x2 + b)
Dw1 = ( a – y ) . x1
Dw1 = x1. DZ
(here DZ is ML notation)
Dw2 = ( -y/a + 1-y / 1-a ) x a (1-a) x d/dw1 (w1x1+w2x2+b)
Dw2 = ( a – y ) . d/dw2 (w1x1 + w2x2 + b)
Dw2 = ( a – y ) . x2
Dw2 = x2. DZ
Db = ( -y/a + 1-y / 1-a ) x a (1-a) x d/dw1 (w1x1+w2x2+b)
Db = ( a – y ) . d/db (w1x1 + w2x2 + b)
Db = ( a – y ) . 1
Db = DZ
Dw1,Dw2,Db and DZ are all ML notations
And means derivative of L(a,y) with respect to w1,w2 and b
respectively

Computing for entire dataset [m samples]
Major
for
loop
for
m
samples
Minor
for
loop
(required)
for
wn
weights
We need vectorization to get rid of
these for loop and write efficient code.
Necessary when m is very large.

What is Vectorization?
GPU and CPU can exceute parallel instructions
If you use built in function such as np.dot which don’t require
explicitly implementing for loop, it enables numpy to exploit
parallelism and thus your computation runs faster
Vectorization can significantly improve your code.
# Program to demonstrate how vectorization improves computational performance
# by comparing a vector dot product (parall implementation) vs foor loop implementation (sequential
execution)
# Ammar Ahmed
import time
import numpy as np
# Getting details about underlying hardware # running on google collab
import platform
print("Machine :" + str(platform.machine()))
print("Platform version :" + str(platform.version()))
print("Platform :" + str(platform.platform()))
# Generating array of elements
a = np.random.rand(1000000)
b = np.random.rand(1000000)
# Vector implementation
result=0
tic = time.time()
result = np.dot(a,b)
toc = time.time()
t1 = (toc - tic)*1000
print("Execution time of vectorized version = " + str(t1) + "ms " + " Computed Value :" + str(result))
# Non vector / loop implementation
result=0
tic = time.time()
for i in range(1000000):
result += a[i]*b[i]
toc = time.time()
t2 = (toc - tic)*1000
print("Execution time of sequential loop version = " + str(t2) + "ms "+ " Computed Value :" + str(result))
print("Time difference = " + str(t2-21) + "ms")

Vector implementation using python : numpy

Logistic regression derivatives

Python broadcasting
Cal is already in right shape for
divide through broadcasting
method.
However .reshape make sure
you are doing it right.
However, it can be omitted.

Python broadcasting
Refer to python documentation for more general principle for broadcasting

KEY LEARNING!
• A * b = element
wise multiplication
• Np.dot(a,b) = matrix
multiplication

Tips: “Don’t use rank 1 arrays” Rank 1 arrays are neither row vector
nor column vector. Therefore matrix
or vector operations are not
consistent with them.
Always initialize with proper
structure and size
Use
A = a.reshape((5,1)) to convert rank 1
array into a column vector
a.Shape = (5,1) = column vector
a.Shape = (1,5) = row vector
Assert(a.shape == (5,1) ) SIMPLIY YOUR CODE
ALWAYS USE COLUMN OR ROW VECTORS

Python code: wrong implementation
Rank 1 Array
Its neither row vector
nor column vector
Not doing
proper
transpose to the
vector. wrong
Not proper dot
product as required

Python code: correct implementation

Why we use numpy and not python’s math library?
Actually, we
rarely use the
"math" library
in deep learning
because the
inputs of the
functions are
real numbers.
In deep learning
we mostly use
matrices and
vectors. This is
why numpy is
more useful.
Numpy
version of
sigmoid

Vectorization of an RGB image | Reshaping Arrays

image2vector
Flatten vector /
vectorization
using .reshape
on image matrix

Normalizaiton - code
**Note**:
In normalize_rows(), you can try to print
the shapes of x_norm and x, and then
rerun the assessment.
You'll find out that they have different
shapes. This is normal given that x_norm
takes the norm of each row of x.
So x_norm has the same number of rows
but only 1 column. So how did it work
when you divided x by x_norm? This is
called broadcasting and we'll talk about it
now!

Softmax – A normalizing function
Softmax is a normalizing function used
when the algorithm needs to classify two
or more classes.

Softmax – Python Code
- If you print the shapes of x_exp,
x_sum and s above and rerun the
assessment cell, you will see that
x_sum is of shape (2,1) while x_exp
and s are of shape (2,5).
**x_exp/x_sum** works due to
python broadcasting.

Key points to remember
• np.exp(x) works for any np.array x and applies the exponential function to
every coordinate
• the sigmoid function and its gradient
• image2vector is commonly used in deep learning
• np.reshape is widely used.
• numpy has efficient built-in functions
• broadcasting is extremely useful

Some interpreter directives
import numpy as np
import copy
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset
from public_tests import *
%matplotlib inline
%load_ext autoreload
%autoreload 2

Required Functions to implement

Neural Network Representation
Two layer network
As we don’t count input
layer
[n] represent layer #
an represent node # in the
layer

Neural Network Representation
Implementing above four set of equations using loop will
be very slow.
We need to VECTORIZE them.

Vectorize representation
Converting to
stacked matrixes
or column vector
notation

Convolution animations
No padding, no strides Arbitrary padding, no strides Half padding, no strides Full padding, no strides
No padding, strides Padding, strides Padding, strides (odd)
GitHub - vdumoulin/conv_arithmetic
: A technical report on convolution arit
hmetic in the context of deep learning

Deep learning requirement and notes for novoice

More Related Content

Similar to Deep learning requirement and notes for novoice

Recently uploaded

Deep learning requirement and notes for novoice

Editor's Notes