A Beginner's Approach to Deep Learning Techniques

A Beginner’s Approach to Deep
Learning

What is Deep Learning?
deep
learning (DL)
machine
learning (ML)
artificial
intelligence
(AI)
Provides computers or computing systems the
ability to automatically learn and improve from
experience without being explicitly programmed
Uses multiple layers to progressively extract
higher level features from the raw input, with the
features and classification both learned from data
Algorithms to mimic human intelligence with
some logic rule which may or may not be trained
through some data

Courtesy: Semiconductor Engineering
Machine Learning vs Deep Learning

Deep Learning in Healthcare
Disease diagnosis
Medical imaging
Smart health
records
Disease prediction
Personalized
medicine

Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
Learning
Clustering
Centroid
Density
Distribution
Hierarchical
Density
Estimation
Parametric
Non-parametric
Dimensionality
Reduction
Reinforcement
Learning
Positive
Negative
Hierarchy of ML

Input layer takes in numerical features
Input layers are often connected to hidden layers and finally to
output layer
These connections are called edges
Edges typically have a weight that adjusts as learning proceeds
Each circular unit is called a node
The inputs at each node is multiplied with the corresponding
weights, added with a bias, passed through an activation
function to obtain the output
Neural Networks

Perceptron – Simplest Neural Network
Courtesy: Towards Data Science
Perceptron is a single layer neural network
inputs weights
× +
bias
activate
The perceptron consists of 4 parts.
• Inputs
• Weights and Bias
• Net sum
• Activation Function
Weights show the strength of the particular
edge
A bias value allows you to shift the activation
function curve up or down
Activation functions are used to map the input
between the required values like (0, 1) or (-1, 1).

Multi-Layer Perceptron (MLP)
MLP has more than a single layer
Layers between input and output are hidden
layers
MLP utilizes a supervised learning technique
called backpropagation for training
It can distinguish data that is not linearly
separable.
As we increase no. of layers in MLP, we enter DL.

Convolutional Neural Networks (CNN)
ConvNets have the ability to learn these
filters
Less features to train than MLP, if input size is
large
CNNs have four types of layers:
• Convolution layer
• Pooling layer
• Dense/Fully connected layer
• Activation layer
CNNs are well-suited for image classification
tasks
Popular CNN architectures include LeNet,
AlexNet, VGGNet, etc.

Convolution Layer
Courtesy: IBM Research
The convolutional layer is the core building block of a
CNN
Filter values are the weights which are learned during
training
Convolution kernel or a filter moves across the
receptive fields of the image, checking if the feature is
present
A dot product is calculated between the input pixels
and the filter
The filter shifts by a stride, repeating the process until
the kernel has swept across the entire image
The final output from the series of dot products from
the input and the filter is known as a feature map,
activation map

Pooling Layer
Pooling layer conducts dimensionality
reduction, reducing the number of parameters
in the input
Pooling layer does not have any trainable
weights
There are two main types of pooling:
• Max pooling
• Average pooling
Pooling layers help to reduce complexity,
improve efficiency, and limit risk of overfitting.

Fully Connected Layer
In the fully-connected layer, each node in the
output layer connects directly to a node in
the previous layer
This layer performs the task of classification
based on the features extracted through the
previous layers and their different filters.

Activation Functions
Activation function is used to map the input feature value to get the desired output
of node
It is used to determine the output of neural network like yes or no. It maps the
resulting values in between 0 to 1 or-1 to 1 etc.
Sigmoid or Logistic Activation Function
The main reason why we use sigmoid function is because it
exists between 0 to 1.
Therefore, it is especially used for models where we have
to predict the probability as an output.
The function is differentiable, so we can find the slope of the
sigmoid curve at any two points.
The logistic sigmoid function can cause a neural network to get
stuck at the training time.

Softmax Activation Function
The softmax function is a more generalized logistic activation
function which is used for multiclass classification.

Tanh or hyperbolic tangent Activation Function
The range of the tanh function is from-1 to 1
The advantage is that the negative inputs will be mapped
strongly negative and the zero inputs will be mapped near zero
in the tanh graph.
The tanh function is mainly used classification between two
classes.

ReLU (Rectified Linear Unit) Activation Function
The range of the ReLU function is from 0 to positive infinity
The disadvantage is that the any negative input given to the
ReLU activation function turns the value into zero immediately
in the graph

Leaky ReLU Activation Function
It is an attempt to solve the dying ReLU problem
Usually, the value of a is 0.01
The range of the Leaky ReLU is-infinity to infinity

Training a CNN
Courtesy: Andrej Karpathy
Once the architecture is fixed, the training is
possible
Training is the fixing of optimal weights of
the convolutional and fully connected layers
Backpropagation of error is used for
updating the weights
A loss function is optimized w.r.t. the weights
Training uses Stochastic Gradient Decent
optimization
Parameters include learning rate, optimizer,
batch size, validation split, metric, and loss
Training is to map a set of inputs to a set of
outputs from training data.

Losses
Courtesy: Machine Learning Mastery
Loss function is used to evaluate the performance of prediction, and is a measure of the error in prediction
Binary Classification Problems Multi-Class Classification Problems Regression Problems
Mean Squared Error
(MSE)
Cross-Entropy Categorical Cross-
Entropy
Sparse Categorical Cross-
Entropy

Metrics
Supervised Learning
Classification Problems
Confusion Matrix
Accuracy
Precision
Recall
F1 score
ROC
AUC
Regression Problems
Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Coefficient of Determination (R2
)

Getting Started
Install keras and tensorflow
Prepare the training and testing data
Build the CNN layers using the Tensorflow library
Select the Optimizer
Train the network
Finally, test the model
pip install keras and pip install tensorflow
Make separate folders, without repeating data, and separate classes in different folders in training data
Use a sequential model and keep stacking layers
The most common one is Adam
Use suitable parameters based on the problem
Test it on the test database

Text Recognition RNTN, RNN
Image Recognition CNN, DBN
Object Recognition CNN, RNTN
Time Series Analysis RNN
Video Analysis RNN
Other Networks

References
Nettleton, David F., Albert Orriols-Puig, and Albert Fornells. "A study of the effect of different types of noise on the precision of supervised learning
techniques." Artificial intelligence review 33.4 (2010): 275-306.
Polikar, Robi. "Ensemble learning." Ensemble machine learning. Springer, Boston, MA, 2012. 1-34.
Holzinger, Andreas, et al. "Current advances, trends and challenges of machine learning and knowledge extraction: from machine learning to explainable
AI." International Cross-Domain Conference for Machine Learning and Knowledge Extraction. Springer, Cham, 2018.
Seliya, Naeem, Taghi M. Khoshgoftaar, and Jason Van Hulse. "A study on the relationships of classifier performance metrics." 2009 21st IEEE
international conference on tools with artificial intelligence. IEEE, 2009.
Blum, Avrim L., and Pat Langley. "Selection of relevant features and examples in machine learning." Artificial intelligence 97.1-2 (1997): 245-271.
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. "Unsupervised learning." The elements of statistical learning. Springer, New York, NY, 2009.
485-585.
Deng, Li, Geoffrey Hinton, and Brian Kingsbury. "New types of deep neural network learning for speech recognition and related applications: An
overview." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013.

A Beginner's Approach to Deep Learning Techniques

A Beginner's Approach to Deep Learning Techniques

More Related Content

Similar to A Beginner's Approach to Deep Learning Techniques

Recently uploaded

A Beginner's Approach to Deep Learning Techniques