Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Computer Science

Muhammad Usman Akhtar
Ph.D. Scholar, School Of Computer Science
Wuhan University, Wuhan, China
DEEP LEARNING
UNDERSTANDING FUNDAMENTALS

Outline
1 Machine Learning (ML)
2 Ingredients for training ML
3 Types of ML algorithms
3.1 Supervised Learning
3.2 Un-supervised Learning
3.3 Reinforcement Learning
4 Deep Learning (DL)
4.1 Why DL useful
4.2 Application
5 Architectures
6 Activation Function
7 Popular Neural Network Architecture
7.1 Feedforward Neural network
7.2 Recurrent Neural network
7.3 Convolutional neural network

MACHINE LEARNING
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan
University | School of Computer Science

1 Machine Learning
◂ Machine learning is the scientific study of algorithms and statistical models
that computer systems use to perform a specific task without using explicit
instructions, relying on patterns and inference instead. It is seen as a subset
of artificial intelligence.

2 Ingredients for TRAINING ML ALGORITHM
 Data
 Model
 Objective function
 Optimization Algorithm

Data
◂ First we must prepare a certain amount of data to train with. Usually
this is historical data which is readily available.

Model
◂ The simplest model we need to train is a linear model.
◂ In weather forecast problem that would mean to find some coefficients multiply
each variable with them and sum everything to get the output .

Objective Function
DATA MODEL OUTPUT
fed obtain
We want the output as close to reality as possible. That's where the objective function comes in, it estimates how correct the model outputs are?
Here out goal is to minimize the objective function or error

Optimization Algorithm
◂ It consists the mechanics through which we vary the parameters of the
model to optimize the objective function.

3 Types of ML Algorithms
◂ Supervised Learning
◂ Unsupervised Learning
◂ Reinforcement learning

3.1 Supervised Learning
◂ Learning with a labeled training set. Starting from the analysis of a known training
dataset, the learning algorithm produces an inferred function to make predictions
about the output values. The system is able to provide targets for any new input
after sufficient training.
◂ Example: email classification and tea making trained on labeled data.

3.2 Unsupervised Learning
◂ Unsupervised learning studies how systems can infer a function to describe
a hidden structure from unlabeled data. The system doesn’t figure out the
right output, but it explores the data and can draw inferences from
datasets, that’s called clustering.
◂ Example: House Price and animal classifier

3.3 Reinforcement learning
◂ Interacts with its environment by producing actions and discovers errors or
rewards. This method allows machines and software agents to
automatically maximize its performance.
◂ Example: learn to play Go, reward: win or lose

DEEP LEARNING

4 Deep Learning (DL) ?
◂ Deep learning is a machine learning technique that teaches computers to do what
comes naturally to humans: Visual , text sound- learn by example.
◂ Deep learning algorithms attempt to learn (multiple levels of) representation by
using a hierarchy of multiple layers, learning can be supervised, semi-supervised or
unsupervised.
◂ If you provide the system tons of information, it begins to understand it and
respond in useful ways.

4.1 Why is DL useful?
 Manually designed features are often over-specified, incomplete and take a long time
to design and validate
 Learned Features are easy to adapt, fast to learn
 Utilize large amounts of training data
In ~2010 DL started outperforming other ML techniques
first in speech and vision, then NLP

4.2 Application
◂ DL is a key technology behind Drive Less cars. It is the key to voice control in
consumer devices like phones, tablets, TVs, and hands-free speakers.
◂ Medical Research
◂ Several big improvements in recent years in NLP
 Machine Translation
 Sentiment Analysis
 Dialogue Agents
 Question Answering
 Text Classification …

5 Architecture 3*4=12
w11 w12 w13 w14
w21 w22 w23 w24
w31 w32 w33 w34
X*Y=
1 ∗ 3 ∗ 3 ∗ 4 = 1 ∗ 4

5.2 Packages
A third-party package used for computations. Allows us
to work with multi dimension arrays.
2D plotting package, especially designed for visualizing
Python and NumPy computations.
Machine Learning, especially deep learning.
Features various algorithms like support vector machine, random
forests, and k-neighbors, and it also supports Python numerical and
scientific libraries like NumPy and SciPy
Alternative

Parameters
found by optimizing
◂ Width
◂ Depth
◂ Learning Rate
5.3 Hyperparameters
pre-set by us
vs
◂ Weights(w)
◂ Biases(b)

5.4 Vanishing Gradient
◂ Each of the neural networks weights receives an update proportion to partial
derivation of error function with respect to the current weight in each iteration
of training.

5.5 Under fitting and Overfitting

5.6 Training Loss and validation loss

6 Activation Functions
◂ A Neural Network without Activation function would simply be a Linear
regression Model, which is limited in complexity and less power to learn to
learn complex functional mapping such as images, videos , audio , speech
etc.

6.1 Sigmoid (Logistic Function)
◂ A sigmoid activation squishes values between 0 and 1. That is helpful to
update or forget data because any number getting multiplied by 0 is 0, causing
values to disappears or be “forgotten.” Any number multiplied by 1 is the
same value therefore that value stay’s the same or is “kept.” The network can
learn which data is not important therefore can be forgotten or which data is
important to keep.

6.2 Activation: Tanh
◂ The tanh activation is used to help regulate the values flowing through the
network. The tanh function squishes values to always be between -1 and 1.

6.3 ReLU
◂ Takes a real-valued number and thresholds it at zero.
◂ Used within hidden layers for outside use softmax
◂ Prevents the gradient vanishing problem

6.4 Softmax
◂ Is a function that takes as input a vector of K real numbers, and normalizes
it into a probability distribution consisting of K probabilities proportional to
the exponentials of the input numbers

7.1 Feed Forward Neural network
◂ In feed forward Neural network information flows in only forward direction,
from input to nodes, through the hidden layers

7.2 Recurrent NN
◂ Recurrent Neural Networks, or RNNs, were designed to work with
sequence prediction problems rather than local features.
◂ Sequence prediction problems come in many forms and are best
described by the types of inputs and outputs supported.
Memorizes time
series input
Handle sequential data
Consider
Consider all inputs

Use RNNs For:
◂ Text data
◂ Speech data
◂ Classification prediction problems
◂ Regression prediction problems
◂ Machine Translation
Don’t Use RNNs For:
◂ Tabular data
◂ Image data

7.3 Convolutional NN
◂ CNNs, were designed to map image data to an output variable. They have ability
to develop an internal representation of a two-dimensional image.
◂ The CNN input is traditionally two-dimensional, a field or matrix, but can also be
changed to be one-dimensional, allowing it to develop an internal
representation of a one-dimensional sequence. This allows the CNN to be used
more generally on other types of data that has a spatial relationship.
◂ For example: There is an order relationship between words in a document of
text. There is an ordered relationship in the time steps of a time series.

Use CNNs For:
◂ Image data
◂ Classification prediction problems
◂ Regression prediction problems
Try CNNs On:
◂ Text data
◂ Time series data
◂ Sequence input data

Application Example:
IMDB Movie reviews sentiment classification
◂ https://uofi.box.com/v/cs510DL
50 K Reviews
25K Positive 25K Negative

Application Example:
Relation Extraction from text
Useful for:
• knowledge base completion
• social media analysis
• question answering
• …

Possible Question?
• When to use Supervised Learning?
• Which Algorithm is best for time series dependent solutions?
• What is 10 fold validation?

Reference
 http://web.stanford.edu/class/cs224n
 https://www.coursera.org/specializations/deep-learning
 https://chrisalbon.com/#Deep-Learning
 http://www.asimovinstitute.org/neural-network-zoo
 http://cs231n.github.io/optimization-2
 https://medium.com/@ramrajchandradevan/the-evolution-of-gradient-descend-optimization-algorithm-4106a6702d39
 https://arimo.com/data-science/2016/bayesian-optimization-hyperparameter-tuning
 http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow
 http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp
 https://medium.com/technologymadeeasy/the-best-explanation-of-convolutional-neural-networks-on-the-internet-fbb8b1ad5df8
 http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/
 http://www.wildml.com/2015/10/recurrent-neural-network-tutorial-part-4-implementing-a-grulstm-rnn-with-python-and-theano/
 http://colah.github.io/posts/2015-08-Understanding-LSTMs
 https://github.com/hyperopt/hyperopt
 https://github.com/tensorflow/nmt

Thank you!
You can find me at
◂ ua@uetpeshawar.edu.pk
Q&A

Muhammad Usman Akhtar
Ph.D. Scholar, School Of Computer Science
Wuhan University, Wuhan, China
DEEP LEARNING
UNDERSTANDING FUNDAMENTALS

Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Computer Science

More Related Content

What's hot

Similar to Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Computer Science

Recently uploaded

Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Computer Science

Editor's Notes