INTRODUCTION TO DEEP LEARNING
Big Guns
Google Trends
Real ?
Look Around !
FR - Image Captioning API
Deep Learning is well descriptive
For Visually Impaired People
Self Driving Cars - Detection Mechanisms
MIT 6.S094- Self Driving
So Many Applications
1. How FB cluster images - Open Face
2. Google Neural Machine Translation - Seq2Seq
3. Google Assistant/ Siri - Seq2Seq , Attention Mechanisms
4. Self Driving Cars - Detection , Deep Reinforcement Learning
5. Signal Processing - Discriminative and separable feature extractors
Yes… Even In Wall Street !
Algorithmic Trading With Bots
THE GAP BETWEEN ACADEMIA AND
INDUSTRY IS REALLY REALLY LESS
Where To Start ?
● This is different from traditional Machine Learning
● SVM , Decision Trees , Random Forest , Naive Bayes , Regression
Where to use traditional ML ?
● Good for problems with linear classification boundaries
Not Lucky Enough ..
● Non Linear Boundaries
● We need algorithms with more non-linearity
Problems Getting Complex With Big Data
1. Data is new oil
2. Economy is data driven
3. So many patterns
4. So many connections
Mining the Big Data
Again more NON LINEARITY
Best Fit is Deep Neural Architecture
AI vs ML vs DL
● Nutshell
Machine Learning Basics - Very Brief
1. Supervised Learning - Neural Nets , SVM , Naive Bayes , Random Forest
2. Unsupervised Learning - K Means Clustering
3. Reinforcement Learning - Model Free Learning , MDP , Q Learning
4. Semi Supervised Learning - GAN(New)
Supervised vs Unsupervised
Reinforcement Learning
Semi Supervised Learning (GAN)
Again Why Deep Learning
❏ Traditional ML works with more rule based structure
❏ Not deep enough to extract complex patterns
❏ Should carefully input selected features : PCA , Handcraft -Haar
Haar Features
SIFT Features
Deep Learning Allows : End to End Training
Where to Start ..
★ Neural Architecture
Single Layer Neural Net
Architecture - Basics
● Classification Algorithm
● Trainable Weight Set
● Labeled Data
Let’s start with a node
Perceptron
Nonlinearity applied inside
a node
What Kind Of Nonlinearity ?
Nonlinearity to mine complex boundaries
Basic Idea -:
These units are there to squash the information which means they have a
working rage .
Ex : Activation Range Of Sigmoid (a) & Tanh(b) Neurons
Linear line in the curve
Understanding the Neural Model
● Training Phase -: Supervised Learning | Labeled data
● Inference / Testing Phase -: Checking the model for unseen data
Architecture
Forward Pass
•Calculating the loss
Backward Pass
•Backpropagation Algorithm
•Distributing Gradients
Optimizing
•Reducing the loss
•Updating the weight matrix
Updating weights - SGD
Loss FunctionUpdating weights
YOU CAN’T TRAIN
IF THERE ARE NO GRADIENTS
Deep Neural Networks
This went deeper!
How ?
With the help of two superheros!
Deep Neural Networks
What makes this special?
Hierarchical Feature Representation
Why this feature thing is so effective ?
Then what happened ?
Deep Neural Nets became harder and harder to train!
DEEP NETS!
Y U NO EASY?
Number Of Parameters?
These nets are huge!
Many
layers
nodes
parameters
BUT DON’T KNOW
HOW MANY HIDDEN
LAYERS / NODES TO
USE?
WHEN YOU WANT TO
BUILD A NEURAL NET
Overfitting
Algorithm Fail To Generalize Things
❏ When predicting , algorithm should be very logical , it should take decisions
based on the patterns .
❏ Otherwise It will fail eventually on Unseen Data
❏ When networks become bigger and bigger they came up with overfitting issue
more and more
Common Regularization wasn’t enough
● Adding some kind of penalty when tweaking
parameters
Common Regularization Methods
Not Enough !
Left - L2 Regularization | Right - L1 Regularization
Then Came up with new methods
1. Dropout
2. Modified SGD for optimization - Adam , RMSprop , Adagrad
3. Different architectures like CNN
Killing the training process
Neuron in a neural network
Sigmoid activation function
Batch Normalization Helped DL a LOT!
A Neural Net should have more Unsaturated Neurons .
● Backpropagation is basically applying chain rule
● In-order to make it happen we need to calculate local gradients in each node
connection
Network with ReLU
activation function
Feature Engineering
Canny edge detection filter
Can we design set of features for
machines ??
No way!
We may design some high level features!
But our machines deal with PIXELS!
(In other domains like NLP also)
What if we let the machine to
extract it’s own features!
Deep Learning is all about End-End Training
Automated Feature Extraction
Visualization of self learned filters of a CNN. Each layer different
features
Some Cool Applications
Self driving cars
Generative Chatbots
Detectors
VQA networks
Introduction to deep learning workshop

Introduction to deep learning workshop

Editor's Notes

  • #2 Topics 1. Very short intro what is ML 2.Difference between ML and DL 3. Why DL is so effective 4. Modern day aplications
  • #7 Some applications - Making the DL more familier .
  • #8 In FR we use this
  • #12 Applications and their papers or algorithms
  • #20 If we can mimic the brain it will be a good solution
  • #22 Give a little feed , coming slides have the information
  • #26 Here just give some alarm why deep Learning
  • #27 From here onwards I will give some theoretical Stuff in o
  • #28 Explaining the neural achitecture
  • #30 Simple introduction to smallest unit of a neural network How dot product and non linearity is applied
  • #32 Make them sure they get the idea these functions can mine complex non linear boundaries
  • #33 Then should Speak about the training phase
  • #34 Forward pass Loss function Reducing the loss Tweaking the weights FINAL LAYER
  • #35 Generally speaking the importance of gradients and backward pass
  • #36 Very brief idea - This is a 3 dementianal loss function but in real this MULTI DIMENTIONAL
  • #37 What is happening in the training . We don’t use Naive Optimization
  • #38 Somehow u should calculate the gradietns
  • #41 At that time these networks outperformed all the things .. so they needed to develop this deeper ..
  • #43 Give them number of nodes parameters and weights increased with line Same process
  • #44 The hierarchical feature representation and classifying NOT JUST FOR IMAGEST !
  • #45 Botom to top approach Why this structure matters so much ???
  • #46 Will explain how brain would like to see things in brief . They use a cortex of a cat
  • #47 Why people had to move from deep neural nets
  • #48 Give them the idea of how many nodes and how many parameters has increased
  • #49 Give them the idea they nets are complex and have many things to calculate . I will talk about bit of Problemes
  • #50 Very brief
  • #55 Mathematical complexity
  • #57 Most of the neurones have died
  • #58 Deep Learning NOT Feature Engineering
  • #59 Feature engineering - These are human engineered features . Then people tried to mke this automated . This is the starting point of the CNN
  • #60 This is like deciding of making a filter
  • #61 We don’t work with the pixels. We use some high level features
  • #62 New wway of thingking
  • #63 Some features doenst even make sense for us