SlideShare a Scribd company logo
Deep Learning
Presentation By:
Yasas Senarath, Research Assistant,
University of Moratuwa
Lecturer in Charge:
Dr. Uthayasanker Thayasivam
1
Overview
1. Artificial Neural Network Basics
2. Introduction to Deep Learning
3. Convolutional Neural Networks (CNNs)
4. Recurrent neural network (RNNs)
5. Practical Session
2
Artificial Neural Network Basics
3
Artificial Neural Network
• Computational model based on the structure and functions of
biological neural networks
The structure of a single artificial neuronThe structure of a basic biological neuron
4
A Neuron - Function
• Receiving information: the processing unit obtains the information as
input x1,x2,....,xn.
• Weighting: each input is weighted
by its corresponding weights
denoted as w0,w1,w2,....,wn.
• Activation: an activation function
f is applied to the sum of all the
weighted inputs z.
• Output: an output is y generated
depending on z.
The structure of a single artificial neuron
5
Activation Function
• Threshold Function
• Sigmoid Function
• Hyperbolic Tangent Function
• Rectified Linear Units
6
Feedforward Neural Network
• Connections between the nodes do not form a cycle
The structure of a fully connected 3-Layer neural network
7
Deep Neural Networks
8
The Black Box
CAR
Preprocessing
Feature
Extraction
Post Processing Classifier
9
Issues with Hand
Engineered Features
• Most critical for accuracy
• Most time-consuming in development
• What is the best feature???
• What is next?? Keep on crafting better
features?
• Let’s learn feature representation directly
from data.
10
Learning Features and classifier
together
• A non-linear mapping that takes raw
pixels directly to labels
• How to build?
• By combining simple building blocks (i.e.
layers in Neural Network)
Hmmm…
Which is
better?
Option 2 is better
Option 1 Option 2
11
Intuition behind Deep Neural Nets
• Each layer will have parameters subject to learning
• Composition makes a highly non-linear system
• In case of classification:
• Final layer outputs a probability
distribution of categories.
Final Layer
A Layer
12
Training a Deep Neural Network
• Compute loss on small batches(Forward Propagation)
• Compute Gradient w.r.t. parameters
• Use gradient to update parameters
𝑦1
𝑋
𝑦
Error
Number of Hidden
Units
Number of Hidden
Layers
Type of Layer
Loss Function
13
Types of Layers
• Dense Layer (activation=ReLU)
• Convolutional Layer in Convolutional Neural Network (CNN)
• Recurrent Neural Network (RNN) Layer
• Simple RNN cells
• LSTM cell
• GRU cell
14
Convolutional Neural Networks (CNNs)
• AKA ConvNets
• Regular Neural Nets don’t scale well
• 3D volumes of neurons
• Depth
• Height
• Width
• Mainly used in
• Image Processing
• Natural Language Processing
15
Consider an Image…
Example: 1000 X 1000 image
1M Hidden Units
1 B Parameters to Optimize!!
16
Reduce connections to local regions
Example: 1000 x 1000 image
1 M hidden units
Filter size: 10 * 10
10 M parameters
17
Reuse the same kernel everywhere
Why?
Because interesting
features (edges) can
happen at anywhere in
the image
Share the same parameters across
different locations
Convolution with learned kernels
18
Convolutional Neural Nets
Learn Multiple Filters
Example: 1000 x 1000 image
100 Filters
Filter size: 10 * 10
10 K parameters
19
Handling Multiple Channels
• Image may contain
multiple channels
• Eg: 3 channel (R, G, B)
image
• 3 separate k by k filter
is applied to each
channel
20
Translation Invariance
Assume we are going to make an Eye detector
Problem: How to make the detection
robust to exact Eye location?
21
Translation Invariance
Solution: Use pooling (max / average)
on the filter responses
• Provides robustness to exact spatial location of
features
• Also sub-samples the image allowing next layer
to look @ larger spatial regions
22
Summary of Complete CNN
• Doing all of this consists one layer.
• Pooling and normalization is optional
• Stack them up and train just like multilayer
neural nets
• Multiple Conv Layers can be used to learn high
level features
• Final layer is usually fully connected 𝑛𝑒𝑢𝑟𝑎𝑙 𝑛𝑒𝑡
𝑤𝑖𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡 𝑠𝑖𝑧𝑒 == 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
23
Recurrent neural network (RNN)
• Considers sequence
• Used in Forecasting
• Applications
• Language Modelling
• Machine Translation
• Conversation Bots
• Image Description
• Image Search
24
Structure of RNN
• Performs the same task for every element of a sequence, with the
output being depended on the previous computations
• Have a “memory” which captures information about what has been
calculated so far
An unrolled recurrent neural network. 25
A Simple RNN
• Performs the same task for every element of a sequence, with the
output being depended on the previous computations
Unrolled RNN 26
The Problem of Long-Term Dependencies
• Consider a language model
trying to predict the next word
based on the previous ones
• Larger Gap => Unable to learn
features by RNN
• Theoretically, this should be
possible but practically simple
RNNs are not capable of
representing long-term
dependencies
𝑇ℎ𝑒 𝑐𝑙𝑜𝑢𝑑𝑠 𝑎𝑟𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑘𝑦
𝐼 𝑔𝑟𝑒𝑤 𝑢𝑝 𝑖𝑛 𝐹𝑟𝑎𝑛𝑐𝑒 … 𝐼 𝑠𝑝𝑒𝑎𝑘 𝑓𝑙𝑢𝑒𝑛𝑡 𝐹𝑟𝑒𝑛𝑐ℎ27
LSTM - Hochreiter & Schmidhuber (1997)
• A special kind of RNN
• Capable of learning long-term dependencies
• Long periods of time is practically their default behaviour, not
something they struggle to learn!
An unrolled LSTM 28
LSTM Modes and examples
29
Example Model (Image Captioning)
30
Practical Session
• See https://online.mrt.ac.lk/mod/folder/view.php?id=65448
• Follow instructions in Moodle to get started using Colab
• Then follow the instructions in Python Notebook
31
Resources
1. http://cs231n.github.io/convolutional-networks/
2. http://www.cs.umd.edu/~djacobs/CMSC733/CNN.pdf
3. http://colah.github.io/posts/2015-08-Understanding-LSTMs/
4. https://wiki.tum.de/display/lfdv/Artificial+Neural+Networks
32

More Related Content

What's hot

Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
ernj
 

What's hot (20)

CNN
CNNCNN
CNN
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
Handwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPTHandwritten Digit Recognition(Convolutional Neural Network) PPT
Handwritten Digit Recognition(Convolutional Neural Network) PPT
 
Digit recognition using mnist database
Digit recognition using mnist databaseDigit recognition using mnist database
Digit recognition using mnist database
 
Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.Convolutional Neural Network and RNN for OCR problem.
Convolutional Neural Network and RNN for OCR problem.
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Artificial Neural Networks: Pointers
Artificial Neural Networks: PointersArtificial Neural Networks: Pointers
Artificial Neural Networks: Pointers
 
Image classification with Deep Neural Networks
Image classification with Deep Neural NetworksImage classification with Deep Neural Networks
Image classification with Deep Neural Networks
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 
Cnn method
Cnn methodCnn method
Cnn method
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Neural networks
Neural networksNeural networks
Neural networks
 
Digit recognition
Digit recognitionDigit recognition
Digit recognition
 
Neural network
Neural networkNeural network
Neural network
 
Deep learning
Deep learningDeep learning
Deep learning
 
Introduction to Neural networks (under graduate course) Lecture 1 of 9
Introduction to Neural networks (under graduate course) Lecture 1 of 9Introduction to Neural networks (under graduate course) Lecture 1 of 9
Introduction to Neural networks (under graduate course) Lecture 1 of 9
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 

Similar to Lecture on Deep Learning

Deep Learning
Deep LearningDeep Learning
Deep Learning
Pierre de Lacaze
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
Lect1_Threshold_Logic_Unit lecture 1 - ANN
Lect1_Threshold_Logic_Unit  lecture 1 - ANNLect1_Threshold_Logic_Unit  lecture 1 - ANN
Lect1_Threshold_Logic_Unit lecture 1 - ANN
MostafaHazemMostafaa
 

Similar to Lecture on Deep Learning (20)

Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
Artificial Neural Network in Medical Diagnosis
Artificial Neural Network in Medical DiagnosisArtificial Neural Network in Medical Diagnosis
Artificial Neural Network in Medical Diagnosis
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
Neural
NeuralNeural
Neural
 
A Survey of Convolutional Neural Networks
A Survey of Convolutional Neural NetworksA Survey of Convolutional Neural Networks
A Survey of Convolutional Neural Networks
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
 
Convolutional Neural Network and Its Applications
Convolutional Neural Network and Its ApplicationsConvolutional Neural Network and Its Applications
Convolutional Neural Network and Its Applications
 
DL.pdf
DL.pdfDL.pdf
DL.pdf
 
Towards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networksTowards better analysis of deep convolutional neural networks
Towards better analysis of deep convolutional neural networks
 
Lec 1-2-3-intr.
Lec 1-2-3-intr.Lec 1-2-3-intr.
Lec 1-2-3-intr.
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Fundamental of deep learning
Fundamental of deep learningFundamental of deep learning
Fundamental of deep learning
 
Lect1_Threshold_Logic_Unit lecture 1 - ANN
Lect1_Threshold_Logic_Unit  lecture 1 - ANNLect1_Threshold_Logic_Unit  lecture 1 - ANN
Lect1_Threshold_Logic_Unit lecture 1 - ANN
 
DEEPLEARNING recurrent neural networs.pdf
DEEPLEARNING recurrent neural networs.pdfDEEPLEARNING recurrent neural networs.pdf
DEEPLEARNING recurrent neural networs.pdf
 

More from Yasas Senarath

More from Yasas Senarath (7)

Aspect Based Sentiment Analysis
Aspect Based Sentiment AnalysisAspect Based Sentiment Analysis
Aspect Based Sentiment Analysis
 
Forecasting covid 19 by states with mobility data
Forecasting covid 19 by states with mobility data Forecasting covid 19 by states with mobility data
Forecasting covid 19 by states with mobility data
 
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
Evaluating Semantic Feature Representations to Efficiently Detect Hate Intent...
 
Solr workshop
Solr workshopSolr workshop
Solr workshop
 
Affect Level Opinion Mining
Affect Level Opinion MiningAffect Level Opinion Mining
Affect Level Opinion Mining
 
Data science / Big Data
Data science / Big DataData science / Big Data
Data science / Big Data
 
Twitter sentiment analysis
Twitter sentiment analysisTwitter sentiment analysis
Twitter sentiment analysis
 

Recently uploaded

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Domenico Conte
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 

Recently uploaded (20)

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 

Lecture on Deep Learning

  • 1. Deep Learning Presentation By: Yasas Senarath, Research Assistant, University of Moratuwa Lecturer in Charge: Dr. Uthayasanker Thayasivam 1
  • 2. Overview 1. Artificial Neural Network Basics 2. Introduction to Deep Learning 3. Convolutional Neural Networks (CNNs) 4. Recurrent neural network (RNNs) 5. Practical Session 2
  • 4. Artificial Neural Network • Computational model based on the structure and functions of biological neural networks The structure of a single artificial neuronThe structure of a basic biological neuron 4
  • 5. A Neuron - Function • Receiving information: the processing unit obtains the information as input x1,x2,....,xn. • Weighting: each input is weighted by its corresponding weights denoted as w0,w1,w2,....,wn. • Activation: an activation function f is applied to the sum of all the weighted inputs z. • Output: an output is y generated depending on z. The structure of a single artificial neuron 5
  • 6. Activation Function • Threshold Function • Sigmoid Function • Hyperbolic Tangent Function • Rectified Linear Units 6
  • 7. Feedforward Neural Network • Connections between the nodes do not form a cycle The structure of a fully connected 3-Layer neural network 7
  • 10. Issues with Hand Engineered Features • Most critical for accuracy • Most time-consuming in development • What is the best feature??? • What is next?? Keep on crafting better features? • Let’s learn feature representation directly from data. 10
  • 11. Learning Features and classifier together • A non-linear mapping that takes raw pixels directly to labels • How to build? • By combining simple building blocks (i.e. layers in Neural Network) Hmmm… Which is better? Option 2 is better Option 1 Option 2 11
  • 12. Intuition behind Deep Neural Nets • Each layer will have parameters subject to learning • Composition makes a highly non-linear system • In case of classification: • Final layer outputs a probability distribution of categories. Final Layer A Layer 12
  • 13. Training a Deep Neural Network • Compute loss on small batches(Forward Propagation) • Compute Gradient w.r.t. parameters • Use gradient to update parameters 𝑦1 𝑋 𝑦 Error Number of Hidden Units Number of Hidden Layers Type of Layer Loss Function 13
  • 14. Types of Layers • Dense Layer (activation=ReLU) • Convolutional Layer in Convolutional Neural Network (CNN) • Recurrent Neural Network (RNN) Layer • Simple RNN cells • LSTM cell • GRU cell 14
  • 15. Convolutional Neural Networks (CNNs) • AKA ConvNets • Regular Neural Nets don’t scale well • 3D volumes of neurons • Depth • Height • Width • Mainly used in • Image Processing • Natural Language Processing 15
  • 16. Consider an Image… Example: 1000 X 1000 image 1M Hidden Units 1 B Parameters to Optimize!! 16
  • 17. Reduce connections to local regions Example: 1000 x 1000 image 1 M hidden units Filter size: 10 * 10 10 M parameters 17
  • 18. Reuse the same kernel everywhere Why? Because interesting features (edges) can happen at anywhere in the image Share the same parameters across different locations Convolution with learned kernels 18
  • 19. Convolutional Neural Nets Learn Multiple Filters Example: 1000 x 1000 image 100 Filters Filter size: 10 * 10 10 K parameters 19
  • 20. Handling Multiple Channels • Image may contain multiple channels • Eg: 3 channel (R, G, B) image • 3 separate k by k filter is applied to each channel 20
  • 21. Translation Invariance Assume we are going to make an Eye detector Problem: How to make the detection robust to exact Eye location? 21
  • 22. Translation Invariance Solution: Use pooling (max / average) on the filter responses • Provides robustness to exact spatial location of features • Also sub-samples the image allowing next layer to look @ larger spatial regions 22
  • 23. Summary of Complete CNN • Doing all of this consists one layer. • Pooling and normalization is optional • Stack them up and train just like multilayer neural nets • Multiple Conv Layers can be used to learn high level features • Final layer is usually fully connected 𝑛𝑒𝑢𝑟𝑎𝑙 𝑛𝑒𝑡 𝑤𝑖𝑡ℎ 𝑜𝑢𝑡𝑝𝑢𝑡 𝑠𝑖𝑧𝑒 == 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 23
  • 24. Recurrent neural network (RNN) • Considers sequence • Used in Forecasting • Applications • Language Modelling • Machine Translation • Conversation Bots • Image Description • Image Search 24
  • 25. Structure of RNN • Performs the same task for every element of a sequence, with the output being depended on the previous computations • Have a “memory” which captures information about what has been calculated so far An unrolled recurrent neural network. 25
  • 26. A Simple RNN • Performs the same task for every element of a sequence, with the output being depended on the previous computations Unrolled RNN 26
  • 27. The Problem of Long-Term Dependencies • Consider a language model trying to predict the next word based on the previous ones • Larger Gap => Unable to learn features by RNN • Theoretically, this should be possible but practically simple RNNs are not capable of representing long-term dependencies 𝑇ℎ𝑒 𝑐𝑙𝑜𝑢𝑑𝑠 𝑎𝑟𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑘𝑦 𝐼 𝑔𝑟𝑒𝑤 𝑢𝑝 𝑖𝑛 𝐹𝑟𝑎𝑛𝑐𝑒 … 𝐼 𝑠𝑝𝑒𝑎𝑘 𝑓𝑙𝑢𝑒𝑛𝑡 𝐹𝑟𝑒𝑛𝑐ℎ27
  • 28. LSTM - Hochreiter & Schmidhuber (1997) • A special kind of RNN • Capable of learning long-term dependencies • Long periods of time is practically their default behaviour, not something they struggle to learn! An unrolled LSTM 28
  • 29. LSTM Modes and examples 29
  • 30. Example Model (Image Captioning) 30
  • 31. Practical Session • See https://online.mrt.ac.lk/mod/folder/view.php?id=65448 • Follow instructions in Moodle to get started using Colab • Then follow the instructions in Python Notebook 31
  • 32. Resources 1. http://cs231n.github.io/convolutional-networks/ 2. http://www.cs.umd.edu/~djacobs/CMSC733/CNN.pdf 3. http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 4. https://wiki.tum.de/display/lfdv/Artificial+Neural+Networks 32

Editor's Notes

  1. Similar to what we discussed in past few slides However, the traditional deep network which involves in taking all the inputs from previous layers are not scalable Images which is one of the things that led to implementation of this network has very complex learning curve and they are large as well, that is the traditional neural networks are
  2. These are neurons
  3. In the second sentence, recent information suggests that the next word is probably the name of a language, but if we want to narrow down which language, we need the context of France, from further back. It’s entirely possible for the gap between the relevant information and the point where it is needed to become very large.