Convolutional Neural Networks (D1L3 Deep Learning for Speech and Language)

•

3 likes•1,955 views

https://telecombcn-dl.github.io/2017-dlsl/ Winter School on Deep Learning for Speech and Language. UPC BarcelonaTech ETSETB TelecomBCN. The aim of this course is to train students in methods of deep learning for speech and language. Recurrent Neural Networks (RNN) will be presented and analyzed in detail to understand the potential of these state of the art tools for time series processing. Engineering tips and scalability issues will be addressed to solve tasks such as machine translation, speech recognition, speech synthesis or question answering. Hands-on sessions will provide development skills so that attendees can become competent in contemporary data analytics tools.

Data & Analytics

[course site]
Day 1 Lecture 3
Convolutional Neural
Networks
Elisa Sayrol

The i-th layer is defined by a matrix Wi
and a vector bi, and the activation is
simply a dot product plus bi:
The Deep Neural Network
Num parameters to learn at i-th layer:
2

From Neurons to Convolutional Neural Networks
What if the input is a 2D signal?
(images, spectrogram, but also 1D signals)

From Neurons to Convolutional Neural Networks
For a 200x200 image,
we have 4x104
neurons
each one with 4x104
inputs, that is 16x108
parameters, only for one
layer!!!
Figure Credit: Ranzatto

From Neurons to Convolutional Neural Networks
For a 200x200 image, we
have 4x104
neurons each one
with 10x10 “local
connections” (also called
receptive field) inputs, that is
4x106
What else can we do to
reduce the number of
parameters? Figure Credit: Ranzatto

From Neurons to Convolutional Neural Networks
Translation invariance: we can use same
parameters to capture a specific “feature” in any
area of the image. We can try different sets of
parameters to capture different features.
These operations are equivalent to perform
convolutions with different filters.
Ex: With100 different filters (or feature extractors)
of size 10x10, the number of parameters is 104
Figure Credit: Ranzatto

From Neurons to Convolutional Neural Networks
… and don’t forget the activation function!
Figure Credit: Ranzatto
ReLu PReLu

From Neurons to Convolutional Neural Networks
Most ConvNets use Pooling (or
subsampling) to reduce dimensionality
and provide invariance to small local
changes.
Pooling options:
• Max
• Average
• Stochastic pooling
Figure Credit: Ranzatto

From Neurons to Convolutional Neural Networks
Padding (P): When doing the
convolution in the borders, you may
add values to compute the
convolution.
When the values are zero, that is
quite common, the technique is called
zero-padding.
When padding is not used the output
size is reduced.
FxF=3x3

From Neurons to Convolutional Neural Networks
Stride (S): When doing the
convolution or another operation, like
pooling, we may decide to slide not
pixel by pixel but every 2 or more
pixels. The number of pixels that we
skip is the value of the stride.
It might be used to reduce the
dimensionality of the output

From Neurons to Convolutional Neural Networks
Example: Most convnets contain several convolutional layers, interspersed with
pooling layers, and followed by a small number of fully connected layers
A layer is characterized by its width, height and depth (that is, the number of filters
used to generate the feature maps)
An architecture is characterized by the number of layers
LeNet-5 From Lecun ´98

From Neurons to Convolutional Neural Networks
Example 1: CNN for SL
“Character-based Neural Machine Translation”
Marta R. Costa-Jussà and José A. R. Fonollosa

From Neurons to Convolutional Neural Networks
Example 2: CNN for SL
“Convolutional Neural Network for Speech Recognition”
Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, and Dong Yu
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol.22, No. 10, October 2014

Viewers also liked

Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya

Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Universitat Politècnica de Catalunya

Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya

End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...Universitat Politècnica de Catalunya

Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya

Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Deep Networks (UPC 2016)Universitat Politècnica de Catalunya

Convolutional Neural Networks (CNN)Gaurav Mittal

Deep Learning - Convolutional Neural Networks - Architectural ZooChristian Perone

Lukáš Vrábel - Deep Convolutional Neural NetworksMachine Learning Prague

Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Backward Propagation (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Visualization (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)Universitat Politècnica de Catalunya

Deep Learning for Computer Vision: Memory usage and computational considerati...Universitat Politècnica de Catalunya

Viewers also liked (15)

Language Model (D3L1 Deep Learning for Speech and Language UPC 2017)

Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...

Speaker ID II (D4L1 Deep Learning for Speech and Language UPC 2017)

End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...

Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)

Multimodal Deep Learning (D4L4 Deep Learning for Speech and Language UPC 2017)

Deep Learning for Computer Vision: Deep Networks (UPC 2016)

Convolutional Neural Networks (CNN)

Deep Learning - Convolutional Neural Networks - Architectural Zoo

Lukáš Vrábel - Deep Convolutional Neural Networks

Deep Learning for Computer Vision: Data Augmentation (UPC 2016)

Deep Learning for Computer Vision: Backward Propagation (UPC 2016)

Deep Learning for Computer Vision: Visualization (UPC 2016)

Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)

Deep Learning for Computer Vision: Memory usage and computational considerati...

Recently uploaded

ALSO dropshipping via API with DroFx.pptxolyaivanovalion

Ravak dropshipping via API with DroFx.pptxolyaivanovalion

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Smarteg dropshipping via API with DroFx.pptxolyaivanovalion

Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

Edukaciniai dropshipping via API with DroFxolyaivanovalion

100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate

BigBuy dropshipping via API with DroFx.pptxolyaivanovalion

VidaXL dropshipping via API with DroFx.pptxolyaivanovalion

Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H

Data-Analysis for Chicago Crime Data 2023ymrp368

April 2024 - Crypto Market Report's Analysismanisha194592

Sampling (random) method and Non random.pptDr. Soumendra Kumar Patra

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083

Midocean dropshipping via API with DroFxolyaivanovalion

Recently uploaded (20)

ALSO dropshipping via API with DroFx.pptx

Ravak dropshipping via API with DroFx.pptx

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...

Smarteg dropshipping via API with DroFx.pptx

Schema on read is obsolete. Welcome metaprogramming..pdf

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

Edukaciniai dropshipping via API with DroFx

100-Concepts-of-AI by Anupama Kate .pptx

BigBuy dropshipping via API with DroFx.pptx

VidaXL dropshipping via API with DroFx.pptx

Log Analysis using OSSEC sasoasasasas.pptx

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf

Data-Analysis for Chicago Crime Data 2023

April 2024 - Crypto Market Report's Analysis

Sampling (random) method and Non random.ppt

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...

Midocean dropshipping via API with DroFx

Convolutional Neural Networks (D1L3 Deep Learning for Speech and Language)

1. [course site] Day 1 Lecture 3 Convolutional Neural Networks Elisa Sayrol

2. The i-th layer is defined by a matrix Wi and a vector bi, and the activation is simply a dot product plus bi: The Deep Neural Network Num parameters to learn at i-th layer: 2

3. From Neurons to Convolutional Neural Networks What if the input is a 2D signal? (images, spectrogram, but also 1D signals)

4. From Neurons to Convolutional Neural Networks For a 200x200 image, we have 4x104 neurons each one with 4x104 inputs, that is 16x108 parameters, only for one layer!!! Figure Credit: Ranzatto

5. From Neurons to Convolutional Neural Networks For a 200x200 image, we have 4x104 neurons each one with 10x10 “local connections” (also called receptive field) inputs, that is 4x106 What else can we do to reduce the number of parameters? Figure Credit: Ranzatto

6. From Neurons to Convolutional Neural Networks Translation invariance: we can use same parameters to capture a specific “feature” in any area of the image. We can try different sets of parameters to capture different features. These operations are equivalent to perform convolutions with different filters. Ex: With100 different filters (or feature extractors) of size 10x10, the number of parameters is 104 Figure Credit: Ranzatto

7. From Neurons to Convolutional Neural Networks … and don’t forget the activation function! Figure Credit: Ranzatto ReLu PReLu

8. From Neurons to Convolutional Neural Networks Most ConvNets use Pooling (or subsampling) to reduce dimensionality and provide invariance to small local changes. Pooling options: • Max • Average • Stochastic pooling Figure Credit: Ranzatto

9. From Neurons to Convolutional Neural Networks Padding (P): When doing the convolution in the borders, you may add values to compute the convolution. When the values are zero, that is quite common, the technique is called zero-padding. When padding is not used the output size is reduced. FxF=3x3

10. From Neurons to Convolutional Neural Networks Padding (P): When doing the convolution in the borders, you may add values to compute the convolution. When the values are zero, that is quite common, the technique is called zero-padding. When padding is not used the output size is reduced. FxF=5x5

11. From Neurons to Convolutional Neural Networks Stride (S): When doing the convolution or another operation, like pooling, we may decide to slide not pixel by pixel but every 2 or more pixels. The number of pixels that we skip is the value of the stride. It might be used to reduce the dimensionality of the output

12. From Neurons to Convolutional Neural Networks Example: Most convnets contain several convolutional layers, interspersed with pooling layers, and followed by a small number of fully connected layers A layer is characterized by its width, height and depth (that is, the number of filters used to generate the feature maps) An architecture is characterized by the number of layers LeNet-5 From Lecun ´98

13. From Neurons to Convolutional Neural Networks Example 1: CNN for SL “Character-based Neural Machine Translation” Marta R. Costa-Jussà and José A. R. Fonollosa

14. From Neurons to Convolutional Neural Networks Example 2: CNN for SL “Convolutional Neural Network for Speech Recognition” Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, and Dong Yu IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol.22, No. 10, October 2014

Convolutional Neural Networks (D1L3 Deep Learning for Speech and Language)

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (15)

More from Universitat Politècnica de Catalunya

More from Universitat Politècnica de Catalunya (20)

Recently uploaded

Recently uploaded (20)

Convolutional Neural Networks (D1L3 Deep Learning for Speech and Language)