SlideShare a Scribd company logo
1 of 56
Download to read offline
Deep learning in practice : Speech
recognition and beyond
Abdel HEBA
27 septembre 2017
2 / 56
OutlineOutline
● Part 1 : Basics of Machine Learning ( Deep and Shallow) and of Signal
Processing
● Part 2 : Speech Recognition
● Acoustic representation
● Probabilistic speech recognition
● Part 3 : Neural Network Speech Recognition
● Hybrid neural networks
● End-to-End architecture
● Part 4 : Kaldi
3 / 56
Reading MaterialReading Material
4 / 56
A Deep-Learning
Approach
Books:
Bengio, Yoshua (2009).
"Learning Deep Architectures fo
r AI"
.
 
L. Deng and D. Yu (2014) "Deep
Learning: Methods and
Applications"
http://research.microsoft.com/pubs/209355/DeepLearning-NowPublishing-Vol
7-SIG-039.pdf
 
D. Yu and L. Deng (2014).
"Automatic Speech
Recognition: A Deep Learning
Approach” (Publisher:
Springer).
Reading MaterialReading Material
5 / 56
Reading MaterialReading Material
6 / 56
Part I : Machine Learning ( Deep/Shallow)Part I : Machine Learning ( Deep/Shallow)
and Signal Processingand Signal Processing
7 / 56
Current view of Artificial Intelligence, Machine Learning & DeepCurrent view of Artificial Intelligence, Machine Learning & Deep
LearningLearning
Edureka blog – what-is-deep-learning
8 / 56
Current view of Machine Learning founding & disciplinesCurrent view of Machine Learning founding & disciplines
Edureka blog – what-is-deep-learning
9 / 56
Machine Learning Paradigms : An OverviewMachine Learning Paradigms : An Overview
Machine learning Data
Analysis/
Statistic
s
Programs
10 / 56
Supervised Machine Learning (classification)Supervised Machine Learning (classification)
measurements (features)
&
associated ‘class’ labels
(colors used to show class labels)
Training data set
Training
algorithm
Parameters/weights
(and sometimes structure)
Learned model
Training phase (usually offline)
11 / 56
Supervised Machine Learning (classification)Supervised Machine Learning (classification)
Input test data point
structure + parameters
predicted class label or
label sequence (e.g. sentence)
Learned model Output
measurements (features) only
Test phase (run time, online)
12 / 56
What Is Deep Learning ?What Is Deep Learning ?
Deep learning
Machine
learning
Deep learning (deep
machine learning, or deep
structured learning, or
hierarchical learning, or
sometimes DL) is a branch of
machine learning based on a
set of algorithms that attempt
to model high-level
abstractions in data by using
model architectures, with
complex structures or
otherwise, composed of
multiple non-
linear transformations.[1](p198)[2]
[3][4]
13 / 56
Evolution of Machine LearningEvolution of Machine Learning
(Slide from: Yoshua Bengio)
14 / 56
Face RecognitionFace Recognition
Y LeCun
MA Ranzato
D-AE
DBN DBM
AEPerceptron
RBM
GMM BayesNP
SVM
Sparse
Coding

DecisionTree
Boosting
SHALLOW DEEP
Conv. Net
Neural Net
RNN
Bayes Nets
Modified from
Y LeCun
MA Ranzato
SHALLOW DEEP
Neural Networks
Probabilistic Models
D-AE
DBN DBM
AEPerceptron
RBM
GMM BayesNP
SVM
Sparse
Coding

DecisionTree
Boosting
Conv. Net
Deep Neural
Net RNN
Bayes Nets
Modified from
Y LeCun
MA Ranzato
SHALLOW DEEP
Neural Networks
Probabilistic Models
Conv. Net
D-AE
DBN DBM
AEPerceptron
RBM
?GMM BayesNP
SVM
Supervised Supervised
Unsupervised
Sparse
Coding

Boosting
DecisionTree
Deep Neural
Net RNN
?Bayes Nets
Modified from
18 / 56
Part II : Speech RecognitionPart II : Speech Recognition
19 / 56
Human Communication : verbal & non verbal informationHuman Communication : verbal & non verbal information
20 / 56
Speech recognition problemSpeech recognition problem
21 / 56
Speech recognition problemSpeech recognition problem
● Automatic speech recognition
● Spontaneous vs read speech
● Large vocabulary
● In noise
● Low resource
● Far-Field
● Accent-independent
● Speaker-adaptative
● Speaker identification
● Speech enhancement
● Speech separation
22 / 56
Speech representationSpeech representation
● Same word : « Appeler »
23 / 56
Speech representationSpeech representation
We want a low-dimensionality representation, invariant to
speaker, background noise, rate of speaking etc.
● Fourier analysis shows energy in different frequency bands
24 / 56
Acoustic representationAcoustic representation
Vowel triangle as seen from the formants 1 & 2
25 / 56
Acoustic representationAcoustic representation
● Features used in speech recognition
● Mel Frequency Cepstral Coefficients – MFCC
● Perceptual Linear Prediction – PLP
● RASTA-PLP
● Filter Banks Coefficient – F-BANKs
26 / 56
Speech Recognition asSpeech Recognition as
transduction Fromtransduction From
signal to languagesignal to language
27 / 56
Speech Recognition asSpeech Recognition as
transduction Fromtransduction From
signal to languagesignal to language
28 / 56
Speech Recognition asSpeech Recognition as
transduction Fromtransduction From
signal to languagesignal to language
29 / 56
Probabilistic speech recognitionProbabilistic speech recognition
● Speech signal represented as an acoustic observation sequence
● We want to find the most likely word sequence W
● We model this with a Hidden Markov Model
● The system has a set of discrete states,
● Transitions from state to state according to transition probabilities (Markovian :
memoryless)
● Acoustic observation when making a transition is conditioned on state alone.
P(o|c)
● We seek to recover the state sequence and consequently the word sequence
30 / 56
Speech Recognition asSpeech Recognition as
transduction - Phone Recognitiontransduction - Phone Recognition
● Training Algorithm (N iteration)
● Align data & text
● Compute probabilities P(o/p) of each segments o
● Update boundaries
31 / 56
Speech Recognition asSpeech Recognition as
transduction - Lexicontransduction - Lexicon
● Construct graph using Weighted Finite State Transducers
(WFST)
32 / 56
Speech Recognition asSpeech Recognition as
transductiontransduction
● Compose Lexicon FST with Grammar FST L o G
● Transduction via Composition
● Map output labels of lexicon to input labels of Language Model.
● Join and optimize end-to-end graph.
33 / 56
Different steps of acoustic modelingDifferent steps of acoustic modeling
34 / 56
DecodingDecoding
35 / 56
DecodingDecoding
● We want to find the most likely word sequence W
knowing the observation o in the graph
36 / 56
Part III : Neural Networks for Speech RecognitionPart III : Neural Networks for Speech Recognition
37 / 56
Three main paradigms for neural networks for speechThree main paradigms for neural networks for speech
● Use neural networks to compute nonlinear feature
representation
● « Bottleneck » or « tandem » features
● Use neural networks to estimate phonetic unit
probabilities (Hybrid networks)
● Use end-to-end neural networks
38 / 56
Neural network featuresNeural network features
● Train a neural network to discriminate classes.
● Use output or a low-dimensional bottleneck layer
representation as features.
39 / 56
Hybrid Speech Recognition SystemHybrid Speech Recognition System
● Train the network as a classifier with a softmax across
the phonetic units.
40 / 56
Hybrid Speech Recognition SystemHybrid Speech Recognition System
41 / 56
Neural network architectures for speech recognitionNeural network architectures for speech recognition
● Fully connected
● Convolutional Networks (CNNs)
● Recurrent neural networks (RNNs)
● LSTMs
● GRUs
42 / 56
Neural network architectures for speech recognitionNeural network architectures for speech recognition
● Convolutional Neural network
43 / 56
Neural network architectures for speech recognitionNeural network architectures for speech recognition
● Recurrent Neural Network
44 / 56
Neural network architectures for speech recognitionNeural network architectures for speech recognition
● Recurrent Neural Network
45 / 56
Neural network architectures for speech recognitionNeural network architectures for speech recognition
● Recurrent Neural Network
46 / 56
Neural network architectures for speech recognitionNeural network architectures for speech recognition
● Recurrent Neural Network
47 / 56
End-To-End Neural Networks for Speech Recognition :End-To-End Neural Networks for Speech Recognition :
CTC Loss FucntionCTC Loss Fucntion
48 / 56
End-To-End Speech Recognition :End-To-End Speech Recognition :
CTC InputCTC Input
● Graphem-based model : c {A,B,C…,Z,Blank,Space}
● P(c=HHH_E_LL_LO___|x)= P(c₁=H|x)P(c₂=H|x)...P(c₆=blank|x)..
49 / 56
Connexionist Temporal Classification (CTC)Connexionist Temporal Classification (CTC)
● CTC Loss Function :
50 / 56
Connexionist Temporal Classification (CTC)Connexionist Temporal Classification (CTC)
● Mise à jour du réseau avec la CTC Loss Function :● Mise à jour du réseau avec la CTC Loss Function :
● Backprobagation :
51 / 56
Home messageHome message
● Speech Recognition systems
● HMM-GMM traditional system
● Hybrid ASR system
● Use Neural Networks for feature representation
● Or , use Neural Networks for phoneme recognition
● End-To-End Neural Networks system
● Grapheme based model
● Need lot of date to perform
● Complex modeling
52 / 56
Part IV : KaldiPart IV : Kaldi
53 / 56
The Kaldi ToolkitThe Kaldi Toolkit
● Kaldi is specifically designed for speech recognition research
application
● Kaldi training tools
● Data preparation (link text to wav, speaker to utt..)
● Feature extraction : MFCC, PLP, F-BANKs, Pitch, LDA, HLDA,
fMLLR, MLLT, VTLN, etc.
● Scripts for building finite state transducer : converting
Lexicon & Language model to fst format
● HMM-GMM traditional system
● Hybrid system
● Online decoding
54 / 56
Kaldi ArchitectureKaldi Architecture
55 / 56
LinSTT use KaldiLinSTT use Kaldi
Site CLIPS ENST IRENE LIA LIMSI LIUM LORIA Linagora
WER 40.7 45.4 35.4 26.7 11.9 23.6 27.6 26.23
Audio Corpus 90h 90h 90h 90h 90h
+100h
90h
+90h
90h 90h
#states 1,500 114 6,000 3,600 12,000 7,000 6,000 15,000
#gaussians 24k 14k 200k 230k 370k 154k 90k 500k
#pronunciations 38k 118k 118k 130k 276k 107k 112k 105k
Thanks for your attentionThanks for your attention
LINAGORA – headquarters
80, rue Roque de Fillol
92800 PUTEAUX
FRANCE
Phone : +33 (0)1 46 96 63 63
Info : info@linagora.com
Web : www.linagora.com
facebook.com/Linagora/
@linagora

More Related Content

What's hot

Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkAtul Krishna
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionJinwon Lee
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketakiKetaki Patwari
 
Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work IIMohamed Loey
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 
07 regularization
07 regularization07 regularization
07 regularizationRonald Teo
 
Chapter 09 classification advanced
Chapter 09 classification advancedChapter 09 classification advanced
Chapter 09 classification advancedHouw Liong The
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNNNoura Hussein
 
Text detection and recognition from natural scenes
Text detection and recognition from natural scenesText detection and recognition from natural scenes
Text detection and recognition from natural sceneshemanthmcqueen
 
Denclue Algorithm - Cluster, Pe
Denclue Algorithm - Cluster, PeDenclue Algorithm - Cluster, Pe
Denclue Algorithm - Cluster, PeTauhidul Khandaker
 
A completed modeling of local binary pattern operator
A completed modeling of local binary pattern operatorA completed modeling of local binary pattern operator
A completed modeling of local binary pattern operatorWin Yu
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationMohammed Bennamoun
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learningleopauly
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 

What's hot (20)

Unit 1
Unit 1Unit 1
Unit 1
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
PR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object DetectionPR-217: EfficientDet: Scalable and Efficient Object Detection
PR-217: EfficientDet: Scalable and Efficient Object Detection
 
Transfer Learning
Transfer LearningTransfer Learning
Transfer Learning
 
CNN and its applications by ketaki
CNN and its applications by ketakiCNN and its applications by ketaki
CNN and its applications by ketaki
 
Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work II
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 
07 regularization
07 regularization07 regularization
07 regularization
 
Chapter 09 classification advanced
Chapter 09 classification advancedChapter 09 classification advanced
Chapter 09 classification advanced
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Text detection and recognition from natural scenes
Text detection and recognition from natural scenesText detection and recognition from natural scenes
Text detection and recognition from natural scenes
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Denclue Algorithm - Cluster, Pe
Denclue Algorithm - Cluster, PeDenclue Algorithm - Cluster, Pe
Denclue Algorithm - Cluster, Pe
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
A completed modeling of local binary pattern operator
A completed modeling of local binary pattern operatorA completed modeling of local binary pattern operator
A completed modeling of local binary pattern operator
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computation
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 

Viewers also liked

Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic TheoryMelanie Swan
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalBhaskar Mitra
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedMelanie Swan
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Jaemin Cho
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS MeetupLINAGORA
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingYanbin Kong
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep LearningMelanie Swan
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHenning Spjelkavik
 

Viewers also liked (20)

Blockchain Economic Theory
Blockchain Economic TheoryBlockchain Economic Theory
Blockchain Economic Theory
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain ExplainedBlockchain Smartnetworks: Bitcoin and Blockchain Explained
Blockchain Smartnetworks: Bitcoin and Blockchain Explained
 
Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)Deep Learning for Chatbot (1/4)
Deep Learning for Chatbot (1/4)
 
Advanced Node.JS Meetup
Advanced Node.JS MeetupAdvanced Node.JS Meetup
Advanced Node.JS Meetup
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
Cs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and UnderstandingCs231n 2017 lecture12 Visualizing and Understanding
Cs231n 2017 lecture12 Visualizing and Understanding
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep Learning
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
How we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.noHow we sleep well at night using Hystrix at Finn.no
How we sleep well at night using Hystrix at Finn.no
 

Similar to Deep Learning in practice : Speech recognition and beyond - Meetup

Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2EXAMCELLH4
 
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitImplemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitShubham Verma
 
Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
Looking into the Black Box - A Theoretical Insight into Deep Learning NetworksLooking into the Black Box - A Theoretical Insight into Deep Learning Networks
Looking into the Black Box - A Theoretical Insight into Deep Learning NetworksDinesh V
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...Yun-Nung (Vivian) Chen
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022Kwanghee Choi
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
Sepformer&DPTNet.pdf
Sepformer&DPTNet.pdfSepformer&DPTNet.pdf
Sepformer&DPTNet.pdfssuser849b73
 
deeplearning
deeplearningdeeplearning
deeplearninghuda2018
 
Short story presentation
Short story presentationShort story presentation
Short story presentationStutiAgarwal36
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud
 
Speaker identification
Speaker identificationSpeaker identification
Speaker identificationTriloki Gupta
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Reviewchangedaeoh
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...Universitat Politècnica de Catalunya
 
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...kevig
 
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...kevig
 

Similar to Deep Learning in practice : Speech recognition and beyond - Meetup (20)

Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2Deep learning Techniques JNTU R20 UNIT 2
Deep learning Techniques JNTU R20 UNIT 2
 
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR ToolkitImplemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
Implemetation of parallelism in HMM DNN based state of the art kaldi ASR Toolkit
 
Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
Looking into the Black Box - A Theoretical Insight into Deep Learning NetworksLooking into the Black Box - A Theoretical Insight into Deep Learning Networks
Looking into the Black Box - A Theoretical Insight into Deep Learning Networks
 
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Lan...
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Sepformer&DPTNet.pdf
Sepformer&DPTNet.pdfSepformer&DPTNet.pdf
Sepformer&DPTNet.pdf
 
deeplearning
deeplearningdeeplearning
deeplearning
 
Short story presentation
Short story presentationShort story presentation
Short story presentation
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Speaker identification
Speaker identificationSpeaker identification
Speaker identification
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
Et25897899
Et25897899Et25897899
Et25897899
 
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
End-to-end Speech Recognition with Recurrent Neural Networks (D3L6 Deep Learn...
 
NLP DLforDS
NLP DLforDSNLP DLforDS
NLP DLforDS
 
Machine Learning @NECST
Machine Learning @NECSTMachine Learning @NECST
Machine Learning @NECST
 
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
 
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
 

More from LINAGORA

Personal branding : e-recrutement et réseaux sociaux professionnels
Personal branding : e-recrutement et réseaux sociaux professionnels Personal branding : e-recrutement et réseaux sociaux professionnels
Personal branding : e-recrutement et réseaux sociaux professionnels LINAGORA
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !LINAGORA
 
ChatBots et intelligence artificielle arrivent dans les banques
ChatBots et intelligence artificielle arrivent dans les banques ChatBots et intelligence artificielle arrivent dans les banques
ChatBots et intelligence artificielle arrivent dans les banques LINAGORA
 
Call a C API from Python becomes more enjoyable with CFFI
Call a C API from Python becomes more enjoyable with CFFICall a C API from Python becomes more enjoyable with CFFI
Call a C API from Python becomes more enjoyable with CFFILINAGORA
 
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)LINAGORA
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseLINAGORA
 
Comment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORA
Comment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORAComment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORA
Comment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORALINAGORA
 
Angular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - LinagoraAngular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - LinagoraLINAGORA
 
Industrialisez le développement et la maintenance de vos sites avec Drupal
Industrialisez le développement et la maintenance de vos sites avec DrupalIndustrialisez le développement et la maintenance de vos sites avec Drupal
Industrialisez le développement et la maintenance de vos sites avec DrupalLINAGORA
 
CapDémat Evolution plateforme de GRU pour collectivités
CapDémat Evolution plateforme de GRU pour collectivitésCapDémat Evolution plateforme de GRU pour collectivités
CapDémat Evolution plateforme de GRU pour collectivitésLINAGORA
 
Présentation du marché P2I UGAP « Support sur Logiciels Libres »
Présentation du marché P2I UGAP « Support sur Logiciels Libres »Présentation du marché P2I UGAP « Support sur Logiciels Libres »
Présentation du marché P2I UGAP « Support sur Logiciels Libres »LINAGORA
 
Offre de demat d'Adullact projet
Offre de demat d'Adullact projet Offre de demat d'Adullact projet
Offre de demat d'Adullact projet LINAGORA
 
La dématérialisation du conseil minicipal
La dématérialisation du conseil minicipalLa dématérialisation du conseil minicipal
La dématérialisation du conseil minicipalLINAGORA
 
Open stack @ sierra wireless
Open stack @ sierra wirelessOpen stack @ sierra wireless
Open stack @ sierra wirelessLINAGORA
 
OpenStack - open source au service du Cloud
OpenStack - open source au service du CloudOpenStack - open source au service du Cloud
OpenStack - open source au service du CloudLINAGORA
 
Architecture d'annuaire hautement disponible avec OpenLDAP
Architecture d'annuaire hautement disponible avec OpenLDAPArchitecture d'annuaire hautement disponible avec OpenLDAP
Architecture d'annuaire hautement disponible avec OpenLDAPLINAGORA
 
Présentation offre LINID
Présentation offre LINIDPrésentation offre LINID
Présentation offre LINIDLINAGORA
 
Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...
Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...
Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...LINAGORA
 
Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...
Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...
Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...LINAGORA
 
Open Source Software Assurance by Linagora
Open Source Software Assurance by LinagoraOpen Source Software Assurance by Linagora
Open Source Software Assurance by LinagoraLINAGORA
 

More from LINAGORA (20)

Personal branding : e-recrutement et réseaux sociaux professionnels
Personal branding : e-recrutement et réseaux sociaux professionnels Personal branding : e-recrutement et réseaux sociaux professionnels
Personal branding : e-recrutement et réseaux sociaux professionnels
 
Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !Construisons ensemble le chatbot bancaire dedemain !
Construisons ensemble le chatbot bancaire dedemain !
 
ChatBots et intelligence artificielle arrivent dans les banques
ChatBots et intelligence artificielle arrivent dans les banques ChatBots et intelligence artificielle arrivent dans les banques
ChatBots et intelligence artificielle arrivent dans les banques
 
Call a C API from Python becomes more enjoyable with CFFI
Call a C API from Python becomes more enjoyable with CFFICall a C API from Python becomes more enjoyable with CFFI
Call a C API from Python becomes more enjoyable with CFFI
 
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
[UDS] Cloud Computing "pour les nuls" (Exemple avec LinShare)
 
Angular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entrepriseAngular v2 et plus : le futur du développement d'applications en entreprise
Angular v2 et plus : le futur du développement d'applications en entreprise
 
Comment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORA
Comment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORAComment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORA
Comment faire ses mappings ElasticSearch aux petits oignons ? - LINAGORA
 
Angular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - LinagoraAngular (v2 and up) - Morning to understand - Linagora
Angular (v2 and up) - Morning to understand - Linagora
 
Industrialisez le développement et la maintenance de vos sites avec Drupal
Industrialisez le développement et la maintenance de vos sites avec DrupalIndustrialisez le développement et la maintenance de vos sites avec Drupal
Industrialisez le développement et la maintenance de vos sites avec Drupal
 
CapDémat Evolution plateforme de GRU pour collectivités
CapDémat Evolution plateforme de GRU pour collectivitésCapDémat Evolution plateforme de GRU pour collectivités
CapDémat Evolution plateforme de GRU pour collectivités
 
Présentation du marché P2I UGAP « Support sur Logiciels Libres »
Présentation du marché P2I UGAP « Support sur Logiciels Libres »Présentation du marché P2I UGAP « Support sur Logiciels Libres »
Présentation du marché P2I UGAP « Support sur Logiciels Libres »
 
Offre de demat d'Adullact projet
Offre de demat d'Adullact projet Offre de demat d'Adullact projet
Offre de demat d'Adullact projet
 
La dématérialisation du conseil minicipal
La dématérialisation du conseil minicipalLa dématérialisation du conseil minicipal
La dématérialisation du conseil minicipal
 
Open stack @ sierra wireless
Open stack @ sierra wirelessOpen stack @ sierra wireless
Open stack @ sierra wireless
 
OpenStack - open source au service du Cloud
OpenStack - open source au service du CloudOpenStack - open source au service du Cloud
OpenStack - open source au service du Cloud
 
Architecture d'annuaire hautement disponible avec OpenLDAP
Architecture d'annuaire hautement disponible avec OpenLDAPArchitecture d'annuaire hautement disponible avec OpenLDAP
Architecture d'annuaire hautement disponible avec OpenLDAP
 
Présentation offre LINID
Présentation offre LINIDPrésentation offre LINID
Présentation offre LINID
 
Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...
Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...
Matinée pour conmrendre consacrée à LinID.org, gestion, fédération et contrôl...
 
Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...
Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...
Matinée pour conmrendre consacrée à LinShare.org, application de partage de f...
 
Open Source Software Assurance by Linagora
Open Source Software Assurance by LinagoraOpen Source Software Assurance by Linagora
Open Source Software Assurance by Linagora
 

Recently uploaded

Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfAnubhavMangla3
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxMasterG
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...ScyllaDB
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!Memoori
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)Wonjun Hwang
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfalexjohnson7307
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsLeah Henrickson
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimaginedpanagenda
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 

Recently uploaded (20)

Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 

Deep Learning in practice : Speech recognition and beyond - Meetup

  • 1. Deep learning in practice : Speech recognition and beyond Abdel HEBA 27 septembre 2017
  • 2. 2 / 56 OutlineOutline ● Part 1 : Basics of Machine Learning ( Deep and Shallow) and of Signal Processing ● Part 2 : Speech Recognition ● Acoustic representation ● Probabilistic speech recognition ● Part 3 : Neural Network Speech Recognition ● Hybrid neural networks ● End-to-End architecture ● Part 4 : Kaldi
  • 3. 3 / 56 Reading MaterialReading Material
  • 4. 4 / 56 A Deep-Learning Approach Books: Bengio, Yoshua (2009). "Learning Deep Architectures fo r AI" .   L. Deng and D. Yu (2014) "Deep Learning: Methods and Applications" http://research.microsoft.com/pubs/209355/DeepLearning-NowPublishing-Vol 7-SIG-039.pdf   D. Yu and L. Deng (2014). "Automatic Speech Recognition: A Deep Learning Approach” (Publisher: Springer). Reading MaterialReading Material
  • 5. 5 / 56 Reading MaterialReading Material
  • 6. 6 / 56 Part I : Machine Learning ( Deep/Shallow)Part I : Machine Learning ( Deep/Shallow) and Signal Processingand Signal Processing
  • 7. 7 / 56 Current view of Artificial Intelligence, Machine Learning & DeepCurrent view of Artificial Intelligence, Machine Learning & Deep LearningLearning Edureka blog – what-is-deep-learning
  • 8. 8 / 56 Current view of Machine Learning founding & disciplinesCurrent view of Machine Learning founding & disciplines Edureka blog – what-is-deep-learning
  • 9. 9 / 56 Machine Learning Paradigms : An OverviewMachine Learning Paradigms : An Overview Machine learning Data Analysis/ Statistic s Programs
  • 10. 10 / 56 Supervised Machine Learning (classification)Supervised Machine Learning (classification) measurements (features) & associated ‘class’ labels (colors used to show class labels) Training data set Training algorithm Parameters/weights (and sometimes structure) Learned model Training phase (usually offline)
  • 11. 11 / 56 Supervised Machine Learning (classification)Supervised Machine Learning (classification) Input test data point structure + parameters predicted class label or label sequence (e.g. sentence) Learned model Output measurements (features) only Test phase (run time, online)
  • 12. 12 / 56 What Is Deep Learning ?What Is Deep Learning ? Deep learning Machine learning Deep learning (deep machine learning, or deep structured learning, or hierarchical learning, or sometimes DL) is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using model architectures, with complex structures or otherwise, composed of multiple non- linear transformations.[1](p198)[2] [3][4]
  • 13. 13 / 56 Evolution of Machine LearningEvolution of Machine Learning (Slide from: Yoshua Bengio)
  • 14. 14 / 56 Face RecognitionFace Recognition
  • 15. Y LeCun MA Ranzato D-AE DBN DBM AEPerceptron RBM GMM BayesNP SVM Sparse Coding  DecisionTree Boosting SHALLOW DEEP Conv. Net Neural Net RNN Bayes Nets Modified from
  • 16. Y LeCun MA Ranzato SHALLOW DEEP Neural Networks Probabilistic Models D-AE DBN DBM AEPerceptron RBM GMM BayesNP SVM Sparse Coding  DecisionTree Boosting Conv. Net Deep Neural Net RNN Bayes Nets Modified from
  • 17. Y LeCun MA Ranzato SHALLOW DEEP Neural Networks Probabilistic Models Conv. Net D-AE DBN DBM AEPerceptron RBM ?GMM BayesNP SVM Supervised Supervised Unsupervised Sparse Coding  Boosting DecisionTree Deep Neural Net RNN ?Bayes Nets Modified from
  • 18. 18 / 56 Part II : Speech RecognitionPart II : Speech Recognition
  • 19. 19 / 56 Human Communication : verbal & non verbal informationHuman Communication : verbal & non verbal information
  • 20. 20 / 56 Speech recognition problemSpeech recognition problem
  • 21. 21 / 56 Speech recognition problemSpeech recognition problem ● Automatic speech recognition ● Spontaneous vs read speech ● Large vocabulary ● In noise ● Low resource ● Far-Field ● Accent-independent ● Speaker-adaptative ● Speaker identification ● Speech enhancement ● Speech separation
  • 22. 22 / 56 Speech representationSpeech representation ● Same word : « Appeler »
  • 23. 23 / 56 Speech representationSpeech representation We want a low-dimensionality representation, invariant to speaker, background noise, rate of speaking etc. ● Fourier analysis shows energy in different frequency bands
  • 24. 24 / 56 Acoustic representationAcoustic representation Vowel triangle as seen from the formants 1 & 2
  • 25. 25 / 56 Acoustic representationAcoustic representation ● Features used in speech recognition ● Mel Frequency Cepstral Coefficients – MFCC ● Perceptual Linear Prediction – PLP ● RASTA-PLP ● Filter Banks Coefficient – F-BANKs
  • 26. 26 / 56 Speech Recognition asSpeech Recognition as transduction Fromtransduction From signal to languagesignal to language
  • 27. 27 / 56 Speech Recognition asSpeech Recognition as transduction Fromtransduction From signal to languagesignal to language
  • 28. 28 / 56 Speech Recognition asSpeech Recognition as transduction Fromtransduction From signal to languagesignal to language
  • 29. 29 / 56 Probabilistic speech recognitionProbabilistic speech recognition ● Speech signal represented as an acoustic observation sequence ● We want to find the most likely word sequence W ● We model this with a Hidden Markov Model ● The system has a set of discrete states, ● Transitions from state to state according to transition probabilities (Markovian : memoryless) ● Acoustic observation when making a transition is conditioned on state alone. P(o|c) ● We seek to recover the state sequence and consequently the word sequence
  • 30. 30 / 56 Speech Recognition asSpeech Recognition as transduction - Phone Recognitiontransduction - Phone Recognition ● Training Algorithm (N iteration) ● Align data & text ● Compute probabilities P(o/p) of each segments o ● Update boundaries
  • 31. 31 / 56 Speech Recognition asSpeech Recognition as transduction - Lexicontransduction - Lexicon ● Construct graph using Weighted Finite State Transducers (WFST)
  • 32. 32 / 56 Speech Recognition asSpeech Recognition as transductiontransduction ● Compose Lexicon FST with Grammar FST L o G ● Transduction via Composition ● Map output labels of lexicon to input labels of Language Model. ● Join and optimize end-to-end graph.
  • 33. 33 / 56 Different steps of acoustic modelingDifferent steps of acoustic modeling
  • 35. 35 / 56 DecodingDecoding ● We want to find the most likely word sequence W knowing the observation o in the graph
  • 36. 36 / 56 Part III : Neural Networks for Speech RecognitionPart III : Neural Networks for Speech Recognition
  • 37. 37 / 56 Three main paradigms for neural networks for speechThree main paradigms for neural networks for speech ● Use neural networks to compute nonlinear feature representation ● « Bottleneck » or « tandem » features ● Use neural networks to estimate phonetic unit probabilities (Hybrid networks) ● Use end-to-end neural networks
  • 38. 38 / 56 Neural network featuresNeural network features ● Train a neural network to discriminate classes. ● Use output or a low-dimensional bottleneck layer representation as features.
  • 39. 39 / 56 Hybrid Speech Recognition SystemHybrid Speech Recognition System ● Train the network as a classifier with a softmax across the phonetic units.
  • 40. 40 / 56 Hybrid Speech Recognition SystemHybrid Speech Recognition System
  • 41. 41 / 56 Neural network architectures for speech recognitionNeural network architectures for speech recognition ● Fully connected ● Convolutional Networks (CNNs) ● Recurrent neural networks (RNNs) ● LSTMs ● GRUs
  • 42. 42 / 56 Neural network architectures for speech recognitionNeural network architectures for speech recognition ● Convolutional Neural network
  • 43. 43 / 56 Neural network architectures for speech recognitionNeural network architectures for speech recognition ● Recurrent Neural Network
  • 44. 44 / 56 Neural network architectures for speech recognitionNeural network architectures for speech recognition ● Recurrent Neural Network
  • 45. 45 / 56 Neural network architectures for speech recognitionNeural network architectures for speech recognition ● Recurrent Neural Network
  • 46. 46 / 56 Neural network architectures for speech recognitionNeural network architectures for speech recognition ● Recurrent Neural Network
  • 47. 47 / 56 End-To-End Neural Networks for Speech Recognition :End-To-End Neural Networks for Speech Recognition : CTC Loss FucntionCTC Loss Fucntion
  • 48. 48 / 56 End-To-End Speech Recognition :End-To-End Speech Recognition : CTC InputCTC Input ● Graphem-based model : c {A,B,C…,Z,Blank,Space} ● P(c=HHH_E_LL_LO___|x)= P(c₁=H|x)P(c₂=H|x)...P(c₆=blank|x)..
  • 49. 49 / 56 Connexionist Temporal Classification (CTC)Connexionist Temporal Classification (CTC) ● CTC Loss Function :
  • 50. 50 / 56 Connexionist Temporal Classification (CTC)Connexionist Temporal Classification (CTC) ● Mise à jour du réseau avec la CTC Loss Function :● Mise à jour du réseau avec la CTC Loss Function : ● Backprobagation :
  • 51. 51 / 56 Home messageHome message ● Speech Recognition systems ● HMM-GMM traditional system ● Hybrid ASR system ● Use Neural Networks for feature representation ● Or , use Neural Networks for phoneme recognition ● End-To-End Neural Networks system ● Grapheme based model ● Need lot of date to perform ● Complex modeling
  • 52. 52 / 56 Part IV : KaldiPart IV : Kaldi
  • 53. 53 / 56 The Kaldi ToolkitThe Kaldi Toolkit ● Kaldi is specifically designed for speech recognition research application ● Kaldi training tools ● Data preparation (link text to wav, speaker to utt..) ● Feature extraction : MFCC, PLP, F-BANKs, Pitch, LDA, HLDA, fMLLR, MLLT, VTLN, etc. ● Scripts for building finite state transducer : converting Lexicon & Language model to fst format ● HMM-GMM traditional system ● Hybrid system ● Online decoding
  • 54. 54 / 56 Kaldi ArchitectureKaldi Architecture
  • 55. 55 / 56 LinSTT use KaldiLinSTT use Kaldi Site CLIPS ENST IRENE LIA LIMSI LIUM LORIA Linagora WER 40.7 45.4 35.4 26.7 11.9 23.6 27.6 26.23 Audio Corpus 90h 90h 90h 90h 90h +100h 90h +90h 90h 90h #states 1,500 114 6,000 3,600 12,000 7,000 6,000 15,000 #gaussians 24k 14k 200k 230k 370k 154k 90k 500k #pronunciations 38k 118k 118k 130k 276k 107k 112k 105k
  • 56. Thanks for your attentionThanks for your attention LINAGORA – headquarters 80, rue Roque de Fillol 92800 PUTEAUX FRANCE Phone : +33 (0)1 46 96 63 63 Info : info@linagora.com Web : www.linagora.com facebook.com/Linagora/ @linagora