SlideShare a Scribd company logo
1 of 21
We are considering that while giving speech to our
system. It is quite exhaustive that it has no noise
other than coming from user.
At certain places we use stored database in that
generates after training sets had done.
11/14/2012 2YoGiV
To implement the above system we have 3
subsystems.
1. ASR (Automatic Speech Recognition)
2. DIALOGUE MANAGEMENT
3. SPOKEN LANGUAGE GENERATION
11/14/2012 3YoGiV
This is the 1st subsystem used in SDS which takes
voice as input and converts it into grammatically
correct speech and stores in the system. This
system moreover focuses on making the voice
(including noise) into certain speech which further
can be used in our next subsystem. This is our main
area to focus.
11/14/2012 4YoGiV
This system mainly focus in the management of the
output taken by ASR according to the individual
identity and Stores in the system for using in next
subsystem
11/14/2012 5YoGiV
This subsystem uses stored speeches and generates
spoken language (say English in our case).
11/14/2012 6YoGiV
11/14/2012 7YoGiV
Now in our case we are dealing with
ASR (Automatic Speech Recognition)
11/14/2012 8YoGiV
ASR will take voice as input and accordingly convert to
understandable speeches.
Question Arise
 How can system distinguish between different
speakers?
 How can system distinguish between ambient
noise and someone speaking?
 How can system derive meaning from what was
said?
 For the above questions we start to describe our
important part “Speech”
11/14/2012 9YoGiV
Some of the factors which are to be taken in mind
while taking speech as input.
a) Biological Factors
b) Phonology
c) Frequency of Sounds
d) Timing
11/14/2012 10YoGiV
1. The way our mouth move to produce certain sounds
affect the features of the sound itself.
2. The structure of the mouth produces multiple
waves in certain patterns.
3. When we manipulate our mouths in the way to
make certain letters say‘t’ we push out more air at
once, making a higher frequency sound. So from
this we have one thing to take care is frequency of
speech and with frequency we take Amplitude and
Pitch into consideration.
11/14/2012 11YoGiV
 It shows that how we use sound to convey meaning
in a language
 In English it states characteristics of sounds like
vowels and consonants.
 Phoneme is the smallest segmental unit of sound in
a language. Each Phoneme has features in the
sound that differs it from another Phoneme
Combine to represent words and sentences.
Regarding English we have about 40-50 phonemes.
So we use phoneme to remove any noise from the
sound
11/14/2012 12YoGiV
 Different vowels have different pitches; they are
similar to musical notes
 for ex. 'i' being the highest 'u' being the lowest
 Consonant phonemes have more waves oscillating
of different parts of the mouth.
 So according to different frequency system we can
store words with different phoneme.
11/14/2012 13YoGiV
There is a lot of information in timing. Breaks between
words, breaks between one sentence and another,
so this all to be considered in the speech to
distinguish between different words. According to
Research Vowels last longer than consonants.
 Now by looking above factors we have to:
 Translate from frequencies to a representation of a
phoneme.
 Discarding the useless information like noise, etc.
 The sentence created must make some sense.
11/14/2012 14YoGiV
For the above problems we use two models and one
database:
 Acoustic Model
 Dictionary
 Language model
11/14/2012 15YoGiV
Based on all the features of a sound wave
 Frequency
 Pitch
 Amplitude
 Time information
11/14/2012 16YoGiV
● The Acoustic Model is the statistical mapping from
the units of speech to all the features of speech.
● Convert Speech Sound to Phoneme then to Word
Statistical
● Tells information about the language Phonology.
It can learn from a training set.
11/14/2012 17YoGiV
It checks the Word broken into the phoneme sounds
as what they are typically made of.
11/14/2012 18YoGiV
● Provides word-level structure for a language.
● Use formal grammar rules to make sentence. As we use context to place
particular word at particular place.
To implement the above context matching in systems we use technique of
Probability. For this we calculate probability of next coming word by
using previous probability
Probability of word is based on the last N-1 terms
P(Y) =∑ P (Y|X) P(X)
(Sum over x)
X= Probability of all the existing word in sentence.
Y= Probability of observing a sequence.
11/14/2012 19YoGiV
11/14/2012 20YoGiV
B. Tech , Computer Science
JIIT , Noida
11/14/2012 21YoGiV

More Related Content

What's hot

Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniquessonukumar142
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technologySrijanKumar18
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologyAamir-sheriff
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overviewVarun Jain
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionanshu shrivastava
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Voice/Speech recognition in mobile devices
Voice/Speech recognition in mobile devicesVoice/Speech recognition in mobile devices
Voice/Speech recognition in mobile devicesHarshad Karmarkar
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technologyKalluri Madhuri
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognitionVinay Jaisriram
 
TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxNsaroj kumar
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSubmissionResearchpa
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
Speech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transformSpeech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transformvidhateswapnil
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition systemavinash raibole
 

What's hot (20)

Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Automatic Speech Recognion
Automatic Speech RecognionAutomatic Speech Recognion
Automatic Speech Recognion
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overview
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Voice/Speech recognition in mobile devices
Voice/Speech recognition in mobile devicesVoice/Speech recognition in mobile devices
Voice/Speech recognition in mobile devices
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technology
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 
Speech Recognition System
Speech Recognition SystemSpeech Recognition System
Speech Recognition System
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 
TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptx
 
Speech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speechSpeech Recognition: Transcription and transformation of human speech
Speech Recognition: Transcription and transformation of human speech
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Speech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transformSpeech recognition-using-wavelet-transform
Speech recognition-using-wavelet-transform
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition system
 

Viewers also liked

Automatic speech recognition system using deep learning
Automatic speech recognition system using deep learningAutomatic speech recognition system using deep learning
Automatic speech recognition system using deep learningAnkan Dutta
 
Des modèles graphiques probabilistes aux modeles graphiques de durée
Des modèles graphiques probabilistes aux modeles graphiques de duréeDes modèles graphiques probabilistes aux modeles graphiques de durée
Des modèles graphiques probabilistes aux modeles graphiques de duréeUniversity of Nantes
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognitionأحلام انصارى
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...Yandex
 
Lamini&farsane traitement de_signale
Lamini&farsane traitement de_signaleLamini&farsane traitement de_signale
Lamini&farsane traitement de_signaleAsmae Lamini
 
Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...
Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...
Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...University of Nantes
 
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...Bilel Moulahi
 

Viewers also liked (8)

Automatic speech recognition system using deep learning
Automatic speech recognition system using deep learningAutomatic speech recognition system using deep learning
Automatic speech recognition system using deep learning
 
Des modèles graphiques probabilistes aux modeles graphiques de durée
Des modèles graphiques probabilistes aux modeles graphiques de duréeDes modèles graphiques probabilistes aux modeles graphiques de durée
Des modèles graphiques probabilistes aux modeles graphiques de durée
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognition
 
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi..."Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
"Automatic speech recognition for mobile applications in Yandex" — Fran Campi...
 
Lamini&farsane traitement de_signale
Lamini&farsane traitement de_signaleLamini&farsane traitement de_signale
Lamini&farsane traitement de_signale
 
Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...
Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...
Evaluation des algorithmes d’apprentissage de structure pour les réseaux Bayé...
 
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
 

Similar to Automatic Speech Recognition

B047006011
B047006011B047006011
B047006011inventy
 
B047006011
B047006011B047006011
B047006011inventy
 
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesAutomatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesEditor IJCATR
 
Recent approaches to arabic dialogue acts classifications
Recent approaches to arabic dialogue acts classificationsRecent approaches to arabic dialogue acts classifications
Recent approaches to arabic dialogue acts classificationscsandit
 
G2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian languageG2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian languageijnlc
 
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...ijnlc
 
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET Journal
 
English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...IJECEIAES
 
ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...
ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...
ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...sipij
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemIJERA Editor
 
CALL (computer Assisted Language)
CALL (computer Assisted Language)CALL (computer Assisted Language)
CALL (computer Assisted Language)syeda12345
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
 
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESMULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESmlaij
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for TranslationRIILP
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
 
ELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewLinhVu946763
 
Project linguistics - Phonetic Component
Project linguistics - Phonetic ComponentProject linguistics - Phonetic Component
Project linguistics - Phonetic ComponentDiana Orjuela Cujabán
 

Similar to Automatic Speech Recognition (20)

B047006011
B047006011B047006011
B047006011
 
B047006011
B047006011B047006011
B047006011
 
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesAutomatic Speech Recognition of Malayalam Language Nasal Class Phonemes
Automatic Speech Recognition of Malayalam Language Nasal Class Phonemes
 
Recent approaches to arabic dialogue acts classifications
Recent approaches to arabic dialogue acts classificationsRecent approaches to arabic dialogue acts classifications
Recent approaches to arabic dialogue acts classifications
 
G2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian languageG2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian language
 
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...
 
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival FrameworkIRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
IRJET- Text to Speech Synthesis for Hindi Language using Festival Framework
 
English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...English speaking proficiency assessment using speech and electroencephalograp...
English speaking proficiency assessment using speech and electroencephalograp...
 
ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...
ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...
ENHANCING NON-NATIVE ACCENT RECOGNITION THROUGH A COMBINATION OF SPEAKER EMBE...
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
Tutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis System
 
CALL (computer Assisted Language)
CALL (computer Assisted Language)CALL (computer Assisted Language)
CALL (computer Assisted Language)
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESMULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURES
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
 
ELSA's Speech Recognition Overview
ELSA's Speech Recognition OverviewELSA's Speech Recognition Overview
ELSA's Speech Recognition Overview
 
Language
LanguageLanguage
Language
 
Project linguistics - Phonetic Component
Project linguistics - Phonetic ComponentProject linguistics - Phonetic Component
Project linguistics - Phonetic Component
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Automatic Speech Recognition

  • 1.
  • 2. We are considering that while giving speech to our system. It is quite exhaustive that it has no noise other than coming from user. At certain places we use stored database in that generates after training sets had done. 11/14/2012 2YoGiV
  • 3. To implement the above system we have 3 subsystems. 1. ASR (Automatic Speech Recognition) 2. DIALOGUE MANAGEMENT 3. SPOKEN LANGUAGE GENERATION 11/14/2012 3YoGiV
  • 4. This is the 1st subsystem used in SDS which takes voice as input and converts it into grammatically correct speech and stores in the system. This system moreover focuses on making the voice (including noise) into certain speech which further can be used in our next subsystem. This is our main area to focus. 11/14/2012 4YoGiV
  • 5. This system mainly focus in the management of the output taken by ASR according to the individual identity and Stores in the system for using in next subsystem 11/14/2012 5YoGiV
  • 6. This subsystem uses stored speeches and generates spoken language (say English in our case). 11/14/2012 6YoGiV
  • 8. Now in our case we are dealing with ASR (Automatic Speech Recognition) 11/14/2012 8YoGiV
  • 9. ASR will take voice as input and accordingly convert to understandable speeches. Question Arise  How can system distinguish between different speakers?  How can system distinguish between ambient noise and someone speaking?  How can system derive meaning from what was said?  For the above questions we start to describe our important part “Speech” 11/14/2012 9YoGiV
  • 10. Some of the factors which are to be taken in mind while taking speech as input. a) Biological Factors b) Phonology c) Frequency of Sounds d) Timing 11/14/2012 10YoGiV
  • 11. 1. The way our mouth move to produce certain sounds affect the features of the sound itself. 2. The structure of the mouth produces multiple waves in certain patterns. 3. When we manipulate our mouths in the way to make certain letters say‘t’ we push out more air at once, making a higher frequency sound. So from this we have one thing to take care is frequency of speech and with frequency we take Amplitude and Pitch into consideration. 11/14/2012 11YoGiV
  • 12.  It shows that how we use sound to convey meaning in a language  In English it states characteristics of sounds like vowels and consonants.  Phoneme is the smallest segmental unit of sound in a language. Each Phoneme has features in the sound that differs it from another Phoneme Combine to represent words and sentences. Regarding English we have about 40-50 phonemes. So we use phoneme to remove any noise from the sound 11/14/2012 12YoGiV
  • 13.  Different vowels have different pitches; they are similar to musical notes  for ex. 'i' being the highest 'u' being the lowest  Consonant phonemes have more waves oscillating of different parts of the mouth.  So according to different frequency system we can store words with different phoneme. 11/14/2012 13YoGiV
  • 14. There is a lot of information in timing. Breaks between words, breaks between one sentence and another, so this all to be considered in the speech to distinguish between different words. According to Research Vowels last longer than consonants.  Now by looking above factors we have to:  Translate from frequencies to a representation of a phoneme.  Discarding the useless information like noise, etc.  The sentence created must make some sense. 11/14/2012 14YoGiV
  • 15. For the above problems we use two models and one database:  Acoustic Model  Dictionary  Language model 11/14/2012 15YoGiV
  • 16. Based on all the features of a sound wave  Frequency  Pitch  Amplitude  Time information 11/14/2012 16YoGiV
  • 17. ● The Acoustic Model is the statistical mapping from the units of speech to all the features of speech. ● Convert Speech Sound to Phoneme then to Word Statistical ● Tells information about the language Phonology. It can learn from a training set. 11/14/2012 17YoGiV
  • 18. It checks the Word broken into the phoneme sounds as what they are typically made of. 11/14/2012 18YoGiV
  • 19. ● Provides word-level structure for a language. ● Use formal grammar rules to make sentence. As we use context to place particular word at particular place. To implement the above context matching in systems we use technique of Probability. For this we calculate probability of next coming word by using previous probability Probability of word is based on the last N-1 terms P(Y) =∑ P (Y|X) P(X) (Sum over x) X= Probability of all the existing word in sentence. Y= Probability of observing a sequence. 11/14/2012 19YoGiV
  • 21. B. Tech , Computer Science JIIT , Noida 11/14/2012 21YoGiV