SlideShare a Scribd company logo
1 of 19
Speech Recognition 
Created By : 
Kanjariya Hardik G. 
Roll No : 17
Introduction 
 Speech recognition technology has recently reached a 
higher level of performance and robustness, allowing it 
to communicate to another user by talking . 
 Speech Recognization is process of decoding acoustic 
speech signal captured by microphone or telephone ,to a 
set of words. 
 And with the help of these it will recognize whole 
speech is recognized word by word .
Types of SR 
 There are two main types of speaker models: speaker independent 
and speaker dependent. 
 Speaker independent models recognize the speech patterns of a large 
group of people. 
 Speaker dependent models recognize speech patterns from only one 
person. Both models use mathematical and statistical formulas to yield 
the best work match for speech. A third variation of speaker models is 
now emerging, called speaker adaptive. 
 Speaker adaptive systems usually begin with a speaker independent 
model and adjust these models more closely to each individual during a 
brief training period.
How does it works?.. 
 Speech produces a sound pressure wave which forms an 
acoustic signal. 
The microphone 
– receives the acoustic signal and converts it to an 
analogue signal. 
 To store the analogue signal, it must be converted to a 
digital signal. 
 A speech recognizer tries to transform a digitally 
encoded acoustic signal in a natural language 
into text in that language.
Speech Waveform/Spectrogram 
s p ee ch l a b 
Hz 
 The spectrogram is an alternative way to characterize speech. 
 The louder the sound the greater the amplitude on the y-axis. 
s
Speech Recognition Process 
Flow
The major components 
 Audio input 
 Grammar 
 Acoustic Model 
 Recognized text
Audio I/O 
 It is important to understand that this audio 
stream is rarely pristine 
 It contains not only the speech data (what was 
said) but also background noise. 
 This noise can interfere with the recognition 
process, and the speech engine must handle (and 
possibly even adapt to) the environment within 
which the audio is spoken.
Acoustic+Grammer 
 Once the speech data is in the proper format, the engine 
searches for the best match. 
 It does this by taking into consideration the words and phrases 
it knows about (the active grammars), along with its 
knowledge of the environment in which it is operating. 
 The knowledge of the environment is provided in the form of 
an acoustic model. 
 Once it identifies the most likely match for what was said, it 
returns what it recognized as a text string.
About SR Engine 
 SR requires a software application "engine" with logic 
built in to decipher and act on the spoken word. 
 Sound Card 
– Converts acoustic signal to digital signal. 
 Function of SR Engine- 
– SR Engine converts these digital signal to 
phonemes to word.
 Different SR engine 
 CMU Sphinx 
 Microsoft SAPI 
 IBM ViaVoice
Decoding process.
Recognition Process Flow 
Summary 
Step 1:User Input 
The system catches user’s voice in the form of analog 
acoustic signal. 
Step 2:Digitization 
Digitize the analog acoustic signal. 
Step 3:Phonetic Breakdown 
Breaking signals into phonemes.
Recognition Process Flow 
Summary 
 Step 4:Statistical Modeling 
 Mapping phonemes to their phonetic representation 
using statistics model. 
 Step 5:Matching 
 According to grammar , phonetic representation and 
Dictionary , the system returns an n-best list (I.e.:a 
word plus a confidence score) 
 Grammar-the union words or phrases to constraint the 
range of input or output in the voice application. 
 Dictionary-the mapping table of phonetic 
representation and word(EX:thu,theethe)
REPRESENTATION OF SOFTWARE 
15
Challenges and Difficulties 
of SR 
Speech Recognition is still a very cumbersome problem. Following 
are the problem…. 
 Speaker Variability 
Two speakers or even the same speaker will pronounce the 
same word differently 
 Channel Variability 
The quality and position of microphone and background 
environment will affect the output
Current Software Options for PC 
 Dragon Systems – Naturally Speaking 
 Philips – FreeSpeech 
 IBM – ViaVoice 
 Lernout & Hauspie – Voice Xpress
Speech Recognition
Speech Recognition

More Related Content

What's hot

Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionAhmed Moawad
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
Speech recognition system
Speech recognition systemSpeech recognition system
Speech recognition systemRipal Ranpara
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by IqbalIqbal
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK Kamonasish Hore
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySeminar Links
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technologySrijanKumar18
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Visual speech to text conversion applicable to telephone communication
Visual speech to text conversion  applicable  to telephone communicationVisual speech to text conversion  applicable  to telephone communication
Visual speech to text conversion applicable to telephone communicationSwathi Venugopal
 
Unit 1 speech processing
Unit 1 speech processingUnit 1 speech processing
Unit 1 speech processingazhagujaisudhan
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologyAamir-sheriff
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionManthan Gandhi
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speechBilgin Aksoy
 

What's hot (20)

Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Speech recognition system
Speech recognition systemSpeech recognition system
Speech recognition system
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
 
Speech Recognition System
Speech Recognition SystemSpeech Recognition System
Speech Recognition System
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Visual speech to text conversion applicable to telephone communication
Visual speech to text conversion  applicable  to telephone communicationVisual speech to text conversion  applicable  to telephone communication
Visual speech to text conversion applicable to telephone communication
 
Unit 1 speech processing
Unit 1 speech processingUnit 1 speech processing
Unit 1 speech processing
 
Speech processing
Speech processingSpeech processing
Speech processing
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 

Viewers also liked

Image Steganography using LSB
Image Steganography using LSBImage Steganography using LSB
Image Steganography using LSBSreelekshmi Sree
 
Steganography Project
Steganography Project Steganography Project
Steganography Project Jitu Choudhary
 
Steganography
Steganography Steganography
Steganography Uttam Jain
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project reportSarang Afle
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 

Viewers also liked (7)

Steganography
SteganographySteganography
Steganography
 
Image Steganography using LSB
Image Steganography using LSBImage Steganography using LSB
Image Steganography using LSB
 
Steganography Project
Steganography Project Steganography Project
Steganography Project
 
Steganography
Steganography Steganography
Steganography
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project report
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
PPT steganography
PPT steganographyPPT steganography
PPT steganography
 

Similar to Speech Recognition

Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introductionacemindia
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction acemindia
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generatorsPaul Kahoro
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language ProcessingVikalp Mahendra
 
How speech reorganization works
How speech reorganization worksHow speech reorganization works
How speech reorganization worksMuhammad Taqi
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingiosrjce
 
Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech RecognitionThejus Joby
 
General Speereo Technology
General Speereo TechnologyGeneral Speereo Technology
General Speereo TechnologyDaniel Ischenko
 
Intelligent speech based sms system
Intelligent speech based sms systemIntelligent speech based sms system
Intelligent speech based sms systemKamal Spring
 
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...sophiabelthome
 

Similar to Speech Recognition (20)

Assign
AssignAssign
Assign
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
Speech and Language Processing
Speech and Language ProcessingSpeech and Language Processing
Speech and Language Processing
 
How speech reorganization works
How speech reorganization worksHow speech reorganization works
How speech reorganization works
 
Web AI.pptx
Web AI.pptxWeb AI.pptx
Web AI.pptx
 
Seminar
SeminarSeminar
Seminar
 
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
[IJET-V1I6P21] Authors : Easwari.N , Ponmuthuramalingam.P
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
H010625862
H010625862H010625862
H010625862
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law companding
 
Paper on Speech Recognition
Paper on Speech RecognitionPaper on Speech Recognition
Paper on Speech Recognition
 
General Speereo Technology
General Speereo TechnologyGeneral Speereo Technology
General Speereo Technology
 
Intelligent speech based sms system
Intelligent speech based sms systemIntelligent speech based sms system
Intelligent speech based sms system
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
 
visH (fin).pptx
visH (fin).pptxvisH (fin).pptx
visH (fin).pptx
 

More from Hardik Kanjariya

More from Hardik Kanjariya (6)

Free Zone [Free Internet]
Free Zone [Free Internet]Free Zone [Free Internet]
Free Zone [Free Internet]
 
Rolltop Laptop
Rolltop LaptopRolltop Laptop
Rolltop Laptop
 
Pill Camera
Pill CameraPill Camera
Pill Camera
 
Blue Brain
Blue Brain Blue Brain
Blue Brain
 
Virtual Key Board
Virtual Key BoardVirtual Key Board
Virtual Key Board
 
5 pen pc technology
5 pen pc technology5 pen pc technology
5 pen pc technology
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Speech Recognition

  • 1. Speech Recognition Created By : Kanjariya Hardik G. Roll No : 17
  • 2. Introduction  Speech recognition technology has recently reached a higher level of performance and robustness, allowing it to communicate to another user by talking .  Speech Recognization is process of decoding acoustic speech signal captured by microphone or telephone ,to a set of words.  And with the help of these it will recognize whole speech is recognized word by word .
  • 3. Types of SR  There are two main types of speaker models: speaker independent and speaker dependent.  Speaker independent models recognize the speech patterns of a large group of people.  Speaker dependent models recognize speech patterns from only one person. Both models use mathematical and statistical formulas to yield the best work match for speech. A third variation of speaker models is now emerging, called speaker adaptive.  Speaker adaptive systems usually begin with a speaker independent model and adjust these models more closely to each individual during a brief training period.
  • 4. How does it works?..  Speech produces a sound pressure wave which forms an acoustic signal. The microphone – receives the acoustic signal and converts it to an analogue signal.  To store the analogue signal, it must be converted to a digital signal.  A speech recognizer tries to transform a digitally encoded acoustic signal in a natural language into text in that language.
  • 5. Speech Waveform/Spectrogram s p ee ch l a b Hz  The spectrogram is an alternative way to characterize speech.  The louder the sound the greater the amplitude on the y-axis. s
  • 7. The major components  Audio input  Grammar  Acoustic Model  Recognized text
  • 8. Audio I/O  It is important to understand that this audio stream is rarely pristine  It contains not only the speech data (what was said) but also background noise.  This noise can interfere with the recognition process, and the speech engine must handle (and possibly even adapt to) the environment within which the audio is spoken.
  • 9. Acoustic+Grammer  Once the speech data is in the proper format, the engine searches for the best match.  It does this by taking into consideration the words and phrases it knows about (the active grammars), along with its knowledge of the environment in which it is operating.  The knowledge of the environment is provided in the form of an acoustic model.  Once it identifies the most likely match for what was said, it returns what it recognized as a text string.
  • 10. About SR Engine  SR requires a software application "engine" with logic built in to decipher and act on the spoken word.  Sound Card – Converts acoustic signal to digital signal.  Function of SR Engine- – SR Engine converts these digital signal to phonemes to word.
  • 11.  Different SR engine  CMU Sphinx  Microsoft SAPI  IBM ViaVoice
  • 13. Recognition Process Flow Summary Step 1:User Input The system catches user’s voice in the form of analog acoustic signal. Step 2:Digitization Digitize the analog acoustic signal. Step 3:Phonetic Breakdown Breaking signals into phonemes.
  • 14. Recognition Process Flow Summary  Step 4:Statistical Modeling  Mapping phonemes to their phonetic representation using statistics model.  Step 5:Matching  According to grammar , phonetic representation and Dictionary , the system returns an n-best list (I.e.:a word plus a confidence score)  Grammar-the union words or phrases to constraint the range of input or output in the voice application.  Dictionary-the mapping table of phonetic representation and word(EX:thu,theethe)
  • 16. Challenges and Difficulties of SR Speech Recognition is still a very cumbersome problem. Following are the problem….  Speaker Variability Two speakers or even the same speaker will pronounce the same word differently  Channel Variability The quality and position of microphone and background environment will affect the output
  • 17. Current Software Options for PC  Dragon Systems – Naturally Speaking  Philips – FreeSpeech  IBM – ViaVoice  Lernout & Hauspie – Voice Xpress

Editor's Notes

  1. Speech recognition technology has recently reached a higher level of performance and robustness, allowing it to communicate to another user by talking .
  2. The waveform of the utterance “speech lab” shows time in second along the x-axis and the pressure level on the y-axis, the louder the sound the greater the amplitude on the y-axis. The spectrogram is an alternative way to characterize speech. Time is still on the x-axis, but y-axis has frequency (in Hertz) and intensity is shown by the degree of darkness in the image.
  3. In step 4 ,there is an internal structure called dictionary