SlideShare a Scribd company logo
1 of 21
Visit www.seminarlinks.blogspot.in to Download
Introduction
• Speech recognition is the process of converting an acoustic signal, captured
by a microphone or a telephone, to a set of words.
• The recognized words can be an end in themselves, as for applications such
as commands & control, data entry, and document preparation.
• They can also serve as the input to further linguistic processing in order to
achieve speech understanding.
• It is also known as Automatic Speech Recognition (ASR) ,computer speech
recognition, speech to text (STT).
History
• Around since the 1960s, ASR has seen steady, incremental improvement
over the years.
• It has benefited greatly from increased processing speed of computers in
the last decade, entering the marketplace in the mid-2000s.
• Early systems were acoustic phonetics-based and worked with small
vocabularies to identify isolated words.
• Over the years, vocabularies have grown while ASR systems have become
statistics-based
• They now have large vocabularies and can recognize continuous speech.
Basic Structure
Digital Sampling
• When you speak, you create vibrations in the air. The analog-to-digital
converter (ADC) translates this analog wave into digital data that the
computer can understand.
• To do this, it samples, or digitizes, the sound by taking precise
measurements of the wave at frequent intervals.
• The system filters the digitized sound to remove unwanted noise, and
sometimes to separate it into different bands of frequency.
Acoustic model
• Next the signal is divided into small segments as short as a few
hundredths of a second, or even thousandths in the case of plosive
consonant sounds -- consonant stops produced by obstructing airflow
in the vocal tract -- like "p" or "t."
• The program then matches these segments to known phonemes in
the appropriate language.
• A phoneme is the smallest element of a language -- a representation
of the sounds we make and put together to form meaningful
expressions.
Language model
• The program examines phonemes in the context of the other
phonemes around them.
• It runs the contextual phoneme plot through a complex statistical
model and compares them to a large library of known words, phrases
and sentences.
• The program then determines what the user was probably saying and
either outputs it as text or issues a computer command.
Statistical Modeling Systems
• These systems use probability and mathematical functions to
determine the most likely outcome.
• The two models that dominate the field today are the Hidden Markov
Model and Neural Networks.
• These methods involve complex mathematical functions, but
essentially, they take the information known to the system to figure
out the information hidden from it.
Hidden Markov Model (HMM)
• In this model, each phoneme is like a link in a chain, and the
completed chain is a word.
• The chain branches off in different directions as the program
attempts to match the digital sound with the phoneme that's most
likely to come next.
• During this process, the program assigns a probability score to each
phoneme, based on its built-in dictionary and user training.
Markov Model
Neural Networks
A class of statistical models may be called "neural" if they consist of

• sets of adaptive weights, i.e. numerical parameters that are tuned by
a learning algorithm, and
• are capable of approximating non-linear functions of their inputs.
The adaptive weights are conceptually connection strengths between
neurons, which are activated during training and prediction.
Each circular node represents an artificial neuron and an arrow represents a
connection from the output of one neuron to the input of another.
Program Training
• The process is more complicated for phrases and sentences -- the system
has to figure out where each word stops and starts.
• The statistical systems need lots of exemplary training data to reach their
optimal performance.

• Sometimes on the order of thousands of hours of human-transcribed
speech and hundreds of megabytes of text.
• The training data are used to create acoustic models of words, word lists
and multi-word probability networks.
• The details can make the difference between a well-performing system and
a poorly-performing system -- even when using the same basic algorithm.
Applications
• Transcription
• dictation, information retrieval

• Command and control
• data entry, device control, navigation, call routing

• Information access
• airline schedules, stock quotes, directory assistance

• Problem solving
• travel planning, logistics
Weaknesses and Flaws
• Low signal-to-noise ratio - The program needs to "hear" the words
spoken distinctly, and any extra noise introduced into the sound will
interfere with this.
• Overlapping speech- Current systems have difficulty separating
simultaneous speech from multiple users.

• Intensive use of computer power.
• Homonyms e.g. "There" and "their," "air" and "heir," "be" and "bee"
Major Challenges
• Making a system that can flawlessly handle roadblocks like
slang, dialects, accents and background noise.
• The different grammatical structures used by languages can also pose
a problem. For example, Arabic sometimes uses single words to
convey ideas that are entire sentences in English.
The Future of Speech Recognition
• The Defense Advanced Research Projects Agency (DARPA) has three teams
of researchers working on Global Autonomous Language Exploitation
(GALE), a program that will take in streams of information from foreign
news broadcasts and newspapers and translate them.
• It hopes to create software that can instantly translate two languages with
at least 90 percent accuracy.
• "DARPA is also funding an R&D effort called TRANSTAC to enable the
soldiers to communicate more effectively with civilian populations in nonEnglish-speaking countries.
Conclusion
At some point in the future, speech recognition may become speech
understanding.
The statistical models that allow computers to decide what a person
just said may someday allow them to grasp the meaning behind the
words.
Although it is a huge leap in terms of computational power and
software sophistication, some researchers argue that speech
recognition development offers the most direct line from the
computers of today to true artificial intelligence.
References
• http://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speechrecognition.htm
• http://project.uet.itgo.com/speech.htm
• http://www.hitl.washington.edu/scivw/EVE/I.D.2.d.VoiceRecognition.html
• http://msdn.microsoft.com/en-us/library/hh378337(v=office.14).aspx
• http://www.plumvoice.com/resources/blog/speech-recognition/
• http://en.wikipedia.org/wiki/Hidden_Markov_model
• http://en.wikipedia.org/wiki/Automatic_translation
Speech Recognition Technology

More Related Content

What's hot

Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionRHIMRJ Journal
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data miningJimit Rupani
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice RecognitionAmrita More
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technologySrijanKumar18
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overviewsajanazoya
 
TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxNsaroj kumar
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognitionsowmith chatlapally
 
silent sound technology
silent sound technologysilent sound technology
silent sound technologykamesh0007
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by IqbalIqbal
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speechBilgin Aksoy
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemREHMAT ULLAH
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniquessonukumar142
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognitionananth
 
Artificial intelligence in speech recognition
Artificial intelligence in speech recognitionArtificial intelligence in speech recognition
Artificial intelligence in speech recognitionRajanivetha G
 

What's hot (20)

Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
speech processing and recognition basic in data mining
speech processing and recognition basic in  data miningspeech processing and recognition basic in  data mining
speech processing and recognition basic in data mining
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
TEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptxTEXT-SPEECH PPT.pptx
TEXT-SPEECH PPT.pptx
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
 
silent sound technology
silent sound technologysilent sound technology
silent sound technology
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition system
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Voice recognition
Voice recognitionVoice recognition
Voice recognition
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
 
Artificial intelligence in speech recognition
Artificial intelligence in speech recognitionArtificial intelligence in speech recognition
Artificial intelligence in speech recognition
 

Viewers also liked

Universal Patient Identity: eliminating duplicate records, medical identity t...
Universal Patient Identity: eliminating duplicate records, medical identity t...Universal Patient Identity: eliminating duplicate records, medical identity t...
Universal Patient Identity: eliminating duplicate records, medical identity t...3GDR
 
The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry
The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry
The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry RightPatient®
 
Voice & Speech Recognition Technology in Healthcare
Voice &  Speech Recognition Technology in HealthcareVoice &  Speech Recognition Technology in Healthcare
Voice & Speech Recognition Technology in HealthcareCaroline Macleod
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognitionأحلام انصارى
 
Medical Records Destruction Guide
Medical Records Destruction GuideMedical Records Destruction Guide
Medical Records Destruction GuideShred Nations
 
Medical Transcription
Medical TranscriptionMedical Transcription
Medical Transcriptionaadhar14_b
 
Translation and Transcription Process | Medical Transcription Service Company
Translation and Transcription Process | Medical Transcription Service Company  Translation and Transcription Process | Medical Transcription Service Company
Translation and Transcription Process | Medical Transcription Service Company amar519
 
Introduction to medical transcription
Introduction to medical transcriptionIntroduction to medical transcription
Introduction to medical transcriptionjeanrummy
 
Transcription
TranscriptionTranscription
Transcriptionjoyjulie
 
Medical Records Role and its Maintenance.
Medical Records Role and its Maintenance.Medical Records Role and its Maintenance.
Medical Records Role and its Maintenance.Healthcare consultant
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By MatlabAnkit Gujrati
 

Viewers also liked (15)

Uses of speech recognition system
Uses of speech recognition systemUses of speech recognition system
Uses of speech recognition system
 
What is medical transcription
What is medical transcriptionWhat is medical transcription
What is medical transcription
 
Universal Patient Identity: eliminating duplicate records, medical identity t...
Universal Patient Identity: eliminating duplicate records, medical identity t...Universal Patient Identity: eliminating duplicate records, medical identity t...
Universal Patient Identity: eliminating duplicate records, medical identity t...
 
The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry
The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry
The Impact of Duplicate Medical Records and Overlays on the Healthcare Industry
 
Voice & Speech Recognition Technology in Healthcare
Voice &  Speech Recognition Technology in HealthcareVoice &  Speech Recognition Technology in Healthcare
Voice & Speech Recognition Technology in Healthcare
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognition
 
Medical Records Destruction Guide
Medical Records Destruction GuideMedical Records Destruction Guide
Medical Records Destruction Guide
 
Medical Transcription
Medical TranscriptionMedical Transcription
Medical Transcription
 
Translation and Transcription Process | Medical Transcription Service Company
Translation and Transcription Process | Medical Transcription Service Company  Translation and Transcription Process | Medical Transcription Service Company
Translation and Transcription Process | Medical Transcription Service Company
 
Introduction to medical transcription
Introduction to medical transcriptionIntroduction to medical transcription
Introduction to medical transcription
 
Medical Transcription Power Point Show
Medical Transcription Power Point ShowMedical Transcription Power Point Show
Medical Transcription Power Point Show
 
Transcription
TranscriptionTranscription
Transcription
 
Medical Records Role and its Maintenance.
Medical Records Role and its Maintenance.Medical Records Role and its Maintenance.
Medical Records Role and its Maintenance.
 
Speech Recognition System By Matlab
Speech Recognition System By MatlabSpeech Recognition System By Matlab
Speech Recognition System By Matlab
 
Medical records ppt
Medical records pptMedical records ppt
Medical records ppt
 

Similar to Speech Recognition Technology

Sequence to sequence model speech recognition
Sequence to sequence model speech recognitionSequence to sequence model speech recognition
Sequence to sequence model speech recognitionAditya Kumar Khare
 
Recent advances in LVCSR : A benchmark comparison of performances
Recent advances in LVCSR : A benchmark comparison of performancesRecent advances in LVCSR : A benchmark comparison of performances
Recent advances in LVCSR : A benchmark comparison of performancesIJECEIAES
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introductionacemindia
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction acemindia
 
Speech recognition using neural + fuzzy logic
Speech recognition using neural + fuzzy logicSpeech recognition using neural + fuzzy logic
Speech recognition using neural + fuzzy logicSnehal Patel
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptxAmanBadesra1
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generatorsPaul Kahoro
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionMohammad Ilyas Malik
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptxJhalakDashora
 
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingEfficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingIOSR Journals
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionStephen Marquard
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationChamani Shiranthika
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionZachary S. Brown
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsshrey bhate
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)TANVIRAHMED611926
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingiosrjce
 

Similar to Speech Recognition Technology (20)

Sequence to sequence model speech recognition
Sequence to sequence model speech recognitionSequence to sequence model speech recognition
Sequence to sequence model speech recognition
 
Recent advances in LVCSR : A benchmark comparison of performances
Recent advances in LVCSR : A benchmark comparison of performancesRecent advances in LVCSR : A benchmark comparison of performances
Recent advances in LVCSR : A benchmark comparison of performances
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
 
Speech recognition using neural + fuzzy logic
Speech recognition using neural + fuzzy logicSpeech recognition using neural + fuzzy logic
Speech recognition using neural + fuzzy logic
 
NLP,expert,robotics.pptx
NLP,expert,robotics.pptxNLP,expert,robotics.pptx
NLP,expert,robotics.pptx
 
Speech recognizers & generators
Speech recognizers & generatorsSpeech recognizers & generators
Speech recognizers & generators
 
NLP, Expert system and pattern recognition
NLP, Expert system and pattern recognitionNLP, Expert system and pattern recognition
NLP, Expert system and pattern recognition
 
speech enhancement
speech enhancementspeech enhancement
speech enhancement
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptx
 
Efficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And RecordingEfficient Intralingual Text To Speech Web Podcasting And Recording
Efficient Intralingual Text To Speech Web Podcasting And Recording
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
 
Integration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translationIntegration of speech recognition with computer assisted translation
Integration of speech recognition with computer assisted translation
 
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech RecognitionTeaching Machines to Listen: An Introduction to Automatic Speech Recognition
Teaching Machines to Listen: An Introduction to Automatic Speech Recognition
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Course report-islam-taharimul (1)
Course report-islam-taharimul (1)Course report-islam-taharimul (1)
Course report-islam-taharimul (1)
 
H010625862
H010625862H010625862
H010625862
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law companding
 
VOICE BROWSER
VOICE BROWSERVOICE BROWSER
VOICE BROWSER
 

More from Seminar Links

Artificial Intelligence (A.I.) in Schools (PPT)
Artificial Intelligence (A.I.) in Schools (PPT)Artificial Intelligence (A.I.) in Schools (PPT)
Artificial Intelligence (A.I.) in Schools (PPT)Seminar Links
 
Sustainable Materials Management (SMM)
Sustainable Materials Management (SMM)Sustainable Materials Management (SMM)
Sustainable Materials Management (SMM)Seminar Links
 
Are Top Grades Enough (PPT)
Are Top Grades Enough (PPT)Are Top Grades Enough (PPT)
Are Top Grades Enough (PPT)Seminar Links
 
AI and Youth Employment (PPT)
AI and Youth Employment (PPT)AI and Youth Employment (PPT)
AI and Youth Employment (PPT)Seminar Links
 
Environmental Impacts of COVID-19 Pandemic: PPT
Environmental Impacts of COVID-19 Pandemic: PPTEnvironmental Impacts of COVID-19 Pandemic: PPT
Environmental Impacts of COVID-19 Pandemic: PPTSeminar Links
 
20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging Technologies20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging TechnologiesSeminar Links
 
Claytronics | Programmable Matter | PPT
Claytronics | Programmable Matter | PPTClaytronics | Programmable Matter | PPT
Claytronics | Programmable Matter | PPTSeminar Links
 
Three-dimensional Holographic Projection Technology PPT | 2018
Three-dimensional Holographic Projection Technology PPT | 2018Three-dimensional Holographic Projection Technology PPT | 2018
Three-dimensional Holographic Projection Technology PPT | 2018Seminar Links
 
MicroLED : Latest Display Technology | PPT
MicroLED : Latest Display Technology | PPTMicroLED : Latest Display Technology | PPT
MicroLED : Latest Display Technology | PPTSeminar Links
 
Performance of 400 kV line insulators under pollution | PDF | DOC | PPT
Performance of 400 kV line insulators under pollution | PDF | DOC | PPTPerformance of 400 kV line insulators under pollution | PDF | DOC | PPT
Performance of 400 kV line insulators under pollution | PDF | DOC | PPTSeminar Links
 
Box Pushing Technique
Box Pushing TechniqueBox Pushing Technique
Box Pushing TechniqueSeminar Links
 
Highest Largest Tallest Longest in India 2018
Highest Largest Tallest Longest in India 2018Highest Largest Tallest Longest in India 2018
Highest Largest Tallest Longest in India 2018Seminar Links
 
Atmospheric Vortex Engine (AVE)
Atmospheric Vortex Engine (AVE) Atmospheric Vortex Engine (AVE)
Atmospheric Vortex Engine (AVE) Seminar Links
 
Artificial photosynthesis PPT
Artificial photosynthesis PPTArtificial photosynthesis PPT
Artificial photosynthesis PPTSeminar Links
 
How to prevent WannaCry Ransomware
How to prevent WannaCry RansomwareHow to prevent WannaCry Ransomware
How to prevent WannaCry RansomwareSeminar Links
 
Babbitt material ppt
Babbitt material pptBabbitt material ppt
Babbitt material pptSeminar Links
 
Carbon Foam Military Applications
Carbon Foam Military ApplicationsCarbon Foam Military Applications
Carbon Foam Military ApplicationsSeminar Links
 

More from Seminar Links (20)

Artificial Intelligence (A.I.) in Schools (PPT)
Artificial Intelligence (A.I.) in Schools (PPT)Artificial Intelligence (A.I.) in Schools (PPT)
Artificial Intelligence (A.I.) in Schools (PPT)
 
Sustainable Materials Management (SMM)
Sustainable Materials Management (SMM)Sustainable Materials Management (SMM)
Sustainable Materials Management (SMM)
 
Are Top Grades Enough (PPT)
Are Top Grades Enough (PPT)Are Top Grades Enough (PPT)
Are Top Grades Enough (PPT)
 
AI and Youth Employment (PPT)
AI and Youth Employment (PPT)AI and Youth Employment (PPT)
AI and Youth Employment (PPT)
 
Environmental Impacts of COVID-19 Pandemic: PPT
Environmental Impacts of COVID-19 Pandemic: PPTEnvironmental Impacts of COVID-19 Pandemic: PPT
Environmental Impacts of COVID-19 Pandemic: PPT
 
20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging Technologies20 Latest Computer Science Seminar Topics on Emerging Technologies
20 Latest Computer Science Seminar Topics on Emerging Technologies
 
Claytronics | Programmable Matter | PPT
Claytronics | Programmable Matter | PPTClaytronics | Programmable Matter | PPT
Claytronics | Programmable Matter | PPT
 
Three-dimensional Holographic Projection Technology PPT | 2018
Three-dimensional Holographic Projection Technology PPT | 2018Three-dimensional Holographic Projection Technology PPT | 2018
Three-dimensional Holographic Projection Technology PPT | 2018
 
MicroLED : Latest Display Technology | PPT
MicroLED : Latest Display Technology | PPTMicroLED : Latest Display Technology | PPT
MicroLED : Latest Display Technology | PPT
 
Performance of 400 kV line insulators under pollution | PDF | DOC | PPT
Performance of 400 kV line insulators under pollution | PDF | DOC | PPTPerformance of 400 kV line insulators under pollution | PDF | DOC | PPT
Performance of 400 kV line insulators under pollution | PDF | DOC | PPT
 
Box Pushing Technique
Box Pushing TechniqueBox Pushing Technique
Box Pushing Technique
 
Highest Largest Tallest Longest in India 2018
Highest Largest Tallest Longest in India 2018Highest Largest Tallest Longest in India 2018
Highest Largest Tallest Longest in India 2018
 
Atmospheric Vortex Engine (AVE)
Atmospheric Vortex Engine (AVE) Atmospheric Vortex Engine (AVE)
Atmospheric Vortex Engine (AVE)
 
Artificial photosynthesis PPT
Artificial photosynthesis PPTArtificial photosynthesis PPT
Artificial photosynthesis PPT
 
How to prevent WannaCry Ransomware
How to prevent WannaCry RansomwareHow to prevent WannaCry Ransomware
How to prevent WannaCry Ransomware
 
Dams PPT
Dams PPTDams PPT
Dams PPT
 
Bio mass Energy
Bio mass EnergyBio mass Energy
Bio mass Energy
 
Babbitt material ppt
Babbitt material pptBabbitt material ppt
Babbitt material ppt
 
Ceramic Bearing ppt
Ceramic Bearing pptCeramic Bearing ppt
Ceramic Bearing ppt
 
Carbon Foam Military Applications
Carbon Foam Military ApplicationsCarbon Foam Military Applications
Carbon Foam Military Applications
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Speech Recognition Technology

  • 2. Introduction • Speech recognition is the process of converting an acoustic signal, captured by a microphone or a telephone, to a set of words. • The recognized words can be an end in themselves, as for applications such as commands & control, data entry, and document preparation. • They can also serve as the input to further linguistic processing in order to achieve speech understanding. • It is also known as Automatic Speech Recognition (ASR) ,computer speech recognition, speech to text (STT).
  • 3. History • Around since the 1960s, ASR has seen steady, incremental improvement over the years. • It has benefited greatly from increased processing speed of computers in the last decade, entering the marketplace in the mid-2000s. • Early systems were acoustic phonetics-based and worked with small vocabularies to identify isolated words. • Over the years, vocabularies have grown while ASR systems have become statistics-based • They now have large vocabularies and can recognize continuous speech.
  • 5. Digital Sampling • When you speak, you create vibrations in the air. The analog-to-digital converter (ADC) translates this analog wave into digital data that the computer can understand. • To do this, it samples, or digitizes, the sound by taking precise measurements of the wave at frequent intervals. • The system filters the digitized sound to remove unwanted noise, and sometimes to separate it into different bands of frequency.
  • 6. Acoustic model • Next the signal is divided into small segments as short as a few hundredths of a second, or even thousandths in the case of plosive consonant sounds -- consonant stops produced by obstructing airflow in the vocal tract -- like "p" or "t." • The program then matches these segments to known phonemes in the appropriate language. • A phoneme is the smallest element of a language -- a representation of the sounds we make and put together to form meaningful expressions.
  • 7. Language model • The program examines phonemes in the context of the other phonemes around them. • It runs the contextual phoneme plot through a complex statistical model and compares them to a large library of known words, phrases and sentences. • The program then determines what the user was probably saying and either outputs it as text or issues a computer command.
  • 8.
  • 9. Statistical Modeling Systems • These systems use probability and mathematical functions to determine the most likely outcome. • The two models that dominate the field today are the Hidden Markov Model and Neural Networks. • These methods involve complex mathematical functions, but essentially, they take the information known to the system to figure out the information hidden from it.
  • 10. Hidden Markov Model (HMM) • In this model, each phoneme is like a link in a chain, and the completed chain is a word. • The chain branches off in different directions as the program attempts to match the digital sound with the phoneme that's most likely to come next. • During this process, the program assigns a probability score to each phoneme, based on its built-in dictionary and user training.
  • 12. Neural Networks A class of statistical models may be called "neural" if they consist of • sets of adaptive weights, i.e. numerical parameters that are tuned by a learning algorithm, and • are capable of approximating non-linear functions of their inputs. The adaptive weights are conceptually connection strengths between neurons, which are activated during training and prediction.
  • 13. Each circular node represents an artificial neuron and an arrow represents a connection from the output of one neuron to the input of another.
  • 14. Program Training • The process is more complicated for phrases and sentences -- the system has to figure out where each word stops and starts. • The statistical systems need lots of exemplary training data to reach their optimal performance. • Sometimes on the order of thousands of hours of human-transcribed speech and hundreds of megabytes of text. • The training data are used to create acoustic models of words, word lists and multi-word probability networks. • The details can make the difference between a well-performing system and a poorly-performing system -- even when using the same basic algorithm.
  • 15. Applications • Transcription • dictation, information retrieval • Command and control • data entry, device control, navigation, call routing • Information access • airline schedules, stock quotes, directory assistance • Problem solving • travel planning, logistics
  • 16. Weaknesses and Flaws • Low signal-to-noise ratio - The program needs to "hear" the words spoken distinctly, and any extra noise introduced into the sound will interfere with this. • Overlapping speech- Current systems have difficulty separating simultaneous speech from multiple users. • Intensive use of computer power. • Homonyms e.g. "There" and "their," "air" and "heir," "be" and "bee"
  • 17. Major Challenges • Making a system that can flawlessly handle roadblocks like slang, dialects, accents and background noise. • The different grammatical structures used by languages can also pose a problem. For example, Arabic sometimes uses single words to convey ideas that are entire sentences in English.
  • 18. The Future of Speech Recognition • The Defense Advanced Research Projects Agency (DARPA) has three teams of researchers working on Global Autonomous Language Exploitation (GALE), a program that will take in streams of information from foreign news broadcasts and newspapers and translate them. • It hopes to create software that can instantly translate two languages with at least 90 percent accuracy. • "DARPA is also funding an R&D effort called TRANSTAC to enable the soldiers to communicate more effectively with civilian populations in nonEnglish-speaking countries.
  • 19. Conclusion At some point in the future, speech recognition may become speech understanding. The statistical models that allow computers to decide what a person just said may someday allow them to grasp the meaning behind the words. Although it is a huge leap in terms of computational power and software sophistication, some researchers argue that speech recognition development offers the most direct line from the computers of today to true artificial intelligence.
  • 20. References • http://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speechrecognition.htm • http://project.uet.itgo.com/speech.htm • http://www.hitl.washington.edu/scivw/EVE/I.D.2.d.VoiceRecognition.html • http://msdn.microsoft.com/en-us/library/hh378337(v=office.14).aspx • http://www.plumvoice.com/resources/blog/speech-recognition/ • http://en.wikipedia.org/wiki/Hidden_Markov_model • http://en.wikipedia.org/wiki/Automatic_translation