SlideShare a Scribd company logo
1 of 12
Speech Recognition,
Text-To-Speech,
and Voice Interfaces
By:
Taryne Cahalin
Stephanie Sirico
Christiana Vasquez
Adelphi University - Mobile Learning, Fall 2013
What is Speech
Recognition?
Instead of an automated voice recording that enables a
person to press buttons, he or she is able to speak specific
words into a device and command orders with the help of a
speech recognition program.
The Uses
Individuals With Disabilities – Assists those who have visual
impairment, hand immobility, dyslexia, etc.
Medical Transcription – Reduces delays to write out
medical transcriptions
Dictation - Converts words to text in emails or other word
documents (also helpful for English Language Learners).
Access Menu Commands – Opens files using voice commands.
Using Dragon Mobile
How does it work?
Speech recognition functions as a
pipeline:
The pipeline converts PCM (pulse
code modulation) digital audio into
recognized speech from a sound
card.
Transforming PCM Digital Audio

16,000 PCM values
per second, a “wavy
line”, that repeat while
the user speaks

Information is
converted for
better
recognition in
the program

Fast-Fourier
transform
identifies
frequency
components of a
specific sound

The program
can
approximate
how our ears
distinguish the
sound
Transform PCM digital audio
using Fast-Fourier Transform
Fast-Fourier analyzes every 1/100th of a second
and converts the audio data

Each 1/100th produces an amplitude graph
These graphs are in a database called a “codebook”
Sounds matched to the most similar entry in the codebook.
Sound is given a number which describes the sound, called the “feature
number”
Two Categories

Small Vocabulary/many-users:
• Leaves room for speech disparity (i.e. accents)
• Limited, preset number of commands that are able to be used

Large Vocabulary/limited-users:
• Best for business settings
• Train system to work with a small number of users
• Accuracy rate will increase as it learns its users
Discrete vs. Continuous Speech
Discrete
• Easier for program to understand
• Noticeable pause after each word
Continuous
• Allows speaking at conversational speed
• Used in most modern systems
Programs now can recognize accents and pronunciations better. In
earlier programs, accents, pronunciations, speed, and background noise
were all variables that made sounds difficult for programs to understand.
Using Talk – Text to Voice

This app allows you to type and then have the device repeat what was
typed. In this case, instead of the device saying Taryne as “Ta-rin”, it
pronounced it as “Ta-reen”. This is an example of speech recognition
programs still need some work to be done because of emphasis on a
syllable. The codebook did not have Taryne in it, so it was unable to
pronounce her name.
The Future of Assistive Technology
in Schools
Students who need assistance in their writing skills because
they have stronger oral skills.
Students who were absent for a class, have poor memory, or
need assistance hearing the lesson.
Students who need assistance during Guided Reading.

Students who are English Language Learners.

Students with visual/hearing impairments and learning
disabilities regarding reading/spelling/writing.

More Related Content

Viewers also liked

클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312Justin Shin
 
Speech analytics solution overview
Speech analytics solution overviewSpeech analytics solution overview
Speech analytics solution overviewRajkumar Subramanian
 
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillVoice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillKay Lerch
 
How to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsHow to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsSohan Maheshwar
 
Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Sohan Maheshwar
 
KiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonKiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonAlyona Medelyan
 
Designing a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookDesigning a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookKaushik Das
 
Applying Science to Conversational UX Design
Applying Science to Conversational UX DesignApplying Science to Conversational UX Design
Applying Science to Conversational UX DesignRaphael Arar
 
The Journey to conversational interfaces
The Journey to conversational interfacesThe Journey to conversational interfaces
The Journey to conversational interfacesRomin Irani
 
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Tilmann Böhme
 
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...UXPA International
 
Introduction to Chat Bots
Introduction to Chat BotsIntroduction to Chat Bots
Introduction to Chat BotsAlyona Medelyan
 
Chatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethChatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethWithTheBest
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsDatentreiber
 
Build your first messenger bot
Build your first messenger botBuild your first messenger bot
Build your first messenger botNowa Labs Pte Ltd
 
How to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook MessengerHow to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook MessengerMoritz Strube
 
The lifecycle of a chatbot
The lifecycle of a chatbotThe lifecycle of a chatbot
The lifecycle of a chatbotSohan Maheshwar
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at ScaleAmazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at ScaleAmazon Web Services
 

Viewers also liked (20)

클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312클라우드기반 음성변환 서비스 보이스몬제안서_201312
클라우드기반 음성변환 서비스 보이스몬제안서_201312
 
Speech analytics solution overview
Speech analytics solution overviewSpeech analytics solution overview
Speech analytics solution overview
 
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillVoice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
 
How to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsHow to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video Ads
 
Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016
 
KiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonKiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with Python
 
Designing a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookDesigning a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cook
 
ICS2208 lecture4
ICS2208 lecture4ICS2208 lecture4
ICS2208 lecture4
 
Applying Science to Conversational UX Design
Applying Science to Conversational UX DesignApplying Science to Conversational UX Design
Applying Science to Conversational UX Design
 
The Journey to conversational interfaces
The Journey to conversational interfacesThe Journey to conversational interfaces
The Journey to conversational interfaces
 
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
 
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
 
Introduction to Chat Bots
Introduction to Chat BotsIntroduction to Chat Bots
Introduction to Chat Bots
 
Chatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethChatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud Sheth
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
 
Build your first messenger bot
Build your first messenger botBuild your first messenger bot
Build your first messenger bot
 
How to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook MessengerHow to implement chatbots for Alexa and Facebook Messenger
How to implement chatbots for Alexa and Facebook Messenger
 
The lifecycle of a chatbot
The lifecycle of a chatbotThe lifecycle of a chatbot
The lifecycle of a chatbot
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at ScaleAmazon EC2 Systems Manager for Hybrid Cloud Management at Scale
Amazon EC2 Systems Manager for Hybrid Cloud Management at Scale
 

Similar to Speech Recognition, Text to Speech, and Voice Interfaces

Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overviewsajanazoya
 
Introduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechIntroduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechNgwe Tun
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
An communication app for hearing impaired groups
An communication app for hearing impaired groupsAn communication app for hearing impaired groups
An communication app for hearing impaired groupsVanessa Li
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONijistjournal
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Softwareacollier212
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Softwareacollier212
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognitionأحلام انصارى
 
F 08 dragon naturally speaking
F 08 dragon naturally speakingF 08 dragon naturally speaking
F 08 dragon naturally speakingTracy Gilmer
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01girishjoshi1234
 
PurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docxPurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docxmakdul
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentationShamia Garrett
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentationShamia Garrett
 

Similar to Speech Recognition, Text to Speech, and Voice Interfaces (20)

Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Introduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechIntroduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-Speech
 
Proposal presentation.pptx
Proposal presentation.pptxProposal presentation.pptx
Proposal presentation.pptx
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
An communication app for hearing impaired groups
An communication app for hearing impaired groupsAn communication app for hearing impaired groups
An communication app for hearing impaired groups
 
Seminar
SeminarSeminar
Seminar
 
Web AI.pptx
Web AI.pptxWeb AI.pptx
Web AI.pptx
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITION
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Software
 
Turn Talking Software
Turn Talking SoftwareTurn Talking Software
Turn Talking Software
 
Noise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech RecognitionNoise Adaptive Training for Robust Automatic Speech Recognition
Noise Adaptive Training for Robust Automatic Speech Recognition
 
F 08 dragon naturally speaking
F 08 dragon naturally speakingF 08 dragon naturally speaking
F 08 dragon naturally speaking
 
Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01Speechrecognition 100423091251-phpapp01
Speechrecognition 100423091251-phpapp01
 
Synchronous Communication
Synchronous CommunicationSynchronous Communication
Synchronous Communication
 
PurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docxPurposeSpeech recognition software has existed for decades; diff.docx
PurposeSpeech recognition software has existed for decades; diff.docx
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentation
 
Assistive technology presentation
Assistive technology presentationAssistive technology presentation
Assistive technology presentation
 

Recently uploaded

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Speech Recognition, Text to Speech, and Voice Interfaces

  • 1. Speech Recognition, Text-To-Speech, and Voice Interfaces By: Taryne Cahalin Stephanie Sirico Christiana Vasquez Adelphi University - Mobile Learning, Fall 2013
  • 2. What is Speech Recognition? Instead of an automated voice recording that enables a person to press buttons, he or she is able to speak specific words into a device and command orders with the help of a speech recognition program.
  • 3. The Uses Individuals With Disabilities – Assists those who have visual impairment, hand immobility, dyslexia, etc. Medical Transcription – Reduces delays to write out medical transcriptions Dictation - Converts words to text in emails or other word documents (also helpful for English Language Learners). Access Menu Commands – Opens files using voice commands.
  • 5. How does it work? Speech recognition functions as a pipeline: The pipeline converts PCM (pulse code modulation) digital audio into recognized speech from a sound card.
  • 6.
  • 7. Transforming PCM Digital Audio 16,000 PCM values per second, a “wavy line”, that repeat while the user speaks Information is converted for better recognition in the program Fast-Fourier transform identifies frequency components of a specific sound The program can approximate how our ears distinguish the sound
  • 8. Transform PCM digital audio using Fast-Fourier Transform Fast-Fourier analyzes every 1/100th of a second and converts the audio data Each 1/100th produces an amplitude graph These graphs are in a database called a “codebook” Sounds matched to the most similar entry in the codebook. Sound is given a number which describes the sound, called the “feature number”
  • 9. Two Categories Small Vocabulary/many-users: • Leaves room for speech disparity (i.e. accents) • Limited, preset number of commands that are able to be used Large Vocabulary/limited-users: • Best for business settings • Train system to work with a small number of users • Accuracy rate will increase as it learns its users
  • 10. Discrete vs. Continuous Speech Discrete • Easier for program to understand • Noticeable pause after each word Continuous • Allows speaking at conversational speed • Used in most modern systems Programs now can recognize accents and pronunciations better. In earlier programs, accents, pronunciations, speed, and background noise were all variables that made sounds difficult for programs to understand.
  • 11. Using Talk – Text to Voice This app allows you to type and then have the device repeat what was typed. In this case, instead of the device saying Taryne as “Ta-rin”, it pronounced it as “Ta-reen”. This is an example of speech recognition programs still need some work to be done because of emphasis on a syllable. The codebook did not have Taryne in it, so it was unable to pronounce her name.
  • 12. The Future of Assistive Technology in Schools Students who need assistance in their writing skills because they have stronger oral skills. Students who were absent for a class, have poor memory, or need assistance hearing the lesson. Students who need assistance during Guided Reading. Students who are English Language Learners. Students with visual/hearing impairments and learning disabilities regarding reading/spelling/writing.