SlideShare a Scribd company logo
SPEECH RECOGNITION 
Made by : fathi tarek 
Email:ftarek@fcih1.com
History of speech 
recognition: 
 1950s and 1960s: Baby Talk 
 The first speech recognition systems could understand only digits. 
(Given the complexity of human language, it makes sense that 
inventors and engineers first focused on numbers.) Bell 
Laboratories designed in 1952 the "Audrey" system, which 
recognized digits spoken by a single voice. Ten years later, IBM 
demonstrated at the 1962 World's Fair its "Shoebox" machine, 
which could understand 16 words spoken in English. 
 Labs in the United States, Japan, England, and the Soviet Union 
developed other hardware dedicated to recognizing spoken 
sounds, expanding speech recognition technology to support four 
vowels and nine consonants. 
 They may not sound like much, but these first efforts were an 
impressive start, especially when you consider how primitive 
computers themselves were at the time
1970S: SPEECH RECOGNITION 
TAKES OFF 
•Speech recognition technology made major strides in the 1970s, thanks to 
interest and funding from the U.S. Department of Defense. The DoD's DARPA 
Speech Understanding Research (SUR) program, from 1971 to 1976, was one of 
the largest of its kind in the history of speech recognition, and among other 
things it was responsible for Carnegie Mellon's "Harpy" speech-understanding 
system. 
• Harpy could understand 1011 words, approximately the vocabulary of an 
average three-year-oldHarpy was significant because it introduced a more 
efficient search approach, called beam search, to "prove the finite-state network 
of possible sentences," according to Readings in Speech Recognition by Alex 
Waibel and Kai-Fu Lee. (The story of speech recognition is very much tied to 
advances in search methodology and technology, as Google's entrance into 
speech recognition on mobile devices proved just a few years ago.) 
•The '70s also marked a few other important milestones in speech recognition 
technology, including the founding of the first commercial speech recognition 
company, Threshold Technology, as well as Bell Laboratories' introduction of a 
system that could interpret multiple people's voices.
1980S: SPEECH RECOGNITION 
TURNS TOWARD PREDICTION 
Over the next decade, thanks to new approaches to understanding what 
people say, speech recognition vocabulary jumped from about a few hundred 
words to several thousand words, and had the potential to recognize an 
unlimited number of words. One major reason was a new statistical method 
known as the hidden Markov model. 
Rather than simply using templates for words and looking for sound patterns, 
HMM considered the probability of unknown sounds' being words. This 
foundation would be in place for the next two decades (see Automatic Speech 
Recognition—A Brief History of the Technology Development by B.H. Juang 
and Lawrence R. Rabiner). 
Equipped with this expanded vocabulary, speech recognition started to work 
its way into commercial applications for business and specialized industry (for 
instance, medical use). It even entered the home, in the form ofWorlds of 
Wonder's Julie doll(1987), which children could train to respond to their voice. 
("Finally, the doll that understands you.")
In 1990, Dragon launched the first consumer speech recognition 
product,Dragon Dictate, for an incredible price of $9000. Seven years 
later,the much-improved Dragon NaturallySpeaking arrived. The 
applicationrecognized continuous speech, so you could speak, well, 
naturally, atabout 100 words per minute. However, you had to train the 
program for45 minutes, and it was still expensive at $695. 
The advent of the first voice portal, VAL from BellSouth, was in 
1996;VAL was a dial-in interactive voice recognition system that 
wassupposed to give you information based on what you said on the 
phone.VAL paved the way for all the inaccurate voice-activated menus 
thatwould plague callers for the next 15 years and beyond.
2000s: Speech Recognition Plateaus–Until Google Comes Along By 2001, 
computer speech recognition had topped out at 80 percent accuracy, 
and, near the end of the decade, the technology’s progress seemed to 
be stalled. Recognition systems did well when the language universe 
was limited–but they were still “guessing,” with the assistance of 
statistical models, among similar-sounding words, and the known 
language universe continued to grow as the Internet grew. 
Did you know speech recognition and voice commands were built 
into Windows Vista and Mac OS X? Manycomputer users weren’t aware 
that those features existed. WindowsSpeech Recognition and OS X’s 
voice commands were interesting, but notas accurate or as easy to use 
as a plain old keyboard and mouse.
In 2010, Google added “personalized recognition” to Voice Search 
on Android phones, so that thesoftware could record users’ voice searches 
and produce a more accuratespeech model. The company also added 
Voice Search to its Chrome browserin mid-2011. Remember how we 
started with 10 to 100 words, and thengraduated to a few thousand? 
Google’s English Voice Search system nowincorporates 230 billion words 
from actual user queries. 
And now along comes Siri. Like Google’s Voice Search, Siri relies oncloud-based 
processing. It draws what it knows about you to generate 
acontextual reply, and it responds to your voice input with personality.(As 
my PCWorld colleague David Daw points out: “It’s not just fun butfunny. 
When you ask Siri the meaning of life, it tells you ’42’ or ‘Allevidence to 
date points to chocolate.’ If you tell it you want to hidea body, it helpfully 
volunteers nearby dumps and metal foundries.”) 
Speech recognition has gone from utility to entertainment. The 
childseems all grown up.
THE FUTURE 
Accurate, Ubiquitous Speech 
The explosion of voice recognition apps indicates that 
speechrecognition’s time has come, and that you can expect plenty 
more appsin the future. These apps will not only let you control your PC 
byvoice or convert voice to text–they’ll also support multiplelanguages, 
offer assorted speaker voices for you to choose from, andintegrate into 
every part of your mobile devices (that is, they’llovercome Siri’s 
shortcomings). 
The quality of speech recognition apps will improve, too. For 
instance,Sensory’sTrulyhandsfreeVoice Control can hear and 
understand you,even in noisy environments.
WHAT IS SPEECH RECOGNITION?? 
Speech recognition is the ability of a machine or program to identify 
words and phrases in spoken language and convert them to a machine-readable 
format. 
Another definition 
Speech recognition is an alternative to typing on a keyboard. Put 
simply, you talk to the computer ,mobiles and your words appear on the 
screen. The software has been developed to provide a fast method of 
writing on a computer and can help people with a variety of disabilities. 
It is useful for people with physical disabilities who often find typing 
difficult, painful or impossible. Voice-recognition software can also help 
those with spelling difficulties, including users with dyslexia, because 
recognized words are almost always correctly spelled.
However, speech is more than sequences of phones that forms words 
and 
sentences. There are contents of speech that carries information, e.g. 
the 
prosody of the speech indicates grammatical structures, and the stress 
of a 
word signals its importance/topicality. This information is sometimes 
called 
the paralinguistic content of speech
 Advantages 
 Speech is a very natural way to interact, and it is 
not necessary to sit at a keyboard or work with a 
remote control. 
 No training required for users! 
 Disadvantages 
 Even the best speech recognition systems 
sometimes make errors. If there is noise or some 
other sound in the room (e.g. the television or a 
kettle boiling), the number of errors will increase. 
 Speech Recognition works best if the 
microphone is close to the user (e.g. in a phone, 
or if the user is wearing a microphone). More 
distant microphones (e.g. on a table or wall) will 
tend to increase the number of errors.
Voice recognition software 
 Voice-recognition software programs work by analyzing 
sounds and converting them to text. They also use 
knowledge of how English is usually spoken to decide 
what the speaker most probably said. Once correctly set 
up, the systems should recognize around 95% of what is 
said if you speak clearly.
Voice recognition in 
operating systems 
 Mobile Devices / Smart phones 
Many cell phone handsets have basic dial-by-voice 
features built in. Smartphones such as 
iPhone or Blackberry also support this. A 
number of 3rd party Apps have implemented 
natural language speech recognition support, 
including:
 Smart phones and mobile devices are in the middle 
of major innovations in technology to provide 
hands-free access to features and navigation, often 
called voice commands, voice-enabled, voice 
actions or speech recognition. This technology has 
major implications for use by people who have 
disabilities as assistive technology. As long as a user 
has a strong, clear voice, these devices become 
easier to use and give increased access to use of the 
Internet, use of mobile devices and communication 
accessibility.
 Windows 7 built-in speech recognition 
 The Windows Speech Recognition by Microsoft is the speech recognition 
system that comes built into Windows Vista andWindows 7. Windows 
Vista and Windows 7 include version 8.0 of the Microsoft speech recognition 
engine. Speech Recognition is available only in English, French, Spanish, 
German, Japanese, Simplified Chinese, and Traditional Chinese and only in 
the corresponding version of Windows. That means that you can not use the 
French speech recognition engine if you use an English version of Windows. 
 Windows XP or 2000 only 
 e-Speaking – software for Windows XP that facilitates use of 
the Microsoft Speech API by adding ability to create commands to perform 
custom actions. 
 Microsoft Speech API – Speech recognition functionality included as part of 
Microsoft Office and onTablet PCs running Microsoft Windows XP Tablet PC 
Edition. It can also be downloaded as part of the Speech SDK 5.1 for 
Windows applications, but since that is aimed at developers building speech 
applications, the pure SDK form lacks any user interface, and thus is 
unsuitable for end users. 
 Vestec Inc. - Specializing in Natural Language Understanding and Speech 
Recognition solutions. ASR, NLU and TTS engines support 17 languages in 
server, embedded (on low cost chip) or cloud based environments.
Macintosh
Types of speech recognition 
1. Text-To-Speech: 
As it sounds, Text-To-Speech (or TTS) will 
manipulate a string of text into an audio clip. 
It is useful for blind people to be able to use 
computers but can also be used to simply 
improve computer experience. There are 
several programs available that perform TTS, 
some of which are command-line based 
(ideal for scripting) and others which provide 
a handy GUI.
2. Simple Voice Control/Commands: 
This is the most basic form of Speech-To-Text 
application. These are designed to recognize 
a small number of specific, typically one-word 
commands and then perform an action. This 
is often used as an alternative to an 
application launcher, allowing the user for 
instance to say the word “firefox” and have 
his OS open a new browser window.
3.Full dictation/recognition: 
Full dictation/recognition software allows the 
user to read full sentences or paragraphs and 
translates that data into text on the fly. This 
could be used, for instance, to dictate an 
entire letter into the window of an email 
client. In some cases, these types of 
applications need to be trained to your voice 
and can improve in accuracy the more they 
are used
Thank you

More Related Content

What's hot

Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
Ahmed Moawad
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
International Islamic University
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
himanshubhatti
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
Aamir-sheriff
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
SrijanKumar18
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
Seminar Links
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
Charu Joshi
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
Manthan Gandhi
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
sajanazoya
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
sowmith chatlapally
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
Richie
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
Iqbal
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
Alok Tiwari
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technology
Kalluri Madhuri
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
Iqbal
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition system
REHMAT ULLAH
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
sonukumar142
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognitionVinay Jaisriram
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
Goa App
 

What's hot (20)

Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Automatic Speech Recognition
Automatic Speech RecognitionAutomatic Speech Recognition
Automatic Speech Recognition
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
A seminar report on speech recognition technology
A seminar report on speech recognition technologyA seminar report on speech recognition technology
A seminar report on speech recognition technology
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech recognition An overview
Speech recognition An overviewSpeech recognition An overview
Speech recognition An overview
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Speech synthesis technology
Speech synthesis technologySpeech synthesis technology
Speech synthesis technology
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition system
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech recognition techniques
Speech recognition techniquesSpeech recognition techniques
Speech recognition techniques
 
Abstract of speech recognition
Abstract of speech recognitionAbstract of speech recognition
Abstract of speech recognition
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 

Viewers also liked

Impasse in a detention unit
Impasse in a detention unitImpasse in a detention unit
Impasse in a detention unit
joannakato
 
Annisaa noviyanti
Annisaa noviyantiAnnisaa noviyanti
Annisaa noviyanti
panglimaagus
 
Узагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметникУзагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметник
Irinaochakov
 
Huong dan thiet ke mo phong
Huong dan thiet ke mo phongHuong dan thiet ke mo phong
Huong dan thiet ke mo phongMai Thanh
 
High impact leadership for emerging leaders
High impact leadership for emerging leadersHigh impact leadership for emerging leaders
High impact leadership for emerging leaders
Craig Bihari
 
喜多福蠔油
喜多福蠔油喜多福蠔油
喜多福蠔油
彥欣 李
 
Joints and cartilages
Joints and cartilagesJoints and cartilages
Joints and cartilagesalshahbaa
 
Calendário de eventos e treinamentos 2016 segundo SEMESTRE
Calendário de eventos e treinamentos 2016 segundo SEMESTRECalendário de eventos e treinamentos 2016 segundo SEMESTRE
Calendário de eventos e treinamentos 2016 segundo SEMESTRE
AGENCIAUTO/MT - Associação dos Revendedores de Veículos do Estado de Mato Grosso
 
Präsentationsmodus
PräsentationsmodusPräsentationsmodus
Präsentationsmodus
edelweiss_Deutschland
 
Design Portfolio
Design PortfolioDesign Portfolio
Design Portfolio
Cindy Van
 
FINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDERENFINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDERENArnaud Veere
 
Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Варвара Разумовская
 
My app
My appMy app
My app
Lola5630
 
IGF Norway 2014 12-09
IGF Norway 2014 12-09IGF Norway 2014 12-09
IGF Norway 2014 12-09
Hans Petter Holen
 
Steven Glick Resume
Steven Glick ResumeSteven Glick Resume
Steven Glick ResumeSteve Glick
 
Session iv(master pages)
Session iv(master pages)Session iv(master pages)
Session iv(master pages)
Shrijan Tiwari
 
Morales SBA deck 2-10-15
Morales SBA deck 2-10-15Morales SBA deck 2-10-15
Morales SBA deck 2-10-15Mark Morales
 
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Glucksman Library, University of Limerick
 

Viewers also liked (20)

вирус бронзавости парадајза
вирус бронзавости парадајзавирус бронзавости парадајза
вирус бронзавости парадајза
 
Impasse in a detention unit
Impasse in a detention unitImpasse in a detention unit
Impasse in a detention unit
 
Annisaa noviyanti
Annisaa noviyantiAnnisaa noviyanti
Annisaa noviyanti
 
Узагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметникУзагальнення і систематизація вивченого про прикметник
Узагальнення і систематизація вивченого про прикметник
 
Huong dan thiet ke mo phong
Huong dan thiet ke mo phongHuong dan thiet ke mo phong
Huong dan thiet ke mo phong
 
High impact leadership for emerging leaders
High impact leadership for emerging leadersHigh impact leadership for emerging leaders
High impact leadership for emerging leaders
 
喜多福蠔油
喜多福蠔油喜多福蠔油
喜多福蠔油
 
looseleaf_Portfolio
looseleaf_Portfoliolooseleaf_Portfolio
looseleaf_Portfolio
 
Joints and cartilages
Joints and cartilagesJoints and cartilages
Joints and cartilages
 
Calendário de eventos e treinamentos 2016 segundo SEMESTRE
Calendário de eventos e treinamentos 2016 segundo SEMESTRECalendário de eventos e treinamentos 2016 segundo SEMESTRE
Calendário de eventos e treinamentos 2016 segundo SEMESTRE
 
Präsentationsmodus
PräsentationsmodusPräsentationsmodus
Präsentationsmodus
 
Design Portfolio
Design PortfolioDesign Portfolio
Design Portfolio
 
FINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDERENFINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
FINAL SHEETS SEMINAR 21 MAART CHINA OUDEREN
 
Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"Инна Иванова "О проектных людях и коллаборации в онлайн среде"
Инна Иванова "О проектных людях и коллаборации в онлайн среде"
 
My app
My appMy app
My app
 
IGF Norway 2014 12-09
IGF Norway 2014 12-09IGF Norway 2014 12-09
IGF Norway 2014 12-09
 
Steven Glick Resume
Steven Glick ResumeSteven Glick Resume
Steven Glick Resume
 
Session iv(master pages)
Session iv(master pages)Session iv(master pages)
Session iv(master pages)
 
Morales SBA deck 2-10-15
Morales SBA deck 2-10-15Morales SBA deck 2-10-15
Morales SBA deck 2-10-15
 
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
Moving Special Collections Out of the Basement - Promotion & Outreach at Univ...
 

Similar to Speech Recognition

Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionshanle03
 
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
aryan532920
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
RHIMRJ Journal
 
virtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdfvirtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdf
HarshKumar534677
 
Virtual personal assistant
Virtual personal assistantVirtual personal assistant
Virtual personal assistant
Shubham Bhalekar
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phone
IJERA Editor
 
Presentation 204 lisa bruening aac in times of change
Presentation 204  lisa bruening aac in times of changePresentation 204  lisa bruening aac in times of change
Presentation 204 lisa bruening aac in times of change
The ALS Association
 
The concept of Voice Recognition.
The concept of Voice Recognition.The concept of Voice Recognition.
The concept of Voice Recognition.
NithishKumar366585
 
Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...
Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...
Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...
GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?CarterRodriguez6
 
Speakeasy 04 2017
Speakeasy 04 2017Speakeasy 04 2017
Speakeasy 04 2017
Steve Silver
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...
TELKOMNIKA JOURNAL
 
The Affordances Of Mobile Technologies
The Affordances Of Mobile TechnologiesThe Affordances Of Mobile Technologies
The Affordances Of Mobile Technologies
Neil Milliken
 
A Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UIA Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UI
LHBS
 
Voice recognition
Voice recognitionVoice recognition
Voice recognition
pooja kumari
 
Voice Tech TO #1
Voice Tech TO #1Voice Tech TO #1
Voice Tech TO #1
Voice Tech Global
 
Speak easy global edition
Speak easy global editionSpeak easy global edition
Speak easy global edition
WEB制作仲間
 
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
IOSR Journals
 

Similar to Speech Recognition (20)

Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx Procedia Computer Science   94  ( 2016 )  295 – 301 Avail.docx
Procedia Computer Science 94 ( 2016 ) 295 – 301 Avail.docx
 
Artificial Intelligence for Speech Recognition
Artificial Intelligence for Speech RecognitionArtificial Intelligence for Speech Recognition
Artificial Intelligence for Speech Recognition
 
virtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdfvirtual-assistant-160214154006.pdf
virtual-assistant-160214154006.pdf
 
Virtual personal assistant
Virtual personal assistantVirtual personal assistant
Virtual personal assistant
 
Wake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phoneWake-up-word speech recognition using GPS on smart phone
Wake-up-word speech recognition using GPS on smart phone
 
Presentation 204 lisa bruening aac in times of change
Presentation 204  lisa bruening aac in times of changePresentation 204  lisa bruening aac in times of change
Presentation 204 lisa bruening aac in times of change
 
Seminar
SeminarSeminar
Seminar
 
The concept of Voice Recognition.
The concept of Voice Recognition.The concept of Voice Recognition.
The concept of Voice Recognition.
 
Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...
Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...
Unlocking the Potential of Speech Recognition Dataset: A Key to Advancing AI ...
 
Speech recognition - how does it work?
Speech recognition - how does it work?Speech recognition - how does it work?
Speech recognition - how does it work?
 
Speakeasy 04 2017
Speakeasy 04 2017Speakeasy 04 2017
Speakeasy 04 2017
 
Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...Speech Recognition Application for the Speech Impaired using the Android-base...
Speech Recognition Application for the Speech Impaired using the Android-base...
 
The Affordances Of Mobile Technologies
The Affordances Of Mobile TechnologiesThe Affordances Of Mobile Technologies
The Affordances Of Mobile Technologies
 
A Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UIA Little More Conversation: Branding with Voice UI
A Little More Conversation: Branding with Voice UI
 
Voice recognition
Voice recognitionVoice recognition
Voice recognition
 
Voice Tech TO #1
Voice Tech TO #1Voice Tech TO #1
Voice Tech TO #1
 
Amadou
AmadouAmadou
Amadou
 
Speak easy global edition
Speak easy global editionSpeak easy global edition
Speak easy global edition
 
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...Advances in Automatic Speech Recognition: From Audio-Only  To Audio-Visual Sp...
Advances in Automatic Speech Recognition: From Audio-Only To Audio-Visual Sp...
 

Recently uploaded

Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
muralinath2
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
rakeshsharma20142015
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 

Recently uploaded (20)

Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
Circulatory system_ Laplace law. Ohms law.reynaults law,baro-chemo-receptors-...
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 

Speech Recognition

  • 1. SPEECH RECOGNITION Made by : fathi tarek Email:ftarek@fcih1.com
  • 2. History of speech recognition:  1950s and 1960s: Baby Talk  The first speech recognition systems could understand only digits. (Given the complexity of human language, it makes sense that inventors and engineers first focused on numbers.) Bell Laboratories designed in 1952 the "Audrey" system, which recognized digits spoken by a single voice. Ten years later, IBM demonstrated at the 1962 World's Fair its "Shoebox" machine, which could understand 16 words spoken in English.  Labs in the United States, Japan, England, and the Soviet Union developed other hardware dedicated to recognizing spoken sounds, expanding speech recognition technology to support four vowels and nine consonants.  They may not sound like much, but these first efforts were an impressive start, especially when you consider how primitive computers themselves were at the time
  • 3. 1970S: SPEECH RECOGNITION TAKES OFF •Speech recognition technology made major strides in the 1970s, thanks to interest and funding from the U.S. Department of Defense. The DoD's DARPA Speech Understanding Research (SUR) program, from 1971 to 1976, was one of the largest of its kind in the history of speech recognition, and among other things it was responsible for Carnegie Mellon's "Harpy" speech-understanding system. • Harpy could understand 1011 words, approximately the vocabulary of an average three-year-oldHarpy was significant because it introduced a more efficient search approach, called beam search, to "prove the finite-state network of possible sentences," according to Readings in Speech Recognition by Alex Waibel and Kai-Fu Lee. (The story of speech recognition is very much tied to advances in search methodology and technology, as Google's entrance into speech recognition on mobile devices proved just a few years ago.) •The '70s also marked a few other important milestones in speech recognition technology, including the founding of the first commercial speech recognition company, Threshold Technology, as well as Bell Laboratories' introduction of a system that could interpret multiple people's voices.
  • 4. 1980S: SPEECH RECOGNITION TURNS TOWARD PREDICTION Over the next decade, thanks to new approaches to understanding what people say, speech recognition vocabulary jumped from about a few hundred words to several thousand words, and had the potential to recognize an unlimited number of words. One major reason was a new statistical method known as the hidden Markov model. Rather than simply using templates for words and looking for sound patterns, HMM considered the probability of unknown sounds' being words. This foundation would be in place for the next two decades (see Automatic Speech Recognition—A Brief History of the Technology Development by B.H. Juang and Lawrence R. Rabiner). Equipped with this expanded vocabulary, speech recognition started to work its way into commercial applications for business and specialized industry (for instance, medical use). It even entered the home, in the form ofWorlds of Wonder's Julie doll(1987), which children could train to respond to their voice. ("Finally, the doll that understands you.")
  • 5. In 1990, Dragon launched the first consumer speech recognition product,Dragon Dictate, for an incredible price of $9000. Seven years later,the much-improved Dragon NaturallySpeaking arrived. The applicationrecognized continuous speech, so you could speak, well, naturally, atabout 100 words per minute. However, you had to train the program for45 minutes, and it was still expensive at $695. The advent of the first voice portal, VAL from BellSouth, was in 1996;VAL was a dial-in interactive voice recognition system that wassupposed to give you information based on what you said on the phone.VAL paved the way for all the inaccurate voice-activated menus thatwould plague callers for the next 15 years and beyond.
  • 6. 2000s: Speech Recognition Plateaus–Until Google Comes Along By 2001, computer speech recognition had topped out at 80 percent accuracy, and, near the end of the decade, the technology’s progress seemed to be stalled. Recognition systems did well when the language universe was limited–but they were still “guessing,” with the assistance of statistical models, among similar-sounding words, and the known language universe continued to grow as the Internet grew. Did you know speech recognition and voice commands were built into Windows Vista and Mac OS X? Manycomputer users weren’t aware that those features existed. WindowsSpeech Recognition and OS X’s voice commands were interesting, but notas accurate or as easy to use as a plain old keyboard and mouse.
  • 7. In 2010, Google added “personalized recognition” to Voice Search on Android phones, so that thesoftware could record users’ voice searches and produce a more accuratespeech model. The company also added Voice Search to its Chrome browserin mid-2011. Remember how we started with 10 to 100 words, and thengraduated to a few thousand? Google’s English Voice Search system nowincorporates 230 billion words from actual user queries. And now along comes Siri. Like Google’s Voice Search, Siri relies oncloud-based processing. It draws what it knows about you to generate acontextual reply, and it responds to your voice input with personality.(As my PCWorld colleague David Daw points out: “It’s not just fun butfunny. When you ask Siri the meaning of life, it tells you ’42’ or ‘Allevidence to date points to chocolate.’ If you tell it you want to hidea body, it helpfully volunteers nearby dumps and metal foundries.”) Speech recognition has gone from utility to entertainment. The childseems all grown up.
  • 8. THE FUTURE Accurate, Ubiquitous Speech The explosion of voice recognition apps indicates that speechrecognition’s time has come, and that you can expect plenty more appsin the future. These apps will not only let you control your PC byvoice or convert voice to text–they’ll also support multiplelanguages, offer assorted speaker voices for you to choose from, andintegrate into every part of your mobile devices (that is, they’llovercome Siri’s shortcomings). The quality of speech recognition apps will improve, too. For instance,Sensory’sTrulyhandsfreeVoice Control can hear and understand you,even in noisy environments.
  • 9. WHAT IS SPEECH RECOGNITION?? Speech recognition is the ability of a machine or program to identify words and phrases in spoken language and convert them to a machine-readable format. Another definition Speech recognition is an alternative to typing on a keyboard. Put simply, you talk to the computer ,mobiles and your words appear on the screen. The software has been developed to provide a fast method of writing on a computer and can help people with a variety of disabilities. It is useful for people with physical disabilities who often find typing difficult, painful or impossible. Voice-recognition software can also help those with spelling difficulties, including users with dyslexia, because recognized words are almost always correctly spelled.
  • 10. However, speech is more than sequences of phones that forms words and sentences. There are contents of speech that carries information, e.g. the prosody of the speech indicates grammatical structures, and the stress of a word signals its importance/topicality. This information is sometimes called the paralinguistic content of speech
  • 11.  Advantages  Speech is a very natural way to interact, and it is not necessary to sit at a keyboard or work with a remote control.  No training required for users!  Disadvantages  Even the best speech recognition systems sometimes make errors. If there is noise or some other sound in the room (e.g. the television or a kettle boiling), the number of errors will increase.  Speech Recognition works best if the microphone is close to the user (e.g. in a phone, or if the user is wearing a microphone). More distant microphones (e.g. on a table or wall) will tend to increase the number of errors.
  • 12. Voice recognition software  Voice-recognition software programs work by analyzing sounds and converting them to text. They also use knowledge of how English is usually spoken to decide what the speaker most probably said. Once correctly set up, the systems should recognize around 95% of what is said if you speak clearly.
  • 13. Voice recognition in operating systems  Mobile Devices / Smart phones Many cell phone handsets have basic dial-by-voice features built in. Smartphones such as iPhone or Blackberry also support this. A number of 3rd party Apps have implemented natural language speech recognition support, including:
  • 14.
  • 15.  Smart phones and mobile devices are in the middle of major innovations in technology to provide hands-free access to features and navigation, often called voice commands, voice-enabled, voice actions or speech recognition. This technology has major implications for use by people who have disabilities as assistive technology. As long as a user has a strong, clear voice, these devices become easier to use and give increased access to use of the Internet, use of mobile devices and communication accessibility.
  • 16.  Windows 7 built-in speech recognition  The Windows Speech Recognition by Microsoft is the speech recognition system that comes built into Windows Vista andWindows 7. Windows Vista and Windows 7 include version 8.0 of the Microsoft speech recognition engine. Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows. That means that you can not use the French speech recognition engine if you use an English version of Windows.  Windows XP or 2000 only  e-Speaking – software for Windows XP that facilitates use of the Microsoft Speech API by adding ability to create commands to perform custom actions.  Microsoft Speech API – Speech recognition functionality included as part of Microsoft Office and onTablet PCs running Microsoft Windows XP Tablet PC Edition. It can also be downloaded as part of the Speech SDK 5.1 for Windows applications, but since that is aimed at developers building speech applications, the pure SDK form lacks any user interface, and thus is unsuitable for end users.  Vestec Inc. - Specializing in Natural Language Understanding and Speech Recognition solutions. ASR, NLU and TTS engines support 17 languages in server, embedded (on low cost chip) or cloud based environments.
  • 18. Types of speech recognition 1. Text-To-Speech: As it sounds, Text-To-Speech (or TTS) will manipulate a string of text into an audio clip. It is useful for blind people to be able to use computers but can also be used to simply improve computer experience. There are several programs available that perform TTS, some of which are command-line based (ideal for scripting) and others which provide a handy GUI.
  • 19. 2. Simple Voice Control/Commands: This is the most basic form of Speech-To-Text application. These are designed to recognize a small number of specific, typically one-word commands and then perform an action. This is often used as an alternative to an application launcher, allowing the user for instance to say the word “firefox” and have his OS open a new browser window.
  • 20. 3.Full dictation/recognition: Full dictation/recognition software allows the user to read full sentences or paragraphs and translates that data into text on the fly. This could be used, for instance, to dictate an entire letter into the window of an email client. In some cases, these types of applications need to be trained to your voice and can improve in accuracy the more they are used