SlideShare a Scribd company logo
Athens University of Economics
Communicating with PC
 Traditional ways
 Mouse
 Keyboard (printer)
Communicating with PC
 Traditional Ways
 Keyboard
 Mouse
 Printer
 Modern Ways
 touch
 speech
 Movement
Speech
 Speech Synthesis
Speech
 Speech Synthesis
 Speech Recognition
Speech Synthesis
 Input: Text
 Output: Audio stream
Speech Recognition
 Input: Audio stream
 Output: Text
Used In
 Movies 
Used In
 Movies 
 Automatic translations
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
 Mobiles
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
 Movies
 Robotics
Used In
 Movies 
 Automatic Translation
 Learning Foreign Languages
 Movies
 Robotics
 Games
 Nintendo Wii
 Project Natal (Kinect)
What options do we have today;
 Acapela
What options do we have today;
 Acapela
 Java Speech API
What options do we have today;
 Acapela
 Java Speech API
 Dictaphones
Τι επιλογές έτοσμε σήμερα;
 Acapela
 Java Speech API
 Dictaphones
 etc
 Still a long way to go….
What we see here
 Windows Speech API (SAPI)
with .NET 4.0!
 System.Speech;
Why SAPI;
 free
 Quite accurate
 Easily programmable
History of SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
History of SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
 1998: SAPI 4.0
 C++ wrapper classes
 ActiveX for Visual basic
History of SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
 1998: SAPI 4.0
 C++ wrapper classes
 ActiveX for Visual basic
 2006: SAPI 5.3
 Windows Vista
Ιστορία τοσ SAPI
 1994: SAPI 1.0
 Windows 95 / Windows NT
 1998: SAPI 4.0
 C++ wrapper classes
 ActiveX for Visual basic
 2006: SAPI 5.3
 Windows Vista
 2009: SAPI 5.4
 Windows 7
Αλλαγές στα Windows Vista & 7
 Αναβαθμισμένη Speech Recognition
engine
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
 Checks the UI operation
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
 Checks the UI operation
 Supports more languages -
 English US & UK, Chinese traditional & simplified,
Japanese, German, French, Spanish
Changes in Windows Vista & 7
 Upgraded Speech Recognition engine
 Separate application with its own GUI
 Checks the UI operation
 Supports more languages -
 English US & UK, Chinese traditional & simplified,
Japanese, German, French, Spanish
 Managed code speech API (.ΝΕΤ 3.0)
What we use
Technologies
• .NET Framework 4.0
• C# programming language
• Windows Presentation Foundation
Tools
• Windows 7
• Visual Studio 2010
• FREE @ MSDNAA
Windows Speech Synthesis
 Converts words into voice
 Internet settings like:
 intensity
 Pronunciation (voice)
 Introducing WAV files
 By default, uses Microsoft Anna
DEMO 1
Windows Speech Recognition
 Uses machine learning algorithms
 Continuously Trained
 Trains using the user’s voice
 Can be used for remote control of the
PC 
DEMO 2
Links
 Venus
 StudentGuru
 Exploring Speech Recognition &
Synthesis
 Speech Recognition with C# - Dictation
and custom grammar
Thank you 
Vangos Pterneas
www.vangos.eu
www.vangos.eu/blog

More Related Content

What's hot

Voice Recognition
Voice RecognitionVoice Recognition
Voice RecognitionAmrita More
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project reportSarang Afle
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySeminar Links
 
Text to speech converter in C#.NET
Text to speech converter in C#.NETText to speech converter in C#.NET
Text to speech converter in C#.NETMandeep Cheema
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognitionsowmith chatlapally
 
Emotion based music player
Emotion based music playerEmotion based music player
Emotion based music playerNizam Muhammed
 
Text to speech with Google Cloud
Text to speech with Google CloudText to speech with Google Cloud
Text to speech with Google CloudRajarshi Ghosh
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overviewVarun Jain
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
 
Visual speech to text conversion applicable to telephone communication
Visual speech to text conversion  applicable  to telephone communicationVisual speech to text conversion  applicable  to telephone communication
Visual speech to text conversion applicable to telephone communicationSwathi Venugopal
 
Introduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechIntroduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechNgwe Tun
 
Speech emotion recognition
Speech emotion recognitionSpeech emotion recognition
Speech emotion recognitionsaniya shaikh
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognitionCharu Joshi
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition systemAlok Tiwari
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemREHMAT ULLAH
 
Voice based email for blinds
Voice based email for blindsVoice based email for blinds
Voice based email for blindsArjun AJ
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 

What's hot (20)

Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
Speech recognition project report
Speech recognition project reportSpeech recognition project report
Speech recognition project report
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Text to speech converter in C#.NET
Text to speech converter in C#.NETText to speech converter in C#.NET
Text to speech converter in C#.NET
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Artificial intelligence for speech recognition
Artificial intelligence for speech recognitionArtificial intelligence for speech recognition
Artificial intelligence for speech recognition
 
Emotion based music player
Emotion based music playerEmotion based music player
Emotion based music player
 
Text to speech with Google Cloud
Text to speech with Google CloudText to speech with Google Cloud
Text to speech with Google Cloud
 
Speech recognition an overview
Speech recognition   an overviewSpeech recognition   an overview
Speech recognition an overview
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Visual speech to text conversion applicable to telephone communication
Visual speech to text conversion  applicable  to telephone communicationVisual speech to text conversion  applicable  to telephone communication
Visual speech to text conversion applicable to telephone communication
 
Introduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-SpeechIntroduction to myanmar Text-To-Speech
Introduction to myanmar Text-To-Speech
 
Speech emotion recognition
Speech emotion recognitionSpeech emotion recognition
Speech emotion recognition
 
Speech recognition
Speech recognitionSpeech recognition
Speech recognition
 
Automatic speech recognition system
Automatic speech recognition systemAutomatic speech recognition system
Automatic speech recognition system
 
Artificial intelligence Speech recognition system
Artificial intelligence Speech recognition systemArtificial intelligence Speech recognition system
Artificial intelligence Speech recognition system
 
Voice based email for blinds
Voice based email for blindsVoice based email for blinds
Voice based email for blinds
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 

Viewers also liked

Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Text to speech and word predicition
Text to speech and word predicitionText to speech and word predicition
Text to speech and word predicitionHindie Dershowitz
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminarDiptimaya Sarangi
 
Gujarati Text-to-Speech Presentation
Gujarati Text-to-Speech PresentationGujarati Text-to-Speech Presentation
Gujarati Text-to-Speech Presentationsamyakbhuta
 
Developing with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile AppsDeveloping with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile AppsNick Landry
 
Speech Recognition as a User Interface
Speech Recognition as a User InterfaceSpeech Recognition as a User Interface
Speech Recognition as a User InterfaceJared Sheehan
 
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...TAUS - The Language Data Network
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentationshahinmehr
 
Voice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinksVoice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinksSJones87
 
Myanmar Text To Speech Engine
Myanmar Text To Speech EngineMyanmar Text To Speech Engine
Myanmar Text To Speech EngineThin Zar Phyo
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by IqbalIqbal
 
IBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API LabIBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API LabColin McCabe
 
Good presentation!
Good presentation!Good presentation!
Good presentation!Arry Arman
 

Viewers also liked (16)

Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Text to speech and word predicition
Text to speech and word predicitionText to speech and word predicition
Text to speech and word predicition
 
Speech recognition system seminar
Speech recognition system seminarSpeech recognition system seminar
Speech recognition system seminar
 
Gujarati Text-to-Speech Presentation
Gujarati Text-to-Speech PresentationGujarati Text-to-Speech Presentation
Gujarati Text-to-Speech Presentation
 
Developing with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile AppsDeveloping with Speech and Voice Recognition in Mobile Apps
Developing with Speech and Voice Recognition in Mobile Apps
 
Odf2 Daisy
Odf2 DaisyOdf2 Daisy
Odf2 Daisy
 
E speak aegis-workshop
E speak aegis-workshopE speak aegis-workshop
E speak aegis-workshop
 
Speech Recognition as a User Interface
Speech Recognition as a User InterfaceSpeech Recognition as a User Interface
Speech Recognition as a User Interface
 
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
Spoken Language Translation, Past, Present, and Future, by Mark Seligman, Spo...
 
Voice To Text Presentation
Voice To Text PresentationVoice To Text Presentation
Voice To Text Presentation
 
Voice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinksVoice to text voice to sign with hyperlinks
Voice to text voice to sign with hyperlinks
 
Myanmar Text To Speech Engine
Myanmar Text To Speech EngineMyanmar Text To Speech Engine
Myanmar Text To Speech Engine
 
PPT on Android
PPT on AndroidPPT on Android
PPT on Android
 
Speech Recognition by Iqbal
Speech Recognition by IqbalSpeech Recognition by Iqbal
Speech Recognition by Iqbal
 
IBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API LabIBM Bootcamp - Text to Speech API Lab
IBM Bootcamp - Text to Speech API Lab
 
Good presentation!
Good presentation!Good presentation!
Good presentation!
 

Similar to Text to-speech & voice recognition

windows CE
windows CEwindows CE
windows CEbretorio
 
Comparison of Voice Assistant SDKs for Embedded Linux Devices
 Comparison of Voice Assistant SDKs for Embedded Linux Devices Comparison of Voice Assistant SDKs for Embedded Linux Devices
Comparison of Voice Assistant SDKs for Embedded Linux DevicesLeon Anavi
 
Howcasts: Instructional Videos for Library Users
Howcasts:  Instructional Videos for Library UsersHowcasts:  Instructional Videos for Library Users
Howcasts: Instructional Videos for Library UsersJeff Lewandowski
 
Automation Open Source tools
Automation Open Source toolsAutomation Open Source tools
Automation Open Source toolsQA Club Kiev
 
Formate factory22
Formate factory22Formate factory22
Formate factory22Asma Saeed
 
Developing FirefoxOS
Developing FirefoxOSDeveloping FirefoxOS
Developing FirefoxOSFred Lin
 
Development workflow
Development workflowDevelopment workflow
Development workflowSigsiu.NET
 
Windows 7 uudistuksia
Windows 7 uudistuksiaWindows 7 uudistuksia
Windows 7 uudistuksiaVaihde 7
 
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesCloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesVMware Tanzu
 
Adobe premiere pro cs6
Adobe premiere pro cs6Adobe premiere pro cs6
Adobe premiere pro cs6K-M1
 
Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...Verhaert Masters in Innovation
 
Tracking your Technical Debt with Sonarqube
Tracking your Technical Debt with SonarqubeTracking your Technical Debt with Sonarqube
Tracking your Technical Debt with SonarqubePuppet
 
System development using visual studio
System development using visual studioSystem development using visual studio
System development using visual studiojeff23_athisbest
 
Immutable Server generation: The new App Deployment
Immutable Server generation: The new App DeploymentImmutable Server generation: The new App Deployment
Immutable Server generation: The new App DeploymentAxel Fontaine
 
Cloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San FranciscoCloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San FranciscoVMware Tanzu
 
Cloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - DallasCloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - DallasVMware Tanzu
 
Cloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - DenverCloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - DenverVMware Tanzu
 
Cloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: GoogleCloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: GoogleVMware Tanzu
 

Similar to Text to-speech & voice recognition (20)

windows CE
windows CEwindows CE
windows CE
 
Comparison of Voice Assistant SDKs for Embedded Linux Devices
 Comparison of Voice Assistant SDKs for Embedded Linux Devices Comparison of Voice Assistant SDKs for Embedded Linux Devices
Comparison of Voice Assistant SDKs for Embedded Linux Devices
 
Howcasts: Instructional Videos for Library Users
Howcasts:  Instructional Videos for Library UsersHowcasts:  Instructional Videos for Library Users
Howcasts: Instructional Videos for Library Users
 
Automation Open Source tools
Automation Open Source toolsAutomation Open Source tools
Automation Open Source tools
 
Formate factory22
Formate factory22Formate factory22
Formate factory22
 
Developing FirefoxOS
Developing FirefoxOSDeveloping FirefoxOS
Developing FirefoxOS
 
Development workflow
Development workflowDevelopment workflow
Development workflow
 
Windows 7 uudistuksia
Windows 7 uudistuksiaWindows 7 uudistuksia
Windows 7 uudistuksia
 
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los AngelesCloud-Native Roadshow Google Cloud Platform - Los Angeles
Cloud-Native Roadshow Google Cloud Platform - Los Angeles
 
Adobe premiere pro cs6
Adobe premiere pro cs6Adobe premiere pro cs6
Adobe premiere pro cs6
 
Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...Software language over the last 50 years, what will be next (by Pieter Zulian...
Software language over the last 50 years, what will be next (by Pieter Zulian...
 
Tracking your Technical Debt with Sonarqube
Tracking your Technical Debt with SonarqubeTracking your Technical Debt with Sonarqube
Tracking your Technical Debt with Sonarqube
 
System development using visual studio
System development using visual studioSystem development using visual studio
System development using visual studio
 
Immutable Server generation: The new App Deployment
Immutable Server generation: The new App DeploymentImmutable Server generation: The new App Deployment
Immutable Server generation: The new App Deployment
 
Jarvisproject
JarvisprojectJarvisproject
Jarvisproject
 
Cloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San FranciscoCloud-Native Roadshow - Google - San Francisco
Cloud-Native Roadshow - Google - San Francisco
 
Cloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - DallasCloud-Native Roadshow - Google - Dallas
Cloud-Native Roadshow - Google - Dallas
 
Cloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - DenverCloud-Native Roadshow - Google - Denver
Cloud-Native Roadshow - Google - Denver
 
Cloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: GoogleCloud-Native Roadshow Boston: Google
Cloud-Native Roadshow Boston: Google
 
Canvas real speaker
Canvas real speakerCanvas real speaker
Canvas real speaker
 

Text to-speech & voice recognition