Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/1u1Jvi3.
Simon MacDonald explains how to use speech recognition effectively on mobile platforms, covering the W3C Web Speech API specification and its current implementation status. Filmed at qconnewyork.com.
Simon MacDonald has over 15 years of development experience and has worked on a variety of projects including object oriented databases, police communication systems, speech recognition and unified messaging. His current focus is building a secure version of Android, bringing speech recognition to the masses and enabling developers to create cross platform mobile apps with Web technologies.
2. Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/state-speech-recognition-mobile
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
3. Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
8. The future won't be like Star Trek.
Scott Adams, creator of Dilbert
13. Here's a conversation between two Cape
Bretoners
P1: jeet?
P2: naw, jew?
P1: naw, t'rly t'eet bye.
14. And here's the translation
P1: jeet?
P1: Did you eat?
P2: naw, jew?
P2: No, did you?
P1: naw, t'rly t'eet bye.
P1: No, it's too early to eat buddy.
33. Why is that?
When two people talk
comprehension rates are better
than 97%
34. A really good english language
speech recognition system is
right 92% of the time
35. Where does that extra 5% in
error rate come from?
Vocabulary size and confusability
Speaker dependence vs independence
Isolated or continuous speech
Initiated vs spontaneous speech
Adverse conditions
36. Mobile Speech Recognition
OS Application SDK
Android Google Now Java API
iOS Siri Many 3rd party Obj-C SDK's
Windows Phone Cortana C# API
46. But I want to do
something more
exciting with the
result
47. Let's do something a little less
trivial
recognition.onresult = function(event) {
var result = event.results[0][0].transcript;
var music = document.getElementById("music");
switch(result) {
case "jazz":
music.src="jazz.mp3";
music.play();
break;
case "rock":
music.src="rock.mp3";
music.play();
break;
case "stop":
default:
music.pause();
}
};
Click to Speak
63. Availability
OS Recognition Synthesis
Android ✓ ✓
iOS* Active development Native to iOS 7.0
Windows Phone × ×
* Working with Julio César (@jcesarmobile) to get iOS done
64. Getting started
cordova create speech com.example.speech speech
cd speech
cordova build android
cordova local plugin add https://github.com/macdonst/SpeechRecognitionPlugin
cordova local plugin add https://github.com/macdonst/SpeechSynthesisPlugin
cordova install android
65. For more information on hybrid
applications
Check out
Christophe
presentation on
Coenraets
Creating Native-Like Mobile
Apps with AngularJS, Ionic and
Cordova 3:00pm today right
here in Salon C.