JavaScript
Speech
Recognition
Applicationsand maybe some other HTML5 goodness
Who is this guy?
core contributor
nutty about
IBM'er
@macdonst
macdonst on Github
simonmacdonald.com
PhoneGap
speech recog...
Why do I care about speech rec?
Here's a conversation between
two Cape Bretoners
P1: jeet?
P2: naw, jew?
P1: naw, t'rly t'...
And here's the translation
P1: jeet?
P1: Did you eat?
P2: naw, jew?
P2: No, did you?
P1: naw, t'rly t'eet bye.
P1: No, it'...
What is
speech
recognition?
Speech
recognition is
the process of
translating
the spoken
word into text.
The process of speech
rec includes
1. Record and digitize the audio data
2. Split data into phonemes
3. Apply the phonemes...
Basically...
So how do we
add speech
rec to our
app?
You may look at the W3C
Speech API Specification
but only Chrome on the
desktop has
implemented that spec
But that's okay!
The spec looks like this:
interface SpeechRecognition : EventTarget {
// recognition parameters
attribute SpeechGrammarLis...
With additional event
methods to control
behaviour:
attribute EventHandler onaudiostart;
attribute EventHandler onsoundsta...
Let's recognize some
speech
Click to Speak
hello world
var recognition = new SpeechRecognition();
recognition.onresult = f...
So that's
pretty cool...
...if taking
dictation gets
you going
But I want to
do something
more exciting
with the
result
Let's do something a little
less trivial
Click to Speak
recognition.onresult = function(event) {
var result = event.result...
Which seems
much cooler
to me
Let's ask the web a
question
Click to Speak
Q: what day is it today
A: Friday July 19th, 2013
Works pretty
good...
...but ugly!
Let's style our
button with
some CSS
+
=
<a class="speechinput">
<img src="images/mic.png">
</a>
#speechinput input {
cursor:pointer;
margin:auto;
margin:15px;...
And we'll add some color
using
by Nicholas Gallagher
Speech
Bubbles
Pure-CSS-Speech-Bubbles
Then pull it all
together!
what is steve jobs middle name
Steven Paul Jobs
But wait, why
am I using my
eyes like a
sucker
We'll output the answer
using SpeechSynthesis
The SpeechSynthesis
spec looks like this:
interface SpeechSynthesis {
readonly attribute boolean pending;
readonly attribu...
The
SpeechSynthesisUtterance
spec looks like this:
interface SpeechSynthesisUtterance : EventTarget {
attribute DOMString ...
With additional event
methods to control
behaviour:
attribute EventHandler onstart;
attribute EventHandler onend;
attribut...
who won the stanley cup this year
Chicago Blackhawks
Plugin repo's
SpeechRecognitionPlugin -
https://github.com/macdonst/SpeechRecognitionPlugin
SpeechSynthesisPlugin -
https:...
THE END
PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition
PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition
PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition
Upcoming SlideShare
Loading in...5
×

PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition

1,882

Published on

Learn how you can add speech recognition capabilities to your PhoneGap application by leveraging the Web Speech API plugin. Simon MacDonald will give a quick overview of the API and show how to make a quick question/answer app using Speech Rec.

Published in: Technology
1 Comment
1 Like
Statistics
Notes
No Downloads
Views
Total Views
1,882
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition"

  1. 1. JavaScript Speech Recognition Applicationsand maybe some other HTML5 goodness
  2. 2. Who is this guy? core contributor nutty about IBM'er @macdonst macdonst on Github simonmacdonald.com PhoneGap speech recognition
  3. 3. Why do I care about speech rec? Here's a conversation between two Cape Bretoners P1: jeet? P2: naw, jew? P1: naw, t'rly t'eet bye.
  4. 4. And here's the translation P1: jeet? P1: Did you eat? P2: naw, jew? P2: No, did you? P1: naw, t'rly t'eet bye. P1: No, it's too early to eat buddy.
  5. 5. What is speech recognition?
  6. 6. Speech recognition is the process of translating
  7. 7. the spoken word into text.
  8. 8. The process of speech rec includes 1. Record and digitize the audio data 2. Split data into phonemes 3. Apply the phonemes to the recognition model 4. Analyze the results against the grammar 5. Return a confidence weighted result
  9. 9. Basically...
  10. 10. So how do we add speech rec to our app?
  11. 11. You may look at the W3C Speech API Specification
  12. 12. but only Chrome on the desktop has implemented that spec
  13. 13. But that's okay!
  14. 14. The spec looks like this: interface SpeechRecognition : EventTarget { // recognition parameters attribute SpeechGrammarList grammars; attribute DOMString lang; attribute boolean continuous; attribute boolean interimResults; attribute unsigned long maxAlternatives; attribute DOMString serviceURI; // methods to drive the speech interaction void start(); void stop(); void abort(); };
  15. 15. With additional event methods to control behaviour: attribute EventHandler onaudiostart; attribute EventHandler onsoundstart; attribute EventHandler onspeechstart; attribute EventHandler onspeechend; attribute EventHandler onsoundend; attribute EventHandler onaudioend; attribute EventHandler onresult; attribute EventHandler onnomatch; attribute EventHandler onerror; attribute EventHandler onstart; attribute EventHandler onend;
  16. 16. Let's recognize some speech Click to Speak hello world var recognition = new SpeechRecognition(); recognition.onresult = function(event) { if (event.results.length > 0) { var test1 = document.getElementById("test1"); test1.innerHTML = event.results[0][0].transcript; } }; recognition.start();
  17. 17. So that's pretty cool...
  18. 18. ...if taking dictation gets you going
  19. 19. But I want to do something more exciting with the
  20. 20. result
  21. 21. Let's do something a little less trivial Click to Speak recognition.onresult = function(event) { var result = event.results[0][0].transcript; var music = document.getElementById("music"); switch(result) { case "jazz": music.src="jazz.mp3"; music.play(); break; case "rock": music.src="rock.mp3"; music.play(); break; case "stop": default: music.pause(); } };
  22. 22. Which seems much cooler to me
  23. 23. Let's ask the web a question Click to Speak Q: what day is it today A: Friday July 19th, 2013
  24. 24. Works pretty good... ...but ugly!
  25. 25. Let's style our button with some CSS
  26. 26. + = <a class="speechinput"> <img src="images/mic.png"> </a> #speechinput input { cursor:pointer; margin:auto; margin:15px; color:transparent; background-color:transparent; border:5px; width:15px; -webkit-transform: scale(3.0, 3.0); }
  27. 27. And we'll add some color using by Nicholas Gallagher Speech Bubbles Pure-CSS-Speech-Bubbles
  28. 28. Then pull it all together!
  29. 29. what is steve jobs middle name Steven Paul Jobs
  30. 30. But wait, why am I using my eyes like a sucker
  31. 31. We'll output the answer using SpeechSynthesis
  32. 32. The SpeechSynthesis spec looks like this: interface SpeechSynthesis { readonly attribute boolean pending; readonly attribute boolean speaking; readonly attribute boolean paused; void speak(SpeechSynthesisUtterance utterance); void cancel(); void pause(); void resume(); SpeechSynthesisVoiceList getVoices(); };
  33. 33. The SpeechSynthesisUtterance spec looks like this: interface SpeechSynthesisUtterance : EventTarget { attribute DOMString text; attribute DOMString lang; attribute DOMString voiceURI; attribute float volume; attribute float rate; attribute float pitch; };
  34. 34. With additional event methods to control behaviour: attribute EventHandler onstart; attribute EventHandler onend; attribute EventHandler onerror; attribute EventHandler onpause; attribute EventHandler onresume; attribute EventHandler onmark; attribute EventHandler onboundary;
  35. 35. who won the stanley cup this year Chicago Blackhawks
  36. 36. Plugin repo's SpeechRecognitionPlugin - https://github.com/macdonst/SpeechRecognitionPlugin SpeechSynthesisPlugin - https://github.com/macdonst/SpeechSynthesisPlugin
  37. 37. THE END

×