Speech-Enabling Web Apps
 

Like this? Share it with your network

Share

Speech-Enabling Web Apps

on

  • 479 views

An overview of the technology options for adding speech to web applications. It covers the HTML5 Speech Input API for speech recognition, using the Audio tag with 3rd party APIs for text-to-speech, ...

An overview of the technology options for adding speech to web applications. It covers the HTML5 Speech Input API for speech recognition, using the Audio tag with 3rd party APIs for text-to-speech, and an overview of WebRTC application possibilities.

Presented at the Atlanta Ruby Users Group meeting on November 13, 2013.

Statistics

Views

Total Views
479
Views on SlideShare
448
Embed Views
31

Actions

Likes
1
Downloads
57
Comments
0

3 Embeds 31

https://twitter.com 29
http://www.linkedin.com 1
http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Speech-Enabling Web Apps Presentation Transcript

  • 1. SpeechEnabling Web Apps
  • 2. CAN YOU SPEAK MAGIC? !2
  • 3. CAN YOU SPEAK MAGIC? Ben Klang !2
  • 4. CAN YOU SPEAK MAGIC? Ben Klang !2
  • 5. CAN YOU SPEAK MAGIC? Ben Klang !2
  • 6. CAN YOU SPEAK MAGIC? ADD SPEECH TO THE WEB !3
  • 7. CAN YOU SPEAK MAGIC? ADD SPEECH TO THE WEB !3
  • 8. CAN YOU SPEAK MAGIC? ADD SPEECH TO THE WEB •Speech Input API !3
  • 9. CAN YOU SPEAK MAGIC? ADD SPEECH TO THE WEB •Speech Input API •Text-To-Speech (<Audio/>) !3
  • 10. CAN YOU SPEAK MAGIC? ADD SPEECH TO THE WEB •Speech Input API •Text-To-Speech (<Audio/>) •WebRTC !3
  • 11. CAN YOU SPEAK MAGIC? ADD SPEECH TO THE WEB •Speech Input API •Text-To-Speech (<Audio/>) •WebRTC http://bit.ly/HTML5_Speech_Input_API http://www.w3.org/TR/webrtc/ !3
  • 12. CAN YOU SPEAK MAGIC? ADD SPEECH TO THE WEB •Speech Input API •Text-To-Speech (<Audio/>) •WebRTC http://bit.ly/HTML5_Speech_Input_API http://www.w3.org/TR/webrtc/ !3
  • 13. CAN YOU SPEAK MAGIC? SPEECH INPUT API !4
  • 14. CAN YOU SPEAK MAGIC? SPEECH INPUT API !5
  • 15. CAN YOU SPEAK MAGIC? SPEECH INPUT API !5
  • 16. CAN YOU SPEAK MAGIC? SPEECH INPUT API <input type="text" x-webkit-speech /> !5
  • 17. CAN YOU SPEAK MAGIC? ANNYANG! !6
  • 18. CAN YOU SPEAK MAGIC? !7
  • 19. CAN YOU SPEAK MAGIC? DEMO !8
  • 20. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS !9
  • 21. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS •Chrome Only :( !9
  • 22. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS •Chrome Only :( •Uses Google ASR(duh) !9
  • 23. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS •Chrome Only :( •Uses Google ASR(duh) •Partial Firefox implementation from GSoC !9
  • 24. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS •Chrome Only :( •Uses Google ASR(duh) •Partial Firefox implementation from GSoC •Requires ASR Server !9
  • 25. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS •Chrome Only :( •Uses Google ASR(duh) •Partial Firefox implementation from GSoC •Requires ASR Server •Only Google runs one today !9
  • 26. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS •Chrome Only :( •Uses Google ASR(duh) •Partial Firefox implementation from GSoC •Requires ASR Server •Only Google runs one today •serviceURI attribute not yet implemented !9
  • 27. CAN YOU SPEAK MAGIC? SPEECH INPUT API CAVEATS •Chrome Only :( •Uses Google ASR(duh) •Partial Firefox implementation from GSoC •Requires ASR Server •Only Google runs one today •serviceURI attribute not yet implemented •Specification maturity seems slow !9
  • 28. CAN YOU SPEAK MAGIC? TEXT-TO-SPEECH !10
  • 29. CAN YOU SPEAK MAGIC? TTS API + <AUDIO/> !11
  • 30. CAN YOU SPEAK MAGIC? TTS API OPTIONS !12
  • 31. CAN YOU SPEAK MAGIC? TTS API OPTIONS •AT&T: http://developer.att.com !12
  • 32. CAN YOU SPEAK MAGIC? TTS API OPTIONS •AT&T: http://developer.att.com •Nuance NDEV
 http://nuancemobiledeveloper.com/ !12
  • 33. CAN YOU SPEAK MAGIC? TTS API OPTIONS •AT&T: http://developer.att.com •Nuance NDEV
 http://nuancemobiledeveloper.com/ •Google:
 http://translate.google.com/translate_tts? tl=en&q=TEXT !12
  • 34. CAN YOU SPEAK MAGIC? <AUDIO/> CAVEATS !13
  • 35. CAN YOU SPEAK MAGIC? <AUDIO/> CAVEATS •You can’t pay for Google TTS !13
  • 36. CAN YOU SPEAK MAGIC? <AUDIO/> CAVEATS •You can’t pay for Google TTS •No specified Mandatory To Implement (MTI) codecs !13
  • 37. CAN YOU SPEAK MAGIC? <AUDIO/> CAVEATS •You can’t pay for Google TTS •No specified Mandatory To Implement (MTI) codecs •Broad consensus !13
  • 38. CAN YOU SPEAK MAGIC? <AUDIO/> CAVEATS •You can’t pay for Google TTS •No specified Mandatory To Implement (MTI) codecs •Broad consensus •Everyone: MP3 (+containers H.264, MP4) !13
  • 39. CAN YOU SPEAK MAGIC? <AUDIO/> CAVEATS •You can’t pay for Google TTS •No specified Mandatory To Implement (MTI) codecs •Broad consensus •Everyone: MP3 (+containers H.264, MP4) •Except IE: Ogg/Vorbis, Opus, WebM !13
  • 40. CAN YOU SPEAK MAGIC? <AUDIO/> CAVEATS •You can’t pay for Google TTS •No specified Mandatory To Implement (MTI) codecs •Broad consensus •Everyone: MP3 (+containers H.264, MP4) •Except IE: Ogg/Vorbis, Opus, WebM •http://bit.ly/Browser_Audio_Codecs !13
  • 41. CAN YOU SPEAK MAGIC? !14
  • 42. CAN YOU SPEAK MAGIC? WHAT IS WEBRTC TO ME? !15
  • 43. CAN YOU SPEAK MAGIC? WHAT IS WEBRTC TO ME? Telephones in Web Browsers! !15
  • 44. CAN YOU SPEAK MAGIC? WHAT IS WEBRTC TO ME? Telephones in Web Browsers! !15
  • 45. CAN YOU SPEAK MAGIC? How does WebRTC Work? !16
  • 46. CAN YOU SPEAK MAGIC? tp:// ht !17
  • 47. CAN YOU SPEAK MAGIC? tp:// ht Alice !17
  • 48. CAN YOU SPEAK MAGIC? tp:// ht Alice Bob !17
  • 49. b Bo se ! pl ea 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 SD t m v= P e 0 : Ge CAN YOU SPEAK MAGIC? tp:// ht Alice Bob !17
  • 50. b Bo se ! pl ea 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 SD t m v= P e 0 : Ge CAN YOU SPEAK MAGIC? tp:// ht Alice Bob !17
  • 51. CAN YOU SPEAK MAGIC? Ge SD t m v= P e 0 : Bo b 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 pl ea se ! tp:// ht Alice SD v P: o= =0 
 s= bo t= - b 1 99 m 00 15 =a ud 0 IN io 61 IP 00 4 0. 1 RT 0. 0. P/ 0 SA V PF 10 9 Bob !17
  • 52. CAN YOU SPEAK MAGIC? Ge SD t m v= P e 0 : Bo b 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 pl ea se ! tp:// ht Alice SD v P: o= =0 
 s= bo t= - b 1 99 m 00 15 =a ud 0 IN io 61 IP 00 4 0. 1 RT 0. 0. P/ 0 SA V PF 10 9 Bob !17
  • 53. CAN YOU SPEAK MAGIC? Ge SD t m v= P e 0 : Bo b 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 pl ea se ! tp:// ht Alice SD v P: o= =0 
 s= bo t= - b 1 99 m 00 15 =a ud 0 IN io 61 IP 00 4 0. 1 RT 0. 0. P/ 0 SA V PF 10 9 Bob !17
  • 54. CAN YOU SPEAK MAGIC? Ge SD t m v= P e 0 : Bo b 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 pl ea se ! tp:// ht Alice SD v P: o= =0 
 s= bo t= - b 1 99 m 00 15 =a ud 0 IN io 61 IP 00 4 0. 1 RT 0. 0. P/ 0 SA V PF 10 9 Bob !17
  • 55. CAN YOU SPEAK MAGIC? tp:// ht se ! SD Ge SD t m v= P e 0 : Bo b 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 pl ea v P: o= =0 
 s= bo t= - b 1 99 m 00 15 =a ud 0 IN io 61 IP 00 4 0. 1 RT 0. 0. P/ 0 SA V PF 10 9 SRTP SRTP Alice Bob !17
  • 56. CAN YOU SPEAK MAGIC? tp:// ht Ge SD t m v= P e 0 : Bo b 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 pl ea se ! SD X v P: o= =0 
 s= bo t= - b 1 99 m 00 15 =a ud 0 IN io 61 IP 00 4 0. 1 RT 0. 0. P/ 0 SA V PF 10 9 SRTP SRTP Alice Bob !17
  • 57. CAN YOU SPEAK MAGIC? Alice Bob !18
  • 58. b Bo se ! pl ea 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 SD t m v= P e 0 : Ge CAN YOU SPEAK MAGIC? Alice Bob !18
  • 59. b Bo se ! pl ea 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 SD t m v= P e 0 : Ge CAN YOU SPEAK MAGIC? Alice Bob !18
  • 60. ! ng Bo b se ! pl ea 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 lli Ca SD t m v= P e 0 : e lic A 0 0. 0. 0. 4 IP 9 IN 0 10 5 F 91 VP 19 
 SA ch P: P/ it w RT SD 1 0 es v= fre 00 61 o= o s= 0 0 di t= =au Bob Alice m Ge CAN YOU SPEAK MAGIC? !18
  • 61. ! ng Bo b se ! pl ea 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 lli Ca SD t m v= P e 0 : e lic A 0 0. 0. 0. 4 IP 9 IN 0 10 5 F 91 VP 19 
 SA ch P: P/ it w RT SD 1 0 es v= fre 00 61 o= o s= 0 0 di t= =au Bob Alice m Ge CAN YOU SPEAK MAGIC? !18
  • 62. se ! pl ea 
 o= al s= ic e2 t= 0 05 m 0 18 =a ud 0 IN io 54 IP 60 4 0. 9 0. RT 0. P/ 0 SA VP F 10 9 SR TP b ! ng Bo lli Ca SD t m v= P e 0 : e lic A 0 0. 0. 0. 4 IP 9 IN 0 10 5 F 91 VP 19 
 SA ch P: P/ it w RT SD 1 0 es v= fre 00 61 o= o s= 0 0 di t= =au m Bob Alice TP SR Ge CAN YOU SPEAK MAGIC? !18
  • 63. CAN YOU SPEAK MAGIC? Example RTC Apps !19
  • 64. CAN YOU SPEAK MAGIC? Example RTC Apps 2 Examples !19
  • 65. CAN YOU SPEAK MAGIC? “Communicating isn’t going to be what you’re doing it’s what you’ll be doing while you’re doing something else”
 - Geoff Hollingworth Ericsson Head of AT&T Foundry !20
  • 66. CAN YOU SPEAK MAGIC? 1. Incident Response !21
  • 67. CAN YOU SPEAK MAGIC? !22
  • 68. CAN YOU SPEAK MAGIC? INCIDENT RESPONSE !23
  • 69. CAN YOU SPEAK MAGIC? INCIDENT RESPONSE •Timely, Contextual Information •Adapt for mobile vs. desktop users •Group-based communication •Inherit from existing organizational groups •Allow ad-hoc participants (“guest” parties) •Federate with external services •Incident recording/logging •“Lessons learned” and process improvement •Links from/to issue tracking systems !23
  • 70. CAN YOU SPEAK MAGIC? 2. Medical Records Management !24
  • 71. CAN YOU SPEAK MAGIC? !25
  • 72. CAN YOU SPEAK MAGIC? MEDICAL RECORDS MGMT !26
  • 73. CAN YOU SPEAK MAGIC? MEDICAL RECORDS MGMT •Automate Medical Claims •Secure Caller Authentication •Reuse primary auth via website •Verify with voice biometrics •Cross-check against caller location •Call recording/transcription •Medical advice given to patient automatically added to patient file •Auditing/Service Quality Assurance !26
  • 74. CAN YOU SPEAK MAGIC? HTTPS://TALKY.IO/ATLRUG !27
  • 75. CAN YOU SPEAK MAGIC? WEBRTC CAVEATS !28
  • 76. CAN YOU SPEAK MAGIC? WEBRTC CAVEATS •Bleeding edge, developing standard !28
  • 77. CAN YOU SPEAK MAGIC? WEBRTC CAVEATS •Bleeding edge, developing standard •Only available on Chrome, Firefox !28
  • 78. CAN YOU SPEAK MAGIC? WEBRTC CAVEATS •Bleeding edge, developing standard •Only available on Chrome, Firefox •Only available on Desktop !28
  • 79. CAN YOU SPEAK MAGIC? WEBRTC CAVEATS •Bleeding edge, developing standard •Only available on Chrome, Firefox •Only available on Desktop •Well funded/backed development !28
  • 80. CAN YOU SPEAK MAGIC? WEBRTC CAVEATS •Bleeding edge, developing standard •Only available on Chrome, Firefox •Only available on Desktop •Well funded/backed development •Expect to see it mainstream (Desktop + Mobile) as soon as 2014 !28
  • 81. CAN YOU SPEAK MAGIC? WEBRTC CAVEATS •Bleeding edge, developing standard •Only available on Chrome, Firefox •Only available on Desktop •Well funded/backed development •Expect to see it mainstream (Desktop + Mobile) as soon as 2014 •http://iswebrtcreadyyet.com/ !28
  • 82. CAN YOU SPEAK MAGIC? !29
  • 83. CAN YOU SPEAK MAGIC? adhearsionconf.com Early Bird Discount: atlrug !30
  • 84. CAN YOU SPEAK MAGIC? @bklang bklang@mojolingo.com http://bit.ly/HTML5_Speech_Input_API http://www.w3.org/TR/webrtc/ http://iswebrtcreadyyet.com/ http://mojolingo.com @MojoLingo Early Bird Discount: atlrug !31