An overview of the technology options for adding speech to web applications. It covers the HTML5 Speech Input API for speech recognition, using the Audio tag with 3rd party APIs for text-to-speech, and an overview of WebRTC application possibilities.
Presented at the Atlanta Ruby Users Group meeting on November 13, 2013.
8. CAN YOU SPEAK MAGIC?
ADD SPEECH TO THE WEB
•Speech Input API
!3
9. CAN YOU SPEAK MAGIC?
ADD SPEECH TO THE WEB
•Speech Input API
•Text-To-Speech (<Audio/>)
!3
10. CAN YOU SPEAK MAGIC?
ADD SPEECH TO THE WEB
•Speech Input API
•Text-To-Speech (<Audio/>)
•WebRTC
!3
11. CAN YOU SPEAK MAGIC?
ADD SPEECH TO THE WEB
•Speech Input API
•Text-To-Speech (<Audio/>)
•WebRTC
http://bit.ly/HTML5_Speech_Input_API
http://www.w3.org/TR/webrtc/
!3
12. CAN YOU SPEAK MAGIC?
ADD SPEECH TO THE WEB
•Speech Input API
•Text-To-Speech (<Audio/>)
•WebRTC
http://bit.ly/HTML5_Speech_Input_API
http://www.w3.org/TR/webrtc/
!3
21. CAN YOU SPEAK MAGIC?
SPEECH INPUT API CAVEATS
•Chrome Only :(
!9
22. CAN YOU SPEAK MAGIC?
SPEECH INPUT API CAVEATS
•Chrome Only :(
•Uses Google ASR(duh)
!9
23. CAN YOU SPEAK MAGIC?
SPEECH INPUT API CAVEATS
•Chrome Only :(
•Uses Google ASR(duh)
•Partial Firefox implementation from GSoC
!9
24. CAN YOU SPEAK MAGIC?
SPEECH INPUT API CAVEATS
•Chrome Only :(
•Uses Google ASR(duh)
•Partial Firefox implementation from GSoC
•Requires ASR Server
!9
25. CAN YOU SPEAK MAGIC?
SPEECH INPUT API CAVEATS
•Chrome Only :(
•Uses Google ASR(duh)
•Partial Firefox implementation from GSoC
•Requires ASR Server
•Only Google runs one today
!9
26. CAN YOU SPEAK MAGIC?
SPEECH INPUT API CAVEATS
•Chrome Only :(
•Uses Google ASR(duh)
•Partial Firefox implementation from GSoC
•Requires ASR Server
•Only Google runs one today
•serviceURI attribute not yet implemented
!9
27. CAN YOU SPEAK MAGIC?
SPEECH INPUT API CAVEATS
•Chrome Only :(
•Uses Google ASR(duh)
•Partial Firefox implementation from GSoC
•Requires ASR Server
•Only Google runs one today
•serviceURI attribute not yet implemented
•Specification maturity seems slow
!9
51. CAN YOU SPEAK MAGIC?
Ge
SD t m
v= P
e
0
:
Bo
b
o=
al
s=
ic
e2
t= 0
05
m 0
18
=a
ud
0
IN
io
54
IP
60
4
0.
9
0.
RT
0.
P/
0
SA
VP
F
10
9
pl
ea
se
!
tp://
ht
Alice
SD
v
P:
o= =0
s= bo
t= - b 1
99
m 00
15
=a
ud
0
IN
io
61
IP
00
4
0.
1
RT
0.
0.
P/
0
SA
V
PF
10
9
Bob
!17
52. CAN YOU SPEAK MAGIC?
Ge
SD t m
v= P
e
0
:
Bo
b
o=
al
s=
ic
e2
t= 0
05
m 0
18
=a
ud
0
IN
io
54
IP
60
4
0.
9
0.
RT
0.
P/
0
SA
VP
F
10
9
pl
ea
se
!
tp://
ht
Alice
SD
v
P:
o= =0
s= bo
t= - b 1
99
m 00
15
=a
ud
0
IN
io
61
IP
00
4
0.
1
RT
0.
0.
P/
0
SA
V
PF
10
9
Bob
!17
53. CAN YOU SPEAK MAGIC?
Ge
SD t m
v= P
e
0
:
Bo
b
o=
al
s=
ic
e2
t= 0
05
m 0
18
=a
ud
0
IN
io
54
IP
60
4
0.
9
0.
RT
0.
P/
0
SA
VP
F
10
9
pl
ea
se
!
tp://
ht
Alice
SD
v
P:
o= =0
s= bo
t= - b 1
99
m 00
15
=a
ud
0
IN
io
61
IP
00
4
0.
1
RT
0.
0.
P/
0
SA
V
PF
10
9
Bob
!17
54. CAN YOU SPEAK MAGIC?
Ge
SD t m
v= P
e
0
:
Bo
b
o=
al
s=
ic
e2
t= 0
05
m 0
18
=a
ud
0
IN
io
54
IP
60
4
0.
9
0.
RT
0.
P/
0
SA
VP
F
10
9
pl
ea
se
!
tp://
ht
Alice
SD
v
P:
o= =0
s= bo
t= - b 1
99
m 00
15
=a
ud
0
IN
io
61
IP
00
4
0.
1
RT
0.
0.
P/
0
SA
V
PF
10
9
Bob
!17
55. CAN YOU SPEAK MAGIC?
tp://
ht
se
!
SD
Ge
SD t m
v= P
e
0
:
Bo
b
o=
al
s=
ic
e2
t= 0
05
m 0
18
=a
ud
0
IN
io
54
IP
60
4
0.
9
0.
RT
0.
P/
0
SA
VP
F
10
9
pl
ea
v
P:
o= =0
s= bo
t= - b 1
99
m 00
15
=a
ud
0
IN
io
61
IP
00
4
0.
1
RT
0.
0.
P/
0
SA
V
PF
10
9
SRTP
SRTP
Alice
Bob
!17
56. CAN YOU SPEAK MAGIC?
tp://
ht
Ge
SD t m
v= P
e
0
:
Bo
b
o=
al
s=
ic
e2
t= 0
05
m 0
18
=a
ud
0
IN
io
54
IP
60
4
0.
9
0.
RT
0.
P/
0
SA
VP
F
10
9
pl
ea
se
!
SD
X
v
P:
o= =0
s= bo
t= - b 1
99
m 00
15
=a
ud
0
IN
io
61
IP
00
4
0.
1
RT
0.
0.
P/
0
SA
V
PF
10
9
SRTP
SRTP
Alice
Bob
!17
64. CAN YOU SPEAK MAGIC?
Example RTC Apps
2 Examples
!19
65. CAN YOU SPEAK MAGIC?
“Communicating isn’t going
to be what you’re doing it’s what you’ll be doing
while you’re doing
something else”
- Geoff Hollingworth
Ericsson Head of AT&T Foundry
!20
69. CAN YOU SPEAK MAGIC?
INCIDENT RESPONSE
•Timely, Contextual Information
•Adapt for mobile vs. desktop users
•Group-based communication
•Inherit from existing organizational groups
•Allow ad-hoc participants (“guest” parties)
•Federate with external services
•Incident recording/logging
•“Lessons learned” and process improvement
•Links from/to issue tracking systems
!23
70. CAN YOU SPEAK MAGIC?
2. Medical Records
Management
!24
73. CAN YOU SPEAK MAGIC?
MEDICAL RECORDS MGMT
•Automate Medical Claims
•Secure Caller Authentication
•Reuse primary auth via website
•Verify with voice biometrics
•Cross-check against caller location
•Call recording/transcription
•Medical advice given to patient
automatically added to patient file
•Auditing/Service Quality Assurance
!26
76. CAN YOU SPEAK MAGIC?
WEBRTC CAVEATS
•Bleeding edge, developing standard
!28
77. CAN YOU SPEAK MAGIC?
WEBRTC CAVEATS
•Bleeding edge, developing standard
•Only available on Chrome, Firefox
!28
78. CAN YOU SPEAK MAGIC?
WEBRTC CAVEATS
•Bleeding edge, developing standard
•Only available on Chrome, Firefox
•Only available on Desktop
!28
79. CAN YOU SPEAK MAGIC?
WEBRTC CAVEATS
•Bleeding edge, developing standard
•Only available on Chrome, Firefox
•Only available on Desktop
•Well funded/backed development
!28
80. CAN YOU SPEAK MAGIC?
WEBRTC CAVEATS
•Bleeding edge, developing standard
•Only available on Chrome, Firefox
•Only available on Desktop
•Well funded/backed development
•Expect to see it mainstream
(Desktop + Mobile) as soon as 2014
!28
81. CAN YOU SPEAK MAGIC?
WEBRTC CAVEATS
•Bleeding edge, developing standard
•Only available on Chrome, Firefox
•Only available on Desktop
•Well funded/backed development
•Expect to see it mainstream
(Desktop + Mobile) as soon as 2014
•http://iswebrtcreadyyet.com/
!28