Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Andrew Sutherland Presentation


Published on

Published in: Education, Business
  • Login to see the comments

Andrew Sutherland Presentation

  1. 1. Voice-enabled web apps with WAMI By Andrew Sutherland Founder,
  2. 2. Who am I? <ul><li>Founder of Quizlet – online flashcards and study tool </li></ul><ul><ul><li>Founded in 2005 in high school </li></ul></ul><ul><ul><li>500,000 registered users </li></ul></ul><ul><ul><li>32,000,000 flashcards uploaded </li></ul></ul><ul><li>Sophomore at MIT </li></ul><ul><ul><li>I should be in Chemistry lecture right now… </li></ul></ul>
  3. 3. What is WAMI? <ul><li>A research project at MIT </li></ul><ul><li>A free web service API </li></ul><ul><li>Plug and play : </li></ul><ul><ul><li>voice recognition </li></ul></ul><ul><ul><li>audio recording </li></ul></ul>
  4. 4. How WAMI works <ul><li>Microphone activated with a Java applet </li></ul><ul><li>Audio streams to WAMI servers </li></ul><ul><li>WAMI processes audio in real-time </li></ul><ul><li>Javascript receives structured data of what the person said </li></ul>
  5. 5. WAMI is a web service <ul><li>Plug-and-play javascript one-liner </li></ul><ul><ul><li>You don’t have to maintain audio processing servers </li></ul></ul><ul><li>Re-Captcha model </li></ul><ul><li>More apps -> more utterances -> better quality voice recognition for all </li></ul>
  6. 6. WAMI lets javascript do the work <ul><li>Javascript can activate microphone </li></ul><ul><ul><li>myWami.startRecording() </li></ul></ul><ul><li>Javascript receives the text of what you said </li></ul><ul><li>No clunky extra UI necessary – you build your web app how you like. </li></ul>
  7. 7. WAMI is fast <ul><li>WAMI can send results before you finish your sentence: </li></ul><ul><ul><li>“ Put an X…” </li></ul></ul><ul><ul><li>Javascript displays an “X” </li></ul></ul><ul><ul><li>“… on square five” </li></ul></ul><ul><ul><li>Javascript moves that “X” to square five. </li></ul></ul>
  8. 8. WAMI is grammar-based <ul><li>Recognition is restricted to a grammar defined by your app </li></ul><ul><li>Grammar is compiled on page load or recompiled at any time </li></ul><ul><li>Very flexible JSGF format </li></ul>
  9. 9. What’s a grammar? <ul><li>#JSGF V1.0; </li></ul><ul><li>grammar SampleGrammar; </li></ul><ul><li>public <top> = turtle | giraffe | pony; </li></ul>
  10. 10. What’s a grammar? <ul><li>#JSGF V1.0; </li></ul><ul><li>grammar SampleGrammar; </li></ul><ul><li>public <top> = turtle {[id=1]} | giraffe {[id=2]} | pony {[id=3]}; </li></ul>
  11. 11. What’s a grammar? <ul><li>#JSGF V1.0; </li></ul><ul><li>grammar SampleGrammar; </li></ul><ul><li>public <top> = i [really] want (a <animal>)+; </li></ul><ul><li><animal> = turtle {[id=1]} | giraffe {[id=2]} | pony {[id=3]}; </li></ul>
  12. 12. Getting started <ul><li><script src=&quot;;></script> </li></ul><ul><li><script> myWami = new WamiApp($(‘wamiDiv’), { </li></ul><ul><li>onRecognitionResult : receiveWAMIguess, </li></ul><ul><li>onReady : startApp </li></ul><ul><li>}); </li></ul><ul><li>myWami.setGrammar(“#JSGF V1.0 …”); </li></ul><ul><li></script> </li></ul>
  13. 13. Javascript Data receiver <ul><li>receiveWAMIguess(obj) { </li></ul><ul><li>// “You want a giraffe” </li></ul><ul><li>alert(“You want a ”+obj.hyps[0].text); </li></ul><ul><li>} </li></ul>
  14. 14. WAMI saves your audio <ul><li>Instantly replay user’s audio. </li></ul><ul><li>You can download audio files to your server for long-term storage. </li></ul>
  15. 15. Real-world application <ul><li>Built WAMI into studying tool. </li></ul><ul><li>Users control vocabulary games by voice. </li></ul><ul><li>Thousands of students using it now </li></ul><ul><ul><li>Over 1 million utterances recorded </li></ul></ul>
  16. 16. Live DEMO!
  17. 17. WAMI To Do: <ul><li>Complete real-time improvement system. </li></ul><ul><li>Switch from Java to Flash </li></ul>
  18. 18. Please complete an evaluation.
  19. 19. Questions? Contact me: [email_address] More about WAMI: