Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The Workplace


Published on

The market is already demonstrating strong value in the home for voice-activated AI, but the work environment is yet to catch up. Omar will explain why voice-activated AI is the most important development to come to the workplace. He will pull from his experiences creating Eva, the first enterprise voice assistant focused on making meetings more actionable, and dive specifically into the challenges of ASR (Automatic Speech Recognition), NLP and neural networks in creating these kinds of voice-activated assistants. He will share how his team have overcome these challenges.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The Workplace

  1. 1. E N T E R P R I S E V O I C E A IE N T E R P R I S E V O I C E A I
  2. 2. Exoskeletons, not Robots • Four attributes of exoskeletons – Enhance not replace – Collective intelligence – Fits your workflow – Human-in-the-loop (optional) 2
  3. 3. The most adopted form of enterprise collaboration 3 high low high Email Meetings (Voice) low Employee time spent IM Information Generation Size of bubble represents activation opportunity lacks activation Enterprise Apps
  5. 5. voicera = voice collaboration • Connect what you say with you what you do • meet eva, your in-meeting AI assistant that takes notes 5 step 1: call or invite to your meetings step 2: interact through voice queues or “taps” step 3: review email and share through Voicera
  6. 6. secular trends 6 Enterprise Voice Collaboration Consumer “Gartner predicts that by 2020, 60% of meetings with three or more participants will involve a virtual assistant.”
  7. 7. 7 Agenda Actions Decisions Artifacts Feedback Meeting Threads horizontal use: collaborate and share information with clarity… conversations inbox
  8. 8. Post Meeting Inbox View 8
  9. 9. Post Meeting Inbox View 9
  10. 10. A different type of competitive advantage 10 • Oracle Data Cloud & Classic Data Network Effects • AI can create a compounding competitive advantage* More Data Sellers More Data Buyers Better Monetization Network Effect …but producing this type of advantage isn’t business as usual. Better Experience More interaction data Better algorithmic results Deeper preferences learned Compounding advantage *GGVC term
  11. 11. Building the data pipeline • Bootstrap through acquiring data & labels • Generate production data • Process for accurate, continuous labels (e.g. FP, TP, and FN) • Compress learning cycles w/ model automation: – Creation – Judgement – Parameter tuning/learning – Deployment 11
  12. 12. Example: Key Word Spotting • Goal: Utterance in which the keyword is spoken has higher confidence than any other spoken utterance in which the keyword is not spoken • The most common measure to evaluate keyword spotters is AUC (Area Under Precision & Recall Curve) • Alternatively, we also use Recall @ Near 100% Precision 12
  13. 13. Technical Challenges • Telephony is the least common denominator – 8K Sampling Rate • A wide variety of microphones & meeting environments • High Social Cost of False Triggers • Online Decoding: Very Fast & Small footprint • Handle different accents and pronunciations 13
  14. 14. Avoiding Judgement Errors: Survivor Bias Example • A KWS creates FP, TP and misses FN • FP & TP are easily labeled (FN are harder) • Survivor bias misjudges performance of next candidate • New algorithm has a bias for it for FP – b/c it won’t generate the same false positives – but it would generate its own false positives • New algorithm has a bias against it for FN – b/c it is judged against TP and fails >0% – but it could accurately identify previous algorithms FN 14
  15. 15. Results • We train a number of models for various keywords • On average we achieve: – A precision of ~0.0005% false trigger every 3 (1-hour meetings) – A recall of ~90%: 1 of 10 voice commands missed • The results vary dramatically based on environments • Our online training constantly trains • Please visit: to signup and use 15 Performance over time recall precision
  16. 16. 16