Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017

319 views

Published on

Sanjeev Satheesh, leads the Deep Speech team at Baidu’s Silicon valley AI lab. Baidu SVAIL is focused on developing hard AI technologies to impact hundreds of millions of people.

The Story of End to End Models in Deep Learning
The past few years have seen the explosive entrance of end to end deep learning models - in computer vision, speech recognition, machine translation, text to speech and others. In this talk, we look at this trend to identify what has worked well, and try to make some predictions for the future based on the next set of unsolved problems.

Published in: Technology
  • Be the first to comment

Sanjeev Satheesj, Research Scientist, Baidu at The AI Conference 2017

  1. 1. Silicon Valley AI Lab Story of end-to-end covfefe Sanjeev Satheesh June 2, 2017
  2. 2. Silicon Valley AI Lab Story of end-to-end models Sanjeev Satheesh June 2, 2017
  3. 3. What are End-to-End Models? Gaussian Mixture model over Spectrograms Hidden Markov Model over Phonemes Lexicon + Language Model of text CAT
  4. 4. What are End-to-End Models? English
  5. 5. End-to-end models Object Recognition Speech Recognition Image Captioning Language Translation
  6. 6. Why End-to-end models? Accuracy Data + Model Size Deep End-to-End model ML workflow-2 ML workflow-1
  7. 7. Traditional machine learning pipelines are fairly complicated and typically need a lot of domain knowledge to build. Why End-to-end models?
  8. 8. Why End-to-end models? Easier to obtain a large amount of data Easier on practitioners
  9. 9. Why End-to-end models? Idea CodeResults
  10. 10. Why End-to-end models? We built deep speech with no superior knowledge of speech recognition or Mandarin language
  11. 11. Challenges Need large amount of data
  12. 12. Challenges Need large amount of data Lots of compute to explore architectures
  13. 13. Challenges Idea CodeResults
  14. 14. Challenges Need large amount of data Lots of compute to explore architectures Lots of compute needed for deployment.
  15. 15. Batch Dispatch for Efficiency Time
  16. 16. What’s coming next (immediately) Speech Recognition Speech Synthesis Semantic Understanding More natural interfaces
  17. 17. What’s coming next (likely) Composition of E2E models Super personalization Tasks we are not solving because there’s not enough compute
  18. 18. What’s probably NOT coming (immediately) Autonomous driving General Dialog systems
  19. 19. Thank You!
  20. 20. Sanjeev Satheesh sanjeevsatheesh@baidu.com http://research.baidu.com Silicon Valley AI Lab

×