Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research


Published on

In this talk at AI Frontiers conference, Jeff Dean discusses recent trends and developments in deep learning research. Jeff touches on the significant progress that this research has produced in a number of areas, including computer vision, language understanding, translation, healthcare, and robotics. These advances are driven by both new algorithmic approaches to some of these problems, and by the ability to scale computation for training ever large models on larger datasets. Finally, one of the reasons for the rapid spread of the ideas and techniques of deep learning has been the availability of open source libraries such as TensorFlow. He gives an overview of why these software libraries have an important role in making the benefits of machine learning available throughout the world.

Published in: Technology

Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research

  1. 1. Trends and Developments in Deep Learning Research Jeff Dean Google Brain team In collaboration with many other people at Google
  2. 2. ConvNets
  3. 3. Accuracy Scale (data size, model size) 1980s and 1990s neural networks other approaches
  4. 4. more computeAccuracy Scale (data size, model size) neural networks other approaches 1980s and 1990s
  5. 5. more computeAccuracy Scale (data size, model size) neural networks other approaches Now
  6. 6. 5% errors humans 2011 26% errors
  7. 7. 2016 3% errors 2011 5% errors humans 26% errors
  8. 8. Growing Use of Deep Learning at Google Android Apps drug discovery Gmail Image understanding Maps Natural language understanding Photos Robotics research Speech Translation YouTube … many others ... Across many products/areas: # of directories containing model description files
  9. 9. Google Brain Team ● Research: 27 papers in ICML, NIPS, and ICLR in 2016, plus others in venues like ACL, ICASSP, CVPR, ICER, and OSDI; Brain Residency Program, hosted ~50 interns in 2016, ... ● Product impact: Many dozens of high impact collaborations in Google products/efforts like Search, Ads, Photos, Translate, GMail, Maps, Cloud ML, speech recognition, self-driving cars, robotics, … ● Open-source tools for ML: TensorFlow, Magenta, visualization tools, …
  10. 10. Need to build the right tools
  11. 11. What do you want in a machine learning system? ● Ease of expression: for lots of crazy ML ideas/algorithms ● Scalability: can run experiments quickly ● Portability: can run on wide variety of platforms ● Reproducibility: easy to share and reproduce research ● Production readiness: go from research to real products
  12. 12. Open, standard software for general machine learning Great for Deep Learning in particular First released Nov 2015 Apache 2.0 license and
  13. 13.
  14. 14. Why Did We Build TensorFlow? Wanted system that was flexible, scalable, and production-ready DistBelief, our first system, was good on two of these, but lacked flexibility Most existing open-source packages were also good on 2 of 3 but not all 3
  15. 15. TensorFlow Goals Establish common platform for expressing machine learning ideas and systems Make this platform the best in the world for both research and production use Open source it so that it becomes a platform for everyone, not just Google
  16. 16. Facts and Figures Launched on Nov. 9, 2015 Initial launch was reasonably fully-featured: auto differentiation, queues, control flow, fairly comprehensive set of ops, … tutorials made system accessible out-of-the-box support for CPUs, GPUs, multiple devices, multiple platforms
  17. 17. Some Stats 500+ contributors, most of them outside Google 12,000+ commits since Nov, 2015 (~30 per day) 1M+ binary downloads #15 most popular repository on GitHub by stars (just passed Linux!) Used in ML classes at quite a few universities now: Toronto, Berkeley, Stanford, … Many companies/organizations using TensorFlow: Google, DeepMind, OpenAI, Twitter, Snapchat, Airbus, Uber, ...
  18. 18. December 2016 (0.6) Python 3.3+ Faster on GPUs
  19. 19. February 2016 (0.7) Dynamic Loading of Kernels CuDNN v4
  20. 20. April 2016 (0.8)
  21. 21. June 2016 (0.9) iOS GPUs on Mac
  22. 22. August 2016 (0.10) Slim
  23. 23. October 2016 (0.11) CuDNN v5
  24. 24. November 2016 (0.12)
  25. 25. Languages
  26. 26. Version 1.0 upcoming Stability Usability Backwards Compatibility Documentation Libraries Models
  27. 27. Appeared in OSDI 2016
  28. 28. Strong External Adoption GitHub launch Nov. 2015 GitHub launch Sep. 2013 GitHub launch Jan. 2012 GitHub launch Jan. 2008 1M+ binary installs since November, 2015 GitHub launch Apr. 2015
  29. 29. Experiment Turnaround Time and Research Productivity ● Minutes, Hours: ○ Interactive research! Instant gratification! ● 1-4 days ○ Tolerable ○ Interactivity replaced by running many experiments in parallel ● 1-4 weeks ○ High value experiments only ○ Progress stalls ● >1 month ○ Don’t even try
  30. 30. Just-In-Time Compilation via TensorFlow’s XLA, "Accelerated Linear Algebra" compiler 0x00000000 movq (%rdx), %rax 0x00000003 vmovaps (%rax), %xmm0 0x00000007 vmulps %xmm0, %xmm0, %xmm0 0x0000000b vmovaps %xmm0, (%rdi) ... TF graphs go in, Optimized & specialized assembly comes out. Let's explain that!
  31. 31. Demo: Inspect JIT code in TensorFlow iPython shell XLA:CPU XLA:GPU
  32. 32. What are some ways that deep learning is having a significant impact at Google? All of these examples implemented using TensorFlow or our predecessor system
  33. 33. “How cold is it outside?” Deep Recurrent Neural Network Acoustic Input Text Output Reduced word errors by more than 30% Speech Recognition Google Research Blog - August 2012, August 2015
  34. 34. “ocean” Deep Convolutional Neural Network Your Photo Automatic Tag Search personal photos without tags. Google Photos Search Google Research Blog - June 2013
  35. 35. Google Photos Search
  36. 36. Reuse same model for completely different problems Same basic model structure trained on different data, useful in completely different contexts Example: given image → predict interesting pixels
  37. 37. We have tons of vision problems Image search, StreetView, Satellite Imagery, Translation, Robotics, Self-driving Cars,
  38. 38. MEDICAL IMAGING Using similar model for detecting diabetic retinopathy in retinal images
  39. 39. Performance on par or slightly better than the median of 8 U.S. board-certified ophthalmologists (F-score of 0.95 vs. 0.91).
  40. 40. Computers can now see Large implications for machine learning for robotics
  41. 41. Combining Vision with Robotics “Deep Learning for Robots: Learning from Large-Scale Interaction”, Google Research Blog, March, 2016 “Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection”, Sergey Levine, Peter Pastor, Alex Krizhevsky, & Deirdre Quillen, Arxiv,
  42. 42. Better language understanding
  43. 43. Score for doc,query pair Deep Neural Network Query & document features Query: “car parts for sale”, Doc: “Rebuilt transmissions …” Launched in 2015 Third most important search ranking signal (of 100s) RankBrain in Google Search Ranking Bloomberg, Oct 2015: “Google Turning Its Lucrative Web Search Over to AI Machines”
  44. 44. Sequence-to-Sequence Model A B C v D __ X Y Z X Y Z Q Input sequence Target sequence [Sutskever & Vinyals & Le NIPS 2014] Deep LSTM
  45. 45. Sequence-to-Sequence Model: Machine Translation v Input sentence Target sentence [Sutskever & Vinyals & Le NIPS 2014] How Quelle est taille?votre <EOS>
  46. 46. Sequence-to-Sequence Model: Machine Translation v Input sentence Target sentence [Sutskever & Vinyals & Le NIPS 2014] How Quelle est taille?votre <EOS> tall How
  47. 47. Sequence-to-Sequence Model: Machine Translation v Input sentence Target sentence [Sutskever & Vinyals & Le NIPS 2014] How tall are Quelle est taille?votre <EOS> How tall
  48. 48. Sequence-to-Sequence Model: Machine Translation v Input sentence Target sentence [Sutskever & Vinyals & Le NIPS 2014] How tall you?are Quelle est taille?votre <EOS> How aretall
  49. 49. Sequence-to-Sequence Model: Machine Translation v Input sentence [Sutskever & Vinyals & Le NIPS 2014] At inference time: Beam search to choose most probable over possible output sequences Quelle est taille?votre <EOS>
  50. 50. April 1, 2009: April Fool’s Day joke Nov 5, 2015: Launched Real Product Feb 1, 2016: >10% of mobile Inbox replies Smart Reply
  51. 51. Small Feed-Forward Neural Network Incoming Email Activate Smart Reply? yes/no Smart Reply Google Research Blog - Nov 2015
  52. 52. Small Feed-Forward Neural Network Incoming Email Activate Smart Reply? Deep Recurrent Neural Network Generated Replies yes/no Smart Reply Google Research Blog - Nov 2015
  53. 53. Combining vision and language
  54. 54. Image Captioning W __ A young girl A young girl asleep[Vinyals et al., CVPR 2015]
  55. 55. Model: A close up of a child holding a stuffed animal. Human: A young girl asleep on the sofa cuddling a stuffed bear. Model: A baby is asleep next to a teddy bear. Image Captions Research
  56. 56. Translation as a sign of better language understanding
  57. 57. Great quality improvements ...but challenging scalability issues
  58. 58. Encoder LSTMs Decoder LSTMs <s> Y1 Y3 SoftMax Y1 Y2 </s> X3 X2 </s> 8 Layers Gpu1 Gpu2 Gpu2 Gpu3 + + + Gpu8 Attention + ++ Gpu1 Gpu2 Gpu3 Gpu8 Google Neural Machine Translation Model One model replica: one machine w/ 8 GPUs
  59. 59. Model + Data Parallelism ... Params Many replicas Parameters distributed across many parameter server machines Params Params ...
  60. 60. neural (GNMT) phrase-based (PBMT) English > Spanish English > French English > Chinese Spanish > English French > English Chinese > English Translation model Translationquality 0 1 2 3 4 5 6 human perfect translation Neural Machine Translation Closes gap between old system and human-quality translation by 58% to 87% Enables better communication across the world
  61. 61. More widespread use of: Transfer and multi-task learning, zero-shot learning
  62. 62. Currently: Most models are trained from scratch for a single task This is quite inefficient: Data inefficient: needs lots of data for each task Computation inefficient: starting from scratch is a lot of work Human ML expert inefficient: substantial effort required for each task
  63. 63. Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation, Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean
  64. 64. Bigger models, but sparsely activated
  65. 65. Per-Example Routing
  66. 66. Outrageously Large Neural Networks: The Sparsely-gated Mixture-of-Experts Layer, Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le & Jeff Dean Submitted to ICLR 2017, Per-Example Routing
  67. 67. Automated machine learning (“learning to learn”)
  68. 68. Current: Solution = ML expertise + data + computation Can we turn this into: Solution = data + 100X computation ???
  69. 69. Idea: model-generating model trained via RL (1) Generate ten models (2) Train them for a few hours (3) Use loss of the generated models as reinforcement learning signal
  70. 70. CIFAR-10 Image Recognition Task
  71. 71. “Normal” LSTM cell Cell discovered by architecture search Penn Tree Bank Language Modeling Task
  72. 72. More computational power needed Deep learning is transforming how we design computers
  73. 73. Special computation properties reduced precision ok about 1.2 × about 0.6 about 0.7 1.21042 × 0.61127 0.73989343 NOT
  74. 74. handful of specific operations × = reduced precision ok about 1.2 × about 0.6 about 0.7 1.21042 × 0.61127 0.73989343 NOT Special computation properties
  75. 75. Tensor Processing Unit Custom Google-designed chip for neural net computations In production use for >20 months: used on every search query, for neural machine translation, for AlphaGo match, ...
  76. 76. Example queries of the future Which of these eye images shows symptoms of diabetic retinopathy? Please fetch me a cup of tea from the kitchen Describe this video in Spanish Find me documents related to reinforcement learning for robotics and summarize them in German
  77. 77. Conclusions Deep neural networks are making significant strides in speech, vision, language, search, robotics, healthcare, … If you’re not considering how to use deep neural nets to solve some problems, you almost certainly should be
  78. 78. More info about our work
  79. 79. Thanks!