Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission

305 views

Published on

I will present several advances in deep learning from OpenAI. First, I will present OpenAI Five, a neural network that learned to play on par with some of the strongest professional Dota 2 teams in the world in an 18-hero version of the game. Next, I will present Dactyl, a human-like robot hand trained entirely in simulation with reinforcement learning that has achieved unprecedented dexterity on a physical robot. I will also present our results on unsupervised learning in language, that show that pre-training and finetuning can achieve a significant improvement over state of the art. Finally, I will present an overview of the historical progress in the field.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission

  1. 1. Progress towards the OpenAI mission Ilya Sutskever Co-founder and Chief Scientist, OpenAI NOVEMBER 9, 2018
  2. 2. OpenAI’s mission OpenAI’s mission is to ensure that artificial general intelligence (AGI) — by which we mean highly autonomous systems that outperform humans at most economically valuable work — benefits all of humanity. — The OpenAI Charter
  3. 3. Technical progress from OpenAI
  4. 4. OpenAI Five
  5. 5. Dota
  6. 6. Dota is hard Partial observability 120 heroes (we integrated 18) 20,000 actions per game, massive action space Pros dedicate their lives to the game, 10K+ hrs of deliberate practice
  7. 7. Dota is popular Largest professional scene Annual prize pool of $40M+
  8. 8. Dota
  9. 9. Our approach Very large scale reinforcement learning millennia of practice LSTM policy = honeybee brain Self play Reward shaping
  10. 10. Reinforcement learning (RL) actually works! Nearly all RL experts believed that RL can’t solve tasks as hard as Dota Horizon too long
  11. 11. Results
  12. 12. Dactyl
  13. 13. Dexterity
  14. 14. Diverse objects
  15. 15. Strategy: Sim 2 Real
  16. 16. Domain randomization Train in simulation: randomize perception and physics
  17. 17. Domain randomization
  18. 18. Curiosity-based exploration
  19. 19. Core idea Novel states = reward Fix all bugs Very hard to do
  20. 20. Montezuma’s Revenge
  21. 21. Montezuma’s Revenge
  22. 22. Mario
  23. 23. Curiosity early in training
  24. 24. Curiosity late in training
  25. 25. The OpenAI Mission
  26. 26. OpenAI’s mission OpenAI’s mission is to ensure that artificial general intelligence (AGI) — by which we mean highly autonomous systems that outperform humans at most economically valuable work — benefits all of humanity. — The OpenAI Charter
  27. 27. Impact of AGI Generate massive wealth Potential to end poverty, achieve material abundance
  28. 28. Impact of AGI Generate massive wealth Potential to end poverty, achieve material abundance Generate science and technology cure disease, extend life, superhuman healthcare mitigate global warming, clean the oceans, etc massively improve education and psychological well being
  29. 29. Why is OpenAI’s mission relevant today? We review progress in the field over the past 6 years Our conclusion: near term AGI should be taken as a serious possibility
  30. 30. Algorithms
  31. 31. Deep Learning at the root of it all During the past 6 years, deep learning repeatedly and rapidly broke through “insurmountable” barriers A multilayered Perceptron trainable with backpropagation
  32. 32. Vision (2012-2016) HOG feature (2005)—image by Antonio Torrabla
  33. 33. Vision (2012-2016) ImageNet Classification Error (Top 5) Human 2015 (ResNet) 2016 (GoogleLeNet-v4) 0 5 10 15 20 25 30 3.13.6 5.0 2011 (XRCE) 2012 (AlexNet) 2013 (ZF) 2014 (VGG) 2014 (GoogleLeNet) 26.0 16.4 11.7 7.3 6.7
  34. 34. Translation (2014-2018) BLEU score on EN to FR translation on the WMT dataset
  35. 35. Image generation (2014-2018) GANs over the years: 2014 Goodfellow et al, 2014
  36. 36. Image generation (2014-2018) GANs over the years: 2015 Radford et al, 2015
  37. 37. Image generation (2014-2018) GANs over the years: 2017 Karras et al, 2017
  38. 38. Image generation (2014-2018) GANs over the years: 2018 Brock et al, 2018
  39. 39. Reinforcement Learning (2013-2018) DQN (2013) Mnih et al, 2013
  40. 40. Reinforcement Learning (2013-2018) TRPO (2015) Schulman et al, 2015
  41. 41. Reinforcement Learning (2013-2018) AlphaGo (2016) Silver et al, 2016
  42. 42. Reinforcement Learning (2013-2018) OpenAI Five (2018) Very large scale: +100,000 CPU cores, +1000 GPUs
  43. 43. Reinforcement Learning (2013-2018) Dactyl (2018)
  44. 44. Compute
  45. 45. Compute grows rapidly Neural networks can usefully consume all available compute Neural networks are extremely parallelizable 300,000x increase in neural net compute used for the largest neural net experiments over the past 6 years
  46. 46. Petaflop/s-days to train Early production convnet
  47. 47. Conclusion
  48. 48. Formidable challenges remain Unsupervised learning Robust classification Reasoning Abstraction ???
  49. 49. We’ve been breaking through barriers for 6 years Will this trend continue, or will it stop? And if so, when?
  50. 50. Means proactively thinking about risks: Machines pursuing goals misspecified by 
 their operator Malicious humans subverting deployed systems Out-of-control economy that grows without resulting in improvements to human lives THIS TALK’S GOAL IS TO PRESENT EVIDENCE THAT: While highly uncertain, 
 near-term AGI should be taken 
 as a serious possibility.
  51. 51. Thank you

×