SlideShare a Scribd company logo
Where will AGI
come from?
Y Conf, June 10, 2017
andrej @karpathy
“Deep Learning” search popularity
2012+ image recognition,
2010+ speech recognition,
2014+ machine translation,
(from @ML_Hipster)
CS231n: Convolutional Neural Networks for Visual Recognition
(Stanford Class)
2015: 150 students
2016: 330 students
2017: 750 students
2018: ??? (max students per class is capped at 999)
The Current State of Machine Intelligence 3.0 [Shivon Zilis]
In popular media...
1. AI today is still very narrow*.
*2. but thanks to Deep Learning, we can
repurpose solution components faster.
Two comments:
Example: AlphaGo
(see my Medium post “AlphaGo, in context”)
Convenient properties of Go:
1. Deterministic. No noise in the game.
2. Fully observed. Each player has complete information.
3. Discrete action space. Finite number of actions possible.
4. Perfect simulator. The effect of any action is know exactly.
5. Short episodes. ~200 actions per game.
6. Clear + fast evaluation. According to Go rules.
7. Huge dataset available. Human vs human games.
Q: “Can we run AlphaGo on a robot for the Amazon
Picking Challenge”?
Q: “Can we run AlphaGo on a robot for the Amazon
Picking Challenge”?
1. Deterministic. No noise in the game.
2. Fully observed. Each player has complete information.
3. Discrete action space. Finite number of actions possible.
4. Perfect simulator. The effect of any action is know exactly.
5. Short episodes. ~200 actions per game.
6. Clear + fast evaluation. According to Go rules.
7. Huge dataset available. Human vs human games.
1. Deterministic. No noise in the game.
2. Fully observed. Each player has complete information.
3. Discrete action space. Finite number of actions possible.
4. Perfect simulator. The effect of any action is know exactly.
5. Short episodes. ~200 actions per game.
6. Clear + fast evaluation. According to Go rules.
7. Huge dataset available. Human vs human games.
not good
Summary so far:
1. in interest in AI
2. AI is still
3. AI tech works in some cases and can
be repurposed much
“What if we succeed in making it not narrow?”
Nick Bostrom
Stephen Hawking
Bill Gates
Elon Musk
Sam Altman
Stuart Russell
Eliezer Yudkowsky
Normal hype cycle
AI is different.
“AGI imminent.”
“Oh no, AI winter imminent.
My funding is about to dry
up again.”
Meanwhile, in Academia...
Talk Outline:
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
- Artificial Life - “just do what nature did.”
- Something not on our radar
Where could AGI come from?
Talk Outline:
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
- Artificial Life - “just do what nature did.”
- Something not on our radar
Where could AGI come from?
Supervised Learning:
Collect lots of labeled data, train a neural network on it.
How do we get labels of
intelligent behavior?
Short Story on AI: A Cognitive Discontinuity.
Nov 14, 2015
Amazon Mechanical Turk
CORE IDEA: collect data from
humans, then train a big Neural
Net to mimic what humans do.
Amazon Mechanical Turk ++
lots of
joint positions/velocities
joint torques, etc.
ACTION taken by
the human
Make these equal
Amazon Mechanical Turk ++
Step 2:
joint positions/velocities
joint torques, etc.
What would this AI look like?
Possible hint: char-rnn
The cat sat on a ma_?
next character
next character by
Make these equal
Possible hint:
at first:
Generate text from the model
train for a bit
at first:
train more
train more
at first:
train for a bit
open source textbook on algebraic geometry
Latex source
The low-level gestalt is right, but the high-level,
long-term structure is missing. This is mitigated
with more data / larger models.
AIs in this approach…
- Imitate/generate human-like actions
- Can these AIs be creative?
- Can they assemble a room of chairs/tables?
- Can they make human domination schemes?
AIs in this approach…
- Imitate/generate human-like actions
- Can these AIs be creative?
- Can they assemble a room of chairs/tables?
- Can they make human domination schemes?
(Kind of)
Talk Outline:
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
- Artificial Life - “just do what nature did.”
- Something not on our radar
Where could AGI come from?
Unsupervised Learning: Big generative models.
1. Initialize a Big Neural Network
2. Train it to compress a huge amount of
data on the internet
3. ???
4. Profit
Example2: (variational) autoencoders
Also see:
Autoregressive models,
Generative Adversarial Networks,
identity function
Information bottleneck:
30 numbers.
(must compress the data to 30
numbers to reconstruct later)
Example2: (variational) autoencoders
Meddle with the code, then
“decode” to the image
Work at OpenAI: “Unsupervised Sentiment Neuron”
(Alec Radford et al.)
Another example:
1. Train a large char-rnn on a large corpus of unlabeled reviews from Amazon
2. One of the neurons automagically “discovers” a small sentiment classifier (this
high-level feature must help predict the next character)
(char-rnn also optimizes compression of data; prediction and compression are closely linked.)
Basic idea:
all of
Big Neural Network
What would this AI look like?
- The neural network has a powerful “brain state”:
- Given any input data, could get e.g. 10,000
numbers of the networks “thoughts” about
the data.
- Given any vector of 10,000 numbers, we
could maybe ask the network to generate
samples of data that correspond.
- Does it want to take over the world? (no; has no
agency, no planning, etc.)
Talk Outline:
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
- Artificial Life - “just do what nature did.”
- Something not on our radar
Where could AGI come from?
- Algorithmic information theory applied to general artificial
intelligence. (Marcus Hutter)
- Allows for a formal definition of “Universal Intelligence”
(Shane Legg)
- Bayesian Reinforcement Learning agent over the
hypothesis space of all Turing machines.
Turing machines
Prior probability:
“Simpler worlds” are more likely
Turing machines
Likelihood probability:
Which TMs are consistent with my
experience so far?
System identification: which Turing machine am I in? If I knew, I could plan perfectly.
Multiply vertically to get a posterior
We can write down the optimal agent’s action at time t:
Complete history of
interactions up to this point
time t
time m
all possible future
Weighted average of the
total discounted reward,
across all possible
Turing Machines.
The weights are
[prior] x [likelihood] for
each Turing machine.
(description length of the
TM, number of bits)
There’s just a few problems...
!!! !!!
Attempts have been made...
I like “A Monte-Carlo AIXI Approximation” from Veness et al. 2011,
What would this agent look like?
- We need to feed it a reward signal. Might be very hard to write
down. Might lead to “perverse instantiations” (e.g. paper clip
maximizers etc.)
- Or maybe humans have a dial that gives the reward. But its
actions might not be fully observable to humans.
- Very computationally intractable. Also, people are really not
good at writing complex code. (e.g. for “AIXI approximation”).
- This agent could be quite scary. Definitely has agency.
Talk Outline:
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
- Artificial Life - “just do what nature did.”
- Something not on our radar
Where could AGI come from?
Brain simulation
BRAIN initiative, Human Brain Project, optogenetics,
multi-electrode arrays, connectomics, NeuraLink, ...
Brain simulation
- How to measure a complete brain state?
- At what level of abstraction?
- How to model the dynamics?
- How do you simulate the “environment” to
feed into senses?
- Various ethical dilemmas
- Timescale-bearish neuroscientists.
Talk Outline:
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
- Artificial Life - “just do what nature did.”
- Something not on our radar
Where could AGI come from?
How did intelligence arise in nature?
We don’t have to redo 4B years of evolution.
- Work at a higher level of abstraction. We don’t have to
simulate chemistry etc. to get intelligent networks.
- Intelligent design. We can meddle with the system and
initialize with RL agents, etc.
Intelligence is the ability to win, in the face of world dynamics
and a changing population of other intelligent agents with
similar goals.
● attention. The at-will ability to selectively "filter out" parts of the input that is judged not to be relevant for a current top-down
goal. e.g. the "cocktail party effect".
● working memory: some structures/processes that temporarily store and manipulate information (7 +/- 2). Related to this,
phonological loop: a special part of working memory dedicated to storing a few seconds of sound (e.g. when you repeat a 7-digit
phone number in your mind to keep it in memory). also: the visuospatial sketchpad and an episodic buffer.
● long-term memory of quite a few suspected different types: procedural memory (e.g. driving a car), semantic memory (e.g. the
name of the current President), episodic memory (for autobiographical sequences of events, e.g. where one was during 9/11)
● knowledge representation; the ability to rapidly learn and incorporate facts into some "world model" that can be inferred over in
what looks to be approximately bayesian ways. the ability to detect and resolve contradictions, or propose experiments that
disambiguate cases. the ability to keep track of what source provided a piece of information and later down-weigh its confidence
if the source is suddenly judged not trust-worthy.
● spatial reasoning, some crude "game engine" model of a scene and its objects and attributes. All the complex biases we have
built in that only get properly revealed with optical illusions. Spatial memory: cells in the brain that keep track of the connectivity
of the world and do something like an automatic "SLAM", putting together a lot of information from different senses to position
the brain in the world.
● reasoning by analogy, eg applying a proverb such as "that’s locking the barn door after the horse has gone" to a current situation.
● emotions; heuristics that make our genes more likely to spread - e.g. frustration.
● a forward simulator, which lets us roll forward and consider abstract events and situations.
● various skill acquisition heuristics; practicing something repeatedly, including the abstract idea of "resetting" an experiment, or
deciding when an experiment is finished, or what its outcomes were. The heuristic inclination for "fun", experimentation, and
curiosity. The heuristic of empowerment, or the idea that it is better to take actions that leave more options available in the
● consciousness / theory of mind: the understanding that other agents are like me but also slightly different in unknown ways.
Empathy (e.g. the cringy feeling when seeing someone else get hurt). Imitation learning, or the heuristic of paying attention to
and then later repeating what the other agents are doing.
Intelligence “Cognitive toolkit” includes but is not limited to:
Conclusion: we need to create environments that
incentivize the emergence of a cognitive toolkit.
Conclusion: we need to create environments that
incentivize the emergence of cognitive toolkit.
Incentives a lookup table of correct moves.
Doing it wrong:
Conclusion: we need to create environments that
incentivize the emergence of cognitive toolkit.
Doing it right:
Incentives a lookup table of correct moves.
Doing it wrong:
Incentivises cognitive tools.
Benefits of multi-agent environments:
- variety - the environment is parameterized by its agent
population, so an optimal strategy must be dynamically
derived, and cannot be statically “baked” as behaviors /
reflexes into a network.
- natural curriculum - the difficulty of the environment is
determined by the skill of the other agents.
Why? Trends.
Q: What about the optimization?
A: Optimize over the whole thing: the architecture, the
initialization, the learning rule.
Write very little (or none) explicit code.
(example small
tensorflow graph)
(~10^6 images)
Caltech 101
(~10^4 images)
(how large they are)
Images on the web
(~10^9+ images)
(how well they work)
Image Features
(SIFT etc., learning linear
classifiers on top)
(learn the features,
Structure hard-coded)
90s - 2012
(learn the weights
and the structure)
Hard Coded
(edge detection etc.
no learning)
(10^0; single image)
70s - 90s
Zone of “not going to happen.”
Pascal VOC
(~10^5 images)
In Computer Vision...
(~few dozen envs)
Cartpole etc.
(and bandits, gridworld,
...few toy tasks)
(how much they measure / incentivise general intelligence)
more multi-agent / non-stationary / real-world-like.
(how impressive they are)
more learning.
more compute.
Value Iteration etc.
(~discrete MDPs, linear
function approximators)
(deep nets, hard-coded
various tricks)
(Learn the RL
structure fixed.)
90s - 2012
(learn structure and
learning algorithm)
(simple multi-agent envs)
Digital worlds
(complex multi-agent envs)
Hard Coded
(LISP programs, no learning)
(SHRDLU etc)
70s - 90s
Zone of “not going to happen.”
In Reinforcement Learning
With increasing computational resources, the trend
is towards more learning/optimization, and less
explicit design.
1970: One of Many explicit (LISP)
programs that made up SHRDLU.
50 years
Large-Scale Evolution of Image Classifiers
“Learning to Cooperate, Compete, and Communicate”
OpenAI blog post, 2017
- 4 red agents cooperate to
chase 2 green agents
- 2 green agents want to
reach blue “water”
What would this look like?
- Achieve completely uninterpretable “proto-AIs” first, similar
to simple animals, but with fairly complete cognitive toolkits.
- Evolved AIs are a synthetic species that lives among us.
- We will shape them to love humans, similar to how we
shaped dogs.
- “AI safety” will become a primarily empirical discipline, not a
mathematical one as it is today.
- Some might try to evolve bad AIs, equiv. to. combat dogs.
- We might have to make it illegal to evolve AI strains, or
upper bound the amount of computation per person and
closely track all computational resources on Earth.
Talk Outline:
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
- Artificial Life - “just do what nature did.”
- Something not on our radar
Where could AGI come from?
Data from very large VR MMORPG worlds?
Combination of some of the above?
- E.g. take the artificial life
approach, but allow agents to
access the high-level
representations of a big,
pre-trained generative model.
In order of promisingness:
- Artificial Life - “just do what nature did.”
- Something not on our radar
- Supervised learning - “it works, just scale up!”
- Unsupervised learning - “it will work, if we only scale up!”
- AIXI - “guys, I can write down optimal AI.”
- Brain simulation - “this will work one day, right?”
What do you think?
(Thank you!)
BrainSim ALife Other
Cool Related Pointers
Sebastian’s post, which inspired the title of this talk
Rodney Brooks paper

More Related Content

Similar to Y conf talk - Andrej Karpathy

Today is all about AI
Today is all about AIToday is all about AI
Today is all about AI
Petru Cioată
Artificial Intelligence or the Brainization of the Economy
Artificial Intelligence or the Brainization of the EconomyArtificial Intelligence or the Brainization of the Economy
Artificial Intelligence or the Brainization of the Economy
Willy Braun
Deep Learning
Deep LearningDeep Learning
Deep Learning
Shaikh Shahzad
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro..."Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
Edge AI and Vision Alliance
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphs
The Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep LearningThe Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep Learning
indico data
The Magic Behind AI
The Magic Behind AIThe Magic Behind AI
The Magic Behind AI
Othman Gacem
New Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human ActionsNew Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human Actions
Shreya Shetty
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
Darian Frajberg
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Sri Ambati
1. The Game Of The Century
1. The Game Of The Century1. The Game Of The Century
1. The Game Of The Century
Alexandre Linhares
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
osman ansari
Machine Learning, AI and the Brain
Machine Learning, AI and the Brain Machine Learning, AI and the Brain
Machine Learning, AI and the Brain
Computers are plain stupid (but that's just common sense).
Computers are plain stupid (but that's just common sense).Computers are plain stupid (but that's just common sense).
Computers are plain stupid (but that's just common sense).
Pim Nauts
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
Lukas Masuch
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Niket Singh
Artificial Intelligence for Undergrads
Artificial Intelligence for UndergradsArtificial Intelligence for Undergrads
Artificial Intelligence for Undergrads
Jose Berengueres
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1
Mustafa Kuğu

Similar to Y conf talk - Andrej Karpathy (20)

Today is all about AI
Today is all about AIToday is all about AI
Today is all about AI
Artificial Intelligence or the Brainization of the Economy
Artificial Intelligence or the Brainization of the EconomyArtificial Intelligence or the Brainization of the Economy
Artificial Intelligence or the Brainization of the Economy
Deep Learning
Deep LearningDeep Learning
Deep Learning
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro..."Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
Introduction to Knowledge Graphs
Introduction to Knowledge GraphsIntroduction to Knowledge Graphs
Introduction to Knowledge Graphs
The Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep LearningThe Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep Learning
The Magic Behind AI
The Magic Behind AIThe Magic Behind AI
The Magic Behind AI
New Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human ActionsNew Artifitial Intelligence that can predicts Human Actions
New Artifitial Intelligence that can predicts Human Actions
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
1. The Game Of The Century
1. The Game Of The Century1. The Game Of The Century
1. The Game Of The Century
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
Machine Learning, AI and the Brain
Machine Learning, AI and the Brain Machine Learning, AI and the Brain
Machine Learning, AI and the Brain
Computers are plain stupid (but that's just common sense).
Computers are plain stupid (but that's just common sense).Computers are plain stupid (but that's just common sense).
Computers are plain stupid (but that's just common sense).
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
Artificial Intelligence for Undergrads
Artificial Intelligence for UndergradsArtificial Intelligence for Undergrads
Artificial Intelligence for Undergrads
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1

Recently uploaded

A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill

Recently uploaded (20)

A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Christine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptxChristine's Product Research Presentation.pptx
Christine's Product Research Presentation.pptx
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota"Choosing proper type of scaling", Olena Syrota
"Choosing proper type of scaling", Olena Syrota
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill

Y conf talk - Andrej Karpathy

  • 1. Where will AGI come from? Y Conf, June 10, 2017 andrej @karpathy
  • 2. “Deep Learning” search popularity 2012 2012+ image recognition, 2010+ speech recognition, 2014+ machine translation, etc.
  • 3.
  • 5.
  • 6. CS231n: Convolutional Neural Networks for Visual Recognition (Stanford Class) 2015: 150 students 2016: 330 students 2017: 750 students 2018: ??? (max students per class is capped at 999)
  • 7. The Current State of Machine Intelligence 3.0 [Shivon Zilis]
  • 9. 1. AI today is still very narrow*. *2. but thanks to Deep Learning, we can repurpose solution components faster. Two comments:
  • 10. Example: AlphaGo (see my Medium post “AlphaGo, in context”)
  • 11. Convenient properties of Go: 1. Deterministic. No noise in the game. 2. Fully observed. Each player has complete information. 3. Discrete action space. Finite number of actions possible. 4. Perfect simulator. The effect of any action is know exactly. 5. Short episodes. ~200 actions per game. 6. Clear + fast evaluation. According to Go rules. 7. Huge dataset available. Human vs human games.
  • 12. Q: “Can we run AlphaGo on a robot for the Amazon Picking Challenge”?
  • 13. Q: “Can we run AlphaGo on a robot for the Amazon Picking Challenge”? A:
  • 14. 1. Deterministic. No noise in the game. 2. Fully observed. Each player has complete information. 3. Discrete action space. Finite number of actions possible. 4. Perfect simulator. The effect of any action is know exactly. 5. Short episodes. ~200 actions per game. 6. Clear + fast evaluation. According to Go rules. 7. Huge dataset available. Human vs human games.
  • 15. 1. Deterministic. No noise in the game. 2. Fully observed. Each player has complete information. 3. Discrete action space. Finite number of actions possible. 4. Perfect simulator. The effect of any action is know exactly. 5. Short episodes. ~200 actions per game. 6. Clear + fast evaluation. According to Go rules. 7. Huge dataset available. Human vs human games. OK OKish OK TROUBLE. challenge challenge not good
  • 16. Summary so far: 1. in interest in AI 2. AI is still 3. AI tech works in some cases and can be repurposed much (narrow)
  • 17. “What if we succeed in making it not narrow?” Nick Bostrom Stephen Hawking Bill Gates Elon Musk Sam Altman Stuart Russell Eliezer Yudkowsky ... ~2014+
  • 20. “AGI imminent.” “Oh no, AI winter imminent. My funding is about to dry up again.” Meanwhile, in Academia...
  • 21. Talk Outline: - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” - Artificial Life - “just do what nature did.” - Something not on our radar Where could AGI come from?
  • 22. Talk Outline: - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” - Artificial Life - “just do what nature did.” - Something not on our radar Where could AGI come from?
  • 23. Supervised Learning: Collect lots of labeled data, train a neural network on it.
  • 24. How do we get labels of intelligent behavior?
  • 25. Short Story on AI: A Cognitive Discontinuity. Nov 14, 2015 see: link
  • 26. Amazon Mechanical Turk CORE IDEA: collect data from humans, then train a big Neural Net to mimic what humans do.
  • 27. Amazon Mechanical Turk ++ SSH lots of train data
  • 29. Amazon Mechanical Turk ++ Step 2: autonomy Big Neural Network STATE: vision audio joint positions/velocities TASK description ACTION: joint torques, etc.
  • 30. What would this AI look like?
  • 31. Possible hint: char-rnn The cat sat on a ma_?
  • 33.
  • 34. at first: Generate text from the model
  • 35. train for a bit at first:
  • 36. train more train more at first: train for a bit
  • 37.
  • 38. open source textbook on algebraic geometry Latex source
  • 39.
  • 40. The low-level gestalt is right, but the high-level, long-term structure is missing. This is mitigated with more data / larger models.
  • 41. AIs in this approach… - Imitate/generate human-like actions - Can these AIs be creative? - Can they assemble a room of chairs/tables? - Can they make human domination schemes?
  • 42. AIs in this approach… - Imitate/generate human-like actions - Can these AIs be creative? - Can they assemble a room of chairs/tables? - Can they make human domination schemes? (Kind of) (Yes) (No.)
  • 43. Talk Outline: - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” - Artificial Life - “just do what nature did.” - Something not on our radar Where could AGI come from?
  • 44. Unsupervised Learning: Big generative models. 1. Initialize a Big Neural Network 2. Train it to compress a huge amount of data on the internet 3. ??? 4. Profit
  • 45. Example2: (variational) autoencoders Also see: Autoregressive models, Generative Adversarial Networks, etcetc. identity function Information bottleneck: 30 numbers. (must compress the data to 30 numbers to reconstruct later)
  • 46. Example2: (variational) autoencoders Meddle with the code, then “decode” to the image
  • 47. Work at OpenAI: “Unsupervised Sentiment Neuron” (Alec Radford et al.) Another example: 1. Train a large char-rnn on a large corpus of unlabeled reviews from Amazon 2. One of the neurons automagically “discovers” a small sentiment classifier (this high-level feature must help predict the next character) (char-rnn also optimizes compression of data; prediction and compression are closely linked.)
  • 48. Basic idea: all of internet Big Neural Network +compression objective
  • 49. What would this AI look like? - The neural network has a powerful “brain state”: - Given any input data, could get e.g. 10,000 numbers of the networks “thoughts” about the data. - Given any vector of 10,000 numbers, we could maybe ask the network to generate samples of data that correspond. - Does it want to take over the world? (no; has no agency, no planning, etc.)
  • 50. Talk Outline: - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” - Artificial Life - “just do what nature did.” - Something not on our radar Where could AGI come from?
  • 51. AIXI - Algorithmic information theory applied to general artificial intelligence. (Marcus Hutter) - Allows for a formal definition of “Universal Intelligence” (Shane Legg) - Bayesian Reinforcement Learning agent over the hypothesis space of all Turing machines.
  • 52. Turing machines Prior probability: “Simpler worlds” are more likely P Turing machines Likelihood probability: Which TMs are consistent with my experience so far? P System identification: which Turing machine am I in? If I knew, I could plan perfectly. Multiply vertically to get a posterior
  • 53. We can write down the optimal agent’s action at time t: (from where
  • 54. Complete history of interactions up to this point time t time m all possible future action-state sequences Weighted average of the total discounted reward, across all possible Turing Machines. The weights are [prior] x [likelihood] for each Turing machine. (description length of the TM, number of bits)
  • 55.
  • 56. There’s just a few problems... !!! !!! !!!!!!!!!!!!11
  • 57. Attempts have been made... I like “A Monte-Carlo AIXI Approximation” from Veness et al. 2011,
  • 58. What would this agent look like? - We need to feed it a reward signal. Might be very hard to write down. Might lead to “perverse instantiations” (e.g. paper clip maximizers etc.) - Or maybe humans have a dial that gives the reward. But its actions might not be fully observable to humans. - Very computationally intractable. Also, people are really not good at writing complex code. (e.g. for “AIXI approximation”). - This agent could be quite scary. Definitely has agency.
  • 59. Talk Outline: - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” - Artificial Life - “just do what nature did.” - Something not on our radar Where could AGI come from?
  • 60. Brain simulation BRAIN initiative, Human Brain Project, optogenetics, multi-electrode arrays, connectomics, NeuraLink, ...
  • 61. Brain simulation - How to measure a complete brain state? - At what level of abstraction? - How to model the dynamics? - How do you simulate the “environment” to feed into senses? - Various ethical dilemmas - Timescale-bearish neuroscientists.
  • 62. Talk Outline: - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” - Artificial Life - “just do what nature did.” - Something not on our radar Where could AGI come from?
  • 63. How did intelligence arise in nature?
  • 64. We don’t have to redo 4B years of evolution. - Work at a higher level of abstraction. We don’t have to simulate chemistry etc. to get intelligent networks. - Intelligent design. We can meddle with the system and initialize with RL agents, etc.
  • 65. Intelligence is the ability to win, in the face of world dynamics and a changing population of other intelligent agents with similar goals.
  • 66. ● attention. The at-will ability to selectively "filter out" parts of the input that is judged not to be relevant for a current top-down goal. e.g. the "cocktail party effect". ● working memory: some structures/processes that temporarily store and manipulate information (7 +/- 2). Related to this, phonological loop: a special part of working memory dedicated to storing a few seconds of sound (e.g. when you repeat a 7-digit phone number in your mind to keep it in memory). also: the visuospatial sketchpad and an episodic buffer. ● long-term memory of quite a few suspected different types: procedural memory (e.g. driving a car), semantic memory (e.g. the name of the current President), episodic memory (for autobiographical sequences of events, e.g. where one was during 9/11) ● knowledge representation; the ability to rapidly learn and incorporate facts into some "world model" that can be inferred over in what looks to be approximately bayesian ways. the ability to detect and resolve contradictions, or propose experiments that disambiguate cases. the ability to keep track of what source provided a piece of information and later down-weigh its confidence if the source is suddenly judged not trust-worthy. ● spatial reasoning, some crude "game engine" model of a scene and its objects and attributes. All the complex biases we have built in that only get properly revealed with optical illusions. Spatial memory: cells in the brain that keep track of the connectivity of the world and do something like an automatic "SLAM", putting together a lot of information from different senses to position the brain in the world. ● reasoning by analogy, eg applying a proverb such as "that’s locking the barn door after the horse has gone" to a current situation. ● emotions; heuristics that make our genes more likely to spread - e.g. frustration. ● a forward simulator, which lets us roll forward and consider abstract events and situations. ● various skill acquisition heuristics; practicing something repeatedly, including the abstract idea of "resetting" an experiment, or deciding when an experiment is finished, or what its outcomes were. The heuristic inclination for "fun", experimentation, and curiosity. The heuristic of empowerment, or the idea that it is better to take actions that leave more options available in the future. ● consciousness / theory of mind: the understanding that other agents are like me but also slightly different in unknown ways. Empathy (e.g. the cringy feeling when seeing someone else get hurt). Imitation learning, or the heuristic of paying attention to and then later repeating what the other agents are doing. Intelligence “Cognitive toolkit” includes but is not limited to:
  • 67. Conclusion: we need to create environments that incentivize the emergence of a cognitive toolkit.
  • 68. Conclusion: we need to create environments that incentivize the emergence of cognitive toolkit. Incentives a lookup table of correct moves. Doing it wrong:
  • 69. Conclusion: we need to create environments that incentivize the emergence of cognitive toolkit. Doing it right: Incentives a lookup table of correct moves. Doing it wrong: Incentivises cognitive tools.
  • 70. Benefits of multi-agent environments: - variety - the environment is parameterized by its agent population, so an optimal strategy must be dynamically derived, and cannot be statically “baked” as behaviors / reflexes into a network. - natural curriculum - the difficulty of the environment is determined by the skill of the other agents.
  • 71. Why? Trends. Q: What about the optimization? A: Optimize over the whole thing: the architecture, the initialization, the learning rule. Write very little (or none) explicit code. (example small tensorflow graph)
  • 72. datasets models ImageNet (~10^6 images) Caltech 101 (~10^4 images) (how large they are) Google/FB Images on the web (~10^9+ images) (how well they work) Image Features (SIFT etc., learning linear classifiers on top) ConvNets (learn the features, Structure hard-coded) 2013 2017 90s - 2012 CodeGen (learn the weights and the structure) projection Hard Coded (edge detection etc. no learning) Lena (10^0; single image) 70s - 90s possibilityfrontier Zone of “not going to happen.” Pascal VOC (~10^5 images) In Computer Vision...
  • 73. environments agents MuJoCo/ATARI /Universe (~few dozen envs) Cartpole etc. (and bandits, gridworld, ...few toy tasks) (how much they measure / incentivise general intelligence) more multi-agent / non-stationary / real-world-like. (how impressive they are) more learning. more compute. Value Iteration etc. (~discrete MDPs, linear function approximators) DQN, PG (deep nets, hard-coded various tricks) 2013 2017 RL^2 (Learn the RL algorithm. structure fixed.) 90s - 2012 CodeGen (learn structure and learning algorithm) projection (simple multi-agent envs) Digital worlds (complex multi-agent envs) Reality Hard Coded (LISP programs, no learning) BlocksWorld (SHRDLU etc) 70s - 90s possibilityfrontier Zone of “not going to happen.” In Reinforcement Learning
  • 74. With increasing computational resources, the trend is towards more learning/optimization, and less explicit design. 1970: One of Many explicit (LISP) programs that made up SHRDLU. 50 years “NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING”, Zoph & Le Large-Scale Evolution of Image Classifiers
  • 75. “Learning to Cooperate, Compete, and Communicate” OpenAI blog post, 2017 - 4 red agents cooperate to chase 2 green agents - 2 green agents want to reach blue “water”
  • 76. What would this look like? - Achieve completely uninterpretable “proto-AIs” first, similar to simple animals, but with fairly complete cognitive toolkits. - Evolved AIs are a synthetic species that lives among us. - We will shape them to love humans, similar to how we shaped dogs. - “AI safety” will become a primarily empirical discipline, not a mathematical one as it is today. - Some might try to evolve bad AIs, equiv. to. combat dogs. - We might have to make it illegal to evolve AI strains, or upper bound the amount of computation per person and closely track all computational resources on Earth.
  • 77. Talk Outline: - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” - Artificial Life - “just do what nature did.” - Something not on our radar Where could AGI come from?
  • 78. + Data from very large VR MMORPG worlds?
  • 79. Combination of some of the above? - E.g. take the artificial life approach, but allow agents to access the high-level representations of a big, pre-trained generative model.
  • 80. In order of promisingness: - Artificial Life - “just do what nature did.” - Something not on our radar - Supervised learning - “it works, just scale up!” - Unsupervised learning - “it will work, if we only scale up!” - AIXI - “guys, I can write down optimal AI.” - Brain simulation - “this will work one day, right?” Conclusion
  • 81. What do you think? (Thank you!) SL UL AIXI BrainSim ALife Other
  • 82. Cool Related Pointers Sebastian’s post, which inspired the title of this talk Rodney Brooks paper