Towards AGI - Berlin May 2019

Building Artificial
General Intelligence
Peter Morgan
www.deeplp.com

https://start.activestate.com/oreilly-ebook/
© Peter Morgan, May 2019

Outline of Talk
• Physical Systems
• Biological
• Non-biological
• Deep Learning
• Description
• Types
• Reinforcement Learning
• Latest Research
• Towards AGI
• Overview
• Comparisons
• Building AGI
• Conclusions

Motivation
• Solve (general) intelligence
• Use it to solve everything else
• Medicine
• Cancer
• Brain disease (Alzheimer's, etc.)
• Longevity
• Physics
• Maths
• Materials science
• Social

What is Intelligence?

How will we get
there?
Physics
Computer
Science
Neuroscience
It takes a village to create an AGI
Psychology

Physical Systems
• Biological
• Plants, bacteria, insects, reptiles, mammalian, biological brains
• Classical (non-biological)
• CPU - Intel Xeon SP, AMD RyZen, Qualcomm, IBM PowerPC, ARM
• GPU - Nvidia (Volta), AMD (Vega)
• FPGA - Intel (Altera, Xylinx etc.)
• ASIC - Google TPU, Graphcore IPU, Intel Nervana, Wave, …
• Neuromorphic (Human Brain Project - SpiNNaker, BrainScaleS; IBM TrueNorth; Intel Liohi, …
• Quantum
• IBM, Microsoft, Intel, Google, DWave, Rigetti, …
• Quantum biology? (photosynthesis, navigation, …)
• QuantumML, Quantum Intelligence

Types of Physical Computation Systems*
*Can we find a theory that unifies them all (classical, quantum, biological, non-biological)
Digital Neuromorphic
Quantum Biological

Biology

Biological
Systems are
Hierarchical

Biological
Neuron
Microstructure

Biological
Neuron

Hand drawn neuron
types
From "Structure of the Mammalian Retina"
c.1900, by Santiago Ramon y Cajal.

Neuron -
scanning
electron
microscope

Cortical
columns in
the cortex

Human
Connectome

Central
Nervous
System (CNS)

Social
Systems

A
Comparison
of Neuron
Models

Non-biological
Hardware
• Digital
• CPU
• GPU
• FPGA
• ASIC
• Neuromorphic
• Various architectures
• SpiNNaker, BrainScaleS, TrueNorth, …
• Quantum
• Different qubits
• Anyons, superconducting, photonic, …

Digital
Computing
• Abacus
• Charles Babbage
• Ada Lovelace
• Vacuum tubes (valves)
• Turing
• Von Neumann
• ENIAC
• Transistor (Bardeen, Brattain, Shockey, 1947)
• Intel
• ARM
• Nvidia
• ASICs

Cray-1
1976
160 MFlops

CPU – Intel
Xeon
Up to 32 cores, ~1 TFlops

GPU –
Nvidia Volta
V100
21 billion transistors, 120 TFlops

DGX-2 - released Mar 2018
16 V100’s, 2 PFlops, 30TB storage ($400k) 2 PFlops!

ASIC – Google TPU 3.0
© Peter Morgan, May 2019 360 TFlops! - Announced Google I/O, May 2018

ASIC - Graphcore IPU
© Peter Morgan, May 2019 >200 TFlops

Cloud TPU’s
Over 100 PetaFlops !

Summit US
IBM AC922 system
4,608 servers
Each server contains two
22-core IBM Power9 CPUs
six Nvidia Tesla V100 GPUs
 200PFlops (3 ExaFlops
mixed precision!)
2 tennis courts area
250 Petabytes storage
13MW power
$200million
Announced 5 June 2018

HPC –
what’s
next?
Currently 200PFlops - Summit
By 2020 – beyond Exascale

HPC
Timeline

Aurora 21
Exascale compute by 2021, Argonne National Lab, Intel + Cray © Peter Morgan, June 14 2018

Processor
Performance
(MFlops)
More specific 

Biology vs
Digital

Neuromorphic
Computing
• Biologically inspired
• First proposed Carver Mead, Caltech, 1980’s
• Uses analogue signals – spiking neural networks (SNN)
• SpiNNaker (Manchester, HBP, Furber)
• BrainScaleS (Heidelberg, HBP, Schemmel)
• TrueNorth (IBM, Modha)
• Intel Liohi
• Startups (Knowm, Spaun, etc.)
• Up to 1 million cores, 1 billion “neurons” (mouse)
• Need to scale 100X  human brain
• Relatively low power
• Available on the (HBP) cloud today

SpiNNaker
Neuromorphic
Computer

Neuromorphic
vs von
Neumann

TrueNorth
Performance

Neuromorphic v ASIC
Analogue v Digital

Quantum
Computing
• First proposed by Richard Feynman, Caltech, 1980’s
• Qubits – spin 1, 0 and superposition states (QM)
• (Nature is) fundamentally probabilistic at atomic scale
• Have to be kept cold (mKelvin) to avoid
noise/decoherence
• Building is an engineering problem (theory is known)
• Several approaches - superconductors, trapped ions,
semiconductors, topological structures
• Several initiatives (with access available)
• Microsoft, IBM, Google, Intel, Dwave, Rigetti, etc.
• Can login today
• Many applications – optimization, cryptography, drug
discovery, etc.

IBM 50 Qubit
Quantum
Computer

Quantum Logic Gates

Summary – Now have three non-biological stacks
Algorithms
Distributed Layer
OS
Hardware
Classical Neuromorphic Quantum

Outline
• Biological
• Non-biological
• Deep Learning
• Description
• Types
• Latest Research in DL
• Towards AGI?
• Overview
• Comparisons
• AGI
• Conclusions

Early papers

Nodes and Layers

More Neural Networks (“Neural Network Zoo”)

Computation in each node

Universal Approximation Theorem
• A feed-forward network with a single hidden layer containing a finite number
of neurons, can approximate continuous functions in Rn, under mild assumptions on the
activation function
• We can define as an approximate realization of f(x):
• One of the first versions of the theorem was proved by George Cybenko in 1989
for sigmoid activation functions
• Kurt Hornik showed in 1991 that it is not the specific choice of the activation function, but
rather the multilayer feedforward architecture which gives neural networks the potential
of being universal approximators
• Cybenko, G., Approximations by superpositions of sigmoidal functions, Mathematics of
Control, Signals, and Systems, 2(4), 303-314, 1989
• Kurt Hornik (1991) Approximation Capabilities of Multilayer Feedforward Networks,
Neural Networks, 4(2), 251–257, 1991

Computation Graph
https://www.tensorflow.org/programmers_guide/graph_viz

Hyperparameters
• Activation function
• Optimizations
• Loss (cost) function
• Learning rate
• Initialization
• Batch normalization
• Automation
• Hyperparameter tuning
• AutoML
• https://research.googleblog.com/2018/03/using-machine-learning-to-discover.html

Deep
Learning
Performance
Image classification

Deep Learning Performance
ImageNet Error rate is now around 2.2%, less than half that of average humans

Convolutional Neural
Networks
• First developed in 1970’s.
• Widely used for image recognition and
classification.
• Inspired by biological processes, CNN’s
are a type of feed-forward ANN.
• The individual neurons are tiled in such a
way that they respond to overlapping
regions in the visual field
• Yann LeCun – Bell Labs, 90’s

Recurrent Neural Networks
• First developed in 1970’s.
• RNN’s are neural networks that
are used to predict the next
element in a sequence or time
series.
• This could be, for example,
words in a sentence or letters in
a word.
• Applications include predicting
or generating music, stories,
news, code, financial instrument
pricing, text, speech, in fact the
next element in any event
stream.

GANs
Generative Adversarial Networks - introduced
by Ian Goodfellow et al in 2014 (see references)
A class of artificial intelligence algorithms used
in unsupervised deep learning
A theory of adversarial examples, resembling
what we have for normal supervised learning
Implemented by a system of two neural
networks, a discriminator, D and a generator, G
D & G contest with each other in a zero-sum
game framework
Generator generates candidate networks and
the discriminator evaluates them

Stacked Generative Adversarial Networks
https://arxiv.org/abs/1612.04357v1

NN Models
AlexNet (Toronto)
VGG (Oxford)
ResNet (Microsoft)
Inception (Google)
DenseNet (Cornell)
SqueezNet (Berkeley)
MobileNet (Google)
NASNet (Google)
And many (hundreds) more ...

Deep
Learning
Frameworks

Top 20 ML/DL Frameworks
KD Nuggets Feb 2018 https://www.kdnuggets.com/2018/02/top-20-python-ai-machine-learning-open-source-projects.html
* Deep Learning
o Machine Learning
*
MXNet
*CNTK

TensorFlow
• TensorFlow is the open sourced deep learning library from Google (Nov 2015)
• It is their second generation system for the implementation and deployment of
large-scale machine learning models
• Written in C++ with a python interface, originated from research and deploying
machine learning projects throughout a wide range of Google products and
services
• Initially TF ran only on a single node (your laptop, say), but now runs on distributed
clusters
• Available across all the major cloud providers (TFaaS)
• Second most popular framework on GitHub
• Over 100,000 stars as of May 2018
• https://www.tensorflow.org/

TensorFlow supports many platforms
RaspberryPi
Android
iOS
TPU
GPU
CPU
CloudTPU © Peter Morgan, May 2019

Growth of Deep Learning atGoogle
and many more . . ..
Directories containing model descriptionfiles

TensorFlow Popularity

Other
Frameworks
• CNTK (Microsoft)
• MXnet (Amazon)
• Keras (Open source community)
• PyTorch (Facebook)
• Neon (Intel)
• Chainer (Preferred Networks)

Data Sets
• Text, speech, images, video, time series
• Examples - MNIST and Labeled Faces in the Wild (LFW).
MNIST LFW

Open Source
• ML Frameworks – open source (e.g., TensorFlow)
• Operating systems – open source (Linux)
• Hardware – open source (OCP = Open Compute Project)
• Data sets – open source (see previous slide)
• Research – open source (see arXiv)
• The fourth industrial revolution will be (is) open source

Reinforcement
Learning
• Goal driven
• Reward and penalty
• TD Learning
• DQN
• AlphaGo
• Latest research
• http://metalearning-symposium.ml

RL Research
Directions
• Deep Reinforcement Learning Symposium, NIPS 2017
• https://sites.google.com/view/deeprl-symposium-
nips2017/home
• Berkeley (BAIR) http://bair.berkeley.edu
• Peter Abdeel
• Serge Levine
• Deepmind https://deepmind.com
• IMPALA (DMLab)
https://deepmind.com/blog/impala-scalable-
distributed-deeprl-dmlab-30/
• OpenAI https://openai.com
• Research white papers
• Graphcore - Bristol ASIC company
• https://www.graphcore.ai/posts/directions-of-ai-
research

Outline
• Biological
• Non-biological
• Deep Learning
• Description
• Types
• Latest Research in DL
• Towards AGI
• Overview
• Comparisons
• Building AGI
• Conclusions

Towards AGI
• What do we need?
• Active Inference
• Other approaches
• Applications
• Building AGI
AGI = Artificial General Intelligence

Comparisons - ANN vs BNN
• Neural circuits in the brain develop via synaptic pruning; a process by which connections
are overproduced and then eliminated over time
• In contrast, computer scientists typically design networks by starting with an initially
sparse topology and gradually adding connections
• AI (specific) vs AGI (general)
• Yann LeCun – CNN’s Bell Labs in ’80/90’s – “mathematical, not biological”
• Gone as far as we can with ”just” mathematics
• Now almost every researcher looking to biology for inspiration
• Costa et al, 2018, etc. (see “Bio-plausible Deep Learning” in reference section)
ANN = Artificial Neural Networks
BNN = Biological Neural Networks

Approaches to AGI
• Helmholtz (Late 1800’s)
• Friston – Active Inference
• Tishby – Information bottleneck
• Bialek – Biophysics
• Hutter - AIXI
• Schmidhuber – Godel Machine
• Etc.

Active Inference
• Free Energy Principle
• Systems act to minimize their expected free energy
• Reduce uncertainty (or surprisal)
• F = Complexity – Accuracy
• Prediction error = expected outcome – actual outcome = surprise
• Theory of Everything (ToE)
• In physics - try to unify gravity and quantum mechanics  call this a ToE
• But actually Active Inference is more encompassing than even this
• It encompasses all interactions and dynamics (physical phenomena)
• Over all time scales
• Over all distance scales
• Also see Constructor Theory
• David Deutsch (Oxford)

What are the principles?
Newtonian
mechanics – three
laws
Special relativity –
invariance of laws
under a Lorentz
transformation
GR – Principle of
Equivalence
Electromagnetism
– Maxwell’s
equations
Thermodynamics –
three laws
Quantum
mechanics –
uncertainty
principle
Relativistic QM –
Dirac equation
Dark energy/dark
matter – we don’t
know yet
All of the above =
Principle of Least
Action

Key Concepts
Bayesian
inference
Predictive
coding
Generative
models
Cortical
organization
Perception Action
Learning
Decision
making
Affect
(emotional
intelligence)
Computational
psychiatry
Developmental
psychology
Social
interactions
General Hierarchical Cognitive

Analogy – Einstein’s General Theory of Relativity
• Made some very general (and insightful)
assumptions about the laws of physics in a
gravitational field (non-inertial frames)
• Equivalence principle
• Covariance of laws of physics
• Generalised coordinate system –
Riemannian geometry
• Spacetime is curved
• Standing on the shoulders of giants
• After ten years of hard work he finally
wrote down his now famous field equations

All known physics – Field theoretic

Active Inference - Information theoretic (uses generalised free energy)
( ) argmin E [ ( , )] [ ( ) || ( )]
( ) argmin ( , )
( , ) E [ln ( | ) ln ( , | )]
ln ( ) ( , )
( , ) E [ln ( | ) ln ( , | )]
Q
Q
entropy energy
Q
entropy energy
Q F D Q P
Q s F
F Q s P o s
P G
G Q s P o s

 
  

  
    
  
   
  
   
 

 
 
 



Perceptual inference
Policy selection
( , | )
( , | )
( , ) E [ ( , )]
E [ln ( | ) ( | ) ln ( , )]
E [ ( | ) || ( | )] [ ( , ) || ( | ) ( | )]
Q
Q o s
entropy energy
Q o s
expected cost epistemic value(mutual informat
G F
Q o Q s P o s
D Q s P s D Q o s Q s Q o
 
 
    
      
   
 
   

 
 
ion)
Generalised free energy – with some care
( | ) :
( | )
( ) :
( ) :
( | )
( | ) :
P o s t
Q o s
o t
P s t
P s
P s t
 
 





 


 


 




 



Active Inference
Karl Friston - UCL

ln ( ) ( , )
arg min ( , )
( , ) E [ln ( | ) ln ( , )]
[ ( | ) || ( )] [ [ ( | )]]
Q
expected entropy expected energy
Q
expected cost expected ambiguity
P G
G
G Q s P o s
D Q s P s E H P o s

 
  
   
   
 
  

   

 
 


π
[ ] 0
0
( [ ], [ ]| [ ])
( [ ]) 0
[ ] argmin ( [ ])
( [ ]) E [ln ( [ ]| [ ]) ln ( [ ], [ ])]
[ ( [ ]| [ ]) || ( [ ])]
T
a
T
a
p s a
expected entropy expected energy
expected complexity
d
a d
a p b p b
D p b p

   
  
  
      
    
 

 



I a
a I
I
( [ ]| [ ])
E [ [ ( [ ]| [ ])]]
p a
expected ambiguity
H p b
     

Active states
( , )
s
s f b
 
 
( )
a a
f b F
 
External states
Sensory states
( , )
f b

  
 
prefrontal cortex
β
t
u
VTA/SN
motor cortex
occipital cortex
striatum
t
o
π


s
G hippocampus
Discrete formulation
Dynamic formulation
Expected surprise and free energy

What is free-energy?
Free-energy is basically prediction error
where small errors mean low surprise
General Principle – Systems act to minimize uncertainty (their
expected free energy)
sensations – predictions
= prediction error

The Markov blanket of cells to brains
Active states
( , , )
a
a f s a 

External states Internal states
Sensory states
( , , )
f s a

 

( , , )
s s
s f s a
 
 
( , , )
f s a
 
  
 
Cell
Brain

But what about the Markov blanket?
( , , )
s s a 

( ) ln ( | )
( ) ln ( | )
a
Q p s m
a Q p s m

    
   
Perception
Action
Reinforcement learning, optimal control
and expected utility theory
Infomax, minimum redundancy and the
free-energy principle
Self-organisation, synergetics and
homoeostasis
Bayesian brain, evidence
accumulation and predictive coding
Value
Surprise
Entropy
Model evidence
Pavlov
Haken
Helmholtz
ln ( | )
ln ( | )
[ ln ( | )]
( | )
t
p s m
F p s m
E p s m
p s m

  
 

Barlow
( ) ( ) ln ( | )
f x Q p x m
   

Application

Summary
• Biological agents resist the second law of thermodynamics
• They must minimize their average surprise (entropy)
• They minimize surprise by suppressing prediction error (free-energy)
• Prediction error can be reduced by changing predictions (perception)
• Prediction error can be reduced by changing sensations (action)
• Perception entails recurrent message passing in the brain to optimise predictions
• Action makes predictions come true (and minimises surprise)
Perception Birdsong and categorization
Simulated lesions
Action Active inference
Goal directed reaching
Policies Control and attractors
The mountain-car problem

Building AGI

Can we build general intelligence?
• We have the theory – active inference
• We have the algorithms/software
• We have the hardware (ASIC, neuromorphic)
• We have the data sets (Internet plus open data sets)
• Need to build out libraries
• A TensorFlow for general intelligence
• Open source? (Open/closed)
• Apollo Project of our time – “Fourth Revolution”
• Human Brain Project
• Deepmind
• BRAIN project
• Should we build AGI/ASI? – safety, ethics, singularity?

Competitive Landscape
Major AI Efforts

Other AGI
Projects
• OpenCog – Ben Goertzel (US)
• Numenta – Jeff Hawkins (US)
• Vicarious – Dileep George (US)
• NNAIsense – Jurgen Schmidhuber (Swiss)
• AGI Innovations – Peter Voss (US)
• GoodAI – Marek Rosa (Czech)
• Curious AI – (Finland)
• Eurisko – Doug Lenat (US)
• SOAR – CMU
• ACT-R – CMU
• Sigma – Paul Rosenbloom – USC
• Plus many more

Implementations & Applications
• BNN Simulation Frameworks – SPM, PyNN, NEST, NEURON, Brian
• Various open source frameworks on GitHub
• Hearing aids - GN Group (DK)
• Order of Magnitude - Christian Kaiser (SV)
• Turing.AI – Our company (London)

1. Active Eyes – Adaptive, learning cameras for
‘walk through’ airports and ‘counterless’ stores
2. True NLP – An Active Inference based NLP for home,
retail & commercial applications
Turing.AI - Two main, very broad, product
areas – vision and language (stealth mode)
https://turing-ai.co

Conclusions
• Deep Learning (ANN) is lacking many of the characteristics and attributes needed
for a general theory of intelligence
• Active inference is such a theory (A ToE* which includes AGI)
• ANN research groups are now (finally) turning to biology for inspiration
• Bioplausible models are starting to appear
• Some groups are starting to look at active inference
• AGI in five years? Ten years?
• Still have to wait for hardware to mature
• Neuromorphic might be the platform that gets us there
* ToE = Theory of Everything

References

Neuroscience - Books
• Saxe, G. et al, Brain entropy and human intelligence: A resting-state fMRI study, PLOS One,
Feb 12, 2018
• Sterling, P. and Laughlin, S., Principles of Neural Design, MIT Press, 2017
• Slotnick, S., Cognitive Neuroscience of Memory, Cambridge Univ Press, 2017
• Engel, Friston & Kragic, Eds, The Pragmatic Turn - Toward Action-Oriented Views in
Cognitive Science, MIT Press, 2016
• Marcus G., & J. Freeman, Eds, The Future of the Brain, Princeton, Univ Press, 2015
• Gerstner, W. et al, Neuronal Dynamics, Cambridge Univ Press, 2014
• Kandel, E., Principles of Neural Science, 5th ed, McGraw-Hill, 2012
• Rabinovich, Friston and Varona, Eds, Principles of Brain Dynamics, MIT Press, 2012
• Jones, E. G., Thalamus, Cambridge Univ. Press, 2007
• Dayan, P. and L. Abbott, Theoretical Neuroscience, MIT Press, 2005

Neuroscience - Papers
• Crick, F., The recent excitement about neural networks, Nature337, 129–132, 1989
• Rao RP and DH Ballard, Predictive coding in the visual cortex, Nature Neuroscience 2:79–87, 1999
• Izhikevich, E. M., Solving the distal reward problem through linkage of STDP and dopamine
signalling, Cereb. Cortex 17, 2443–2452, 2007
• How the brain constructs the world, 2018 https://medicalxpress.com/news/2018-02-brain-world.html
• Lamme, V. A. F. and Roelfsema, P. R., The distinct modes of vision offered by feedforward and recurrent
processing, Trends Neurosci. 23, 571–579, 2000
• Sherman, S. M., Thalamus plays a central role in ongoing cortical functioning, Nat. Neurosci. 16, 533–
541, 2016
• Harris, K. D. and Shepherd, G. M. G., The neocortical circuit: themes and variations, Nat.
Neurosci. 18, 170–181, 2015
• van Kerkoerle, T. et al, Effects of attention and working memory in the different layers of monkey
primary visual cortex, Nat. Commun. 8, 13804, 2017
• Roelfsema, P.R. and A. Holtmaat, Control of synaptic plasticity in deep cortical networks, Nature
Reviews Neuroscience, 19, pages 166–180, 2018

Hardware
• Wang, Z. et al, Fully memristive neural networks for pattern classification with
unsupervised learning, Nature Electronics, 8 Feb, 2018
• Microsoft Research, The Future is Quantum, Jan 17, 2018
https://www.microsoft.com/en-us/research/blog/future-is-quantum-with-dr-krysta-
svore/?OCID=MSR_podcast_ksvore_fb
• Suri, M. Advances in Neuromorphic Hardware, Springer, 2017
• Nanalyze, 12 AI Hardware Startups Building New AI Chips, May 2017
https://www.nanalyze.com/2017/05/12-ai-hardware-startups-new-ai-chips/
• Lacey, G. et al, Deep Learning on FPGAs: Past, Present, and Future, Feb 2016
https://arxiv.org/abs/1602.04283
• Human Brain Project, Silicon Brains https://www.humanbrainproject.eu/en/silicon-
brains/
• Artificial Brains http://www.artificialbrains.com

Classical Deep Learning
• Schmidhuber, Jurgen, Deep learning in neural networks: An overview, Neural Networks, 61:85–117, 2015
• Bengio, Yoshua et al, Deep Learning, MIT Press, 2016
• LeCun, Y., Bengio, Y., and Hinton, G., Deep Learning, Nature, v.521, p.436–444, May 2016
http://www.nature.com/nature/journal/v521/n7553/abs/nature14539.html
• Brtiz, D. et al, Massive Exploration of Neural Machine Translation Architectures, Mar 2017
• Liu H. et al, Hierarchical representations for efficient architecture search, 2017
• NIPS 2017 Proceedings https://papers.nips.cc/book/advances-in-neural-information-processing-systems-30-
2017
• Deepmind papers https://deepmind.com/blog/deepmind-papers-nips-2017/
• Jeff Dean, Building Intelligent Systems with Large Scale Deep Learning, TensorFlow slides, Google Brain,
2017
• Rawat, W. and Z. Wang, Deep Convolutional Neural Networks for Image Classification: A Comprehensive
Review, Neural Computation, 29(9), Sept 2017

New Ideas in Deep Learning
• Pham H. et al, Efficient Neural Architecture Search via Parameter Sharing, Feb 2018,
• Pearl, Judea, Theoretical Impediments to Machine Learning With Seven Sparks from the Causal
Revolution, Jan 2018, https://arxiv.org/abs/1801.04016
• Marcus, Gary, Deep Learning: A Critical Appraisal, Jan 2018, https://arxiv.org/abs/1801.00631
• Chaudhari, P. and S. Soatto, Stochastic gradient descent performs variational inference, Jan 2018,
• Vidal, R. et al, The Mathematics of Deep Learning, Dec 2017, https://arxiv.org/abs/1712.04741
• Sabour, S. et al, Dynamic Routing Between Capsules, Nov 2017, https://arxiv.org/abs/1710.09829
• Jaderberg, M. et al, Population Based Training of Neural Networks, 28 Nov, 2017,
• Chaudhari, P. and S. Soatto, On the energy landscape of deep networks, Apr 2017,
• Scellier, B. and Y. Bengio, Equilibrium propagation: bridging the gap between energy-based models
and backpropagation, Front. Comput. Neurosci. 11, 24, 2017

Bio-plausible Deep Learning
• Bengio, Y. et al, Towards Biologically Plausible Deep Learning, Aug 2016
• Marblestone, A.H. et al, Toward an Integration of Deep Learning and Neuroscience, Front
Comput Neurosci., 14 Sept, 2016
• Costa, R.P. et al, Cortical microcircuits as gated-recurrent neural networks, Jan 2018
• Lillicrap T.P. et al, Random synaptic feedback weights support error backpropagation for
deep learning, Nature Communications 7:13276, 2016
• Hassabis, D. et al, Neuroscience-Inspired Artificial Intelligence, Neuron, 95(2), July 2017
• Sacramento, J. et al, Dendritic error backpropagation in deep cortical microcircuits, Dec
2017 https://arxiv.org/abs/1801.00062
• Guerguiev, J. et al, Towards deep learning with segregated dendrites, eLife Neuroscience, 5
Dec, 2017

Cognitive Science
• Dissecting artificial intelligence to better understand the human brain, Cognitive Neuroscience
Society, March 25, 2018 https://medicalxpress.com/news/2018-03-artificial-intelligence-human-
brain.html
• Barbey, A., Network Neuroscience Theory of Human Intelligence, Trends in Cognitive Sciences,
22(1), Jan 2018
• Navlakha, B. et al, Network Design and the Brain, Trends in Cognitive Sciences, 22 (1), Jan 2018
• Lake, B. et al, Building Machines That Learn and Think Like People, Nov 2016
• Lake, B., et al, Human-level concept learning through probabilistic program induction, Science,
350(6266) Dec 2015
• Tenenbaum, J.B. et al, How to Grow a Mind: Statistics, Structure, and Abstraction, Science,
331(1279) March 2011
• Trends in Cognitive Sciences, Special Issue: The Genetics of Cognition 15 (9), Sept 2011
• William Bialek publications, Princeton https://www.princeton.edu/~wbialek/categories.html

Active Inference
• Friston, K., The free-energy principle: a unified brain theory? Nature Reviews
Neuroscience, 11(2), 2010
• Friston, K., Life as we know it, Journal of the Royal Society Interface, 3 July, 2013
• Friston, K. et al, Active Inference: A Process Theory, Neural Computation, 29(1), Jan 2017
• Friston, K., Consciousness is not a thing, but a process of inference, Aeon, 18 May, 2017
• Kirchoff, M. et al, The Markov blankets of life, Journal of the Royal Society Interface, 17
Jan, 2018
• Frassle, S. et al, A generative model of whole-brain effective connectivity, Neuroimage, 25
May, 2018
• Friston, K. et al, Deep temporal models and active inference, Neuroscience &
Biobehavioral Reviews, May 2018
https://www.researchgate.net/publication/325017738_Deep_temporal_models_and_ac
tive_inference

AGI
• Schmidhuber, J., Goedel Machines: Self-Referential Universal Problem Solvers Making Provably
Optimal Self-Improvements, Dec 2006, https://arxiv.org/abs/cs/0309048
• Wolpert, D., Physical limits of inference, Oct 2008, https://arxiv.org/abs/0708.1362
• Veness, J. et al, A Monte Carlo AIXI Approximation, Dec 2010, https://arxiv.org/abs/0909.0801
• Sunehag, P. and M. Hutter, Principles of Solomonoff Induction and AIXI, Nov 2011,
• Hutter, M., One Decade of Universal Artificial Intelligence, Feb 2012,
• Silver, D. et al, Mastering the game of Go without human knowledge, Nature, Vol 550, 19 Oct, 2017
• Goertzel, B., Toward a Formal Model of Cognitive Synergy, Mar 2017,
• Hauser, Hermann, Are Machines Better than Humans? Evening lecture on machine intelligence at
SCI, London, 25 October 2017 https://www.youtube.com/watch?v=SVOMyEeXUow

Information Theory
• Tishby, N. & R. Schwartz-Ziv, Opening the Black Box of Deep Neural Networks via
Information, Apr 29, 2017, https://arxiv.org/abs/1703.00810
• Chaitin, G.J., From Philosophy to Program Size, Mar 2013,
https://arxiv.org/abs/math/0303352
• Solomonoff, R.J., Machine Learning — Past and Future, Revision of lecture given at
AI@50, The Dartmouth Artificial Intelligence Conference, July 13-15, 2006
• Publications of A. N. Kolmogorov, Annals of Probability, 17(3), July 1989
• Levin, L. A., Universal Sequential Search Problems, Problems of Information Transmission,
9(3), 1973
• Shannon, C.E., A Mathematical Theory of Communication, Bell System Technical Journal,
27 (3):379–423, July 1948
• AIT https://en.m.wikipedia.org/wiki/Algorithmic_information_theory

Classic Papers
• Deutsch, David, The Constructor Theory of Life, Journal of the Royal Society Interface, 12(104), 2016
• Crick F., The recent excitement about neural networks, Nature 337:129–132, 1989
• Rumelhart DE, Hinton GE, Williams RJ, Learning representations by back-propagating errors, Nature
323:533–536, 1986
• Solomonoff, R.J., A Formal Theory of Inductive Inference, Part 1, Information and Control, 7(1), Mar,
1964, http://world.std.com/~rjs/1964pt1.ps
• F. Rosenblatt, A probabilistic model for information storage and organization in the brain, Psych.
Rev. 62, 386-407, 1958
• Turing, A.M., Computing Machinery and Intelligence, Mind 49:433-460, 1950
• Schrodinger, E., What is Life? Based on lectures delivered at Trinity College, Dublin, Feb 1943
http://www.whatislife.ie/downloads/What-is-Life.pdf
• McCulloch, W.S. and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bulletin of
Mathematical Biophysics, 5(4):115–133, 1943
• Kolmogorov, A., On Analytical Methods in the Theory of Probability, Mathematische Annalen,
104(1), 1931

Books
• Sutton, R. S. & A.G. Barto, Reinforcement Learning, 2nd ed., MIT Press, 2018
• Goodfellow, I. et al, Deep Learning, Cambridge University Press, 2016
• Li, Ming and Paul Vitanyi, An Introduction to Kolmogorov Complexity and Its
Applications. Springer-Verlag, N.Y., 2008
• Hutter M., Universal Artificial Intelligence, Springer–Verlag, 2004
• MacKay, David, Information theory, inference and learning algorithms, Cambridge
University Press, 2003
• Wolfram, S., A New Kind of Science, Wolfram Media, 2002
• Hebb, D. O. The Organization of Behavior, A Neuropsychological Theory, John Wiley &
Sons, 1949

Final Word …
https://www.youtube.com/watch?v=7ottuFZYflg

Questions

Towards AGI - Berlin May 2019

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Towards AGI - Berlin May 2019

Similar to Towards AGI - Berlin May 2019 (20)

Recently uploaded

Recently uploaded (20)

Towards AGI - Berlin May 2019