The History and
Near Future of
Deep Learning
David Kammeyer
Kammeyer Development
kammeyer@kammeyer.org
Big Data Beers 15.9.2015
What’s the big Deal?
Solving Problems that are
Easy for Humans, Hard
for Computers
• Visual Recognition, including OCR
• Speech Recognition
• Natural Language Processing (Translation,
Sentiment Analysis
Where did this all
come from?
1957: The Perceptron
Frank Rosenblatt @ Cornell, MIT, ONR
How the Perceptron Works
Limitations and
Winter #1
Perceptrons cannot learn
the XOR function, or any
nonmonotonic function.
Multilayer Perceptrons
1989: Cybenko’s Universal Approximation theorem for
Single Hidden Layer Perceptrons
Backpropagation
Training Methods and
Winter #2
• Just because you can represent a
function as a single hidden layer net
doesn’t mean you can learn it (Might
need more layers to be able to learn)
• SVMs provided better learning
guarantees
The Renaissance
Convolutional Neural Networks
LeCun, 1993
ImageNet 2012
A. Krizhevsky’s AlexNet wins ImageNet Competition
Image Captioning
Karpathy 2015
What Changed?
GPUs
• 40x Speedup relative to
CPUs, allows the training
of much larger models
than before
Very Deep Models
• Allows for Hierarchical Representation of Knowledge
Big Data
Newer Techniques
RNN, LSTM, Deep Q-Learning, New Activation
Functions, Max Pooling
What’s Next?
Faster Processing
• Faster GPUs
• FPGAs
• ASICS
More Recurrence,
Bidirectional Hierarchies
• LSTM and RNN models have taken over at the
state of the art.
• Next step is Deep Recurrent models to capture
conceptual hierarchies
• Will Require new learning algorithms
Hierarchical Representations in the Brain
Attentional Models
Allow the network to sequentially focus attention on a
particular part of the input
Simulated (or Real) Worlds
• Lots of Data Needed to Train Large Models
• We’re going to have to Generate it, or Capture it from the Real World
More Researchers
Questions?
Dave Kammeyer
Kammeyer Development
kammeyer@kammeyer.org
Thanks!

Deep Learning Intro