SlideShare a Scribd company logo
Deep Learning
for Data Scientists

Andrew B. Gardner
agardner@momentics.com
http://linkd.in/1byADxC

www.momentics.com/deep-learning
Deep Learning in the Press…

Ng

Hinton

LeCun

Zuckerberg

Google Hires Brains that Helped
Supercharge Machine Learning.
Wired 3/2013.

Kurzweil

Facebook taps ‘Deep Learning’ Giant for New AI Lab.
Wired 12/2013.

Is “Deep Learning” A Revolutions in
Artificial Intelligence?

The Man Behind the Google Brain: Andrew
Ng and the Quest for the New AI.

New Yorker 11/2012.

Wired 5/2013.

New Techniques from Google and Ray Kurzweil Are
Taking Artificial Intelligence to Another Level.
MIT Technology Review 5/2013.
… Publication & Search Trends …
Google Scholar Citations

Google Trends

600

big data

500

data science

400
300

“deep learning” +
“neural network”

deep learning
machine learning

200
100

0

‘06

‘11

‘06

‘11

domains: computer vision, speech & audio, bioinformatics, etc.

Conferences: NIPS, ICLR, ICML, …
… Industry & Products
• Google

Microsoft Real-time English-Chinese Translation

– Android Voice
Recognition
– Maps
– Image+

•
•
•
•

SIRI
Translation
Documents
…

https://www.youtube.com/watch?v=Nu-nlQqFCKg

Microsoft Chief Research Officer Rick
Rashid, 11/2012
Deep Learning Epicenters (North America)

de Freitas (UBC)
Microsoft

Bengio (U Montreal)
Hinton (U Toronto)

Facebook
Ng (Stanford)
Google
Yahoo

LeCun (NYU)
Deep Learning: The Origin Story
Before: A Cat Detector
We want to build this….

classifier
f : X ®Y

Y ~ the labels {“cat”, “dog”}

X ~ the images

… for less than $1.0M !
Challenge: Labeled Data
Labels are expensive  Less data
Intuitively: more data is good
cat

cat
dog

unused,unlabeled

cat
dog
Challenge: Features
Features are expensive  Fewer, shallow
Intuitively: better features are good
image (pixels)
Magic feature dictionary
SIFT
HoG

B W

SIFT

binary histogram

Moments
Shape Histogram

+
++
+
+++

+

+
+
+ +
+

x=(1.3, 2.8, …)

Fang detector
Something new
Machine Learning (Before)
Building a Cat Detector 1.0

expensive
important*

Features

Detector
(Classifier)
fa
ng
of
in
ch
on,
of
of
us
on
is
is
bly

How Good is “More Data?”

speech. The memory-based learner used only
the word before and word after as features.

Labels are expensive  Less data
1.00

• More data dominates*
better techniques

.975

0.95

0.90

Test Accuracy

a
93,
In
is
fic
es
are
m
ber

• Often have lots of data

0.85

.825

0.80

Memory-Based
Winnow

0.75

Perceptron
Naïve Bayes
0.70
0.1

1

10
100
Millions of Words

1000

Learning curves for confusion set
Figure 1. Learning Curves for Confusion Set
disambiguation, e.g. {to, two, too}.
Disambiguation
We collected a 1
-billion-word training
corpus from a variety of English texts, including

• … we just don’t have
lots of labels
• What if there was a
way to use unlabeled
data?

“Scaling to Very Very Large Corpora for Natural Language Disambiguation,” Banko and Brill, 2001.
The Impact of Features
Intuitively: better features are good

• Critical to success – even more than data!
• How to create / engineer features?
– Typically shallow

• Domain-specific
• What if there was a way to automatically
learn features?
Machine Learning (What We Want)
Building a Cat Detector 2.0

bountiful
important*

Features + Detector
(Classifier)

end-to-end
AR” Building an Object Recognition System

”

“CAR”

Deep Nets Intuition
“CAR”

car
intermediate representations
CLASSIFIER

FEATURE
EXTRACTOR

label

IDEA: Use data to optimize features for the given task.

olutional DBN's for scalable unsup. learning...” ICML 2009

Lee et al. ICML 2009

12

Ranzato
2

Ranzato
13

Ranzato

Ranza
on from low
structure as
hical Another Example of Hierarchy
Learning
rchical Learning
mplexity from low
progression

ral progression from low
high level structure as
to high level structure as
natural complexity
in natural complexity

what is being
eto monitor whatisisbeing
the machine
o monitor what being
r
and guide the machine
es toto guide themachine
t and

er subspaces
tter subspaces

od lower level
llower level heads
ntation can be used for
sentation can be usedfor
ndistinct tasks for
be used
istinct tasks

s

faces

as

parts

edges
d tomachine machine
e guide the
he
subspaces Hierarchy Reusability?
faces

cars

elephants

chairs

wer level
be used forbe used for
tation can
tinct tasks

5

5
A Breakthrough
G. E. Hinton, S. Osindero, and Y. Teh, “A fast learning
algorithm for deep belief nets,” Neural
Computation, vol. 18, pp. 1527–1554, 2006.
G. E. Hinton and R. R. Salakhutdniov, “Reducing the
dimensionality of data with neural networks,”
Science, vol. 313, no. 5786, pp. 504-507, July 2006.

before

after
Deep Belief Nets
MNIST

60K + 10K Images

Technique

Test Error

DBN pretrain

1.25

SVM

1.4

kNN

2.8-4.4

ConvNet

0.4 -> 0.23

supervised tuning
unsupervised pretraining
MNIST Sample Errors

Ciresan et al. “Deep Big Simple Neural Networks Excel on
Handwritten Digit Recognition,” 2010
Key Ideas
• Learn features from data
– Use all data

• Deep architecture
– Representation
– Computational efficiency
– Shared statistics

• Practical training
• State-of-the-art (it worked)
After: Cat Detector
unlabeled images (millions)

labeled images (few)

deep learning
network

more data

automatic (deep) features
How Does It Work?
This Is A Neuron
output

1. Sum all inputs (weighted)

y

x = w0 + w1z1 + w2 z2 + w3z3

f(x)

2. Nonlinearly transform

y = f ( x)

weights
w0 w1

w2

sigmoid

w3
tanh

1
bias

z1

z2
inputs

z3
activation function
A Neural Network
forward propagation: weighted sum inputs, produce activation, feed forward

cat

dog

Output

Hidden

13.5

weight

21

n_teeth

16

n_whiskers

Inputs
(the features)
Training
Back propagation of error.

1

0

cat

dog

total error at top

proportional
contributions going
backwards

13.5

weight

21

n_teeth

16

n_whiskers
After Training
network

layer weights

weights as a matrix

[.5, -.2, 4, .15, -1,…]

-.5

.4

0

.1

.1

.5

-1

2

[-.5, -.3, .4, 0, …]

-.3
.7

-.2

.4

we can view weight matrix as image

… plus performance evaluation & logging
Building Blocks
So many choices!
network topology

• Network Topology
– Number of layers
– Nodes per layer

• Layer Type
– Feedforward
– Restricted Boltzmann
– Autoencoder
– Recurrent
– Convolutional

layer type

neuron type

• Neuron Type
– Rectified Linear Unit

• Regularization
– Dropout

• Magic Numbers
A Deep Learning Recipe, 1.0
• Lots of data, some+
labels
• Train each RBM layer
greedily, successively
• Add an output layer
and train with labels

labels
A Few Other Important Things
• Deep Learning Recipe 2.0
– Dropout / regularization
– Rectified Linear Units

•
•
•
•

Convolutional networks
Hyperparameters
Not just neural networks
Practical Issues (GPU)
Some Applications
Sample Classification Results

ImageNet
V
alidation classification

Krizhevsky et al., NIPS 2012.

[Krizhevsky et al. NI PS’12
Segmentation
neuronal membranes

Ciresan et al. “DNN segment neuronal membranes...” NIPS 2012
CalTech 256 2 5 6
Caltech
Z eiler & Fergus, Vis
ualizing and Unders
tanding Convolutional Ne
tworks arXiv 1311.2901, 2013
,
7
5
7
0
6
5

6 training examples

6
0
5
5
5
0
4
5
4
0
3
5
3
0
2
5
0

1
0

2
0

3
0

4
0

5
0

6
0

Zeiler & Fergus,”Visualizing and Understanding Convolutional Networks,” arXiv 1311.2901, 2013
Application: Speech
frequencies
in window

“He can for example present significant university wide
issues to the senate.”

small time window
slide 15ms

phoneme

Spectrogram: window in time -> vector of frequences; slide; repeat
Automatic Speech
CDBNs for speech
Unlabeled TIMIT data -> convolutional DBN

Trained on unlabeled TIMIT corpus

Experimental R

• Speaker identification
TIMIT Speaker identification

Accuracy

Prior art (Reynolds, 1995)

99.7%

Convolutional DBN

100.0%

• Phone classification
TIMIT Phone classification

Accuracy

Clarkson et al. (1999)

77.6%

Gunawardana et al. (2005)

78.3%

Sung et al. (2007)

78.5%

Petrov et al. (2007)

78.6%

Sha & Saul (2006)

78.9%

Yu et al. (2009)

79.2%

Convolutional DBN

80.3%

Learned first-layer bases

Lee et al., “Unsupervised feature learning for audio classification using convolutional deep
68
belief networks”, NIPS 2009.
A Long List of Others
• Kaggle
– Merck Molecular Activation (‘12)
– Salary Prediction (‘13)

•
•
•
•

Learning to Play Atari Games (‘13)
NLP – chunking, NER, parsing, etc.
Activity recognition from video
Recommendations
Deep Learning In A Nutshell
•
•
•
•
•
•
•
•

Architectures vs. features
Deep vs. shallow
Automatic* features
Lots of data vs. best technique
Compute- vs. human intensive
State-of-the-art
Breaks expert, domain barrier
Details & tricks can be complex
http://www.deeplearning.net/
Interested in Deep Learning?
Connect for:
• Training Workshop (interest list)
• Projects / consulting

• Collaboration
• Questions

agardner@momentics.com
http://www.momentics.com/deep-learning/

More Related Content

What's hot

Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GAN
S.Shayan Daneshvar
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers
leopauly
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly
 
State of transformers in Computer Vision
State of transformers in Computer VisionState of transformers in Computer Vision
State of transformers in Computer Vision
Deep Kayal
 
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Edureka!
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
Entrepreneur / Startup
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
Poo Kuan Hoong
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
Devashish Shanker
 
Using synthetic data for computer vision model training
Using synthetic data for computer vision model trainingUsing synthetic data for computer vision model training
Using synthetic data for computer vision model training
Unity Technologies
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesis
BeerenSahu
 
The Deep Learning Glossary
The Deep Learning GlossaryThe Deep Learning Glossary
The Deep Learning Glossary
NVIDIA
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
Debarko De
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
Deep Learning Italia
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillation
NAVER Engineering
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
Preferred Networks
 
[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)
Susang Kim
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks
신동 강
 
Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep Learning
YaminiAlapati1
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
Gayatri Khanvilkar
 

What's hot (20)

Image to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GANImage to image translation with Pix2Pix GAN
Image to image translation with Pix2Pix GAN
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
State of transformers in Computer Vision
State of transformers in Computer VisionState of transformers in Computer Vision
State of transformers in Computer Vision
 
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
Artificial Neural Network Tutorial | Deep Learning With Neural Networks | Edu...
 
You only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detectionYou only look once (YOLO) : unified real time object detection
You only look once (YOLO) : unified real time object detection
 
An Introduction to Deep Learning
An Introduction to Deep LearningAn Introduction to Deep Learning
An Introduction to Deep Learning
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Using synthetic data for computer vision model training
Using synthetic data for computer vision model trainingUsing synthetic data for computer vision model training
Using synthetic data for computer vision model training
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesis
 
The Deep Learning Glossary
The Deep Learning GlossaryThe Deep Learning Glossary
The Deep Learning Glossary
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillation
 
Introduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable RenderingIntroduction to 3D Computer Vision and Differentiable Rendering
Introduction to 3D Computer Vision and Differentiable Rendering
 
[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)[Paper] Multiscale Vision Transformers(MVit)
[Paper] Multiscale Vision Transformers(MVit)
 
Graph Convolutional Neural Networks
Graph Convolutional Neural Networks Graph Convolutional Neural Networks
Graph Convolutional Neural Networks
 
Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep Learning
 
Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 

Viewers also liked

On-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on AndroidOn-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on Android
Yufeng Guo
 
Machine Intelligence at Google Scale: TensorFlow
Machine Intelligence at Google Scale: TensorFlowMachine Intelligence at Google Scale: TensorFlow
Machine Intelligence at Google Scale: TensorFlow
DataWorks Summit/Hadoop Summit
 
Introducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsIntroducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applications
Rokesh Jankie
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlowNeural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
Darshan Patel
 
TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...
TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...
TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...
Sam Putnam [Deep Learning]
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
Traian Rebedea
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
Christian Perone
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
Buhwan Jeong
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Sujit Pal
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
Sri Ambati
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
Sri Ambati
 
Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow
Jen Aman
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
台灣資料科學年會
 

Viewers also liked (14)

On-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on AndroidOn-device machine learning: TensorFlow on Android
On-device machine learning: TensorFlow on Android
 
Machine Intelligence at Google Scale: TensorFlow
Machine Intelligence at Google Scale: TensorFlowMachine Intelligence at Google Scale: TensorFlow
Machine Intelligence at Google Scale: TensorFlow
 
Introducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applicationsIntroducing TensorFlow: The game changer in building "intelligent" applications
Introducing TensorFlow: The game changer in building "intelligent" applications
 
Neural Networks with Google TensorFlow
Neural Networks with Google TensorFlowNeural Networks with Google TensorFlow
Neural Networks with Google TensorFlow
 
TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...
TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...
TensorFlow Serving, Deep Learning on Mobile, and Deeplearning4j on the JVM - ...
 
Intro to Deep Learning for Question Answering
Intro to Deep Learning for Question AnsweringIntro to Deep Learning for Question Answering
Intro to Deep Learning for Question Answering
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 
Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow
 
Google TensorFlow Tutorial
Google TensorFlow TutorialGoogle TensorFlow Tutorial
Google TensorFlow Tutorial
 

Similar to Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 2014-01-08

Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep Learning
David Khosid
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
Darian Frajberg
 
The Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep LearningThe Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep Learning
indico data
 
Deep learning & Humanity's Grand Challenges
Deep learning & Humanity's Grand ChallengesDeep learning & Humanity's Grand Challenges
Deep learning & Humanity's Grand Challenges
The Wisdom Daily
 
Big Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning DemystifiedBig Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning Demystified
Matt Stubbs
 
Putting the Magic in Data Science
Putting the Magic in Data SciencePutting the Magic in Data Science
Putting the Magic in Data Science
Sean Taylor
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Amr Rashed
 
Deep Learning In R
Deep Learning In RDeep Learning In R
Deep Learning In R
Martin Eastwood
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Universitat Politècnica de Catalunya
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
Amr Rashed
 
Barga DIDC'14 Invited Talk
Barga DIDC'14 Invited TalkBarga DIDC'14 Invited Talk
Barga DIDC'14 Invited Talk
Roger Barga
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
Edge AI and Vision Alliance
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptx
fahmi324663
 
An Introduction to Deep Learning I AWS Dev Day 2018
An Introduction to Deep Learning I AWS Dev Day 2018An Introduction to Deep Learning I AWS Dev Day 2018
An Introduction to Deep Learning I AWS Dev Day 2018
AWS Germany
 
An Introduction to Deep Learning (April 2018)
An Introduction to Deep Learning (April 2018)An Introduction to Deep Learning (April 2018)
An Introduction to Deep Learning (April 2018)
Julien SIMON
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
KONGU ENGINEERING COLLEGE
 
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
APJ ABDUL KALAM TECHNICAL UNIVERSITY
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
Semantic Web Company
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
GeeksLab Odessa
 

Similar to Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 2014-01-08 (20)

Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep Learning
 
Introduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolutionIntroduction to the Artificial Intelligence and Computer Vision revolution
Introduction to the Artificial Intelligence and Computer Vision revolution
 
The Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep LearningThe Unreasonable Benefits of Deep Learning
The Unreasonable Benefits of Deep Learning
 
Deep learning & Humanity's Grand Challenges
Deep learning & Humanity's Grand ChallengesDeep learning & Humanity's Grand Challenges
Deep learning & Humanity's Grand Challenges
 
Big Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning DemystifiedBig Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning Demystified
 
Putting the Magic in Data Science
Putting the Magic in Data SciencePutting the Magic in Data Science
Putting the Magic in Data Science
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep Learning In R
Deep Learning In RDeep Learning In R
Deep Learning In R
 
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
Deep Learning Representations for All - Xavier Giro-i-Nieto - IRI Barcelona 2020
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Barga DIDC'14 Invited Talk
Barga DIDC'14 Invited TalkBarga DIDC'14 Invited Talk
Barga DIDC'14 Invited Talk
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
 
Week1- Introduction.pptx
Week1- Introduction.pptxWeek1- Introduction.pptx
Week1- Introduction.pptx
 
An Introduction to Deep Learning I AWS Dev Day 2018
An Introduction to Deep Learning I AWS Dev Day 2018An Introduction to Deep Learning I AWS Dev Day 2018
An Introduction to Deep Learning I AWS Dev Day 2018
 
An Introduction to Deep Learning (April 2018)
An Introduction to Deep Learning (April 2018)An Introduction to Deep Learning (April 2018)
An Introduction to Deep Learning (April 2018)
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
 
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
Industrial training (Artificial Intelligence, Machine Learning & Deep Learnin...
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
 

Recently uploaded

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 

Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 2014-01-08

  • 1. Deep Learning for Data Scientists Andrew B. Gardner agardner@momentics.com http://linkd.in/1byADxC www.momentics.com/deep-learning
  • 2.
  • 3. Deep Learning in the Press… Ng Hinton LeCun Zuckerberg Google Hires Brains that Helped Supercharge Machine Learning. Wired 3/2013. Kurzweil Facebook taps ‘Deep Learning’ Giant for New AI Lab. Wired 12/2013. Is “Deep Learning” A Revolutions in Artificial Intelligence? The Man Behind the Google Brain: Andrew Ng and the Quest for the New AI. New Yorker 11/2012. Wired 5/2013. New Techniques from Google and Ray Kurzweil Are Taking Artificial Intelligence to Another Level. MIT Technology Review 5/2013.
  • 4. … Publication & Search Trends … Google Scholar Citations Google Trends 600 big data 500 data science 400 300 “deep learning” + “neural network” deep learning machine learning 200 100 0 ‘06 ‘11 ‘06 ‘11 domains: computer vision, speech & audio, bioinformatics, etc. Conferences: NIPS, ICLR, ICML, …
  • 5. … Industry & Products • Google Microsoft Real-time English-Chinese Translation – Android Voice Recognition – Maps – Image+ • • • • SIRI Translation Documents … https://www.youtube.com/watch?v=Nu-nlQqFCKg Microsoft Chief Research Officer Rick Rashid, 11/2012
  • 6. Deep Learning Epicenters (North America) de Freitas (UBC) Microsoft Bengio (U Montreal) Hinton (U Toronto) Facebook Ng (Stanford) Google Yahoo LeCun (NYU)
  • 7. Deep Learning: The Origin Story
  • 8. Before: A Cat Detector We want to build this…. classifier f : X ®Y Y ~ the labels {“cat”, “dog”} X ~ the images … for less than $1.0M !
  • 9. Challenge: Labeled Data Labels are expensive  Less data Intuitively: more data is good cat cat dog unused,unlabeled cat dog
  • 10. Challenge: Features Features are expensive  Fewer, shallow Intuitively: better features are good image (pixels) Magic feature dictionary SIFT HoG B W SIFT binary histogram Moments Shape Histogram + ++ + +++ + + + + + + x=(1.3, 2.8, …) Fang detector Something new
  • 11. Machine Learning (Before) Building a Cat Detector 1.0 expensive important* Features Detector (Classifier)
  • 12. fa ng of in ch on, of of us on is is bly How Good is “More Data?” speech. The memory-based learner used only the word before and word after as features. Labels are expensive  Less data 1.00 • More data dominates* better techniques .975 0.95 0.90 Test Accuracy a 93, In is fic es are m ber • Often have lots of data 0.85 .825 0.80 Memory-Based Winnow 0.75 Perceptron Naïve Bayes 0.70 0.1 1 10 100 Millions of Words 1000 Learning curves for confusion set Figure 1. Learning Curves for Confusion Set disambiguation, e.g. {to, two, too}. Disambiguation We collected a 1 -billion-word training corpus from a variety of English texts, including • … we just don’t have lots of labels • What if there was a way to use unlabeled data? “Scaling to Very Very Large Corpora for Natural Language Disambiguation,” Banko and Brill, 2001.
  • 13. The Impact of Features Intuitively: better features are good • Critical to success – even more than data! • How to create / engineer features? – Typically shallow • Domain-specific • What if there was a way to automatically learn features?
  • 14. Machine Learning (What We Want) Building a Cat Detector 2.0 bountiful important* Features + Detector (Classifier) end-to-end
  • 15. AR” Building an Object Recognition System ” “CAR” Deep Nets Intuition “CAR” car intermediate representations CLASSIFIER FEATURE EXTRACTOR label IDEA: Use data to optimize features for the given task. olutional DBN's for scalable unsup. learning...” ICML 2009 Lee et al. ICML 2009 12 Ranzato 2 Ranzato 13 Ranzato Ranza
  • 16. on from low structure as hical Another Example of Hierarchy Learning rchical Learning mplexity from low progression ral progression from low high level structure as to high level structure as natural complexity in natural complexity what is being eto monitor whatisisbeing the machine o monitor what being r and guide the machine es toto guide themachine t and er subspaces tter subspaces od lower level llower level heads ntation can be used for sentation can be usedfor ndistinct tasks for be used istinct tasks s faces as parts edges
  • 17. d tomachine machine e guide the he subspaces Hierarchy Reusability? faces cars elephants chairs wer level be used forbe used for tation can tinct tasks 5 5
  • 18. A Breakthrough G. E. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, pp. 1527–1554, 2006. G. E. Hinton and R. R. Salakhutdniov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504-507, July 2006. before after
  • 19. Deep Belief Nets MNIST 60K + 10K Images Technique Test Error DBN pretrain 1.25 SVM 1.4 kNN 2.8-4.4 ConvNet 0.4 -> 0.23 supervised tuning unsupervised pretraining
  • 20. MNIST Sample Errors Ciresan et al. “Deep Big Simple Neural Networks Excel on Handwritten Digit Recognition,” 2010
  • 21. Key Ideas • Learn features from data – Use all data • Deep architecture – Representation – Computational efficiency – Shared statistics • Practical training • State-of-the-art (it worked)
  • 22. After: Cat Detector unlabeled images (millions) labeled images (few) deep learning network more data automatic (deep) features
  • 23. How Does It Work?
  • 24. This Is A Neuron output 1. Sum all inputs (weighted) y x = w0 + w1z1 + w2 z2 + w3z3 f(x) 2. Nonlinearly transform y = f ( x) weights w0 w1 w2 sigmoid w3 tanh 1 bias z1 z2 inputs z3 activation function
  • 25. A Neural Network forward propagation: weighted sum inputs, produce activation, feed forward cat dog Output Hidden 13.5 weight 21 n_teeth 16 n_whiskers Inputs (the features)
  • 26. Training Back propagation of error. 1 0 cat dog total error at top proportional contributions going backwards 13.5 weight 21 n_teeth 16 n_whiskers
  • 27. After Training network layer weights weights as a matrix [.5, -.2, 4, .15, -1,…] -.5 .4 0 .1 .1 .5 -1 2 [-.5, -.3, .4, 0, …] -.3 .7 -.2 .4 we can view weight matrix as image … plus performance evaluation & logging
  • 28. Building Blocks So many choices! network topology • Network Topology – Number of layers – Nodes per layer • Layer Type – Feedforward – Restricted Boltzmann – Autoencoder – Recurrent – Convolutional layer type neuron type • Neuron Type – Rectified Linear Unit • Regularization – Dropout • Magic Numbers
  • 29. A Deep Learning Recipe, 1.0 • Lots of data, some+ labels • Train each RBM layer greedily, successively • Add an output layer and train with labels labels
  • 30. A Few Other Important Things • Deep Learning Recipe 2.0 – Dropout / regularization – Rectified Linear Units • • • • Convolutional networks Hyperparameters Not just neural networks Practical Issues (GPU)
  • 32. Sample Classification Results ImageNet V alidation classification Krizhevsky et al., NIPS 2012. [Krizhevsky et al. NI PS’12
  • 33. Segmentation neuronal membranes Ciresan et al. “DNN segment neuronal membranes...” NIPS 2012
  • 34. CalTech 256 2 5 6 Caltech Z eiler & Fergus, Vis ualizing and Unders tanding Convolutional Ne tworks arXiv 1311.2901, 2013 , 7 5 7 0 6 5 6 training examples 6 0 5 5 5 0 4 5 4 0 3 5 3 0 2 5 0 1 0 2 0 3 0 4 0 5 0 6 0 Zeiler & Fergus,”Visualizing and Understanding Convolutional Networks,” arXiv 1311.2901, 2013
  • 35. Application: Speech frequencies in window “He can for example present significant university wide issues to the senate.” small time window slide 15ms phoneme Spectrogram: window in time -> vector of frequences; slide; repeat
  • 36. Automatic Speech CDBNs for speech Unlabeled TIMIT data -> convolutional DBN Trained on unlabeled TIMIT corpus Experimental R • Speaker identification TIMIT Speaker identification Accuracy Prior art (Reynolds, 1995) 99.7% Convolutional DBN 100.0% • Phone classification TIMIT Phone classification Accuracy Clarkson et al. (1999) 77.6% Gunawardana et al. (2005) 78.3% Sung et al. (2007) 78.5% Petrov et al. (2007) 78.6% Sha & Saul (2006) 78.9% Yu et al. (2009) 79.2% Convolutional DBN 80.3% Learned first-layer bases Lee et al., “Unsupervised feature learning for audio classification using convolutional deep 68 belief networks”, NIPS 2009.
  • 37. A Long List of Others • Kaggle – Merck Molecular Activation (‘12) – Salary Prediction (‘13) • • • • Learning to Play Atari Games (‘13) NLP – chunking, NER, parsing, etc. Activity recognition from video Recommendations
  • 38. Deep Learning In A Nutshell • • • • • • • • Architectures vs. features Deep vs. shallow Automatic* features Lots of data vs. best technique Compute- vs. human intensive State-of-the-art Breaks expert, domain barrier Details & tricks can be complex http://www.deeplearning.net/
  • 39. Interested in Deep Learning? Connect for: • Training Workshop (interest list) • Projects / consulting • Collaboration • Questions agardner@momentics.com http://www.momentics.com/deep-learning/

Editor's Notes

  1. (1:00)Thank organizers & attendeesMy background thesisInvitation to connectTalk in 3 parts: introduction and motivate the topichigh-level overview of deep learning detailsexamples
  2. How many heard of deep learning
  3. joke: Wired and ad placementCompanies are qcquiring talent and demonstrating use caseZuckerberg @ NIPS
  4. Growing popularityLots of applications motivated by vision and audioSensible because of connections to perception, AI and neural networksRevolutions have participants
  5. Products are seeing big liftExample of real-time translation kept it in the same voice!“I’m speaking in English and hopefully you’ll hear me speaking in Chinese in my own voice”
  6. Apology for ommission
  7. - As a data scientist, consume machine learning
  8. Consider canonical problem: classificationCats and dogs, cats and data scientistsIn this case, we want to build a magic box that discriminates cats vs dogsPlay on the google cat detector: 1000 nodes, 16000 cores, 1 week per trial @ $1/hr = ? June 2012Cat detector detects better than a catLeaving data on the dable
  9. Many examples, from all classes, requiredConsequence -> use less dataFeatures require lots of engineering and workExample here, SIFT, took over a decade for David Lowe to developMany examples of features: tail, fur, eyes, edges, height, etc.
  10. Features: raw numbers to smaller, better pile of numbersMany examples, from all classes, requiredConsequence -> use less dataFeatures require lots of engineering and workExample here, SIFT, took over a decade for David Lowe to developMany examples of features: tail, fur, eyes, edges, height, etc.Best disciplined approach: copy and tweakShow of hands – how many of you have experienced this?
  11. 80% of the data scientist jobWe don’t scale – how long to get a Phd?Each loop we have to do invention and ideation“Won a kaggle contest using RF”Workflow, feature engineering
  12. This is not always true, but good for high variance problemsWhat examples of extra data?Not just a little more data, but a lot of dataOften have a lot more data today in the connected world
  13. No principled way to generate featuresNo playbook for alien data features
  14. Modules that learn featuresStack and I get a hierarchical decomposition
  15. Hinton split timeBefore & after
  16. Describe MNIST, boring easy“everything works at 96% accuracy”
  17. This network achieved 0.35% error using online backprop6 hidden layers, 2500, 2000, 1500, 1000, 500, 10 with validation & test error .35% & .32%
  18. Data flows from bottom to topAffine + nonlinearityNonlinear regressionWe have to learn the weights and biasWe have to pick the activation function
  19. Backprop topBackprop global
  20. 1000 categories25% -> 15% errorAcquired by Google 1/13