The document provides an introduction to deep learning and the Nervana framework. It discusses the speaker's background and Intel's Artificial Intelligence Products Group. It then covers machine learning concepts, a brief history of deep learning, neural network architectures, training procedures, and examples of computer vision applications for deep learning like image classification. Use cases for recurrent neural networks and long short-term memory networks are also mentioned.
Gen AI in Business - Global Trends Report 2024.pdf
Introduction to Deep Learning and neon at Galvanize
1. Proprietary and confidential. Do not distribute.
Introduction to Deep Learning and Neon
MAKING MACHINES SMARTER.™
Kyle H. Ambert, PhD
Senior Data Scientist
May 25 , 2017th
@TheKyleAmbert
7. Nervana Systems Proprietary
About me & Intel’s Artificial Intelligence Products Group (AIPG)
+
Together, we create production deep learning solutions in multiple
domains, while advancing the field of applied analytics and optimization.
8. Nervana Systems Proprietary
8
Intel’s Interest in Analytics
To provide the infrastructure
for the fastest time-to-insight
To create tools that enable
scientists to think about their
research, rather than their
process
To enable users to ask bigger
questions
Bigger Data Better Hardware Smarter Algorithms
Image: 1000 KB / picture
Audio: 5000 KB / song
Video: 5,000,000 KB / movie
Transistor density doubles
every 18 months
Cost / GB in 1995: $1000.00
Cost / GB in 2015: $0.03
Advances in neural
networks leading to better
accuracy in training models
Great solutions require great hardware!
9. Nervana Systems Proprietary
LIBRARIES Intel® MKL
Intel® MKL-DNN
FRAMEWORKS
Intel® DAAL
HARDWARE
Memory/Storage FabricCompute
Intel
Distribution
MORE
UNLEASHING
POTENTIAL
FULL
SOLUTIONS
PLATFORMS/TOOLS
BIGDL
Intel® Nervana™ Deep
Learning Platform
Intel® Nervana™
Cloud
Intel® Nervana™
Graph
10. Nervana Systems Proprietary
10
This Evening
1. Machine Learning and Data Science
2. Introduction to Deep Learning
3. Nervana!
4. Neon
5. Deep Learning Use Cases
11. Nervana Systems Proprietary
11
This Evening
1. Machine Learning and Data Science
2. Introduction to Deep Learning
3. Nervana!
4. Neon
5. Deep Learning Use Cases
13. Machine learning is the development, and application of, algorithms that can
learn from data in an automated, semi-automated, or supervised setting.
Deep LearningStatistical Learning
Algorithms where multiple layers of neurons learn
successively complex representations of input data
CNN RNN DFF RBM LSTM
Algorithms which leverage statistical methods for
estimating functions from examples
Naïve
Bayes SVM GLM
Tree-
based kNN
Training: building a mathematical model based on input data
Classification (scoring): using a trained model to make predictions about new data
14. Machine learning is the development, and application of, algorithms that can
learn from data in an automated, semi-automated, or supervised setting.
Deep LearningStatistical Learning
Algorithms where multiple layers of neurons learn
successively complex representations of input data
CNN RNN DFF RBM LSTM
Algorithms which leverage statistical methods for
estimating functions from examples
Naïve
Bayes SVM GLM
Tree-
based kNN
Training: building a mathematical model based on input data
Classification (scoring): using a trained model to make predictions about new data
15. Machine learning is the development, and application of, algorithms that can
learn from data in an automated, semi-automated, or supervised setting.
Deep LearningStatistical Learning
Algorithms where multiple layers of neurons learn
successively complex representations of input data
CNN RNN DFF RBM LSTM
Algorithms which leverage statistical methods for
estimating functions from examples
Naïve
Bayes SVM GLM
Tree-
based kNN
Training: building a mathematical model based on input data
Classification (scoring): using a trained model to make predictions about new data
Ingest
Data
Engineer Features
Structure
Model
Clean
Data
Visualize
Query/
Analyze
TrainM
odel
Deploy
16. Nervana Systems Proprietary
16
This Evening
1. Machine Learning and Data Science
2. Introduction to Deep Learning
3. Nervana!
4. Neon
5. Deep Learning Use Cases
17. Nervana Systems Proprietary
17
A Quite Brief History of Deep Learning
• 1960s: Neural networks used for binary classification
• 1970s: Neural networks popularity dries after not delivering on the hype
• 1980s: Backpropagation is used to train deep networks
• 1990s: Neural networks take the back seat to support vector machines due to the nice
theoretical properties and guarantee bounds
• 2010s: Access to large datasets and more computation allowed deep networks to return and
have state-of-the-art results in speech, vision, and natural language processing
• 1949: The Organization of Behavior is published
(Hebb!)
(Minsky)
Today: Deep Learning is a fast-moving area of academic and applied analytics!
There are many opportunities for new discoveries!
(Vapnik)
(Hinton)
18. Nervana Systems Proprietary
18
ML v. DL: Practical Differences
SVM
Random Forest
Naïve Bayes
Decision Trees
Logistic Regression
Ensemble methods
Harrison
20. Nervana Systems Proprietary
20
Workflows in Machine Learning
⟹ The same rules apply for deep learning!
➝ Preprocessing data
➝ Feature extraction
➝ Parsimony in model selection
⟹ How we go about some of this does change…
24. Nervana Systems Proprietary
Deep Learning: Networks of Artificial Neurons
Output of unit
Activation Function
Linear weights Bias unit
Input from unit j
⟹ With an explosion of moving parts,
being able to understand and keep
track of what sort of model is being
built becomes even more important!
25. Nervana Systems Proprietary
Practical example: recognition of handwritten digits
MNIST dataset
70,000 images (28x28 pixels)
Goal: classify images into a digit 0-9
N = 28 x 28 pixels
= 784 input units
N = 10 output units (one
for each digit)
Each unit i encodes the
probability of the input
image of being of the
digit i
N = 100 hidden units
(user-defined
parameter)
Input
Hidden
Output
38. Nervana Systems Proprietary
Why Does This Work at All?
Krizhevsky, 2012
60 million parameters
120 million parameters
Taigman, 2014
39. Nervana Systems Proprietary
39
This Evening
1. Machine Learning and Data Science
2. Introduction to Deep Learning
3. Nervana!
4. Neon
5. Deep Learning Use Cases
40. Nervana Systems Proprietary
Nervana in 30 seconds. Possibly less.
40
neon deep
learning
framework
train deployexplore
nervana
engine
2-3x speedup on
Titan X GPUs
cloudn
43. Nervana Systems Proprietary
43
This Evening
1. Machine Learning and Data Science
2. Introduction to Deep Learning
3. Nervana!
4. Neon
5. Deep Learning Use Cases
47. Nervana Systems Proprietary
Curated Models
47
• https://github.com/NervanaSystems/ModelZoo
• Pre-trained weights and models
SegNet
Deep Speech 2
Skip-thought
Autoencoders
Deep Dream
48. Nervana Systems Proprietary
Neon workflow
1. Generate backend
2. Load data
3. Specify model architecture
4. Define training parameters
5. Train model
6. Evaluate
53. Nervana Systems Proprietary
53
This Evening
1. Machine Learning and Data Science
2. Introduction to Deep Learning
3. Nervana!
4. Neon
5. Deep Learning Use Cases
56. Nervana Systems Proprietary
•Layers: convolution, rectified linear units, pooling, dropout, softmax
•Popular with 2D + depth (+ time) inputs
•Gray or RBG images
•Videos
•Synthetic aperture radar
•Spectrogram (speech)
57. Nervana Systems Proprietary
•Layers: convolution, rectified linear units, pooling, dropout,
softmax
•Use multiple copies of the same feature on the input
(correlation)
•Use several features (aka kernels, filters)
•Reduces number of weights compared to fully connected
58. Nervana Systems Proprietary
•Layers: convolution, rectified linear units (ReLu),
pooling, dropout, softmax
•It is fast – no normalization or exponential computations
•Induces sparsity in the hidden units
59. Nervana Systems Proprietary
•Layers: convolution, rectified linear units, pooling, dropout, softmax
•Downsampling
•Reduces the number of parameters
•Provides some translation invariance
60. Nervana Systems Proprietary
•Layers: convolution, rectified linear units, pooling, dropout, softmax
•Reduces overfitting – Prevents co-adaptation on training data
61. Nervana Systems Proprietary
•Layers: convolution, rectified linear units, pooling, dropout, softmax
•aka “normalized exponential function”
•Normalizes vector to a probability distribution
70. Nervana Systems Proprietary
Long-Short Term Memory (LSTM)
1 1
1
Manipulate memory cell:
1. “forget” (flush the memory)
2. “input” (add to memory)
3. “output” (get from memory)
71. Nervana Systems Proprietary
Example – Sentiment analysis with LSTM
“Okay, sorry, but I loved this movie. I just
love the whole 80’s genre of these kind
of movies, because you don’t see many
like this...” -~CupidGrl~
POSITIVE
The plot/writing is completely unrealistic and just dumb at
times. Bond is dressed up in a white tux on an overnight
train ride? eh, OK. But then they just show up at the
villain’s compound like nothing bad is going to happen to
them. How stupid is this Bond?
NEGATIVE
72. Nervana Systems Proprietary
Preprocessing
“Okay, sorry, but I loved this movie. I just
love the whole 80’s genre of these kind
of movies, because you don’t see many
like this...” -~CupidGrl~
[5, 4, 940, 107, 14, 672, 1790,
333, 47, 11, 7890, …,1]
Out-of-Vocab
(e.g. CupidGrl)
• Limit vocab size to 20,000 words
• Truncate each example to 128 words [from the left]
• Pad examples up to 128 whitespace
78. Nervana Systems Proprietary
In Summary…
1. Deep learning methods are powerful and versatile
2. It’s important to understand how DL relates to
traditional ML methods
3. The barrier of entry to using DL in practice is
lowered with the neon framework on the Nervana
ecosystem
kyle.h.ambert@intel.com
@TheKyleAmbert