SlideShare a Scribd company logo
1 of 69
Download to read offline
Developing deep learning models with neon
Arjun Bansal
startup.ml
November 7, 2015
Outline
2
• Intro to Deep Learning
• Nervana platform
• Neon
• Building a sentiment analysis model (hands-on)
• Building a model that learns to play video games (demo)
• Nervana Cloud
INTRO TO DEEP LEARNING
3
4
What is deep learning?
A method for extracting features at
multiple levels of abstraction
• Features are discovered from data
• Performance improves with more data
• Network can express complex transformations
• High degree of representational power
WHAT IS DEEP LEARNING?
MORE THAN AN ALGORITHM - A FUNDAMENTALLY
DISTINCT COMPUTE PARADIGM
A method of extracting features
at multiple levels of abstraction
• Unsupervised learning can find structure in
unlabeled datasets
• Supervised learning optimizes solutions for a
particular application
• Performance improves with more training data
5
Convolutional neural networks
Filter + Non-Linearity
Pooling
Filter + Non-Linearity
Fully connected layers
…
“how can
I help
you?”
cat
Low level features
Mid level features
Object parts, phonemes
Objects, words
*Hinton et al., LeCun, Zeiler, Fergus
Filter + Non-Linearity
Pooling
6
Improved accuracy
Error rate1
0%!
5%!
10%!
15%!
20%!
25%!
30%!
2010! 2011! 2012! 2013! 2014! 2015!
Source: ImageNet
1: ImageNet top 5 error rate
7
Improved accuracy
Error rate1
Deep learning techniques
0%!
5%!
10%!
15%!
20%!
25%!
30%!
2010! 2011! 2012! 2013! 2014! 2015!
Source: ImageNet
1: ImageNet top 5 error rate
8
Improved accuracy
Error rate1
Deep learning techniques
0%!
5%!
10%!
15%!
20%!
25%!
30%!
2010! 2011! 2012! 2013! 2014! 2015!
human performance
Source: ImageNet
1: ImageNet top 5 error rate
9
Scene Parsing
*Yann LeCun https://www.youtube.com/watch?v=ZJMtDRbqH40
10
Speech Translation
*Skype https://www.youtube.com/watch?v=eu9kMIeS0wQ
11
Understanding Images
*Karpathy http://cs.stanford.edu/people/karpathy/deepimagesent/
12
Types of models
Model Application
Convolutional Neural Network
(CNN)
Object localization and classification in
images
Restricted Boltzmann Machines
(RBM)
Drug targeting, Collaborative Filtering,
Imputing missing interactions
Recurrent Neural Networks
(RNN)
Forecasting or predictions for timeseries
and sequence datasets
Multilayer Perceptrons
(MLP)
Arbitrary input-output problems
Deep Q Networks
(DQN)
Reinforcement Learning problems,
State-Action learning, decision-making
13
Recurrent neural networks
input
hidden
output
• MLP
13
Recurrent neural networks
input
hidden
output
input
recurrent
output
• MLP
• Add recurrent
connections
13
Recurrent neural networks
input
hidden
output
input
recurrent
output
• MLP
• Add recurrent
connections
• Unroll and train as
feed-forward network
input
hidden
output
timesteps…
14
Long short term memory
Network activations determine
states of input, forget, output
gate:
f g i o
φ
* *
*
+
ct-1
ct ht
ht-1
14
Long short term memory
Network activations determine
states of input, forget, output
gate:
• Open input, open output,
closed forget: LSTM network
acts like a standard RNN
f g i o
φ
* *
*
+
ct-1
ct ht
ht-1
f g i o
φ
0 1
1
+
ct-1
ct ht
ht-1
14
Long short term memory
Network activations determine
states of input, forget, output
gate:
• Open input, open output,
closed forget: LSTM network
acts like a standard RNN
• Closing input, opening forget:
Memory cell recalls previous
state, new input is ignored
f g i o
φ
* *
*
+
ct-1
ct ht
ht-1
f g i o
φ
0 1
1
+
ct-1
ct ht
ht-1
f g i o
φ
1 0
1
+
ct-1
ct ht
ht-1
14
Long short term memory
Network activations determine
states of input, forget, output
gate:
• Open input, open output,
closed forget: LSTM network
acts like a standard RNN
• Closing input, opening forget:
Memory cell recalls previous
state, new input is ignored
• Closing output: Internal state is
stored for the next time step
without producing any output
f g i o
φ
* *
*
+
ct-1
ct ht
ht-1
f g i o
φ
0 1
1
+
ct-1
ct ht
ht-1
f g i o
φ
1 0
1
+
ct-1
ct ht
ht-1
f g i o
φ
1 0
0
+
ct-1
ct
ht-1
ht
15
LSTM networks
memory
forget gate
cell input
input gate
forget gate
LSTM weights:
• Requires less tuning than
RNN, with same or better
performance
• neon implementation
hides internal complexity
from the user
• LSTMs perform state of
the art on sequence and
time series data
• machine translation
• video recognition
• speech recognition
• caption generation
NERVANA PLATFORM
16
17
Scalable deep learning is hard and expensive
Pre-process training
data
Augment
data
Design
model
Perform
hyperparameter
search
•Team of data scientists with
deep learning expertise
•Enormous compute (CPUs /
GPUs) and engineering
resources
http://papers.nips.cc/paper/4687-large-scale-distributed-deep-networks.pdf
18
nervana platform for deep learning
neon deep
learning
framework
train deploy
nervana
cloud
explore
18
nervana platform for deep learning
neon deep
learning
framework
train deploy
nervana
cloud
explore
AWS
VM
S3 S3
Web
VM VM
VM VM VM
S3
18
nervana platform for deep learning
neon deep
learning
framework
train deploy
nervana
cloud
explore
GPUs
CPUs
nervana engine
AWS
VM
S3 S3
Web
VM VM
VM VM VM
S3
20
Deep learning as a core technology
DL
Image
classification
Image
localization
Speech
recognition
Video
indexing Sentiment
analysis
Machine
Translation
Nervana Platform
21
Core technology
• Unprecedented compute density
21
Core technology
• Unprecedented compute density
• Scalable distributed architecture
21
Core technology
• Unprecedented compute density
• Scalable distributed architecture
• Learning and inference
• Architecture optimized for
algorithm
21
Core technology
• Unprecedented compute density
• Scalable distributed architecture
• Learning and inference
22
Verticals
Pharma Oil&Gas AgricultureMedical
$
Finance Internet Govt
NEON
23
neon: nervana python deep learning library
24
• User-friendly, extensible, abstracts parallelism & data caching
• Support for many deep learning models
• Interface to nervana cloud
• Supports multiple backends
• Currently optimized for Maxwell GPU at assembler level
• Basic automatic differentiation
• Open source (Apache 2.0)
nervana engine
GPU cluster
CPU cluster{ }
See github for details
High level design
25
Backends
NervanaCPU, NervanaGPU
NervanaEngine (internal)
Datasets
Images: ImageNet, CIFAR-10, MNIST
Captions: flickr8k, flickr30k, COCO; Text: Penn Treebank, hutter-prize, IMDB, Amazon
Initializers Constant, Uniform, Gaussian, Glorot Uniform
Learning rules
Gradient Descent with Momentum
RMSProp, AdaDelta, Adam, Adagrad
Activations Rectified Linear, Softmax, Tanh, Logistic
Layers
Linear, Convolution, Pooling, Deconvolution, Dropout
Recurrent, Long Short-Term Memory, Gated Recurrent Unit, Recurrent Sum, LookupTable
Costs Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares Error
Metrics Misclassification, TopKMisclassification, Accuracy
• Modular components
• Extensible, OO design
• Documentation
• neon.nervanasys.com
Proprietary and confidential. Do not distribute.
Using neon
26
Start with basic model:
# create training set
train_set = DataIterator(X, y)
# define model
init_norm = Gaussian(loc=0.0, scale=0.01)
layers = [
Affine(nout=100, init=init_norm, activation=Rectlin()),
Affine(nout=10, init=init_norm, activation=Logistic(shortcut=True))
]
model = Model(layers=layers)
cost = GeneralizedCost(CrossEntropyBinary())
optimizer = GradientDescentMomentum(0.1, momentum_coef=0.9)
# fit model
model.fit(train_set, optimizer=optimizer, cost=cost)
mlp.py
Multilayer Perceptron
x
y
Proprietary and confidential. Do not distribute.
Using neon
27
Define data, model:
# create training set
train_set = DataIteratorSequence(X, y)
# define model
init = Uniform(low=-0.08, high=0.08)
layers = [
LSTM(hidden, init, Logistic(), Tanh()),
Dropout(keep=0.5),
Affine(features, init, bias=init, activation=Identity())
]
model = Model(layers=layers)
cost = GeneralizedCost(SumSquared())
optimizer = RMSProp()
# fit model
model.fit(train_set, optimizer=optimizer, cost=cost)
rnn.py
. . .
xtkxt1
xt2
yt2
yt1
ytk
Recurrent neural net
Proprietary and confidential. Do not distribute.
Speed is important
28
iteration = innovation
VGG-B ImageNet training
Traintime(hours)
0
275
550
825
1,100
CPU Single GPU NervanaGPU Multi NervanaGPU
64
450
1,000
25,000
25,000*
25000
*estimate
28
*
Proprietary and confidential. Do not distribute.
1 Soumith Chintala, github.com/soumith/convnet-benchmarks
Benchmarks for convnets1
29
Benchmarks compiled by Facebook. Smaller is better.
Proprietary and confidential. Do not distribute.
1 Soumith Chintala, github.com/soumith/convnet-benchmarks
Benchmarks for convnets (updated1)
30
Benchmarks compiled by Facebook. Smaller is better.
Proprietary and confidential. Do not distribute.
31
VGG-D speed comparison
Runtimes

VGG-D
NEON

[NervanaGPU]
Caffe

[CuDNN v3]
NEON

Speed Up
fprop 363 ms 581 ms 1.6x
bprop 762 ms 1472 ms 1.9x
full forward/
backward pass
1125 ms 2053 ms 1.8x
Proprietary and confidential. Do not distribute.
Benchmarks for RNNs1
32
GEMM benchmarks compiled by Baidu. Bigger is better. 1 Erich Elsen, http://svail.github.io/
33
Optimized data loading
• Goal: ensure neon
never blocks
waiting for data
• C++ multi-
threaded
• Double buffered,
pooled resources
Library	Wrapper	
DataLoader	 DataLoader	 DecodeThreads	
start	
IOThreads	
destroy	thread	pool	
stop	
next	
...	
next	
create	thread	pool	
create	thread	pool	
destroy	thread	pool	
read	macrobatch	file	
decode	
decode	
decode	
macrobatch	
buffers	
minibatch	
buffers	
(pinned)	
raw	file	
buffers
HANDS ON EXERCISE
34
Sentiment analysis using LSTMs
35
• Analyze text and map it to a numerical rating (1-5)
• Movie reviews (IMDB)
• Product reviews (Amazon, coming soon)
Data preprocessing
36
• Converting words to one-hot
• Top 50,000 words
• PAD, OOV, START tags
• Ids based on frequency
• Pre-defined sentence length
• Targets binarized to positive (>=7), negative (<7)
Embedding
37
• Learning to embed words from a sparse representation to a dense space
Mikolov et al. 2013a
*http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/
W(woman)−W(man) ≃ W(aunt)−W(uncle)
W(woman)−W(man) ≃ W(queen)−W(king)
Model architecture
38
http://deeplearning.net/tutorial/lstm.html
See J.Li et al, EMNLP2015 - http://arxiv.org/pdf/1503.00185v5.pdf
This movie was awesomethe opposite of…
Embedding layer
LSTM layer (128)
Recurrent Sum
+Dropout
Affine
positive negative
…
Backend
39
NervanaCPU, NervanaGPU
NervanaEngine (internal)
# setup backend
be = gen_backend(backend=args.backend,
batch_size=batch_size,
rng_seed=args.rng_seed,
device_id=args.device_id,
default_dtype=args.datatype)
# invoking from command line with arguments
python examples/imdb_lstm.py -b cpu -e 2 -val 1 -r 0
Dataset
40
# make dataset
path = load_text('imdb', path=args.data_dir)
(X_train, y_train), (X_test, y_test), nclass = Text.pad_data(
path, vocab_size=vocab_size,
sentence_length=sentence_length)
train_set = DataIterator(X_train, y_train, nclass=2)
test_set = DataIterator(X_test, y_test, nclass=2)
Images: ImageNet, CIFAR-10, MNIST
Captions: flickr8k, flickr30k, COCO
Text: Penn Treebank, hutter-prize, IMDB, Amazon reviews
Initializers
41
# weight initialization
init_emb = Uniform(low=-0.1/embedding_dim, high=0.1/
embedding_dim)
init_glorot = GlorotUniform()
Constant, Uniform, Gaussian, Glorot Uniform
Architecture
42
# Layers and Activations
layers = [
LookupTable(vocab_size=vocab_size,
embedding_dim=embedding_dim, init=init_emb),
LSTM(hidden_size, init_glorot, activation=Tanh(),
gate_activation=Logistic(), reset_cells=True),
RecurrentSum(),
Dropout(keep=0.5),
Affine(2, init_glorot, bias=init_glorot,
activation=Softmax())
]
Rectified Linear, Softmax, Tanh, Logistic
Linear, Convolution, Pooling, Deconvolution, Dropout
Recurrent, Long Short-Term Memory, Gated Recurrent Unit,
Recurrent Sum, LookupTable
Cost & Metrics
43
cost =
GeneralizedCost(costfunc=CrossEntropyMulti(usebits=True))
metric = Accuracy()
model = Model(layers=layers)
Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares
ErrorMisclassification, TopKMisclassification, Accuracy
Learning rules & Callbacks
44
optimizer = Adagrad(learning_rate=0.01,
clip_gradients=clip_gradients)
# configure callbacks
callbacks = Callbacks(model, train_set, args,
valid_set=test_set)
Gradient Descent with Momentum
RMSProp, AdaDelta, Adam, Adagrad
Train model
45
model.fit(train_set,
optimizer=optimizer,
num_epochs=num_epochs,
cost=cost,
callbacks=callbacks)
Demo
46
• Training
• python train.py -e 2 -val 1 -r 0 -s model.pkl --serialize 1
• Inference
• python inference.py --train_fname model
• Exercise
• Use word2vec to initialize embeddings
git checkout tutorial
DEMO
47
Deep Reinforcement Learning*
48
• Learning video games from raw pixels and scores
• Developer contribution: Tambet Matiisen, University of Tartu, Estonia
• https://github.com/tambetm/simple_dqn
*Mnih et al., Nature (2015)
Deep Reinforcement Learning*
48
• Learning video games from raw pixels and scores
• Developer contribution: Tambet Matiisen, University of Tartu, Estonia
• https://github.com/tambetm/simple_dqn
*Mnih et al., Nature (2015)
Deep Reinforcement Learning
49
• Convnet to compute Q score for state, action pairs
• Replay memory (to remove correlations in observation sequence)
• Freezing network (to reduce correlation with target)
• Clipping scores between -1, +1 (same learning rate across games)
• Same network can play a range of games
Mnih et al., Nature (2015)
Algorithm
50
Mnih et al., Nature (2015)
Deep Reinforcement Learning
51
Mnih et al., Nature (2015)
Deep Reinforcement Learning
51
Mnih et al., Nature (2015)
Conv
Layer
FC Layer
Conv
Layer
Conv
Layer
FC Layer Q*(s,a)
DQN code (deepqnetwork.py)
52
init_norm = Gaussian(loc=0.0, scale=0.01)
layers = []
layers.append(Conv((8, 8, 32), strides=4, init=init_norm, activation=Rectlin()))
layers.append(Conv((4, 4, 64), strides=2, init=init_norm, activation=Rectlin()))
layers.append(Conv((3, 3, 64), strides=1, init=init_norm, activation=Rectlin()))
layers.append(Affine(nout=512, init=init_norm, activation=Rectlin()))
layers.append(Affine(nout = num_actions, init = init_norm))
Other parts of the code
53
• main.py: executable
• agent.py: Agent class (learning and playing)
• environment.py: wrapper for Arcade Learning Environment (ALE)
• replay_memory.py: replay memory class
Demo
54
• Training
• ./train.sh --minimal_action_set roms/breakout.bin
• ./train.sh --minimal_action_set roms/pong.bin
• Plot results
• ./plot.sh results/breakout.csv
• Play (observe the network learning)
• ./play.sh --minimal_action_set roms/pong/.bin --load_weights
snapshots/pong_<epoch>.pkl
• Record
• ./record.sh --minimal_action_set roms/pong.bin --load_weights
snapshots/pong_<epoch>.pkl
NERVANA CLOUD
55
Proprietary and confidential. Do not distribute.
Using neon and nervana cloud
56
Running locally:
% python rnn.py # or neon rnn.yaml
Running in nervana cloud:
% ncloud submit rnn.py # or rnn.yaml
% ncloud show <model_id>
% ncloud list
% ncloud deploy <model_id>
% ncloud predict <model_id> <data> # or use REST api
Proprietary and confidential. Do not distribute.
Contact
57
arjun@nervanasys.com
@coffeephoenix
github.com/NervanaSystems/neon
Proprietary and confidential. Do not distribute.

More Related Content

What's hot

Anil Thomas - Object recognition
Anil Thomas - Object recognitionAnil Thomas - Object recognition
Anil Thomas - Object recognitionIntel Nervana
 
Urs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonUrs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonIntel Nervana
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at ScaleIntel Nervana
 
Intel Nervana Artificial Intelligence Meetup 1/31/17
Intel Nervana Artificial Intelligence Meetup 1/31/17Intel Nervana Artificial Intelligence Meetup 1/31/17
Intel Nervana Artificial Intelligence Meetup 1/31/17Intel Nervana
 
Introduction to Deep Learning with Will Constable
Introduction to Deep Learning with Will ConstableIntroduction to Deep Learning with Will Constable
Introduction to Deep Learning with Will ConstableIntel Nervana
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA Taiwan
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsHPCC Systems
 
Language translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlowLanguage translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlowS N
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesAnirudh Koul
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA Taiwan
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsChester Chen
 
A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning ApplicationsNVIDIA Taiwan
 
DIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe WorkshopDIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe Workshopodsc
 
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetsonNVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetsonNVIDIA Taiwan
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabImry Kissos
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMiguel González-Fierro
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep LearningBrahim HAMADICHAREF
 

What's hot (20)

Anil Thomas - Object recognition
Anil Thomas - Object recognitionAnil Thomas - Object recognition
Anil Thomas - Object recognition
 
Urs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in BostonUrs Köster Presenting at RE-Work DL Summit in Boston
Urs Köster Presenting at RE-Work DL Summit in Boston
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
Intel Nervana Artificial Intelligence Meetup 1/31/17
Intel Nervana Artificial Intelligence Meetup 1/31/17Intel Nervana Artificial Intelligence Meetup 1/31/17
Intel Nervana Artificial Intelligence Meetup 1/31/17
 
Introduction to Deep Learning with Will Constable
Introduction to Deep Learning with Will ConstableIntroduction to Deep Learning with Will Constable
Introduction to Deep Learning with Will Constable
 
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detectionNVIDIA 深度學習教育機構 (DLI): Approaches to object detection
NVIDIA 深度學習教育機構 (DLI): Approaches to object detection
 
Moving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC SystemsMoving Toward Deep Learning Algorithms on HPCC Systems
Moving Toward Deep Learning Algorithms on HPCC Systems
 
Language translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlowLanguage translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlow
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile Phones
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning Applications
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
DIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe WorkshopDIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe Workshop
 
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetsonNVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
 
Introduction to deep learning in python and Matlab
Introduction to deep learning in python and MatlabIntroduction to deep learning in python and Matlab
Introduction to deep learning in python and Matlab
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep LearningMastering Computer Vision Problems with State-of-the-art Deep Learning
Mastering Computer Vision Problems with State-of-the-art Deep Learning
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep Learning
 

Viewers also liked

Video Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model ExampleVideo Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model ExampleIntel Nervana
 
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Intel Nervana
 
An Analysis of Convolution for Inference
An Analysis of Convolution for InferenceAn Analysis of Convolution for Inference
An Analysis of Convolution for InferenceIntel Nervana
 
High-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep LearningHigh-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep LearningIntel Nervana
 
Deep Learning for Robotics
Deep Learning for RoboticsDeep Learning for Robotics
Deep Learning for RoboticsIntel Nervana
 
RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016Intel Nervana
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowAltoros
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introductionnlab_utokyo
 
On the benchmark of Chainer
On the benchmark of ChainerOn the benchmark of Chainer
On the benchmark of ChainerKenta Oono
 
Chainer v2 alpha
Chainer v2 alphaChainer v2 alpha
Chainer v2 alphaSeiya Tokui
 
Deep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecastingDeep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecastingPavel Filonov
 

Viewers also liked (13)

Video Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model ExampleVideo Activity Recognition and NLP Q&A Model Example
Video Activity Recognition and NLP Q&A Model Example
 
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
 
An Analysis of Convolution for Inference
An Analysis of Convolution for InferenceAn Analysis of Convolution for Inference
An Analysis of Convolution for Inference
 
High-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep LearningHigh-Performance GPU Programming for Deep Learning
High-Performance GPU Programming for Deep Learning
 
Deep Learning for Robotics
Deep Learning for RoboticsDeep Learning for Robotics
Deep Learning for Robotics
 
RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016RE-Work Deep Learning Summit - September 2016
RE-Work Deep Learning Summit - September 2016
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlowLearning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
 
On the benchmark of Chainer
On the benchmark of ChainerOn the benchmark of Chainer
On the benchmark of Chainer
 
Chainer v2 alpha
Chainer v2 alphaChainer v2 alpha
Chainer v2 alpha
 
Deep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecastingDeep learning and feature extraction for time series forecasting
Deep learning and feature extraction for time series forecasting
 

Similar to Startup.Ml: Using neon for NLP and Localization Applications

AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...Apache MXNet
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...Vandana Kannan
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetAmazon Web Services
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetAmazon Web Services
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningS N
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Julien SIMON
 
Cognitive Toolkit - Deep Learning framework from Microsoft
Cognitive Toolkit - Deep Learning framework from MicrosoftCognitive Toolkit - Deep Learning framework from Microsoft
Cognitive Toolkit - Deep Learning framework from MicrosoftŁukasz Grala
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsSanghamitra Deb
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018Apache MXNet
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Julien SIMON
 

Similar to Startup.Ml: Using neon for NLP and Localization Applications (20)

AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNet
 
Development of Deep Learning Architecture
Development of Deep Learning ArchitectureDevelopment of Deep Learning Architecture
Development of Deep Learning Architecture
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
MXNet Workshop
MXNet WorkshopMXNet Workshop
MXNet Workshop
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Cognitive Toolkit - Deep Learning framework from Microsoft
Cognitive Toolkit - Deep Learning framework from MicrosoftCognitive Toolkit - Deep Learning framework from Microsoft
Cognitive Toolkit - Deep Learning framework from Microsoft
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)
 

Recently uploaded

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Startup.Ml: Using neon for NLP and Localization Applications

  • 1. Developing deep learning models with neon Arjun Bansal startup.ml November 7, 2015
  • 2. Outline 2 • Intro to Deep Learning • Nervana platform • Neon • Building a sentiment analysis model (hands-on) • Building a model that learns to play video games (demo) • Nervana Cloud
  • 3. INTRO TO DEEP LEARNING 3
  • 4. 4 What is deep learning? A method for extracting features at multiple levels of abstraction • Features are discovered from data • Performance improves with more data • Network can express complex transformations • High degree of representational power WHAT IS DEEP LEARNING? MORE THAN AN ALGORITHM - A FUNDAMENTALLY DISTINCT COMPUTE PARADIGM A method of extracting features at multiple levels of abstraction • Unsupervised learning can find structure in unlabeled datasets • Supervised learning optimizes solutions for a particular application • Performance improves with more training data
  • 5. 5 Convolutional neural networks Filter + Non-Linearity Pooling Filter + Non-Linearity Fully connected layers … “how can I help you?” cat Low level features Mid level features Object parts, phonemes Objects, words *Hinton et al., LeCun, Zeiler, Fergus Filter + Non-Linearity Pooling
  • 6. 6 Improved accuracy Error rate1 0%! 5%! 10%! 15%! 20%! 25%! 30%! 2010! 2011! 2012! 2013! 2014! 2015! Source: ImageNet 1: ImageNet top 5 error rate
  • 7. 7 Improved accuracy Error rate1 Deep learning techniques 0%! 5%! 10%! 15%! 20%! 25%! 30%! 2010! 2011! 2012! 2013! 2014! 2015! Source: ImageNet 1: ImageNet top 5 error rate
  • 8. 8 Improved accuracy Error rate1 Deep learning techniques 0%! 5%! 10%! 15%! 20%! 25%! 30%! 2010! 2011! 2012! 2013! 2014! 2015! human performance Source: ImageNet 1: ImageNet top 5 error rate
  • 9. 9 Scene Parsing *Yann LeCun https://www.youtube.com/watch?v=ZJMtDRbqH40
  • 12. 12 Types of models Model Application Convolutional Neural Network (CNN) Object localization and classification in images Restricted Boltzmann Machines (RBM) Drug targeting, Collaborative Filtering, Imputing missing interactions Recurrent Neural Networks (RNN) Forecasting or predictions for timeseries and sequence datasets Multilayer Perceptrons (MLP) Arbitrary input-output problems Deep Q Networks (DQN) Reinforcement Learning problems, State-Action learning, decision-making
  • 15. 13 Recurrent neural networks input hidden output input recurrent output • MLP • Add recurrent connections • Unroll and train as feed-forward network input hidden output timesteps…
  • 16. 14 Long short term memory Network activations determine states of input, forget, output gate: f g i o φ * * * + ct-1 ct ht ht-1
  • 17. 14 Long short term memory Network activations determine states of input, forget, output gate: • Open input, open output, closed forget: LSTM network acts like a standard RNN f g i o φ * * * + ct-1 ct ht ht-1 f g i o φ 0 1 1 + ct-1 ct ht ht-1
  • 18. 14 Long short term memory Network activations determine states of input, forget, output gate: • Open input, open output, closed forget: LSTM network acts like a standard RNN • Closing input, opening forget: Memory cell recalls previous state, new input is ignored f g i o φ * * * + ct-1 ct ht ht-1 f g i o φ 0 1 1 + ct-1 ct ht ht-1 f g i o φ 1 0 1 + ct-1 ct ht ht-1
  • 19. 14 Long short term memory Network activations determine states of input, forget, output gate: • Open input, open output, closed forget: LSTM network acts like a standard RNN • Closing input, opening forget: Memory cell recalls previous state, new input is ignored • Closing output: Internal state is stored for the next time step without producing any output f g i o φ * * * + ct-1 ct ht ht-1 f g i o φ 0 1 1 + ct-1 ct ht ht-1 f g i o φ 1 0 1 + ct-1 ct ht ht-1 f g i o φ 1 0 0 + ct-1 ct ht-1 ht
  • 20. 15 LSTM networks memory forget gate cell input input gate forget gate LSTM weights: • Requires less tuning than RNN, with same or better performance • neon implementation hides internal complexity from the user • LSTMs perform state of the art on sequence and time series data • machine translation • video recognition • speech recognition • caption generation
  • 22. 17 Scalable deep learning is hard and expensive Pre-process training data Augment data Design model Perform hyperparameter search •Team of data scientists with deep learning expertise •Enormous compute (CPUs / GPUs) and engineering resources http://papers.nips.cc/paper/4687-large-scale-distributed-deep-networks.pdf
  • 23. 18 nervana platform for deep learning neon deep learning framework train deploy nervana cloud explore
  • 24. 18 nervana platform for deep learning neon deep learning framework train deploy nervana cloud explore AWS VM S3 S3 Web VM VM VM VM VM S3
  • 25. 18 nervana platform for deep learning neon deep learning framework train deploy nervana cloud explore GPUs CPUs nervana engine AWS VM S3 S3 Web VM VM VM VM VM S3
  • 26. 20 Deep learning as a core technology DL Image classification Image localization Speech recognition Video indexing Sentiment analysis Machine Translation Nervana Platform
  • 28. 21 Core technology • Unprecedented compute density • Scalable distributed architecture
  • 29. 21 Core technology • Unprecedented compute density • Scalable distributed architecture • Learning and inference
  • 30. • Architecture optimized for algorithm 21 Core technology • Unprecedented compute density • Scalable distributed architecture • Learning and inference
  • 33. neon: nervana python deep learning library 24 • User-friendly, extensible, abstracts parallelism & data caching • Support for many deep learning models • Interface to nervana cloud • Supports multiple backends • Currently optimized for Maxwell GPU at assembler level • Basic automatic differentiation • Open source (Apache 2.0) nervana engine GPU cluster CPU cluster{ } See github for details
  • 34. High level design 25 Backends NervanaCPU, NervanaGPU NervanaEngine (internal) Datasets Images: ImageNet, CIFAR-10, MNIST Captions: flickr8k, flickr30k, COCO; Text: Penn Treebank, hutter-prize, IMDB, Amazon Initializers Constant, Uniform, Gaussian, Glorot Uniform Learning rules Gradient Descent with Momentum RMSProp, AdaDelta, Adam, Adagrad Activations Rectified Linear, Softmax, Tanh, Logistic Layers Linear, Convolution, Pooling, Deconvolution, Dropout Recurrent, Long Short-Term Memory, Gated Recurrent Unit, Recurrent Sum, LookupTable Costs Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares Error Metrics Misclassification, TopKMisclassification, Accuracy • Modular components • Extensible, OO design • Documentation • neon.nervanasys.com
  • 35. Proprietary and confidential. Do not distribute. Using neon 26 Start with basic model: # create training set train_set = DataIterator(X, y) # define model init_norm = Gaussian(loc=0.0, scale=0.01) layers = [ Affine(nout=100, init=init_norm, activation=Rectlin()), Affine(nout=10, init=init_norm, activation=Logistic(shortcut=True)) ] model = Model(layers=layers) cost = GeneralizedCost(CrossEntropyBinary()) optimizer = GradientDescentMomentum(0.1, momentum_coef=0.9) # fit model model.fit(train_set, optimizer=optimizer, cost=cost) mlp.py Multilayer Perceptron x y
  • 36. Proprietary and confidential. Do not distribute. Using neon 27 Define data, model: # create training set train_set = DataIteratorSequence(X, y) # define model init = Uniform(low=-0.08, high=0.08) layers = [ LSTM(hidden, init, Logistic(), Tanh()), Dropout(keep=0.5), Affine(features, init, bias=init, activation=Identity()) ] model = Model(layers=layers) cost = GeneralizedCost(SumSquared()) optimizer = RMSProp() # fit model model.fit(train_set, optimizer=optimizer, cost=cost) rnn.py . . . xtkxt1 xt2 yt2 yt1 ytk Recurrent neural net
  • 37. Proprietary and confidential. Do not distribute. Speed is important 28 iteration = innovation VGG-B ImageNet training Traintime(hours) 0 275 550 825 1,100 CPU Single GPU NervanaGPU Multi NervanaGPU 64 450 1,000 25,000 25,000* 25000 *estimate 28 *
  • 38. Proprietary and confidential. Do not distribute. 1 Soumith Chintala, github.com/soumith/convnet-benchmarks Benchmarks for convnets1 29 Benchmarks compiled by Facebook. Smaller is better.
  • 39. Proprietary and confidential. Do not distribute. 1 Soumith Chintala, github.com/soumith/convnet-benchmarks Benchmarks for convnets (updated1) 30 Benchmarks compiled by Facebook. Smaller is better.
  • 40. Proprietary and confidential. Do not distribute. 31 VGG-D speed comparison Runtimes
 VGG-D NEON
 [NervanaGPU] Caffe
 [CuDNN v3] NEON
 Speed Up fprop 363 ms 581 ms 1.6x bprop 762 ms 1472 ms 1.9x full forward/ backward pass 1125 ms 2053 ms 1.8x
  • 41. Proprietary and confidential. Do not distribute. Benchmarks for RNNs1 32 GEMM benchmarks compiled by Baidu. Bigger is better. 1 Erich Elsen, http://svail.github.io/
  • 42. 33 Optimized data loading • Goal: ensure neon never blocks waiting for data • C++ multi- threaded • Double buffered, pooled resources Library Wrapper DataLoader DataLoader DecodeThreads start IOThreads destroy thread pool stop next ... next create thread pool create thread pool destroy thread pool read macrobatch file decode decode decode macrobatch buffers minibatch buffers (pinned) raw file buffers
  • 44. Sentiment analysis using LSTMs 35 • Analyze text and map it to a numerical rating (1-5) • Movie reviews (IMDB) • Product reviews (Amazon, coming soon)
  • 45. Data preprocessing 36 • Converting words to one-hot • Top 50,000 words • PAD, OOV, START tags • Ids based on frequency • Pre-defined sentence length • Targets binarized to positive (>=7), negative (<7)
  • 46. Embedding 37 • Learning to embed words from a sparse representation to a dense space Mikolov et al. 2013a *http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/ W(woman)−W(man) ≃ W(aunt)−W(uncle) W(woman)−W(man) ≃ W(queen)−W(king)
  • 47. Model architecture 38 http://deeplearning.net/tutorial/lstm.html See J.Li et al, EMNLP2015 - http://arxiv.org/pdf/1503.00185v5.pdf This movie was awesomethe opposite of… Embedding layer LSTM layer (128) Recurrent Sum +Dropout Affine positive negative …
  • 48. Backend 39 NervanaCPU, NervanaGPU NervanaEngine (internal) # setup backend be = gen_backend(backend=args.backend, batch_size=batch_size, rng_seed=args.rng_seed, device_id=args.device_id, default_dtype=args.datatype) # invoking from command line with arguments python examples/imdb_lstm.py -b cpu -e 2 -val 1 -r 0
  • 49. Dataset 40 # make dataset path = load_text('imdb', path=args.data_dir) (X_train, y_train), (X_test, y_test), nclass = Text.pad_data( path, vocab_size=vocab_size, sentence_length=sentence_length) train_set = DataIterator(X_train, y_train, nclass=2) test_set = DataIterator(X_test, y_test, nclass=2) Images: ImageNet, CIFAR-10, MNIST Captions: flickr8k, flickr30k, COCO Text: Penn Treebank, hutter-prize, IMDB, Amazon reviews
  • 50. Initializers 41 # weight initialization init_emb = Uniform(low=-0.1/embedding_dim, high=0.1/ embedding_dim) init_glorot = GlorotUniform() Constant, Uniform, Gaussian, Glorot Uniform
  • 51. Architecture 42 # Layers and Activations layers = [ LookupTable(vocab_size=vocab_size, embedding_dim=embedding_dim, init=init_emb), LSTM(hidden_size, init_glorot, activation=Tanh(), gate_activation=Logistic(), reset_cells=True), RecurrentSum(), Dropout(keep=0.5), Affine(2, init_glorot, bias=init_glorot, activation=Softmax()) ] Rectified Linear, Softmax, Tanh, Logistic Linear, Convolution, Pooling, Deconvolution, Dropout Recurrent, Long Short-Term Memory, Gated Recurrent Unit, Recurrent Sum, LookupTable
  • 52. Cost & Metrics 43 cost = GeneralizedCost(costfunc=CrossEntropyMulti(usebits=True)) metric = Accuracy() model = Model(layers=layers) Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares ErrorMisclassification, TopKMisclassification, Accuracy
  • 53. Learning rules & Callbacks 44 optimizer = Adagrad(learning_rate=0.01, clip_gradients=clip_gradients) # configure callbacks callbacks = Callbacks(model, train_set, args, valid_set=test_set) Gradient Descent with Momentum RMSProp, AdaDelta, Adam, Adagrad
  • 55. Demo 46 • Training • python train.py -e 2 -val 1 -r 0 -s model.pkl --serialize 1 • Inference • python inference.py --train_fname model • Exercise • Use word2vec to initialize embeddings git checkout tutorial
  • 57. Deep Reinforcement Learning* 48 • Learning video games from raw pixels and scores • Developer contribution: Tambet Matiisen, University of Tartu, Estonia • https://github.com/tambetm/simple_dqn *Mnih et al., Nature (2015)
  • 58. Deep Reinforcement Learning* 48 • Learning video games from raw pixels and scores • Developer contribution: Tambet Matiisen, University of Tartu, Estonia • https://github.com/tambetm/simple_dqn *Mnih et al., Nature (2015)
  • 59. Deep Reinforcement Learning 49 • Convnet to compute Q score for state, action pairs • Replay memory (to remove correlations in observation sequence) • Freezing network (to reduce correlation with target) • Clipping scores between -1, +1 (same learning rate across games) • Same network can play a range of games Mnih et al., Nature (2015)
  • 60. Algorithm 50 Mnih et al., Nature (2015)
  • 61. Deep Reinforcement Learning 51 Mnih et al., Nature (2015)
  • 62. Deep Reinforcement Learning 51 Mnih et al., Nature (2015) Conv Layer FC Layer Conv Layer Conv Layer FC Layer Q*(s,a)
  • 63. DQN code (deepqnetwork.py) 52 init_norm = Gaussian(loc=0.0, scale=0.01) layers = [] layers.append(Conv((8, 8, 32), strides=4, init=init_norm, activation=Rectlin())) layers.append(Conv((4, 4, 64), strides=2, init=init_norm, activation=Rectlin())) layers.append(Conv((3, 3, 64), strides=1, init=init_norm, activation=Rectlin())) layers.append(Affine(nout=512, init=init_norm, activation=Rectlin())) layers.append(Affine(nout = num_actions, init = init_norm))
  • 64. Other parts of the code 53 • main.py: executable • agent.py: Agent class (learning and playing) • environment.py: wrapper for Arcade Learning Environment (ALE) • replay_memory.py: replay memory class
  • 65. Demo 54 • Training • ./train.sh --minimal_action_set roms/breakout.bin • ./train.sh --minimal_action_set roms/pong.bin • Plot results • ./plot.sh results/breakout.csv • Play (observe the network learning) • ./play.sh --minimal_action_set roms/pong/.bin --load_weights snapshots/pong_<epoch>.pkl • Record • ./record.sh --minimal_action_set roms/pong.bin --load_weights snapshots/pong_<epoch>.pkl
  • 67. Proprietary and confidential. Do not distribute. Using neon and nervana cloud 56 Running locally: % python rnn.py # or neon rnn.yaml Running in nervana cloud: % ncloud submit rnn.py # or rnn.yaml % ncloud show <model_id> % ncloud list % ncloud deploy <model_id> % ncloud predict <model_id> <data> # or use REST api
  • 68. Proprietary and confidential. Do not distribute. Contact 57 arjun@nervanasys.com @coffeephoenix github.com/NervanaSystems/neon
  • 69. Proprietary and confidential. Do not distribute.