SlideShare a Scribd company logo
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Deep Learning on HDP
Dhruv Kumar – dkumar@hortonworks.com
Solutions Engineer, Hortonworks
2015
Version 1.0
Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 2
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Scientists See Promise in Deep-Learning Programs
John Markoff
November 23, 2012
Rich Rashid in Tianjin, October, 25, 2012
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Impact of deep learning in speech technologyWhere is it?
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 6
……Facebook’s foray into deep learning sees it following its
competitors Google and Microsoft, which have used the approach to
impressive effect in the past year. Google has hired and acquired
leading talent in the field (see “
10 Breakthrough Technologies 2013: Deep Learning”), and last year
created software that taught itself to recognize cats and other objects
by reviewing stills from YouTube videos. The underlying deep learning
technology was later used to slash the error rate of Google’s voice
recognition services (see “Google’s Virtual Brain Goes to Work”)
….Researchers at Microsoft have used deep learning to build a
system that translates speech from English to Mandarin Chinese in
real time (see “Microsoft Brings Star Trek’s Voice Translator to Life”).
Chinese Web giant Baidu also recently established a Silicon Valley
research lab to work on deep learning.
September 20, 2013
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 7
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 8
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 9
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 10
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 11
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 12
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
“The word is spreading in all corners of the tech industry that the biggest part of big data, the
unstructured part, possesses learnable patterns that we now have the computing power and
algorithmic leverage to discern…This change marks a true disruption, and there are fortunes to be
made. There are also tremendous social consequences to consider that require as much creativity
and investment as the more immediately lucrative deep learning startups that are popping up all
over…”
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
“Using an artificial intelligence technique inspired by theories about how the brain
recognizes patterns, technology companies are reporting startling gains in fields as
diverse as computer vision, speech recognition and the identification of promising
new molecules for designing drugs.
The advances have led to widespread enthusiasm among researchers who design
software to perform human activities like seeing, listening and thinking. They offer
the promise of machines that converse with humans and perform tasks like driving
cars and working in factories, raising the specter of automated robots that could
replace human workers.”
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Google Trends for “Deep Learning” keyword
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Enterprise use cases
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Deep Learning
•  One of the many pattern recognition techniques in Data
Science
•  Excels at rich media applications:
•  Image recognition
•  Speech translation
•  Voice recognition
•  Loosely inspired by human brain models
•  Synonymous with Artificial Neural Networks, Multi Layer
Networks
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
In this workshop
-  Fundamentals of Deep Learning
-  Implementation and Libraries in Real Life
-  Demo!
Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
So, 1. what exactly is deep learning ?
And, 2. why is it generally better than other methods on image,
speech and certain other types of data?
Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
So, 1. what exactly is deep learning ?
And, 2. why is it generally better than other methods on image,
speech and certain other types of data?
The short answers
1. ‘Deep Learning’ means using a neural network
with several layers of nodes between input and output
2. the series of layers between input & output do
feature identification and processing in a series of stages,
just as our brains seem to.
Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
but:
3. multilayer neural networks have been around for
25 years. What’s actually new?
Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
but:
3. multilayer neural networks have been around for
25 years. What’s actually new?
we have always had good algorithms for learning the
weights in networks with 1 hidden layer
but these algorithms are not good at
learning the weights for networks with
more hidden layers
what’s new is: algorithms for training many-layer networks
Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Longer answers
1.  reminder/quick-explanation of how neural network
weights are learned;
2.  the idea of unsupervised feature learning (why
‘intermediate features’ are important for difficult
classification tasks, and how NNs seem to naturally learn
them)
3.  The ‘breakthrough’ – the simple trick for training Deep
neural networks
Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Neuron Quick Look
24
Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
W1
W2
W3
f(x)
1.4
-2.5
-0.06
Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
2.7
-8.6
0.002
f(x)
1.4
-2.5
-0.06
x = -0.06×2.7 + 2.5×8.6 + 1.4×0.002 = 21.34
Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A dataset
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training the neural network
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Page 29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Initialise with random weights
Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Present a training pattern
1.4
2.7
1.9
Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Feed it through to get output
1.4
2.7 0.8
1.9
Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Compare with target output
1.4
2.7 0.8
0
1.9 error 0.8
Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Adjust weights based on error
1.4
2.7 0.8
0
1.9 error 0.8
Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Present a training pattern
6.4
2.8
1.7
Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Feed it through to get output
6.4
2.8 0.9
1.7
Page 36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Compare with target output
6.4
2.8 0.9
1
1.7 error -0.1
Page 37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
Adjust weights based on error
6.4
2.8 0.9
1
1.7 error -0.1
Page 38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Training data
Fields class
1.4 2.7 1.9 0
3.8 3.4 3.2 0
6.4 2.8 1.7 1
4.1 0.1 0.2 0
etc …
And so on ….
6.4
2.8 0.9
1
1.7 error -0.1
Repeat this thousands, maybe millions of times – each time
taking a random training instance, and making slight
weight adjustments
Algorithms for weight adjustment are designed to make
changes that will reduce the error
Page 39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The decision boundary perspective…
Initial random weights
Page 40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The decision boundary perspective…
Present a training instance / adjust the weights
Page 41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The decision boundary perspective…
Present a training instance / adjust the weights
Page 42 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The decision boundary perspective…
Present a training instance / adjust the weights
Page 43 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The decision boundary perspective…
Present a training instance / adjust the weights
Page 44 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The decision boundary perspective…
Eventually ….
Page 45 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
In essence
•  weight-learning algorithms for NNs are dumb
•  they work by making thousands and thousands of
tiny adjustments, each making the network do better
at the most recent pattern, but perhaps a little worse
on many others
•  but, by dumb luck, eventually this tends to be good
enough to learn effective classifiers for many real
applications
Page 46 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Some other points
Detail of a standard NN weight learning algorithm – later
If f(x) is non-linear, a network with 1 hidden layer can, in theory, learn
perfectly any classification problem. A set of weights exists that can
produce the targets from the inputs. The problem is finding them.
Page 47 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Some other ‘by the way’ points
If f(x) is linear, the NN can only draw straight decision boundaries (even if there
are many layers of units)
Page 48 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Some other ‘by the way’ points
NNs use nonlinear f(x) so they
can draw complex boundaries,
but keep the data unchanged
Page 49 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Some other ‘by the way’ points
NNs use nonlinear f(x) so they SVMs only draw straight lines,
can draw complex boundaries, but they transform the data first
but keep the data unchanged in a way that makes that OK
Page 50 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Feature detectors
Page 51 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What’s this unit doing?
Page 52 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hidden layer units become self-organised feature
detectors
…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
Page 53 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What does this unit detect?
…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
Page 54 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What does this unit detect?
…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
it will send strong signal for a horizontal
line in the top row, ignoring everywhere else
Page 55 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What does this unit detect?
…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
Page 56 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What does this unit detect?
…
1
63
1 5 10 15 20 25 …
strong +ve weight
low/zero weight
Strong signal for a dark area in the top left
corner
Page 57 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What features might you expect a good NN
to learn, when trained with data like this?
Page 58 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
63
1
vertical lines
Page 59 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
63
1
horizontal lines
Page 60 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
63
1
Small circles
Page 61 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
successive layers can learn higher-level features …
etc …detect lines in
Specific positions
v
Higher level detetors
( horizontal line,
“RHS vertical lune”
“upper loop”, etc…
etc …
Page 62 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
successive layers can learn higher-level features …
etc …detect lines in
Specific positions
v
Higher level detetors
( horizontal line,
“RHS vertical lune”
“upper loop”, etc…
etc …
What does this unit detect?
Page 63 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
So: multiple layers make sense
Page 64 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
So: multiple layers make sense
Your brain works that way
Page 65 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
So: multiple layers make sense
Many-layer neural network architectures should be capable of learning the true underlying
features and ‘feature logic’, and therefore generalise very well …
Page 66 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
But, until very recently, weight-learning algorithms simply
did not work on multi-layer architectures
Page 67 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Along came deep learning …
Page 68 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The new way to train multi-layer NNs…
Page 69 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The new way to train multi-layer NNs…
Train this layer first
Page 70 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The new way to train multi-layer NNs…
Train this layer first
then this layer
Page 71 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The new way to train multi-layer NNs…
Train this layer first
then this layer
then this layer
Page 72 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The new way to train multi-layer NNs…
Train this layer first
then this layer
then this layer
then this layer
Page 73 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The new way to train multi-layer NNs…
Train this layer first
then this layer
then this layer
then this layer
finally this layer
Page 74 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The new way to train multi-layer NNs…
EACH of the (non-output) layers is trained to be
an auto-encoder
Basically, it is forced to learn good features that
describe what comes from the previous layer
Page 75 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
an auto-encoder is trained, with an absolutely standard weight-
adjustment algorithm to reproduce the input
Page 76 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
an auto-encoder is trained, with an absolutely standard weight-
adjustment algorithm to reproduce the input
By making this happen with (many) fewer units than the inputs, this
forces the ‘hidden layer’ units to become good feature detectors
Page 77 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
intermediate layers are each trained to be auto encoders
(or similar)
Page 78 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Final layer trained to predict class based on outputs from
previous layers
Page 79 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
But, how does one train?
79
Overall, the NN is trying to minimize a cost
function while adjusting the weights. To
find the minima, Gradient Descent is
used.
If the training set is very large, GD can be
too slow since each input is evaluated in
the cost function. So, one can sample a
subset of input for computing GD. If
sampling is done at random, it is called
Stochastic Gradient Descent.
The implementation of this mathematical
formulation is done by the Error
Backpropagation Algorithm.
Page 80 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
And that’s that
•  That’s the basic idea
•  There are many many types of deep learning,
•  Different kinds of autoencoder, variations on
architectures and training algorithms, etc…
•  Very fast growing area …
Page 81 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Doing this in real life
Page 82 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Typical Workflow
82
1. Ingest training data and store it
2. Split data set into: training, testing and validation sets
3. Vectorize and extract features to go into next step
4. Architect multi layer network, initialize
5. Feed data and train
6. Test and Validate
7. Repeat steps 4 and 5 until desired
8. Store model
9. Put model in app, start generalizing on real data.
Page 83 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
So what do you get?
83
1. Ingest training data and store it using Kafka, Flume, good old
web scraping
2. Split data set into: training, testing and validation sets
3. Vectorize and extract features to go into next step
4. Architect multi layer network, initialize
5. Feed data and train
6. Test and Validate
7. Repeat steps 4 and 5 until desired
8. Store model
9. Put model in app, start generalizing on real data.
Steps 2, 3, 4 and 5:
Use libraries such as
Caffe, Theano,
Deeplearning4j, H20
Page 84 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Distributed Deep Learning on Hadoop
•  In ASF: Apache Singa – new incubator project
•  Two main partners of Hortonworks: Skymind, and H20
•  Skymind is focussed on Deep Learning exclusively (deeplearning4j), H2O includes other ML libraries.
•  Both provide scale out on HDP + Spark
•  Skymind has GPU Acceleration built in - uses CUDA for doing linear algebra operations
•  Both are open source, Apache licensed.
84
Page 85 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Deeplearning4j Architecture
85
Page 86 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
DL4J: Canova for Vectorization and Ingest
•  Canova uses an input/output format system (similar to
how Hadoop uses MapReduce)
•  Supports all major types of input data (text, CSV, audio,
image and video)
•  Can be extended for specialized input formats
•  Connects to Kafka
86
Page 87 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
ND4J:
•  N-dimensional vector library
•  Scientific computing for JVM
•  DL4J uses it to do linear algebra for backpropagation
•  Supports GPUs via CUDA and Native via Jblas
•  Deploys on Android
•  DL4J code remains unchanged whether using GPU or
CPU
87
Page 88 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 88
How to chose a
Neural Net in
DL4J core?
Page 89 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Demo!
Page 90 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thank You
hortonworks.com

More Related Content

What's hot

Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
DataWorks Summit
 
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageEnterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
Hortonworks
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
Hortonworks
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
 
Protecting enterprise Data in Hadoop
Protecting enterprise Data in HadoopProtecting enterprise Data in Hadoop
Protecting enterprise Data in Hadoop
DataWorks Summit
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Hortonworks
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Hortonworks
 
Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
Hortonworks
 
Falcon Meetup
Falcon Meetup Falcon Meetup
Falcon Meetup
Hortonworks
 
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
Hortonworks
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformHortonworks
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Hortonworks
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
Hortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Hortonworks
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
Hortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution
Hortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
Hortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks
 

What's hot (20)

Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARNEnabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
 
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageEnterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
 
Hortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts PresentationHortonworks for Financial Analysts Presentation
Hortonworks for Financial Analysts Presentation
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Protecting enterprise Data in Hadoop
Protecting enterprise Data in HadoopProtecting enterprise Data in Hadoop
Protecting enterprise Data in Hadoop
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 
Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data London
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Falcon Meetup
Falcon Meetup Falcon Meetup
Falcon Meetup
 
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Combine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARNCombine SAS High-Performance Capabilities with Hadoop YARN
Combine SAS High-Performance Capabilities with Hadoop YARN
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 

Viewers also liked

Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it final
Hortonworks
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
Hortonworks
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
DataWorks Summit/Hadoop Summit
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Hortonworks
 
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for productionFaster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Cloudera, Inc.
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
Hortonworks
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
DataWorks Summit
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
Hortonworks
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
Hortonworks
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
Hortonworks
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
Hortonworks
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
Hortonworks
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceP. Taylor Goetz
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
ahortonworks
 
Ranger admin dev overview
Ranger admin dev overviewRanger admin dev overview
Ranger admin dev overviewTushar Dudhatra
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
trihug
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, Scale
Hortonworks
 

Viewers also liked (20)

Hive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it finalHive on spark is blazing fast or is it final
Hive on spark is blazing fast or is it final
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks Solving Big Data Problems using Hortonworks
Solving Big Data Problems using Hortonworks
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for productionFaster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 
Hortonworks tech workshop in-memory processing with spark
Hortonworks tech workshop   in-memory processing with sparkHortonworks tech workshop   in-memory processing with spark
Hortonworks tech workshop in-memory processing with spark
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
Hortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical ApplicationsHortonworks Technical Workshop: HBase For Mission Critical Applications
Hortonworks Technical Workshop: HBase For Mission Critical Applications
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
 
Hortonworks technical workshop operations with ambari
Hortonworks technical workshop   operations with ambariHortonworks technical workshop   operations with ambari
Hortonworks technical workshop operations with ambari
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
The path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial ServicesThe path to a Modern Data Architecture in Financial Services
The path to a Modern Data Architecture in Financial Services
 
Cassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market SceinceCassandra and Storm at Health Market Sceince
Cassandra and Storm at Health Market Sceince
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
 
Ranger admin dev overview
Ranger admin dev overviewRanger admin dev overview
Ranger admin dev overview
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
Apache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, ScaleApache Hive 2.0; SQL, Speed, Scale
Apache Hive 2.0; SQL, Speed, Scale
 

Similar to Deep learning with Hortonworks and Apache Spark - Hortonworks technical workshop

Deep learning 101
Deep learning 101Deep learning 101
Deep learning 101
DataWorks Summit
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Enterprise Data Science at Scale
Enterprise Data Science at ScaleEnterprise Data Science at Scale
Enterprise Data Science at Scale
Artem Ervits
 
openEHR sll-2015final
openEHR sll-2015finalopenEHR sll-2015final
openEHR sll-2015final
openEHR Foundation
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Enterprise data science at scale
Enterprise data science at scaleEnterprise data science at scale
Enterprise data science at scale
Carolyn Duby
 
99supershortjuly2021academicpresentationversion42-210805030711.pptx
99supershortjuly2021academicpresentationversion42-210805030711.pptx99supershortjuly2021academicpresentationversion42-210805030711.pptx
99supershortjuly2021academicpresentationversion42-210805030711.pptx
yehyaibrahem2
 
1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII
TatianaMajor22
 
Data science workshop
Data science workshopData science workshop
Data science workshop
Hortonworks
 
Developing a curriculum and mobile application to deliver training and in...
Developing a  curriculum and mobile  application to deliver  training and  in...Developing a  curriculum and mobile  application to deliver  training and  in...
Developing a curriculum and mobile application to deliver training and in...
Interactive Technologies and Games: Education, Health and Disability
 
Pivotal agile development_the_software-defined_enterprise
Pivotal agile development_the_software-defined_enterprisePivotal agile development_the_software-defined_enterprise
Pivotal agile development_the_software-defined_enterpriseEMC
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
Raúl Marín
 
Overcoming the AI hype — and what enterprises should really focus on
Overcoming the AI hype — and what enterprises should really focus onOvercoming the AI hype — and what enterprises should really focus on
Overcoming the AI hype — and what enterprises should really focus on
DataWorks Summit
 
Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training
Aengus Rooney
 
Webinar - Getting Started with mLearning
Webinar - Getting Started with mLearningWebinar - Getting Started with mLearning
Webinar - Getting Started with mLearning
Raptivity
 
EON-XR Center Grant Programs 2021
EON-XR Center Grant Programs 2021EON-XR Center Grant Programs 2021
EON-XR Center Grant Programs 2021
Senthilkumar R
 
Machine Learning With Spark
Machine Learning With SparkMachine Learning With Spark
Machine Learning With Spark
Shivaji Dutta
 
Chapter 6Techniques for Predictive ModelingBusiness.docx
Chapter 6Techniques for Predictive ModelingBusiness.docxChapter 6Techniques for Predictive ModelingBusiness.docx
Chapter 6Techniques for Predictive ModelingBusiness.docx
mccormicknadine86
 
Academic presentation - Future Skill Force Development
Academic presentation - Future Skill Force DevelopmentAcademic presentation - Future Skill Force Development
Academic presentation - Future Skill Force Development
Senthilkumar R
 

Similar to Deep learning with Hortonworks and Apache Spark - Hortonworks technical workshop (20)

Deep learning 101
Deep learning 101Deep learning 101
Deep learning 101
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Enterprise Data Science at Scale
Enterprise Data Science at ScaleEnterprise Data Science at Scale
Enterprise Data Science at Scale
 
openEHR sll-2015final
openEHR sll-2015finalopenEHR sll-2015final
openEHR sll-2015final
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Enterprise data science at scale
Enterprise data science at scaleEnterprise data science at scale
Enterprise data science at scale
 
99supershortjuly2021academicpresentationversion42-210805030711.pptx
99supershortjuly2021academicpresentationversion42-210805030711.pptx99supershortjuly2021academicpresentationversion42-210805030711.pptx
99supershortjuly2021academicpresentationversion42-210805030711.pptx
 
1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII1P A R T Introduction to Analytics and AII
1P A R T Introduction to Analytics and AII
 
Data science workshop
Data science workshopData science workshop
Data science workshop
 
Developing a curriculum and mobile application to deliver training and in...
Developing a  curriculum and mobile  application to deliver  training and  in...Developing a  curriculum and mobile  application to deliver  training and  in...
Developing a curriculum and mobile application to deliver training and in...
 
Pivotal agile development_the_software-defined_enterprise
Pivotal agile development_the_software-defined_enterprisePivotal agile development_the_software-defined_enterprise
Pivotal agile development_the_software-defined_enterprise
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
 
Overcoming the AI hype — and what enterprises should really focus on
Overcoming the AI hype — and what enterprises should really focus onOvercoming the AI hype — and what enterprises should really focus on
Overcoming the AI hype — and what enterprises should really focus on
 
Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training
 
Webinar - Getting Started with mLearning
Webinar - Getting Started with mLearningWebinar - Getting Started with mLearning
Webinar - Getting Started with mLearning
 
EON-XR Center Grant Programs 2021
EON-XR Center Grant Programs 2021EON-XR Center Grant Programs 2021
EON-XR Center Grant Programs 2021
 
Machine Learning With Spark
Machine Learning With SparkMachine Learning With Spark
Machine Learning With Spark
 
Chapter 6Techniques for Predictive ModelingBusiness.docx
Chapter 6Techniques for Predictive ModelingBusiness.docxChapter 6Techniques for Predictive ModelingBusiness.docx
Chapter 6Techniques for Predictive ModelingBusiness.docx
 
Academic presentation - Future Skill Force Development
Academic presentation - Future Skill Force DevelopmentAcademic presentation - Future Skill Force Development
Academic presentation - Future Skill Force Development
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Deep learning with Hortonworks and Apache Spark - Hortonworks technical workshop

  • 1. Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Deep Learning on HDP Dhruv Kumar – dkumar@hortonworks.com Solutions Engineer, Hortonworks 2015 Version 1.0
  • 2. Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 2
  • 3. Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Scientists See Promise in Deep-Learning Programs John Markoff November 23, 2012 Rich Rashid in Tianjin, October, 25, 2012
  • 4. Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 5. Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Impact of deep learning in speech technologyWhere is it?
  • 6. Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 6 ……Facebook’s foray into deep learning sees it following its competitors Google and Microsoft, which have used the approach to impressive effect in the past year. Google has hired and acquired leading talent in the field (see “ 10 Breakthrough Technologies 2013: Deep Learning”), and last year created software that taught itself to recognize cats and other objects by reviewing stills from YouTube videos. The underlying deep learning technology was later used to slash the error rate of Google’s voice recognition services (see “Google’s Virtual Brain Goes to Work”) ….Researchers at Microsoft have used deep learning to build a system that translates speech from English to Mandarin Chinese in real time (see “Microsoft Brings Star Trek’s Voice Translator to Life”). Chinese Web giant Baidu also recently established a Silicon Valley research lab to work on deep learning. September 20, 2013
  • 7. Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 7
  • 8. Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 8
  • 9. Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 9
  • 10. Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 10
  • 11. Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 11
  • 12. Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 12
  • 13. Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved “The word is spreading in all corners of the tech industry that the biggest part of big data, the unstructured part, possesses learnable patterns that we now have the computing power and algorithmic leverage to discern…This change marks a true disruption, and there are fortunes to be made. There are also tremendous social consequences to consider that require as much creativity and investment as the more immediately lucrative deep learning startups that are popping up all over…”
  • 14. Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved “Using an artificial intelligence technique inspired by theories about how the brain recognizes patterns, technology companies are reporting startling gains in fields as diverse as computer vision, speech recognition and the identification of promising new molecules for designing drugs. The advances have led to widespread enthusiasm among researchers who design software to perform human activities like seeing, listening and thinking. They offer the promise of machines that converse with humans and perform tasks like driving cars and working in factories, raising the specter of automated robots that could replace human workers.”
  • 15. Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Google Trends for “Deep Learning” keyword
  • 16. Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Enterprise use cases
  • 17. Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Deep Learning •  One of the many pattern recognition techniques in Data Science •  Excels at rich media applications: •  Image recognition •  Speech translation •  Voice recognition •  Loosely inspired by human brain models •  Synonymous with Artificial Neural Networks, Multi Layer Networks
  • 18. Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved In this workshop -  Fundamentals of Deep Learning -  Implementation and Libraries in Real Life -  Demo!
  • 19. Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved So, 1. what exactly is deep learning ? And, 2. why is it generally better than other methods on image, speech and certain other types of data?
  • 20. Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved So, 1. what exactly is deep learning ? And, 2. why is it generally better than other methods on image, speech and certain other types of data? The short answers 1. ‘Deep Learning’ means using a neural network with several layers of nodes between input and output 2. the series of layers between input & output do feature identification and processing in a series of stages, just as our brains seem to.
  • 21. Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved but: 3. multilayer neural networks have been around for 25 years. What’s actually new?
  • 22. Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved but: 3. multilayer neural networks have been around for 25 years. What’s actually new? we have always had good algorithms for learning the weights in networks with 1 hidden layer but these algorithms are not good at learning the weights for networks with more hidden layers what’s new is: algorithms for training many-layer networks
  • 23. Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Longer answers 1.  reminder/quick-explanation of how neural network weights are learned; 2.  the idea of unsupervised feature learning (why ‘intermediate features’ are important for difficult classification tasks, and how NNs seem to naturally learn them) 3.  The ‘breakthrough’ – the simple trick for training Deep neural networks
  • 24. Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Neuron Quick Look 24
  • 25. Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved W1 W2 W3 f(x) 1.4 -2.5 -0.06
  • 26. Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 2.7 -8.6 0.002 f(x) 1.4 -2.5 -0.06 x = -0.06×2.7 + 2.5×8.6 + 1.4×0.002 = 21.34
  • 27. Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A dataset Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc …
  • 28. Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training the neural network Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc …
  • 29. Page 29 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Initialise with random weights
  • 30. Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Present a training pattern 1.4 2.7 1.9
  • 31. Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Feed it through to get output 1.4 2.7 0.8 1.9
  • 32. Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Compare with target output 1.4 2.7 0.8 0 1.9 error 0.8
  • 33. Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Adjust weights based on error 1.4 2.7 0.8 0 1.9 error 0.8
  • 34. Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Present a training pattern 6.4 2.8 1.7
  • 35. Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Feed it through to get output 6.4 2.8 0.9 1.7
  • 36. Page 36 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Compare with target output 6.4 2.8 0.9 1 1.7 error -0.1
  • 37. Page 37 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Adjust weights based on error 6.4 2.8 0.9 1 1.7 error -0.1
  • 38. Page 38 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … And so on …. 6.4 2.8 0.9 1 1.7 error -0.1 Repeat this thousands, maybe millions of times – each time taking a random training instance, and making slight weight adjustments Algorithms for weight adjustment are designed to make changes that will reduce the error
  • 39. Page 39 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The decision boundary perspective… Initial random weights
  • 40. Page 40 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The decision boundary perspective… Present a training instance / adjust the weights
  • 41. Page 41 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The decision boundary perspective… Present a training instance / adjust the weights
  • 42. Page 42 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The decision boundary perspective… Present a training instance / adjust the weights
  • 43. Page 43 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The decision boundary perspective… Present a training instance / adjust the weights
  • 44. Page 44 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The decision boundary perspective… Eventually ….
  • 45. Page 45 © Hortonworks Inc. 2011 – 2014. All Rights Reserved In essence •  weight-learning algorithms for NNs are dumb •  they work by making thousands and thousands of tiny adjustments, each making the network do better at the most recent pattern, but perhaps a little worse on many others •  but, by dumb luck, eventually this tends to be good enough to learn effective classifiers for many real applications
  • 46. Page 46 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Some other points Detail of a standard NN weight learning algorithm – later If f(x) is non-linear, a network with 1 hidden layer can, in theory, learn perfectly any classification problem. A set of weights exists that can produce the targets from the inputs. The problem is finding them.
  • 47. Page 47 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Some other ‘by the way’ points If f(x) is linear, the NN can only draw straight decision boundaries (even if there are many layers of units)
  • 48. Page 48 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Some other ‘by the way’ points NNs use nonlinear f(x) so they can draw complex boundaries, but keep the data unchanged
  • 49. Page 49 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Some other ‘by the way’ points NNs use nonlinear f(x) so they SVMs only draw straight lines, can draw complex boundaries, but they transform the data first but keep the data unchanged in a way that makes that OK
  • 50. Page 50 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Feature detectors
  • 51. Page 51 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What’s this unit doing?
  • 52. Page 52 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hidden layer units become self-organised feature detectors … 1 63 1 5 10 15 20 25 … strong +ve weight low/zero weight
  • 53. Page 53 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What does this unit detect? … 1 63 1 5 10 15 20 25 … strong +ve weight low/zero weight
  • 54. Page 54 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What does this unit detect? … 1 63 1 5 10 15 20 25 … strong +ve weight low/zero weight it will send strong signal for a horizontal line in the top row, ignoring everywhere else
  • 55. Page 55 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What does this unit detect? … 1 63 1 5 10 15 20 25 … strong +ve weight low/zero weight
  • 56. Page 56 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What does this unit detect? … 1 63 1 5 10 15 20 25 … strong +ve weight low/zero weight Strong signal for a dark area in the top left corner
  • 57. Page 57 © Hortonworks Inc. 2011 – 2014. All Rights Reserved What features might you expect a good NN to learn, when trained with data like this?
  • 58. Page 58 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 63 1 vertical lines
  • 59. Page 59 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 63 1 horizontal lines
  • 60. Page 60 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 63 1 Small circles
  • 61. Page 61 © Hortonworks Inc. 2011 – 2014. All Rights Reserved successive layers can learn higher-level features … etc …detect lines in Specific positions v Higher level detetors ( horizontal line, “RHS vertical lune” “upper loop”, etc… etc …
  • 62. Page 62 © Hortonworks Inc. 2011 – 2014. All Rights Reserved successive layers can learn higher-level features … etc …detect lines in Specific positions v Higher level detetors ( horizontal line, “RHS vertical lune” “upper loop”, etc… etc … What does this unit detect?
  • 63. Page 63 © Hortonworks Inc. 2011 – 2014. All Rights Reserved So: multiple layers make sense
  • 64. Page 64 © Hortonworks Inc. 2011 – 2014. All Rights Reserved So: multiple layers make sense Your brain works that way
  • 65. Page 65 © Hortonworks Inc. 2011 – 2014. All Rights Reserved So: multiple layers make sense Many-layer neural network architectures should be capable of learning the true underlying features and ‘feature logic’, and therefore generalise very well …
  • 66. Page 66 © Hortonworks Inc. 2011 – 2014. All Rights Reserved But, until very recently, weight-learning algorithms simply did not work on multi-layer architectures
  • 67. Page 67 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Along came deep learning …
  • 68. Page 68 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The new way to train multi-layer NNs…
  • 69. Page 69 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The new way to train multi-layer NNs… Train this layer first
  • 70. Page 70 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The new way to train multi-layer NNs… Train this layer first then this layer
  • 71. Page 71 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The new way to train multi-layer NNs… Train this layer first then this layer then this layer
  • 72. Page 72 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The new way to train multi-layer NNs… Train this layer first then this layer then this layer then this layer
  • 73. Page 73 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The new way to train multi-layer NNs… Train this layer first then this layer then this layer then this layer finally this layer
  • 74. Page 74 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The new way to train multi-layer NNs… EACH of the (non-output) layers is trained to be an auto-encoder Basically, it is forced to learn good features that describe what comes from the previous layer
  • 75. Page 75 © Hortonworks Inc. 2011 – 2014. All Rights Reserved an auto-encoder is trained, with an absolutely standard weight- adjustment algorithm to reproduce the input
  • 76. Page 76 © Hortonworks Inc. 2011 – 2014. All Rights Reserved an auto-encoder is trained, with an absolutely standard weight- adjustment algorithm to reproduce the input By making this happen with (many) fewer units than the inputs, this forces the ‘hidden layer’ units to become good feature detectors
  • 77. Page 77 © Hortonworks Inc. 2011 – 2014. All Rights Reserved intermediate layers are each trained to be auto encoders (or similar)
  • 78. Page 78 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Final layer trained to predict class based on outputs from previous layers
  • 79. Page 79 © Hortonworks Inc. 2011 – 2014. All Rights Reserved But, how does one train? 79 Overall, the NN is trying to minimize a cost function while adjusting the weights. To find the minima, Gradient Descent is used. If the training set is very large, GD can be too slow since each input is evaluated in the cost function. So, one can sample a subset of input for computing GD. If sampling is done at random, it is called Stochastic Gradient Descent. The implementation of this mathematical formulation is done by the Error Backpropagation Algorithm.
  • 80. Page 80 © Hortonworks Inc. 2011 – 2014. All Rights Reserved And that’s that •  That’s the basic idea •  There are many many types of deep learning, •  Different kinds of autoencoder, variations on architectures and training algorithms, etc… •  Very fast growing area …
  • 81. Page 81 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Doing this in real life
  • 82. Page 82 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Typical Workflow 82 1. Ingest training data and store it 2. Split data set into: training, testing and validation sets 3. Vectorize and extract features to go into next step 4. Architect multi layer network, initialize 5. Feed data and train 6. Test and Validate 7. Repeat steps 4 and 5 until desired 8. Store model 9. Put model in app, start generalizing on real data.
  • 83. Page 83 © Hortonworks Inc. 2011 – 2014. All Rights Reserved So what do you get? 83 1. Ingest training data and store it using Kafka, Flume, good old web scraping 2. Split data set into: training, testing and validation sets 3. Vectorize and extract features to go into next step 4. Architect multi layer network, initialize 5. Feed data and train 6. Test and Validate 7. Repeat steps 4 and 5 until desired 8. Store model 9. Put model in app, start generalizing on real data. Steps 2, 3, 4 and 5: Use libraries such as Caffe, Theano, Deeplearning4j, H20
  • 84. Page 84 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Distributed Deep Learning on Hadoop •  In ASF: Apache Singa – new incubator project •  Two main partners of Hortonworks: Skymind, and H20 •  Skymind is focussed on Deep Learning exclusively (deeplearning4j), H2O includes other ML libraries. •  Both provide scale out on HDP + Spark •  Skymind has GPU Acceleration built in - uses CUDA for doing linear algebra operations •  Both are open source, Apache licensed. 84
  • 85. Page 85 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Deeplearning4j Architecture 85
  • 86. Page 86 © Hortonworks Inc. 2011 – 2014. All Rights Reserved DL4J: Canova for Vectorization and Ingest •  Canova uses an input/output format system (similar to how Hadoop uses MapReduce) •  Supports all major types of input data (text, CSV, audio, image and video) •  Can be extended for specialized input formats •  Connects to Kafka 86
  • 87. Page 87 © Hortonworks Inc. 2011 – 2014. All Rights Reserved ND4J: •  N-dimensional vector library •  Scientific computing for JVM •  DL4J uses it to do linear algebra for backpropagation •  Supports GPUs via CUDA and Native via Jblas •  Deploys on Android •  DL4J code remains unchanged whether using GPU or CPU 87
  • 88. Page 88 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 88 How to chose a Neural Net in DL4J core?
  • 89. Page 89 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Demo!
  • 90. Page 90 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Thank You hortonworks.com