Georgia Tech cse6242 - Intro to Deep Learning and DL4J

•Download as PPTX, PDF•

4 likes•2,305 views

Introduction to deep learning and DL4J - http://deeplearning4j.org/ - a guest lecture by Josh Patterson at Georgia Tech for the cse6242 graduate class.

Data & Analytics

Josh Patterson
Email:
josh@pattersonconsultingtn.com
Twitter:
@jpatanooga
Github:
https://github.com/jpata
nooga
Past
Published in IAAI-09:
“TinyTermite: A Secure Routing Algorithm”
Grad work in Meta-heuristics, Ant-algorithms
Tennessee Valley Authority (TVA)
Hadoop and the Smartgrid
Cloudera
Principal Solution Architect
Today: Patterson Consulting

Overview
• What is Deep Learning?
• Deep Belief Networks
• DL4J

What is Deep Learning?
Algorithm that tries to learn simple features in lower
layers
And more complex features in higher layers

Interesting Properties of Deep Learning
Reduces a problem with overfitting in neural
networks.
Introduces new techniques for "unsupervised feature
learning”
introduces new more automatic ways to figure out the
parts of your data you should feed into your learning
algorithm.

Chasing Nature
Learning sparse representations of auditory signals
leads to filters that closely correspond to neurons in
early audio processing in mammals
When applied to speech
Learned representations showed a striking
resemblance to the cochlear filters in the auditory
cortext

Yann LeCunn on Deep Learning
Has become the dominant method for acoustic
modeling in speech recognition
Quickly becoming the dominant method for several
vision tasks such as
object recognition
object detection
semantic segmentation.

What is a Deep Belief Network?
Generative probabilistic model
Composed of one visible layer
Many hidden layers
Restricted Boltzman Machines
Each hidden layer learns relationship between units in
lower layer
Higher layer representations tend to become more complex

Restricted Boltzmann Machines
• Unsupervised model
• Does feature learning by repeated sampling of the input data.
• Learns how to reconstruct data for good feature detection.

Deep Belief Network Training
Pre-Train
We should each RBM layer unlabeled vectors
“unsupervised learning”
For each layer we want to minimize the Cross Entropy
Fine-Tune
We move the learned weights (hidden bias units) from the
RBMs to a traditional feed-forward neural network
We run gentle back-propagation with some labeled data

Pre-Train Reconstructions
High Cross Entropy Low Cross Entropy

Deep Belief Network Diagram
• DBNs are classifiers
• Layers of RBMs
• Capped with a Logistic Layer
• RBMs are feature extractors
• RBMs learn features via
sampling
• Creates “simpler problem” for
later layers in stack

DeepLearning4J
Implementation in Java
Self-contained & built on Akka, Hazelcast, Jblas
Runs on desktop
Runs on Hadoop via YARN natively to scale out
Distributed to run faster and with more features than
current Theano-based implementations

Vectorized Implementation
Handles lots of data concurrently.
Any number of examples at once, but the code does
not change.
Faster: Allows for native/GPU execution.
One format: Everything is a matrix.

What are Good Applications for Deep Learning?
Image Processing
High MNIST Scores
Audio Processing
Current Champ on TIMIT dataset
Text / NLP Processing
Word2vec, etc

Parameter Averaging
McDonald, 2010
Distributed Training Strategies for the Structured Perceptron
Langford, 2007
Vowpal Wabbit
Jeff Dean’s Work on Parallel SGD
DownPour SGD
19

Parallelizing Deep Belief Networks
Two phase training
Pre Train
Fine tune
Each phase can do multiple passes over dataset
Entire network is averaged at master

PreTrain and Lots of Data
We’re exploring how to better leverage the
unsupervised aspects of the PreTrain phase of
Deep Belief Networks
Allows for the use of far less unlabeled data
Allows us to more easily modeled the massive amounts
of structured data in HDFS

Refernces
Visualizing RBMs
https://jpatanooga.github.io/Metronome/rbm20140306.h
tml
DL4J
http://deeplearning4j.org/

What's hot

DL4J at Workday MeetupDavid Kale

Smart Data Conference: DL4J and DataVecJosh Patterson

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on HadoopJosh Patterson

Vectorization - Georgia Tech - CSE6242 - March 2015Josh Patterson

DeepLearning4J and Spark: Successes and Challenges - François GarillotSteve Moore

Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman

Deep Learning on Qubole Data PlatformShivaji Dutta

Distributed Inference on Large Datasets Using Apache MXNet and Apache Spark ...Databricks

Amazon Deep LearningAmanda Mackay (she/her)

What’s New in the Berkeley Data Analytics StackTuri, Inc.

Deploying Enterprise Deep Learning Masterclass Preview - Enterprise Deep Lea...Sam Putnam [Deep Learning]

AI powered emotion recognition: From Inception to Production - Global AI Conf...Vandana Kannan

A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...Databricks

Apache MXNet ODSC West 2018Apache MXNet

DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018Apache MXNet

Snorkel: Dark Data and Machine Learning with Christopher RéJen Aman

Apache ToreeAsim Jalis

Kaz Sato, Evangelist, Google at MLconf ATL 2016MLconf

Machine Learning Exposed!javafxpert

Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...Simplilearn

What's hot (20)

DL4J at Workday Meetup

Smart Data Conference: DL4J and DataVec

Hadoop Summit 2014 - San Jose - Introduction to Deep Learning on Hadoop

Vectorization - Georgia Tech - CSE6242 - March 2015

DeepLearning4J and Spark: Successes and Challenges - François Garillot

Deep Learning on Apache® Spark™ : Workflows and Best Practices

Deep Learning on Qubole Data Platform

Distributed Inference on Large Datasets Using Apache MXNet and Apache Spark ...

Amazon Deep Learning

What’s New in the Berkeley Data Analytics Stack

Deploying Enterprise Deep Learning Masterclass Preview - Enterprise Deep Lea...

AI powered emotion recognition: From Inception to Production - Global AI Conf...

A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...

Apache MXNet ODSC West 2018

DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018

Snorkel: Dark Data and Machine Learning with Christopher Ré

Apache Toree

Kaz Sato, Evangelist, Google at MLconf ATL 2016

Machine Learning Exposed!

Deep Learning Frameworks 2019 | Which Deep Learning Framework To Use | Deep L...

Viewers also liked

Deep belief networks for spam filteringSOYEON KIM

Deep Learning Introkammeyer

P05 deep boltzmann machines cvpr2012 deep learning methods for visionzukun

Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud

RapidMiner: Introduction To Rapid MinerRapidmining Content

Applied Deep Learning with Spark and Deeplearning4jDataWorks Summit

Deep learning introbeamandrew

Deep Learning for Stock PredictionLim Zhi Yuan (Zane)

Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters

Prediction of Exchange Rate Using Deep Neural Network Tomoki Hayashi

Viewers also liked (10)

Deep belief networks for spam filtering

Deep Learning Intro

P05 deep boltzmann machines cvpr2012 deep learning methods for vision

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

RapidMiner: Introduction To Rapid Miner

Applied Deep Learning with Spark and Deeplearning4j

Deep learning intro

Deep Learning for Stock Prediction

Deep Learning & NLP: Graphs to the Rescue!

Prediction of Exchange Rate Using Deep Neural Network

Similar to Georgia Tech cse6242 - Intro to Deep Learning and DL4J

Introduction to parallel iterative deep learning on hadoop’s next generation...Anh Le

Hadoop Summit 2014 Distributed Deep LearningAdam Gibson

Deep Learning on HadoopDataWorks Summit

Deeplearning on Hadoop @OSCON 2014Adam Gibson

Distributed deep learning_over_spark_20_nov_2014_ver_2.8Vijay Srinivas Agneeswaran, Ph.D

introduction to deeplearningEyad Alshami

Neural Networks, Spark MLlib, Deep LearningAsim Jalis

Introduction to deep learningAbhishek Bhandwaldar

Synthetic dialogue generation with Deep LearningS N

Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters

Deep learning - A Visual IntroductionLukas Masuch

Week3-Deep Neural Network (DNN).pptxfahmi324663

DSRLab seminar Introduction to deep learningPoo Kuan Hoong

Big Data Malaysia - A Primer on Deep LearningPoo Kuan Hoong

MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong

deeplearninghuda2018

Handwritten Recognition using Deep Learning with RPoo Kuan Hoong

NLP and Deep Learning for non_expertsSanghamitra Deb

An Introduction to Deep LearningPoo Kuan Hoong

Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Impetus Technologies

Similar to Georgia Tech cse6242 - Intro to Deep Learning and DL4J (20)

Introduction to parallel iterative deep learning on hadoop’s next generation...

Hadoop Summit 2014 Distributed Deep Learning

Deep Learning on Hadoop

Deeplearning on Hadoop @OSCON 2014

Distributed deep learning_over_spark_20_nov_2014_ver_2.8

introduction to deeplearning

Neural Networks, Spark MLlib, Deep Learning

Introduction to deep learning

Synthetic dialogue generation with Deep Learning

Python for Image Understanding: Deep Learning with Convolutional Neural Nets

Deep learning - A Visual Introduction

Week3-Deep Neural Network (DNN).pptx

DSRLab seminar Introduction to deep learning

Big Data Malaysia - A Primer on Deep Learning

MDEC Data Matters Series: machine learning and Deep Learning, A Primer

deeplearning

Handwritten Recognition using Deep Learning with R

NLP and Deep Learning for non_experts

An Introduction to Deep Learning

Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...

Recently uploaded

RadioAdProWritingCinderellabyButleri.pdfgstagge

ASML's Taxonomy Adventure by Daniel Cantervoginip

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Universitat Politècnica de Catalunya

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07

Easter Eggs From Star Wars and in cars 1 and 217djon017

Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375

From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck

How we prevented account sharing with MFAAndrei Kaleshka

PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava

Multiple time frame trading analysis -brianshannon.pdfchwongval

Machine learning classification ppt.pptamreenkhanum0307

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort

Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics

Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali

Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson

Recently uploaded (20)

RadioAdProWritingCinderellabyButleri.pdf

ASML's Taxonomy Adventure by Daniel Canter

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx

Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...

2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING

Easter Eggs From Star Wars and in cars 1 and 2

Advanced Machine Learning for Business Professionals

Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...

From idea to production in a day – Leveraging Azure ML and Streamlit to build...

How we prevented account sharing with MFA

PKS-TGC-1084-630 - Stage 1 Proposal.pptx

Multiple time frame trading analysis -brianshannon.pdf

Machine learning classification ppt.ppt

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi

Predicting Salary Using Data Science: A Comprehensive Analysis.pdf

Student profile product demonstration on grades, ability, well-being and mind...

NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...

Defining Constituents, Data Vizzes and Telling a Data Story

Georgia Tech cse6242 - Intro to Deep Learning and DL4J

2. Josh Patterson Email: josh@pattersonconsultingtn.com Twitter: @jpatanooga Github: https://github.com/jpata nooga Past Published in IAAI-09: “TinyTermite: A Secure Routing Algorithm” Grad work in Meta-heuristics, Ant-algorithms Tennessee Valley Authority (TVA) Hadoop and the Smartgrid Cloudera Principal Solution Architect Today: Patterson Consulting

3. Overview • What is Deep Learning? • Deep Belief Networks • DL4J

5. What is Deep Learning? Algorithm that tries to learn simple features in lower layers And more complex features in higher layers

6. Interesting Properties of Deep Learning Reduces a problem with overfitting in neural networks. Introduces new techniques for "unsupervised feature learning” introduces new more automatic ways to figure out the parts of your data you should feed into your learning algorithm.

7. Chasing Nature Learning sparse representations of auditory signals leads to filters that closely correspond to neurons in early audio processing in mammals When applied to speech Learned representations showed a striking resemblance to the cochlear filters in the auditory cortext

8. Yann LeCunn on Deep Learning Has become the dominant method for acoustic modeling in speech recognition Quickly becoming the dominant method for several vision tasks such as object recognition object detection semantic segmentation.

10. What is a Deep Belief Network? Generative probabilistic model Composed of one visible layer Many hidden layers Restricted Boltzman Machines Each hidden layer learns relationship between units in lower layer Higher layer representations tend to become more complex

11. Restricted Boltzmann Machines • Unsupervised model • Does feature learning by repeated sampling of the input data. • Learns how to reconstruct data for good feature detection.

12. Deep Belief Network Training Pre-Train We should each RBM layer unlabeled vectors “unsupervised learning” For each layer we want to minimize the Cross Entropy Fine-Tune We move the learned weights (hidden bias units) from the RBMs to a traditional feed-forward neural network We run gentle back-propagation with some labeled data

13. Pre-Train Reconstructions High Cross Entropy Low Cross Entropy

14. Deep Belief Network Diagram • DBNs are classifiers • Layers of RBMs • Capped with a Logistic Layer • RBMs are feature extractors • RBMs learn features via sampling • Creates “simpler problem” for later layers in stack

15. Rendering RBM Hidden Neuron Filters

16. DeepLearning4J Implementation in Java Self-contained & built on Akka, Hazelcast, Jblas Runs on desktop Runs on Hadoop via YARN natively to scale out Distributed to run faster and with more features than current Theano-based implementations

17. Vectorized Implementation Handles lots of data concurrently. Any number of examples at once, but the code does not change. Faster: Allows for native/GPU execution. One format: Everything is a matrix.

18. What are Good Applications for Deep Learning? Image Processing High MNIST Scores Audio Processing Current Champ on TIMIT dataset Text / NLP Processing Word2vec, etc

19. Parameter Averaging McDonald, 2010 Distributed Training Strategies for the Structured Perceptron Langford, 2007 Vowpal Wabbit Jeff Dean’s Work on Parallel SGD DownPour SGD 19

20. Parallelizing Deep Belief Networks Two phase training Pre Train Fine tune Each phase can do multiple passes over dataset Entire network is averaged at master

21. PreTrain and Lots of Data We’re exploring how to better leverage the unsupervised aspects of the PreTrain phase of Deep Belief Networks Allows for the use of far less unlabeled data Allows us to more easily modeled the massive amounts of structured data in HDFS

22. Refernces Visualizing RBMs https://jpatanooga.github.io/Metronome/rbm20140306.h tml DL4J http://deeplearning4j.org/

Editor's Notes

Bottou similar to Xu2010 in the 2010 paper

Georgia Tech cse6242 - Intro to Deep Learning and DL4J

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Georgia Tech cse6242 - Intro to Deep Learning and DL4J

Similar to Georgia Tech cse6242 - Intro to Deep Learning and DL4J (20)

More from Josh Patterson

More from Josh Patterson (13)

Recently uploaded

Recently uploaded (20)

Georgia Tech cse6242 - Intro to Deep Learning and DL4J

Editor's Notes