SlideShare a Scribd company logo
1 of 32
Download to read offline
Large Scale Deep Learning
Quoc V. Le	

Google & CMU
Deep Learning	


•  Google is using Machine Learning 	

•  Machine Learning is difficult	

•  Requires domain knowledge from human experts	

Deep Learning: 	


•  Great performances for many problems	

•  Works well with a large amount of data	

•  Requires less domain knowledge	

Focus: 	


•  Scale deep learning to bigger models and bigger problems	


Quoc V. Le
Deep Learning	


•  Google is using Machine Learning 	

•  Machine Learning is difficult	

•  Requires domain knowledge from human experts	

Deep Learning: 	


•  Great performances for many problems	

•  Works well with a large amount of data	

•  Requires less domain knowledge	

Focus: 	


•  Scale deep learning to bigger models and bigger problems	


Quoc V. Le
What is Deep Learning?	


Quoc V. Le
What is Deep Learning?	


…	

v = g(B u)	

B	

A	


u = g(A x)	

x	


(images, audio, texts, etc.)	


Quoc V. Le
What is Deep Learning?	


…	

v = g(B u)	

B	

A	


u = g(A x)	

x	


(images, audio, texts, etc.)	


Quoc V. Le
High-level features by Deep Learning	

Face detector, Cat detector	

…	


Edge detectors	

Pixels	


Quoc V. Le
Google’s DistBelief	

Model	

Goal: Train deep learning on many
machines	

	

Model: A multiple layered architecture	

	

Forward pass to compute the
features	

Backward pass to compute the
gradient 	


	

Training Data	


Quoc V. Le
Model partition with DistBelief 	

Model	


DistBelief distributes a model across
multiple machines and multiple cores. 	


Machine (Model Partition)	

Training Data	


Quoc V. Le
Model partition with DistBelief 	

Model	


DistBelief distributes a model across
multiple machines and cores. 	


Machine (Model Partition)	

Training Data	


Core	


Quoc V. Le
Model partition with DistBelief 	

Model	


Stochastic Gradient Descent (SGD)	

	

Model parameters are partitioned	

	

Can use up to 1000 cores	


Training Data	


Quoc V. Le
Model partition with DistBelief 	

Model	

But training is still slow on large data sets	

Can we add more parallelism?	

	

Idea: Train multiple models on different
partitions of the data, and merge them	

Training Data	


Quoc V. Le
Data partition with DistBelief 	

Parameter Server	


∆p	


p’ = p + ∆p	


p’	


Model	

Workers	

Data	

Shards	


Quoc V. Le
Parallelism in DistBelief 	

Model parallelism via model partitioning	

	

Data parallelism via data partitioning and asynchronous communications	

	

DistBelief can scale to billion examples and use 100,000 cores or more	

	

Thanks to its speed, DistBelief dramatically improves many applications	

	


Quoc V. Le
Applications	


Voice Search	


Photo Search	

 Text Understanding	


Quoc V. Le
Voice Search	

Classifier	

Hidden layers with 1000s nodes	


Speech frame	


label!

Quoc V. Le
Voice Search	


Quoc V. Le
Applications	


Voice Search	


Photo Search	

 Text Understanding	


Quoc V. Le
Photo Search
Cat detector	

Front page of New York Times	


Quoc V. Le
Seat-belt	


Archery	


Boston rocker	


Shredder
Face	

Amusement, Park	


Hammock
Google+ PhotoSearch
Applications	


Voice Search	


Photo Search	

 Text Understanding	


Quoc V. Le
Text understanding	


Very useful but also difficult	

	

We should try to understand the meaning of words	

	

Deep Learning can learn the meaning of words	

	


Quoc V. Le
Text understanding	

~100-D vector space	

Clinton

Paris

Obama

whale

dolphin

Quoc V. Le
Predicting the next word in a sentence	

Classifier	

Hidden Layers	


E	


E	


E	


E	


E	


the!

Word Matrix	


cat!

sat!

on!

the!

is a matrix of dimension ||Vocab|| x d	


Quoc V. Le
Visualizing the word vectors	


• 

Example nearest neighbors trained on Google News	

apple

Apple

iPhone
Relation Extraction	


Mikolov, Sutskever, Le. Learning the Meaning behind Words. Google OpenSource Blog, 2013	


Quoc V. Le
Machine Translation	


Quoc V. Le
Summary	

Model partition	

Data partition	


Voice Search	


Photo Search	

 Text Understanding	


Quoc V. Le
Joint work with	


Kai Chen	


Greg Corrado	


Rajat Monga	

 Andrew Ng	


Jeff Dean	


Matthieu Devin	


Paul Tucker	


Ke Yang	


Samy Bengio, Tom Dean, Josh Levenberg, Geoff Hinton, Tomas
Additional	

Mikolov, Mark Mao, Patrick Nguyen, Marc’Aurelio Ranzato,
Thanks:	

 Mark Segal, Jon Shlens, Ilya Sutskever, Vincent Vanhoucke

More Related Content

What's hot

What's hot (13)

Deep Learning on Qubole Data Platform
Deep Learning on Qubole Data PlatformDeep Learning on Qubole Data Platform
Deep Learning on Qubole Data Platform
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
 
Analyzing Data With Python
Analyzing Data With PythonAnalyzing Data With Python
Analyzing Data With Python
 
Introduction To TensorFlow
Introduction To TensorFlowIntroduction To TensorFlow
Introduction To TensorFlow
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016
 
Data Science at the Command Line
Data Science at the Command LineData Science at the Command Line
Data Science at the Command Line
 
Deep learning with TensorFlow
Deep learning with TensorFlowDeep learning with TensorFlow
Deep learning with TensorFlow
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
 
Deep learning trends
Deep learning trendsDeep learning trends
Deep learning trends
 
Webinar: Deep Learning with H2O
Webinar: Deep Learning with H2OWebinar: Deep Learning with H2O
Webinar: Deep Learning with H2O
 
SciPy Latin America 2019
SciPy Latin America 2019SciPy Latin America 2019
SciPy Latin America 2019
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
 

Similar to Quoc le, slides MLconf 11/15/13

"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
Edge AI and Vision Alliance
 

Similar to Quoc le, slides MLconf 11/15/13 (20)

How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...How to use transfer learning to bootstrap image classification and question a...
How to use transfer learning to bootstrap image classification and question a...
 
building intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningbuilding intelligent systems with large scale deep learning
building intelligent systems with large scale deep learning
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
OReilly AI Transfer Learning
OReilly AI Transfer LearningOReilly AI Transfer Learning
OReilly AI Transfer Learning
 
Promises of Deep Learning
Promises of Deep LearningPromises of Deep Learning
Promises of Deep Learning
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
 
AI @ Microsoft, How we do it and how you can too!
AI @ Microsoft, How we do it and how you can too!AI @ Microsoft, How we do it and how you can too!
AI @ Microsoft, How we do it and how you can too!
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Video+Language: From Classification to Description
Video+Language: From Classification to DescriptionVideo+Language: From Classification to Description
Video+Language: From Classification to Description
 
Google Cloud: Data Analysis and Machine Learningn Technologies
Google Cloud: Data Analysis and Machine Learningn Technologies Google Cloud: Data Analysis and Machine Learningn Technologies
Google Cloud: Data Analysis and Machine Learningn Technologies
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Neural Networks (DNN)
 
Week3-Deep Neural Network (DNN).pptx
Week3-Deep Neural Network (DNN).pptxWeek3-Deep Neural Network (DNN).pptx
Week3-Deep Neural Network (DNN).pptx
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
 
Video AI for Media and Entertainment Industry
Video AI for Media and Entertainment IndustryVideo AI for Media and Entertainment Industry
Video AI for Media and Entertainment Industry
 
Introducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en AzureIntroducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en Azure
 
Learn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building ProjectsLearn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building Projects
 

More from MLconf

Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
MLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
MLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
MLconf
 

More from MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Quoc le, slides MLconf 11/15/13