SlideShare a Scribd company logo
1 of 36
Download to read offline
Applications
Lecture slides for Chapter 12 of Deep Learning
www.deeplearningbook.org
Ian Goodfellow
2018-10-25
(Goodfellow 2018)
Disclaimer
• Details of applications change much faster than the
underlying conceptual ideas
• A printed book is updated on the scale of years, state-
of-the-art results come out constantly
• These slides are somewhat more up to date
• Applications involve much more specific knowledge, the
limitations of my own knowledge will be much more
apparent in these slides than others
(Goodfellow 2018)
Large Scale Deep Learning
1950 1985 2000 2015 2056
Year
10 2
10 1
100
101
102
103
104
105
106
107
108
109
1010
1011
Number
of
neurons
(logarithmic
scale)
1 2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Sponge
Roundworm
Leech
Ant
Bee
Frog
Octopus
Human
ure 1.11: Since the introduction of hidden units, artificial neural networks have doub
ize roughly every 2.4 years. Biological neural network sizes from Wikipedia (2015
Figure 1.11
(Goodfellow 2018)
Fast Implementations
• CPU
• Exploit fixed point arithmetic in CPU families where this offers a speedup
• Cache-friendly implementations
• GPU
• High memory bandwidth
• No cache
• Warps must be synchronized
• TPU
• Similar to GPU in many respects but faster
• Often requires larger batch size
• Sometimes requires reduced precision
(Goodfellow 2018)
Distributed Implementations
• Distributed
• Multi-GPU
• Multi-machine
• Model parallelism
• Data parallelism
• Trivial at test time
• Synchronous or asynchronous SGD at train time
(Goodfellow 2018)
Synchronous SGD
TensorFlow tutorial
(Goodfellow 2018)
Example: ImageNet in 18
minutes for $40
Blog post
(Goodfellow 2018)
Model Compression
• Large models often have lower test error
• Very large model trained with dropout
• Ensemble of many models
• Want small model for low resource use at test time
• Train a small model to mimic the large one
• Obtains better test error than directly training a small
model
(Goodfellow 2018)
Quantization
(TensorFlow Lite)
Important for
mobile deployment
(Goodfellow 2018)
Dynamic Structure: Cascades
(Viola and Jones, 2001)
(Goodfellow 2018)
Dynamic Structure
Outrageously Large Neural Networks
(Goodfellow 2018)
Dataset Augmentation for
Computer Vision
Affine
Distortion
Noise
Elastic
Deformation
Horizontal
flip
Random
Translation
Hue Shift
(Goodfellow 2018)
Training Data Sample Generator
(CelebA) (Karras et al, 2017)
Generative Modeling:
Sample Generation
Covered in Part III Progressed rapidly
after the book was
written
Underlies many
graphics and
speech applications
(Goodfellow 2018)
Graphics
(Table by Augustus Odena)
(Goodfellow 2018)
Video Generation
(Wang et al, 2018)
(Goodfellow 2018)
Everybody Dance Now!
(Chan et al 2018)
(Goodfellow 2018)
g optimization with a learned predictor model. a) Original experimental
and measured binding scores (horizontal axis); we fit a model to this data
an oracle for scoring generated sequences. Plot shows scores on held-out
elation 0.97). b) Data is restricted to sequences with oracle scores in the
Model-Based Optimization
(Killoran et al, 2017)
(Goodfellow 2018)
Designing Physical Objects
(Hwang et al 2018)
(Goodfellow 2018)
Attention Mechanisms
translations (Cho et al., 2014a) and for generating translated sentences (Sutskever
et al., 2014). Jean et al. (2014) scaled these models to larger vocabularies.
12.4.5.1 Using an Attention Mechanism and Aligning Pieces of Data
↵(t 1)
↵(t 1)
↵(t)
↵(t)
↵(t+1)
↵(t+1)
h(t 1)
h(t 1)
h(t)
h(t)
h(t+1)
h(t+1)
c
c
⇥
⇥ ⇥
⇥ ⇥
⇥
+
Figure 12.6: A modern attention mechanism, as introduced by Bahdanau et al. (2015), is
essentially a weighted average. A context vector c is formed by taking a weighted average
of feature vectors h(t)
with weights ↵(t)
. In some applications, the feature vectors h are
hidden units of a neural network, but they may also be raw input to the model. The
weights ↵(t)
are produced by the model itself. They are usually values in the interval
(t)
Figure 12.6
Important in many vision, speech, and NLP applications
Improved rapidly after the book was written
(Goodfellow 2018)
Attention for Images
Attention mechanism from
Wang et al 2018
Image model from Zhang et al 2018
(Goodfellow 2018)
Generating Training Data
(Bousmalis et al, 2017)
(Goodfellow 2018)
Generating Training Data
(Bousmalis et al, 2017)
(Goodfellow 2018)
Natural Language Processing
• An important predecessor to deep NLP is the family
of models based on n-grams:
natural language. Depending on how the model is designed, a token may
word, a character, or even a byte. Tokens are always discrete entities. The
st successful language models were based on models of fixed-length sequences
kens called n-grams. An n-gram is a sequence of n tokens.
Models based on n-grams define the conditional probability of the n-th token
the preceding n 1 tokens. The model uses products of these conditional
butions to define the probability distribution over longer sequences:
P(x1, . . . , x⌧ ) = P(x1, . . . , xn 1)
⌧
Y
t=n
P(xt | xt n+1, . . . , xt 1). (12.5)
461
imply by looking up two stored probabilities. For this to exactly reproduce
nference in Pn, we must omit the final character from each sequence when we
rain Pn 1.
As an example, we demonstrate how a trigram model computes the probability
of the sentence “THE DOG RAN AWAY.” The first words of the sentence cannot be
handled by the default formula based on conditional probability because there is no
ontext at the beginning of the sentence. Instead, we must use the marginal prob-
ability over words at the start of the sentence. We thus evaluate P3(THE DOG RAN).
Finally, the last word may be predicted using the typical case, of using the condi-
ional distribution P(AWAY | DOG RAN). Putting this together with equation 12.6,
we obtain:
P(THE DOG RAN AWAY) = P3(THE DOG RAN)P3(DOG RAN AWAY)/P2(DOG RAN).
(12.7)
A fundamental limitation of maximum likelihood for n-gram models is that Pn
as estimated from training set counts is very likely to be zero in many cases, even
hough the tuple (xt n+1, . . . , xt) may appear in the test set. This can cause two
different kinds of catastrophic outcomes. When Pn 1 is zero, the ratio is undefined,
o the model does not even produce a sensible output. When Pn 1 is non-zero but
Pn is zero, the test log-likelihood is 1. To avoid such catastrophic outcomes,
Improve with:
-Smoothing
-Backoff
-Word categories
(Goodfellow 2018)
Word Embeddings in Neural
Language Models
CHAPTER 12. APPLICATIONS
multiple latent variables (Mnih and Hinton, 2007).
34 32 30 28 26
14
13
12
11
10
9
8
7
6
Canada
Europe
Ontario
North
English
Canadian
Union
African
Africa
British
France
Russian
China
Germany
French
Assembly
EU Japan
Iraq
South
European
35.0 35.5 36.0 36.5 37.0 37.5 38.0
17
18
19
20
21
22
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
Figure 12.3: Two-dimensional visualizations of word embeddings obtained from a neural
machine translation model (Bahdanau et al., 2015), zooming in on specific areas where
semantically related words have embedding vectors that are close to each other. Countries
Figure 12.3
(Goodfellow 2018)
High-Dimensional Output
Layers for Large Vocabularies
• Short list
• Hierarchical softmax
• Importance sampling
• Noise contrastive estimation
(Goodfellow 2018)
A Hierarchy of Words and
Word Categories
CHAPTER 12. APPLICATIONS
(1)
(0)
(0,0,0) (0,0,1) (0,1,0) (0,1,1) (1,0,0) (1,0,1) (1,1,0) (1,1,1)
(1,1)
(1,0)
(0,1)
(0,0)
w0
w0 w1
w1 w2
w2 w3
w3 w4
w4 w5
w5 w6
w6 w7
w7
Figure 12.4: Illustration of a simple hierarchy of word categories, with 8 words w0, . . . , w7
Figure 12.4
(Goodfellow 2018)
Neural Machine Translation
HAPTER 12. APPLICATIONS
Decoder
Output object (English
sentence)
Intermediate, semantic representation
Source object (French sentence or image)
Encoder
igure 12.5: The encoder-decoder architecture to map back and forth between a surfac
epresentation (such as a sequence of words or an image) and a semantic representatio
By using the output of an encoder of data from one modality (such as the encoder mappin
Figure 12.5
(Goodfellow 2018)
Google Neural Machine Translation
Wu et al 2016
(Goodfellow 2018)
Speech Recognition
Chan et al 2015
“Listen, Attend, and Spell”
Graphic from
Current speech recognition
is based on seq2seq with
attention
(Goodfellow 2018)
Speech Synthesis
WaveNet
(van den Oord et al, 2016)
(Goodfellow 2018)
Deep RL for Atari game playing
(Mnih et al 2013)
Convolutional network estimates the value function (future
rewards) used to guide the game-playing agent.
(Note: deep RL didn’t really exist when we started the book,
became a success while we were writing it, extremely hot topic by the time the book was printed)
(Goodfellow 2018)
Superhuman Go Performance
(Silver et al, 2016)
Monte Carlo tree search, with convolutional networks for value
function and policy
(Goodfellow 2018)
Robotics
(Google Brain)
(Goodfellow 2018)
Healthcare and Biosciences
(Google Brain)
(Goodfellow 2018)
Autonomous Vehicles
(WayMo)
(Goodfellow 2018)
Questions

More Related Content

What's hot

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)Thomas da Silva Paula
 
Learning To Rank data2day 2017
Learning To Rank data2day 2017Learning To Rank data2day 2017
Learning To Rank data2day 2017Stefan Kühn
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models Chia-Wen Cheng
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Universitat Politècnica de Catalunya
 
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNCapitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNAlpaca
 
Learning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesLearning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesGiuseppe (Pino) Di Fabbrizio
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEmanuele Ghelfi
 
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYijcsit
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsArtifacia
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and ApplicationsHoang Nguyen
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical ImagingSanghoon Hong
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical dataPaul Skeie
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networksYunjey Choi
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Ian Morgan
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelstuxette
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)NamHyuk Ahn
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksJong Wook Kim
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
PixelGAN Autoencoders
  PixelGAN Autoencoders  PixelGAN Autoencoders
PixelGAN AutoencodersMLReview
 

What's hot (20)

A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs) A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
A (Very) Gentle Introduction to Generative Adversarial Networks (a.k.a GANs)
 
Learning To Rank data2day 2017
Learning To Rank data2day 2017Learning To Rank data2day 2017
Learning To Rank data2day 2017
 
Open-ended Visual Question-Answering
Open-ended  Visual Question-AnsweringOpen-ended  Visual Question-Answering
Open-ended Visual Question-Answering
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
Generative Adversarial Networks (D2L5 Deep Learning for Speech and Language U...
 
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNCapitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
 
Learning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectivesLearning when to give up: theory, practice and perspectives
Learning when to give up: theory, practice and perspectives
 
EuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and ApplicationsEuroSciPy 2019 - GANs: Theory and Applications
EuroSciPy 2019 - GANs: Theory and Applications
 
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEYIMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
IMAGE GENERATION WITH GANS-BASED TECHNIQUES: A SURVEY
 
Generative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their ApplicationsGenerative Adversarial Networks and Their Applications
Generative Adversarial Networks and Their Applications
 
GANs and Applications
GANs and ApplicationsGANs and Applications
GANs and Applications
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical data
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
Professor Steve Roberts; The Bayesian Crowd: scalable information combinati...
 
A short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction modelsA short and naive introduction to using network in prediction models
A short and naive introduction to using network in prediction models
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
A Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial NetworksA Short Introduction to Generative Adversarial Networks
A Short Introduction to Generative Adversarial Networks
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
PixelGAN Autoencoders
  PixelGAN Autoencoders  PixelGAN Autoencoders
PixelGAN Autoencoders
 

Similar to 12_applications.pdf

Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
A Decomposition Technique For Solving Integer Programming Problems
A Decomposition Technique For Solving Integer Programming ProblemsA Decomposition Technique For Solving Integer Programming Problems
A Decomposition Technique For Solving Integer Programming ProblemsCarrie Romero
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysisodsc
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2Viral Gupta
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLuba Elliott
 
Beginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxBeginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxIshaq Khan
 
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONijaia
 
TensorfLow_Basic.pptx
TensorfLow_Basic.pptxTensorfLow_Basic.pptx
TensorfLow_Basic.pptxTMUb202109065
 
building intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningbuilding intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningmustafa sarac
 
Seed rl paper review
Seed rl paper reviewSeed rl paper review
Seed rl paper reviewKyoungman Lee
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIWithTheBest
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...Edge AI and Vision Alliance
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
 
result analysis for deep leakage from gradients
result analysis for deep leakage from gradientsresult analysis for deep leakage from gradients
result analysis for deep leakage from gradients國騰 丁
 

Similar to 12_applications.pdf (20)

04 numerical
04 numerical04 numerical
04 numerical
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
A Decomposition Technique For Solving Integer Programming Problems
A Decomposition Technique For Solving Integer Programming ProblemsA Decomposition Technique For Solving Integer Programming Problems
A Decomposition Technique For Solving Integer Programming Problems
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 
06 mlp
06 mlp06 mlp
06 mlp
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
 
Beginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptxBeginner's Guide to Diffusion Models..pptx
Beginner's Guide to Diffusion Models..pptx
 
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATIONEXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
EXTENDING OUTPUT ATTENTIONS IN RECURRENT NEURAL NETWORKS FOR DIALOG GENERATION
 
TensorfLow_Basic.pptx
TensorfLow_Basic.pptxTensorfLow_Basic.pptx
TensorfLow_Basic.pptx
 
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial NetworkSEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
 
building intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningbuilding intelligent systems with large scale deep learning
building intelligent systems with large scale deep learning
 
Seed rl paper review
Seed rl paper reviewSeed rl paper review
Seed rl paper review
 
Deeplearning in finance
Deeplearning in financeDeeplearning in finance
Deeplearning in finance
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAIGenerative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
Generative Adversarial Networks (GANs) - Ian Goodfellow, OpenAI
 
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn..."Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
"Large-Scale Deep Learning for Building Intelligent Computer Systems," a Keyn...
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
result analysis for deep leakage from gradients
result analysis for deep leakage from gradientsresult analysis for deep leakage from gradients
result analysis for deep leakage from gradients
 
Plug play language_models
Plug play language_modelsPlug play language_models
Plug play language_models
 

More from KSChidanandKumarJSSS (10)

16_graphical_models.pdf
16_graphical_models.pdf16_graphical_models.pdf
16_graphical_models.pdf
 
10_rnn.pdf
10_rnn.pdf10_rnn.pdf
10_rnn.pdf
 
15_representation.pdf
15_representation.pdf15_representation.pdf
15_representation.pdf
 
14_autoencoders.pdf
14_autoencoders.pdf14_autoencoders.pdf
14_autoencoders.pdf
 
Adversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdfAdversarial ML - Part 2.pdf
Adversarial ML - Part 2.pdf
 
Adversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdfAdversarial ML - Part 1.pdf
Adversarial ML - Part 1.pdf
 
17_monte_carlo.pdf
17_monte_carlo.pdf17_monte_carlo.pdf
17_monte_carlo.pdf
 
18_partition.pdf
18_partition.pdf18_partition.pdf
18_partition.pdf
 
8803-09-lec16.pdf
8803-09-lec16.pdf8803-09-lec16.pdf
8803-09-lec16.pdf
 
13_linear_factors.pdf
13_linear_factors.pdf13_linear_factors.pdf
13_linear_factors.pdf
 

Recently uploaded

Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...chandars293
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...parulsinha
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Sheetaleventcompany
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...chandars293
 
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...parulsinha
 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableGENUINE ESCORT AGENCY
 
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Sheetaleventcompany
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...parulsinha
 
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls ServiceGENUINE ESCORT AGENCY
 
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...Sheetaleventcompany
 
Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...
Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...
Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...Sheetaleventcompany
 
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...adilkhan87451
 
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...Dipal Arora
 
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...khalifaescort01
 
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In AhmedabadGENUINE ESCORT AGENCY
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426jennyeacort
 
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...chennailover
 

Recently uploaded (20)

Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
 
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...Russian Call Girls Service  Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
Russian Call Girls Service Jaipur {8445551418} ❤️PALLAVI VIP Jaipur Call Gir...
 
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
Call Girls Service Jaipur {9521753030} ❤️VVIP RIDDHI Call Girl in Jaipur Raja...
 
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...Top Rated  Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
Top Rated Hyderabad Call Girls Chintal ⟟ 9332606886 ⟟ Call Me For Genuine Se...
 
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
Call Girls Service Jaipur {8445551418} ❤️VVIP BHAWNA Call Girl in Jaipur Raja...
 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
 
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
Dehradun Call Girls Service {8854095900} ❤️VVIP ROCKY Call Girl in Dehradun U...
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
 
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
 
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
 
Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...
Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...
Low Rate Call Girls Bangalore {7304373326} ❤️VVIP NISHA Call Girls in Bangalo...
 
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
 
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
 
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
 
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
💕SONAM KUMAR💕Premium Call Girls Jaipur ↘️9257276172 ↙️One Night Stand With Lo...
 
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
 
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
Call Girls in Delhi Triveni Complex Escort Service(🔝))/WhatsApp 97111⇛47426
 
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
Coimbatore Call Girls in Thudiyalur : 7427069034 High Profile Model Escorts |...
 

12_applications.pdf

  • 1. Applications Lecture slides for Chapter 12 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2018-10-25
  • 2. (Goodfellow 2018) Disclaimer • Details of applications change much faster than the underlying conceptual ideas • A printed book is updated on the scale of years, state- of-the-art results come out constantly • These slides are somewhat more up to date • Applications involve much more specific knowledge, the limitations of my own knowledge will be much more apparent in these slides than others
  • 3. (Goodfellow 2018) Large Scale Deep Learning 1950 1985 2000 2015 2056 Year 10 2 10 1 100 101 102 103 104 105 106 107 108 109 1010 1011 Number of neurons (logarithmic scale) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sponge Roundworm Leech Ant Bee Frog Octopus Human ure 1.11: Since the introduction of hidden units, artificial neural networks have doub ize roughly every 2.4 years. Biological neural network sizes from Wikipedia (2015 Figure 1.11
  • 4. (Goodfellow 2018) Fast Implementations • CPU • Exploit fixed point arithmetic in CPU families where this offers a speedup • Cache-friendly implementations • GPU • High memory bandwidth • No cache • Warps must be synchronized • TPU • Similar to GPU in many respects but faster • Often requires larger batch size • Sometimes requires reduced precision
  • 5. (Goodfellow 2018) Distributed Implementations • Distributed • Multi-GPU • Multi-machine • Model parallelism • Data parallelism • Trivial at test time • Synchronous or asynchronous SGD at train time
  • 7. (Goodfellow 2018) Example: ImageNet in 18 minutes for $40 Blog post
  • 8. (Goodfellow 2018) Model Compression • Large models often have lower test error • Very large model trained with dropout • Ensemble of many models • Want small model for low resource use at test time • Train a small model to mimic the large one • Obtains better test error than directly training a small model
  • 10. (Goodfellow 2018) Dynamic Structure: Cascades (Viola and Jones, 2001)
  • 12. (Goodfellow 2018) Dataset Augmentation for Computer Vision Affine Distortion Noise Elastic Deformation Horizontal flip Random Translation Hue Shift
  • 13. (Goodfellow 2018) Training Data Sample Generator (CelebA) (Karras et al, 2017) Generative Modeling: Sample Generation Covered in Part III Progressed rapidly after the book was written Underlies many graphics and speech applications
  • 16. (Goodfellow 2018) Everybody Dance Now! (Chan et al 2018)
  • 17. (Goodfellow 2018) g optimization with a learned predictor model. a) Original experimental and measured binding scores (horizontal axis); we fit a model to this data an oracle for scoring generated sequences. Plot shows scores on held-out elation 0.97). b) Data is restricted to sequences with oracle scores in the Model-Based Optimization (Killoran et al, 2017)
  • 18. (Goodfellow 2018) Designing Physical Objects (Hwang et al 2018)
  • 19. (Goodfellow 2018) Attention Mechanisms translations (Cho et al., 2014a) and for generating translated sentences (Sutskever et al., 2014). Jean et al. (2014) scaled these models to larger vocabularies. 12.4.5.1 Using an Attention Mechanism and Aligning Pieces of Data ↵(t 1) ↵(t 1) ↵(t) ↵(t) ↵(t+1) ↵(t+1) h(t 1) h(t 1) h(t) h(t) h(t+1) h(t+1) c c ⇥ ⇥ ⇥ ⇥ ⇥ ⇥ + Figure 12.6: A modern attention mechanism, as introduced by Bahdanau et al. (2015), is essentially a weighted average. A context vector c is formed by taking a weighted average of feature vectors h(t) with weights ↵(t) . In some applications, the feature vectors h are hidden units of a neural network, but they may also be raw input to the model. The weights ↵(t) are produced by the model itself. They are usually values in the interval (t) Figure 12.6 Important in many vision, speech, and NLP applications Improved rapidly after the book was written
  • 20. (Goodfellow 2018) Attention for Images Attention mechanism from Wang et al 2018 Image model from Zhang et al 2018
  • 21. (Goodfellow 2018) Generating Training Data (Bousmalis et al, 2017)
  • 22. (Goodfellow 2018) Generating Training Data (Bousmalis et al, 2017)
  • 23. (Goodfellow 2018) Natural Language Processing • An important predecessor to deep NLP is the family of models based on n-grams: natural language. Depending on how the model is designed, a token may word, a character, or even a byte. Tokens are always discrete entities. The st successful language models were based on models of fixed-length sequences kens called n-grams. An n-gram is a sequence of n tokens. Models based on n-grams define the conditional probability of the n-th token the preceding n 1 tokens. The model uses products of these conditional butions to define the probability distribution over longer sequences: P(x1, . . . , x⌧ ) = P(x1, . . . , xn 1) ⌧ Y t=n P(xt | xt n+1, . . . , xt 1). (12.5) 461 imply by looking up two stored probabilities. For this to exactly reproduce nference in Pn, we must omit the final character from each sequence when we rain Pn 1. As an example, we demonstrate how a trigram model computes the probability of the sentence “THE DOG RAN AWAY.” The first words of the sentence cannot be handled by the default formula based on conditional probability because there is no ontext at the beginning of the sentence. Instead, we must use the marginal prob- ability over words at the start of the sentence. We thus evaluate P3(THE DOG RAN). Finally, the last word may be predicted using the typical case, of using the condi- ional distribution P(AWAY | DOG RAN). Putting this together with equation 12.6, we obtain: P(THE DOG RAN AWAY) = P3(THE DOG RAN)P3(DOG RAN AWAY)/P2(DOG RAN). (12.7) A fundamental limitation of maximum likelihood for n-gram models is that Pn as estimated from training set counts is very likely to be zero in many cases, even hough the tuple (xt n+1, . . . , xt) may appear in the test set. This can cause two different kinds of catastrophic outcomes. When Pn 1 is zero, the ratio is undefined, o the model does not even produce a sensible output. When Pn 1 is non-zero but Pn is zero, the test log-likelihood is 1. To avoid such catastrophic outcomes, Improve with: -Smoothing -Backoff -Word categories
  • 24. (Goodfellow 2018) Word Embeddings in Neural Language Models CHAPTER 12. APPLICATIONS multiple latent variables (Mnih and Hinton, 2007). 34 32 30 28 26 14 13 12 11 10 9 8 7 6 Canada Europe Ontario North English Canadian Union African Africa British France Russian China Germany French Assembly EU Japan Iraq South European 35.0 35.5 36.0 36.5 37.0 37.5 38.0 17 18 19 20 21 22 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Figure 12.3: Two-dimensional visualizations of word embeddings obtained from a neural machine translation model (Bahdanau et al., 2015), zooming in on specific areas where semantically related words have embedding vectors that are close to each other. Countries Figure 12.3
  • 25. (Goodfellow 2018) High-Dimensional Output Layers for Large Vocabularies • Short list • Hierarchical softmax • Importance sampling • Noise contrastive estimation
  • 26. (Goodfellow 2018) A Hierarchy of Words and Word Categories CHAPTER 12. APPLICATIONS (1) (0) (0,0,0) (0,0,1) (0,1,0) (0,1,1) (1,0,0) (1,0,1) (1,1,0) (1,1,1) (1,1) (1,0) (0,1) (0,0) w0 w0 w1 w1 w2 w2 w3 w3 w4 w4 w5 w5 w6 w6 w7 w7 Figure 12.4: Illustration of a simple hierarchy of word categories, with 8 words w0, . . . , w7 Figure 12.4
  • 27. (Goodfellow 2018) Neural Machine Translation HAPTER 12. APPLICATIONS Decoder Output object (English sentence) Intermediate, semantic representation Source object (French sentence or image) Encoder igure 12.5: The encoder-decoder architecture to map back and forth between a surfac epresentation (such as a sequence of words or an image) and a semantic representatio By using the output of an encoder of data from one modality (such as the encoder mappin Figure 12.5
  • 28. (Goodfellow 2018) Google Neural Machine Translation Wu et al 2016
  • 29. (Goodfellow 2018) Speech Recognition Chan et al 2015 “Listen, Attend, and Spell” Graphic from Current speech recognition is based on seq2seq with attention
  • 31. (Goodfellow 2018) Deep RL for Atari game playing (Mnih et al 2013) Convolutional network estimates the value function (future rewards) used to guide the game-playing agent. (Note: deep RL didn’t really exist when we started the book, became a success while we were writing it, extremely hot topic by the time the book was printed)
  • 32. (Goodfellow 2018) Superhuman Go Performance (Silver et al, 2016) Monte Carlo tree search, with convolutional networks for value function and policy
  • 34. (Goodfellow 2018) Healthcare and Biosciences (Google Brain)