SlideShare a Scribd company logo
Deep Learning for Natural Language
Processing
Jonathan Mugan, PhD
NLP Community Day
June 4, 2015
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
The importance of finding dumb mistakes
The importance of finding dumb mistakes
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Deep learning enables sub-symbolic processing
Symbolic systems can be brittle.
I
bought
a
car
.
<i>
<bought>
<a>
<car>
<.>
You have to remember to
represent “purchased” and
“automobile.”
What about “truck”?
How do you encode the
meaning of the entire
sentence?
Deep learning begins with a little function
It all starts with a humble linear function called a perceptron.
weight1 ✖ input1
weight2 ✖ input2
weight3 ✖ input3
sum
✚
Perceptron:
If sum > threshold: output 1
Else: output 0
Example: The inputs can be your data. Question: Should I buy this car?
0.2 ✖ gas milage
0.3 ✖ horepower
0.5 ✖ num cup holders
sum
✚
Perceptron:
If sum > threshold: buy
Else: walk
These little functions are chained together
Deep learning comes from chaining a bunch of these little
functions together. Chained together, they are called
neurons.
where
To create a neuron, we add a nonlinearity to the perceptron to get
extra representational power when we chain them together.
Our nonlinear perceptron is
sometimes called a sigmoid.
The value 𝑏 just offsets the sigmoid so the
center is at 0.
Plot of a sigmoid
Single artificial neuron
Output, or input to
next neuron
weight1 ✖ input1
weight2 ✖ input2
weight3 ✖ input3
Three-layered neural network
A bunch of neurons chained together is called a neural network.
Layer 2: hidden layer. Called
this because it is neither input
nor output.
Layer 3: output. E.g., cat
or not a cat; buy the car or
walk.
Layer 1: input data. Can
be pixel values or the number
of cup holders.
This network has three layers.
(Some edges lighter
to avoid clutter.)
[16.2, 17.3, −52.3, 11.1]
Training with supervised learning
Supervised Learning: You show the network a bunch of things
with a labels saying what they are, and you want the network to
learn to classify future things without labels.
Example: here are some
pictures of cats. Tell me
which of these other pictures
are of cats.
To train the network, want to
find the weights that
correctly classify all of the
training examples. You hope
it will work on the testing
examples.
Done with an algorithm
called Backpropagation
[Rumelhart et al., 1986].
[16.2, 17.3, −52.3, 11.1]
Training with supervised learning
Supervised Learning: You show the network a bunch of things
with a labels saying what they are, and you want the network to
learn to classify future things without labels.
𝑤
𝑊
𝑦
𝑥 [16.2, 17.3, −52.3, 11.1]
Learning is learning the
parameter values.
Why Google’s deep
learning toolbox is
called TensorFlow.
Deep learning is adding more layers
There is no exact
definition of
what constitutes
“deep learning.”
The number of
weights (parameters)
is generally large.
Some networks have
millions of parameters
that are learned.
(Some edges omitted
to avoid clutter.)
[16.2, 17.3, −52.3, 11.1]
Recall our standard architecture
Layer 2: hidden layer. Called
this because it is neither input
nor output.
Layer 3: output. E.g., cat
or not a cat; buy the car or
walk.
Layer 1: input data. Can
be pixel values or the number
of cup holders.
Is this a cat?
[16.2, 17.3, −52.3, 11.1]
Neural nets with multiple outputs
Okay, but what kind of cat is it?
𝑃(𝑥)𝑃(𝑥)𝑃(𝑥) 𝑃(𝑥) 𝑃(𝑥)
Introduce a new node
called a softmax.
Probability a
house cat
Probability a
lion
Probability a
panther
Probability a
bobcat
Just normalize the
output over the sum of
the other outputs
(using the exponential).
Gives a probability.
[16.2, 17.3, −52.3, 11.1]
Learning word vectors
13.2, 5.4, −3.1 [−12.1, 13.1, 0.1] [7.2, 3.2,-1.9]
the man ran
From the sentence, “The man ran fast.”
𝑃(𝑥)𝑃(𝑥) 𝑃(𝑥) 𝑃(𝑥)
Probability
of “fast”
Probability
of “slow”
Probability
of “taco”
Probability
of “bobcat”
Learns a vector for each word based on the “meaning” in the sentence by
trying to predict the next word [Bengio et al., 2003].
These numbers updated
along with the weights
and become the vector
representations of the
words.
Comparing vector and symbolic representations
Vector representation
taco = [17.32, 82.9, −4.6, 7.2]
Symbolic representation
taco = 𝑡𝑎𝑐𝑜
• Vectors have a similarity score.
• A taco is not a burrito but similar.
• Symbols can be the same or not.
• A taco is just as different from a
burrito as a Toyota.
• Vectors have internal structure
[Mikolov et al., 2013].
• Italy – Rome = France – Paris
• King – Queen = Man – Woman
• Symbols have no structure.
• Symbols are arbitrarily assigned.
• Meaning relative to other symbols.
• Vectors are grounded in
experience.
• Meaning relative to predictions.
• Ability to learn representations
makes agents less brittle.
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Encoding sentence meaning into a vector
h0
The
“The patient fell.”
Encoding sentence meaning into a vector
h0
The
h1
patient
“The patient fell.”
Encoding sentence meaning into a vector
h0
The
h1
patient
h2
fell
“The patient fell.”
Encoding sentence meaning into a vector
Like a hidden Markov model, but doesn’t make the Markov
assumption and benefits from a vector representation.
h0
The
h1
patient
h2
fell
h3
.
“The patient fell.”
Decoding sentence meaning
Machine translation, or structure learning more generally.
El
h3
Decoding sentence meaning
Machine translation, or structure learning more generally.
El
h3 h4
Decoding sentence meaning
Machine translation, or structure learning more generally.
El
h3
paciente
h4
Decoding sentence meaning
Machine translation, or structure learning more generally.
El
h3
paciente
h4
cayó
h5
.
h5
[Cho et al., 2014]
It keeps generating until it generates a stop symbol.
Generating image captions
Convolutional
neural network
An
h0
angry
h1
sister
h2
.
h3
[Karpathy and Fei-Fei, 2015]
[Vinyals et al., 2015]
Image caption examples
[Karpathy and Fei-Fei, 2015] http://cs.stanford.edu/people/karpathy/deepimagesent/
See:
Attention [Bahdanau et al., 2014]
El
h3
paciente
h4
cayó
h5
.
h5
h0
The
h1
patient
h2
fell
h3
.
RNNs and Structure Learning
• These are sometimes called seq2seq models.
• In addition to machine translation and generating
captions for images, can be used to learn just about
any kind of structure you’d want, as long as you
have lots of training data.
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Deep learning and question answering
RNNs answer questions.
What is the translation of this
phrase to French?
What is the next word?
Attention is useful for question
answering.
This can be generalized to which facts
the learner should pay attention to
when answering questions.
Deep learning and question answering
Bob went home.
Tim went to the junkyard.
Bob picked up the jar.
Bob went to town.
Where is the jar? A: town
• Memory Networks [Weston et al.,
2014]
• Updates memory vectors based on
a question and finds the best one to
give the output.
The office is north of the yard.
The bath is north of the office.
The yard is west of the kitchen.
How do you go from the office to
the kitchen? A: south, east
• Neural Reasoner [Peng et al.,
2015]
• Encodes the question and facts in
many layers, and the final layer is
put through a function that gives
the answer.
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Limitations of deep learning
The encoded meaning is grounded with respect to other words.
There is no linkage to the physical world.
"ICubLugan01 Reaching". Licensed under CC BY-SA 3.0 via Wikipedia - https://en.wikipedia.org/wiki/File:ICubLugan01_Reaching.png#/media/File:ICubLugan01_Reaching.png
The iCub http://www.icub.org/
Limitations of deep learning
Bob went home.
Tim went to the junkyard.
Bob picked up the jar.
Bob went to town.
Where is the jar? A: town
Deep learning has no
understanding of what it means for
the jar to be in town.
For example that it can’t also be at
the junkyard. Or that it may be in
Bob’s car, or still in his hands.
The encoded meaning is grounded with respect to other words.
There is no linkage to the physical world.
Limitations of deep learning
Imagine a dude standing
on a table. How would a
computer know that if
you move the table you
also move the dude?
Likewise, how could a
computer know that it
only rains outside?
Or, as Marvin Minsky asks, how could a computer learn
that you can pull a box with a string but not push it?
Limitations of deep learning
Imagine a dude standing
on a table. How would a
computer know that if
you move the table you
also move the dude?
Likewise, how could a
computer know that it
only rains outside?
Or, as Marvin Minsky asks, how could a computer learn
that you can pull a box with a string but not push it?
No one knows how to explain
all of these situations to a
computer. There’s just too
many variations.
A robot can learn through
experience, but it must be
able to efficiently generalize
that experience.
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Overview
• About me and DeepGrammar (4 minutes)
• Introduction to Deep Learning for NLP
• Recurrent Neural Networks
• Deep Learning and Question Answering
• Limitations of Deep Learning for NLP
• How You Can Get Started
Best learning resources
Stanford class on deep learning for
NLP.http://cs224d.stanford.edu/syllabus.html
Hinton’s Coursera Course. Get it right from the horse’s mouth.
He explains things well.
https://www.coursera.org/course/neuralnets
Online textbook in preparation for deep learning from Yoshua
Bengio and friends. Clear and understandable.
http://www.iro.umontreal.ca/~bengioy/dlbook/
TensorFlow tutorials.
https://www.tensorflow.org/versions/r0.8/tutorials/index.ht
ml
TensorFlow has a seq2seq abstraction
data_utils is vocabulary.
seq2seq_model puts buckets around seq2seq function.
translate trains the model.
Check out spaCy for simple text processing
https://nicschrading.com/project/Intro-to-NLP-with-spaCy/
It also does
word vectors.
See:
Thanks for listening
Jonathan Mugan
@jmugan
www.deepgrammar.com

More Related Content

What's hot

Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Saurabh Kaushik
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
Roelof Pieters
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
YerevaNN research lab
 
Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)
Kobkrit Viriyayudhakorn
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
Roelof Pieters
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
Roelof Pieters
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
Roelof Pieters
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
Deep Learning for NLP Applications
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP Applications
Samiur Rahman
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
Anuj Gupta
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Matthew Lease
 
KiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonKiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with Python
Alyona Medelyan
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
Roelof Pieters
 
Chatbots from first principles
Chatbots from first principlesChatbots from first principles
Chatbots from first principles
Jonathan Mugan
 
Practical Deep Learning for NLP
Practical Deep Learning for NLP Practical Deep Learning for NLP
Practical Deep Learning for NLP
Textkernel
 
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Márton Miháltz
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
Viet-Trung TRAN
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
ananth
 
Chatbots and Deep Learning
Chatbots and Deep LearningChatbots and Deep Learning
Chatbots and Deep Learning
Andherson Maeda
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
Vijay Ganti
 

What's hot (20)

Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
 
Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)Thai Text processing by Transfer Learning using Transformer (Bert)
Thai Text processing by Transfer Learning using Transformer (Bert)
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Deep Learning for NLP Applications
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP Applications
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
KiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with PythonKiwiPyCon 2014 talk - Understanding human language with Python
KiwiPyCon 2014 talk - Understanding human language with Python
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Chatbots from first principles
Chatbots from first principlesChatbots from first principles
Chatbots from first principles
 
Practical Deep Learning for NLP
Practical Deep Learning for NLP Practical Deep Learning for NLP
Practical Deep Learning for NLP
 
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Chatbots and Deep Learning
Chatbots and Deep LearningChatbots and Deep Learning
Chatbots and Deep Learning
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 

Similar to Deep Learning for Natural Language Processing

What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
Deep network notes.pdf
Deep network notes.pdfDeep network notes.pdf
Deep network notes.pdf
Ramya Nellutla
 
Writing methods
Writing methodsWriting methods
Writing methods
debaleena dutta
 
Course Design Best Practices
Course Design Best PracticesCourse Design Best Practices
Course Design Best Practices
Keitaro Matsuoka
 
Lukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLP
Lukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLPLukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLP
Lukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLP
AI Frontiers
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
butest
 
Sanskrit in Natural Language Processing
Sanskrit in Natural Language ProcessingSanskrit in Natural Language Processing
Sanskrit in Natural Language Processing
Hitesh Joshi
 
An Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLPAn Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLP
Rrubaa Panchendrarajan
 
Unit 12 Future Technologies
Unit 12 Future TechnologiesUnit 12 Future Technologies
Unit 12 Future Technologies
Sonia Osuna
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
Hemantha Kulathilake
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
Aravind Reddy
 
Unit 1 Introduction to Artificial Intelligence.pptx
Unit 1 Introduction to Artificial Intelligence.pptxUnit 1 Introduction to Artificial Intelligence.pptx
Unit 1 Introduction to Artificial Intelligence.pptx
Dr.M.Karthika parthasarathy
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
punedevscom
 
Prompt-Engineering-Lecture-Elvis learn prompt engineering
Prompt-Engineering-Lecture-Elvis learn prompt engineeringPrompt-Engineering-Lecture-Elvis learn prompt engineering
Prompt-Engineering-Lecture-Elvis learn prompt engineering
SaweraKhadium
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
Insoo Chung
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
Lakshya Sivaramakrishnan
 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
Lidia Pivovarova
 
Deep Learning from Scratch - Building with Python from First Principles.pdf
Deep Learning from Scratch - Building with Python from First Principles.pdfDeep Learning from Scratch - Building with Python from First Principles.pdf
Deep Learning from Scratch - Building with Python from First Principles.pdf
YungSang1
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Zeynep Su Kurultay
 

Similar to Deep Learning for Natural Language Processing (20)

What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Deep network notes.pdf
Deep network notes.pdfDeep network notes.pdf
Deep network notes.pdf
 
Writing methods
Writing methodsWriting methods
Writing methods
 
Course Design Best Practices
Course Design Best PracticesCourse Design Best Practices
Course Design Best Practices
 
Lukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLP
Lukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLPLukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLP
Lukasz Kaiser at AI Frontiers: How Deep Learning Quietly Revolutionized NLP
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
 
Sanskrit in Natural Language Processing
Sanskrit in Natural Language ProcessingSanskrit in Natural Language Processing
Sanskrit in Natural Language Processing
 
An Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLPAn Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLP
 
Unit 12 Future Technologies
Unit 12 Future TechnologiesUnit 12 Future Technologies
Unit 12 Future Technologies
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Unit 1 Introduction to Artificial Intelligence.pptx
Unit 1 Introduction to Artificial Intelligence.pptxUnit 1 Introduction to Artificial Intelligence.pptx
Unit 1 Introduction to Artificial Intelligence.pptx
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Prompt-Engineering-Lecture-Elvis learn prompt engineering
Prompt-Engineering-Lecture-Elvis learn prompt engineeringPrompt-Engineering-Lecture-Elvis learn prompt engineering
Prompt-Engineering-Lecture-Elvis learn prompt engineering
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
 
Deep Learning from Scratch - Building with Python from First Principles.pdf
Deep Learning from Scratch - Building with Python from First Principles.pdfDeep Learning from Scratch - Building with Python from First Principles.pdf
Deep Learning from Scratch - Building with Python from First Principles.pdf
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 

More from Jonathan Mugan

How to build someone we can talk to
How to build someone we can talk toHow to build someone we can talk to
How to build someone we can talk to
Jonathan Mugan
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
Jonathan Mugan
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
Jonathan Mugan
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
Jonathan Mugan
 
Data Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First PrinciplesData Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First Principles
Jonathan Mugan
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
Jonathan Mugan
 

More from Jonathan Mugan (6)

How to build someone we can talk to
How to build someone we can talk toHow to build someone we can talk to
How to build someone we can talk to
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
 
Data Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First PrinciplesData Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First Principles
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
 

Recently uploaded

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
maazsz111
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 

Recently uploaded (20)

June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 

Deep Learning for Natural Language Processing

  • 1. Deep Learning for Natural Language Processing Jonathan Mugan, PhD NLP Community Day June 4, 2015
  • 2. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 3. The importance of finding dumb mistakes
  • 4. The importance of finding dumb mistakes
  • 5. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 6. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 7. Deep learning enables sub-symbolic processing Symbolic systems can be brittle. I bought a car . <i> <bought> <a> <car> <.> You have to remember to represent “purchased” and “automobile.” What about “truck”? How do you encode the meaning of the entire sentence?
  • 8. Deep learning begins with a little function It all starts with a humble linear function called a perceptron. weight1 ✖ input1 weight2 ✖ input2 weight3 ✖ input3 sum ✚ Perceptron: If sum > threshold: output 1 Else: output 0 Example: The inputs can be your data. Question: Should I buy this car? 0.2 ✖ gas milage 0.3 ✖ horepower 0.5 ✖ num cup holders sum ✚ Perceptron: If sum > threshold: buy Else: walk
  • 9. These little functions are chained together Deep learning comes from chaining a bunch of these little functions together. Chained together, they are called neurons. where To create a neuron, we add a nonlinearity to the perceptron to get extra representational power when we chain them together. Our nonlinear perceptron is sometimes called a sigmoid. The value 𝑏 just offsets the sigmoid so the center is at 0. Plot of a sigmoid
  • 10. Single artificial neuron Output, or input to next neuron weight1 ✖ input1 weight2 ✖ input2 weight3 ✖ input3
  • 11. Three-layered neural network A bunch of neurons chained together is called a neural network. Layer 2: hidden layer. Called this because it is neither input nor output. Layer 3: output. E.g., cat or not a cat; buy the car or walk. Layer 1: input data. Can be pixel values or the number of cup holders. This network has three layers. (Some edges lighter to avoid clutter.) [16.2, 17.3, −52.3, 11.1]
  • 12. Training with supervised learning Supervised Learning: You show the network a bunch of things with a labels saying what they are, and you want the network to learn to classify future things without labels. Example: here are some pictures of cats. Tell me which of these other pictures are of cats. To train the network, want to find the weights that correctly classify all of the training examples. You hope it will work on the testing examples. Done with an algorithm called Backpropagation [Rumelhart et al., 1986]. [16.2, 17.3, −52.3, 11.1]
  • 13. Training with supervised learning Supervised Learning: You show the network a bunch of things with a labels saying what they are, and you want the network to learn to classify future things without labels. 𝑤 𝑊 𝑦 𝑥 [16.2, 17.3, −52.3, 11.1] Learning is learning the parameter values. Why Google’s deep learning toolbox is called TensorFlow.
  • 14. Deep learning is adding more layers There is no exact definition of what constitutes “deep learning.” The number of weights (parameters) is generally large. Some networks have millions of parameters that are learned. (Some edges omitted to avoid clutter.) [16.2, 17.3, −52.3, 11.1]
  • 15. Recall our standard architecture Layer 2: hidden layer. Called this because it is neither input nor output. Layer 3: output. E.g., cat or not a cat; buy the car or walk. Layer 1: input data. Can be pixel values or the number of cup holders. Is this a cat? [16.2, 17.3, −52.3, 11.1]
  • 16. Neural nets with multiple outputs Okay, but what kind of cat is it? 𝑃(𝑥)𝑃(𝑥)𝑃(𝑥) 𝑃(𝑥) 𝑃(𝑥) Introduce a new node called a softmax. Probability a house cat Probability a lion Probability a panther Probability a bobcat Just normalize the output over the sum of the other outputs (using the exponential). Gives a probability. [16.2, 17.3, −52.3, 11.1]
  • 17. Learning word vectors 13.2, 5.4, −3.1 [−12.1, 13.1, 0.1] [7.2, 3.2,-1.9] the man ran From the sentence, “The man ran fast.” 𝑃(𝑥)𝑃(𝑥) 𝑃(𝑥) 𝑃(𝑥) Probability of “fast” Probability of “slow” Probability of “taco” Probability of “bobcat” Learns a vector for each word based on the “meaning” in the sentence by trying to predict the next word [Bengio et al., 2003]. These numbers updated along with the weights and become the vector representations of the words.
  • 18. Comparing vector and symbolic representations Vector representation taco = [17.32, 82.9, −4.6, 7.2] Symbolic representation taco = 𝑡𝑎𝑐𝑜 • Vectors have a similarity score. • A taco is not a burrito but similar. • Symbols can be the same or not. • A taco is just as different from a burrito as a Toyota. • Vectors have internal structure [Mikolov et al., 2013]. • Italy – Rome = France – Paris • King – Queen = Man – Woman • Symbols have no structure. • Symbols are arbitrarily assigned. • Meaning relative to other symbols. • Vectors are grounded in experience. • Meaning relative to predictions. • Ability to learn representations makes agents less brittle.
  • 19. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 20. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 21. Encoding sentence meaning into a vector h0 The “The patient fell.”
  • 22. Encoding sentence meaning into a vector h0 The h1 patient “The patient fell.”
  • 23. Encoding sentence meaning into a vector h0 The h1 patient h2 fell “The patient fell.”
  • 24. Encoding sentence meaning into a vector Like a hidden Markov model, but doesn’t make the Markov assumption and benefits from a vector representation. h0 The h1 patient h2 fell h3 . “The patient fell.”
  • 25. Decoding sentence meaning Machine translation, or structure learning more generally. El h3
  • 26. Decoding sentence meaning Machine translation, or structure learning more generally. El h3 h4
  • 27. Decoding sentence meaning Machine translation, or structure learning more generally. El h3 paciente h4
  • 28. Decoding sentence meaning Machine translation, or structure learning more generally. El h3 paciente h4 cayó h5 . h5 [Cho et al., 2014] It keeps generating until it generates a stop symbol.
  • 29. Generating image captions Convolutional neural network An h0 angry h1 sister h2 . h3 [Karpathy and Fei-Fei, 2015] [Vinyals et al., 2015]
  • 30. Image caption examples [Karpathy and Fei-Fei, 2015] http://cs.stanford.edu/people/karpathy/deepimagesent/ See:
  • 31. Attention [Bahdanau et al., 2014] El h3 paciente h4 cayó h5 . h5 h0 The h1 patient h2 fell h3 .
  • 32. RNNs and Structure Learning • These are sometimes called seq2seq models. • In addition to machine translation and generating captions for images, can be used to learn just about any kind of structure you’d want, as long as you have lots of training data.
  • 33. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 34. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 35. Deep learning and question answering RNNs answer questions. What is the translation of this phrase to French? What is the next word? Attention is useful for question answering. This can be generalized to which facts the learner should pay attention to when answering questions.
  • 36. Deep learning and question answering Bob went home. Tim went to the junkyard. Bob picked up the jar. Bob went to town. Where is the jar? A: town • Memory Networks [Weston et al., 2014] • Updates memory vectors based on a question and finds the best one to give the output. The office is north of the yard. The bath is north of the office. The yard is west of the kitchen. How do you go from the office to the kitchen? A: south, east • Neural Reasoner [Peng et al., 2015] • Encodes the question and facts in many layers, and the final layer is put through a function that gives the answer.
  • 37. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 38. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 39. Limitations of deep learning The encoded meaning is grounded with respect to other words. There is no linkage to the physical world. "ICubLugan01 Reaching". Licensed under CC BY-SA 3.0 via Wikipedia - https://en.wikipedia.org/wiki/File:ICubLugan01_Reaching.png#/media/File:ICubLugan01_Reaching.png The iCub http://www.icub.org/
  • 40. Limitations of deep learning Bob went home. Tim went to the junkyard. Bob picked up the jar. Bob went to town. Where is the jar? A: town Deep learning has no understanding of what it means for the jar to be in town. For example that it can’t also be at the junkyard. Or that it may be in Bob’s car, or still in his hands. The encoded meaning is grounded with respect to other words. There is no linkage to the physical world.
  • 41. Limitations of deep learning Imagine a dude standing on a table. How would a computer know that if you move the table you also move the dude? Likewise, how could a computer know that it only rains outside? Or, as Marvin Minsky asks, how could a computer learn that you can pull a box with a string but not push it?
  • 42. Limitations of deep learning Imagine a dude standing on a table. How would a computer know that if you move the table you also move the dude? Likewise, how could a computer know that it only rains outside? Or, as Marvin Minsky asks, how could a computer learn that you can pull a box with a string but not push it? No one knows how to explain all of these situations to a computer. There’s just too many variations. A robot can learn through experience, but it must be able to efficiently generalize that experience.
  • 43. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 44. Overview • About me and DeepGrammar (4 minutes) • Introduction to Deep Learning for NLP • Recurrent Neural Networks • Deep Learning and Question Answering • Limitations of Deep Learning for NLP • How You Can Get Started
  • 45. Best learning resources Stanford class on deep learning for NLP.http://cs224d.stanford.edu/syllabus.html Hinton’s Coursera Course. Get it right from the horse’s mouth. He explains things well. https://www.coursera.org/course/neuralnets Online textbook in preparation for deep learning from Yoshua Bengio and friends. Clear and understandable. http://www.iro.umontreal.ca/~bengioy/dlbook/ TensorFlow tutorials. https://www.tensorflow.org/versions/r0.8/tutorials/index.ht ml
  • 46. TensorFlow has a seq2seq abstraction data_utils is vocabulary. seq2seq_model puts buckets around seq2seq function. translate trains the model.
  • 47. Check out spaCy for simple text processing https://nicschrading.com/project/Intro-to-NLP-with-spaCy/ It also does word vectors. See:
  • 48. Thanks for listening Jonathan Mugan @jmugan www.deepgrammar.com