SlideShare a Scribd company logo
1 of 99
Download to read offline
1
@graphific
Roelof Pieters
Deep	
  Learning	
  for	
  Natural	
  
Language	
  Processing:	
  Word	
  
Embeddings
3	
  December	
  2015	
  

KTH
www.csc.kth.se/~roelof/
roelof@kth.se
Language Understanding
2
Can we understand Language ?
1. Language is ambiguous:

Every sentence has many possible interpretations.
2. Language is productive:

We will always encounter new words or new
constructions
3. Language is culturally specific
Some of the challenges in Language Understanding:
3
Can we understand Language ?
1. Language is ambiguous:

Every sentence has many possible interpretations.
2. Language is productive:

We will always encounter new words or new
constructions
• plays well with others
VB ADV P NN
NN NN P DT
• fruit flies like a banana
NN NN VB DT NN
NN VB P DT NN
NN NN P DT NN
NN VB VB DT NN
• the students went to class
DT NN VB P NN
4
Some of the challenges in Language Understanding:
Can we understand Language ?
1. Language is ambiguous:

Every sentence has many possible interpretations.
2. Language is productive:

We will always encounter new words or new
constructions
5
Some of the challenges in Language Understanding:
[Karlgren 2014, NLP Sthlm Meetup]6
Can we understand Language ?
1. Language is ambiguous:

Every sentence has many possible interpretations.
2. Language is productive:

We will always encounter new words or new
constructions
3. Language is culturally specific
Some of the challenges in Language Understanding:
7
ML: Traditional Approach
1. Gather as much LABELED data as you can get
2. Throw some algorithms at it (mainly put in an SVM and
keep it at that)
3. If you actually have tried more algos: Pick the best
4. Spend hours hand engineering some features / feature
selection / dimensionality reduction (PCA, SVD, etc)
5. Repeat…
For each new problem/question::
8
Machine Learning for NLP
Data
Classic Approach: Data is fed into a learning algorithm:
Learning 

Algorithm
9
Machine Learning for NLP
some of the (many) treebank datasets
source: http://www-nlp.stanford.edu/links/statnlp.html#Treebanks
!
10
Penn Treebank
That’s a lot of “manual” work:
11
• the students went to class
DT NN VB P NN
• plays well with others
VB ADV P NN
NN NN P DT
• fruit flies like a banana
NN NN VB DT NN
NN VB P DT NN
NN NN P DT NN
NN VB VB DT NN
With a lot of issues:
Penn Treebank
12
Machine Learning for NLP
Learning 

Algorithm
Data
“Features”
Prediction
Prediction/

Classifier
train set
test set
13
Machine Learning for NLP
Learning 

Algorithm
“Features”
Prediction
Prediction/

Classifier
train set
test set
14
One Model rules them all ?



DL approaches have been successfully applied to:
Deep Learning: Why for NLP ?
Automatic summarization Coreference resolution Discourse analysis
Machine translation Morphological segmentation Named entity recognition (NER)
Natural language generation
Natural language understanding
Optical character recognition (OCR)
Part-of-speech tagging
Parsing
Question answering
Relationship extraction
sentence boundary disambiguation
Sentiment analysis
Speech recognition
Speech segmentation
Topic segmentation and recognition
Word segmentation
Word sense disambiguation
Information retrieval (IR)
Information extraction (IE)
Speech processing
15
Deep Learning: Why for NLP ?
16
• What is the meaning of a word?

(Lexical semantics)
• What is the meaning of a sentence?

([Compositional] semantics)
• What is the meaning of a longer piece of text?
(Discourse semantics)
Semantics: Meaning
18
• NLP treats words mainly (rule-based/statistical
approaches at least) as atomic symbols:

• or in vector space:

• also known as “one hot” representation.
• Its problem ?
Word Representation
Love Candy Store
[0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …]
Candy [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] AND
Store [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 …] = 0 !
19
Word Representation
20
• Structure corresponds to meaning:
Structure and Meaning
21
• Semantics
• Syntax
22
NLP: what can we work with?
• Language models define probability distributions
over (natural language) strings or sentences
• Joint and Conditional Probability
Language Model
23
• Language models define probability distributions
over (natural language) strings or sentences
Language Model
24
• Language models define probability distributions
over (natural language) strings or sentences
Language Model
25
Word senses
What is the meaning of words?
• Most words have many different senses:

dog = animal or sausage?
How are the meanings of different words related?
• - Specific relations between senses:

Animal is more general than dog.
• - Semantic fields:

money is related to bank
26
Word senses
Polysemy:
• A lexeme is polysemous if it has different related
senses
• bank = financial institution or building
Homonyms:
• Two lexemes are homonyms if their senses are
unrelated, but they happen to have the same spelling
and pronunciation
• bank = (financial) bank or (river) bank
27
Word senses: relations
Symmetric relations:
• Synonyms: couch/sofa

Two lemmas with the same sense
• Antonyms: cold/hot, rise/fall, in/out

Two lemmas with the opposite sense
Hierarchical relations:
• Hypernyms and hyponyms: pet/dog

The hyponym (dog) is more specific than the hypernym
(pet)
• Holonyms and meronyms: car/wheel

The meronym (wheel) is a part of the holonym (car)
28
Distributional representations
“You shall know a word by the company it keeps”

(J. R. Firth 1957)
One of the most successful ideas of modern
statistical NLP!
these words represent banking
• Hard (class based) clustering models
• Soft clustering models
29
Distributional hypothesis
He filled the wampimuk, passed it
around and we all drunk some
We found a little, hairy wampimuk
sleeping behind the tree
(McDonald & Ramscar 2001)
30
Distributional semantics
Landauer and Dumais (1997), Turney and Pantel (2010), …
31
Distributional semantics
Distributional meaning as co-occurrence vector:
32
Distributional representations
• Taking it further:
• Continuous word embeddings
• Combine vector space semantics with the
prediction of probabilistic models
• Words are represented as a dense vector:
Candy =
33
Word Embeddings: SocherVector Space Model
adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
34
Word Embeddings: SocherVector Space Model
adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
the country of my birth
the place where I was born
35
• Can theoretically (given enough units) approximate
“any” function
• and fit to “any” kind of data
• Efficient for NLP: hidden layers can be used as word
lookup tables
• Dense distributed word vectors + efficient NN
training algorithms:
• Can scale to billions of words !
Why Neural Networks for NLP?
36
• Representation of words as continuous vectors has a
long history (Hinton et al. 1986; Rumelhart et al. 1986;
Elman 1990)
• First neural network language model: NNLM (Bengio
et al. 2001; Bengio et al. 2003) based on earlier ideas of
distributed representations for symbols (Hinton 1986)
How?
37
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA
In a perfect world:
the country of my birth
the place where I was born ?
…
38
Compositionality
Principle of compositionality:
the “meaning (vector) of a
complex expression (sentence)
is determined by:
— Gottlob Frege 

(1848 - 1925)
- the meanings of its constituent
expressions (words) and
- the rules (grammar) used to
combine them”
39
• How do we handle the compositionality of language in
our models?
40
Compositionality
• How do we handle the compositionality of language in
our models?
• Recursion :

the same operator (same parameters) is
applied repeatedly on different components
41
Compositionality
• How do we handle the compositionality of language in
our models?
• Option 1: Recurrent Neural Networks (RNN)
42
RNN 1: Recurrent Neural Networks
• How do we handle the compositionality of language in
our models?
• Option 2: Recursive Neural Networks (also
sometimes called RNN)
43
RNN 2: Recursive Neural Networks
• achieved SOTA in 2011 on
Language Modeling (WSJ AR
task) (Mikolov et al.,
INTERSPEECH 2011):
• and again at ASRU 2011:
44
Recurrent Neural Networks
“Comparison to other LMs shows that RNN
LMs are state of the art by a large margin.
Improvements inrease with more training data.”
“[ RNN LM trained on a] single core on 400M words in a few days,
with 1% absolute improvement in WER on state of the art setup”
Mikolov, T., Karafiat, M., Burget, L., Cernock, J.H., Khudanpur, S. (2011)

Recurrent neural network based language model
45
Recurrent Neural Networks
(simple recurrent 

neural network for LM)
input
hidden layer(s)
output layer
+ sigmoid activation function
+ softmax function:
Mikolov, T., Karafiat, M., Burget, L., Cernock, J.H., Khudanpur, S. (2011)

Recurrent neural network based language model
46
Recurrent Neural Networks
backpropagation through time
47
Recurrent Neural Networks
backpropagation through time
class based recurrent NN
[code (Mikolov’s RNNLM Toolkit) and more info: http://rnnlm.org/ ]
• Recursive Neural
Network for LM (Socher
et al. 2011; Socher
2014)
• achieved SOTA on new
Stanford Sentiment
Treebank dataset (but
comparing it to many
other models):
Recursive Neural Network
48
Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
info & code: http://nlp.stanford.edu/sentiment/
Recursive Neural Tensor Network
49
code & info: http://www.socher.org/index.php/Main/
ParsingNaturalScenesAndNaturalLanguageWithRecursiveNeuralNetworks
Socher, R., Liu, C.C., NG, A.Y., Manning, C.D. (2011) 

Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Recursive Neural Tensor Network
50
• RNN (Socher et al.
2011a)
Recursive Neural Network
51
Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
info & code: http://nlp.stanford.edu/sentiment/
• RNN (Socher et al.
2011a)
• Matrix-Vector RNN
(MV-RNN) (Socher et
al., 2012)
Recursive Neural Network
52
Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
info & code: http://nlp.stanford.edu/sentiment/
• RNN (Socher et al.
2011a)
• Matrix-Vector RNN
(MV-RNN) (Socher et
al., 2012)
• Recursive Neural
Tensor Network (RNTN)
(Socher et al. 2013)
Recursive Neural Network
53
• negation detection:
Recursive Neural Network
54
Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
info & code: http://nlp.stanford.edu/sentiment/
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
Recurrent NN for Vector Space
55
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
INDT NN PRP NN
Compositionality
56
Recurrent NN: CompositionalityRecurrent NN for Vector Space
NP
IN
NP
PRP NN
Parse Tree
DT NN
Compositionality
57
Recurrent NN: CompositionalityRecurrent NN for Vector Space
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
“rules” “meanings”
Compositionality
58
Recurrent NN: CompositionalityRecurrent NN for Vector Space
Vector Space + Word Embeddings: Socher
59
Recurrent NN: CompositionalityRecurrent NN for Vector Space
Vector Space + Word Embeddings: Socher
60
Recurrent NN for Vector Space
Word Embeddings: Turian (2010)
Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning
code & info: http://metaoptimize.com/projects/wordreprs/61
Word Embeddings: Turian (2010)
Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning
code & info: http://metaoptimize.com/projects/wordreprs/
62
Word Embeddings: Collobert & Weston (2011)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011) .
Natural Language Processing (almost) from Scratch
63
Multi-embeddings: Stanford (2012)
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng (2012)

Improving Word Representations via Global Context and Multiple Word Prototypes
64
Linguistic Regularities: Mikolov (2013)
code & info: https://code.google.com/p/word2vec/
Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations
65
Word Embeddings for MT: Mikolov (2013)
Mikolov, T., Le, V. L., Sutskever, I. (2013) . 

Exploiting Similarities among Languages for Machine Translation
66
Word Embeddings for MT: Kiros (2014)
67
Recursive Deep Models & Sentiment: Socher (2013)
Socher, R., Perelygin, A., Wu, J., Chuang, J.,Manning, C., Ng, A., Potts, C. (2013) 

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank.
code & demo: http://nlp.stanford.edu/sentiment/index.html
68
Paragraph Vectors: Le & Mikolov (2014)
Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents
69
• add context (sentence, paragraph, document) to word
vectors during training
!
Results on Stanford Sentiment 

Treebank dataset:
Paragraph Vectors: Dai et al. (2014)
70
Paragraph Vectors: Dai et al. (2014)
71
Paragraph Vectors: Dai et al. (2014)
72
Global Vectors, GloVe: Stanford (2014)
Pennington, P., Socher, R., Manning,. D.M. (2014). 

GloVe: Global Vectors for Word Representation
code & demo: http://nlp.stanford.edu/projects/glove/
vs
results on the word analogy task
“similar accuracy”
73
Dependency-based Embeddings: Levy & Goldberg (2014)
Levy, O., Goldberg, Y. (2014). Dependency-Based Word Embeddings
code & demo: https://levyomer.wordpress.com/2014/04/25/dependency-based-word-
embeddings/
- Syntactic Dependency Context
Australian scientist discovers star with telescope
- Bag of Words (BoW) Context
0.3$
0.4$
0.5$
0.6$
0.7$
0.8$
0.9$
1$
0$ 0.1$ 0.2$ 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$
Precision$
Recall$
“Dependency-based
embeddings have more
functional
similarities”
74
• LSTMS
• Attention
Wanna Play ?
Recent breakthroughs
75
• LSTMS
• Attention
Wanna Play ?
Recent breakthroughs
76
Wanna Play ?
LSTM
77
• LSTMS
• Attention
Wanna Play ?
Recent breakthroughs
78
Attention
Gregor et al (2015) DRAW: A Recurrent Neural Network For Image
Generation (arxiv) (code)
• Question-Answering Systems (&Memory)
• Summarization
• Text Generation
• Dialogue Systems
• Image Captioning & other multimodal tasks
Wanna Play ?
Recent breakthroughs
80
• Question-Answering Systems (&Memory)
• Summarization
• Text Generation
• Dialogue Systems
• Image Captioning & other multimodal tasks
Wanna Play ?
Recent breakthroughs
81
Wanna Play ?
QA & Memory
82
• Memory Networks (Weston et al 2015)
• Dynamic Memory Network (Kumar et al 2015)
• Neural Turing Machine (Graves et al 2014)
Facebook
Metamind
DeepMind
Weston et al (2015) Memory Networks (arxiv)
QA & Memory
83
Yyer et al. (2014) A Neural Network for Factoid Question Answering over
Paragraphs (paper)
Wanna Play ?
QA & Memory
84
• Memory Networks (Weston et al 2015)
• Dynamic Memory Network (Kumar et al 2015)
• Neural Turing Machine (Graves et al 2014)
Facebook
Metamind
DeepMind
Zaremba & Sutskever (2015) Learning to Execute (arxiv)
Wanna Play ?
QA & Memory
85
Babl Dataset
• Question-Answering Systems (&Memory)
• Summarization
• Text Generation
• Dialogue Systems
• Image Captioning & other multimodal tasks
Wanna Play ?
Recent breakthroughs
86
Wanna Play ?
Text generation
87
Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural
Networks (blog)
Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural
Networks (blog)
• Question-Answering Systems (&Memory)
• Summarization
• Text Generation
• Dialogue Systems
• Image Captioning & other multimodal tasks
Wanna Play ?
Recent breakthroughs
91
Image-Text Embeddings
92
Socher et al (2013) Zero Shot Learning Through Cross-Modal Transfer (info)
Image-Captioning
• Andrej Karpathy Li Fei-Fei , 2015. 

Deep Visual-Semantic Alignments for Generating Image Descriptions (pdf) (info) (code)
• Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan , 2015. Show and Tell: A
Neural Image Caption Generator (arxiv)
• Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan
Salakhutdinov, Richard Zemel, Yoshua Bengio, Show, Attend and Tell: Neural Image
Caption Generation with Visual Attention (arxiv) (info) (code)
“A person riding a motorcycle on a dirt road.”???
Image-Captioning
“Two hockey players are fighting over the puck.”???
Image-Captioning
“A stop sign is flying in blue skies.”
“A herd of elephants flying in the blue skies.”
Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan
Salakhutdinov, 2015. Generating Images from Captions
with Attention (arxiv) (examples)
Image-Captioning
• TensorFlow: Recently released library by Google. 

http://tensorflow.org
• Theano - CPU/GPU symbolic expression compiler in python (from
LISA lab at University of Montreal). http://deeplearning.net/software/
theano/
• Caffe - Computer Vision oriented Deep Learning framework:
caffe.berkeleyvision.org
• Torch - Matlab-like environment for state-of-the-art machine learning
algorithms in lua (from Ronan Collobert, Clement Farabet and Koray
Kavukcuoglu) http://torch.ch/
• more info: http://deeplearning.net/software links/
Wanna Play ? General Deep Learning
97
• RNNLM (Mikolov)

http://rnnlm.org
• NB-SVM

https://github.com/mesnilgr/nbsvm
• Word2Vec (skipgrams/cbow)

https://code.google.com/p/word2vec/ (original)

http://radimrehurek.com/gensim/models/word2vec.html (python)
• GloVe

http://nlp.stanford.edu/projects/glove/ (original)

https://github.com/maciejkula/glove-python (python)
• Socher et al / Stanford RNN Sentiment code:

http://nlp.stanford.edu/sentiment/code.html
• Deep Learning without Magic Tutorial:

http://nlp.stanford.edu/courses/NAACL2013/
Wanna Play ? NLP
98
Questions?
roelof@kth.se
www.csc.kth.se/~roelof/
99
Code & Papers:
Collaborative Open Computer Science
.com
@graphific

More Related Content

What's hot

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity RecognitionTomer Lieber
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)WarNik Chow
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERTshaurya uppal
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentationbhavesh_physics
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models Xiaotao Zou
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 

What's hot (20)

Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Text Classification
Text ClassificationText Classification
Text Classification
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
 
BERT introduction
BERT introductionBERT introduction
BERT introduction
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Bert
BertBert
Bert
 
Bert.pptx
Bert.pptxBert.pptx
Bert.pptx
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 

Similar to Deep Learning for Natural Language Processing: Word Embeddings

Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
 
CNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learningCNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learningKv Sagar
 
MixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptxMixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptxMariYam371004
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...Digital Classicist Seminar Berlin
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptOlusolaTop
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information RetrievalRoelof Pieters
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2Karthik Murugesan
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingParrotAI
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Ana Marasović
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA DATASCIENCE
 
A Simple Explanation of XLNet
A Simple Explanation of XLNetA Simple Explanation of XLNet
A Simple Explanation of XLNetDomyoung Lee
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLPSatyam Saxena
 

Similar to Deep Learning for Natural Language Processing: Word Embeddings (20)

Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
CNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learningCNN for NLP using text analysis by using deep learning
CNN for NLP using text analysis by using deep learning
 
MixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptxMixedLanguageProcessingTutorialEMNLP2019.pptx
MixedLanguageProcessingTutorialEMNLP2019.pptx
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
[DCSB] Gregory Crane, Stella Dee, Maryam Foradi, Monica Lent, Maria Moritz (U...
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.ppt
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
1 Introduction.ppt
1 Introduction.ppt1 Introduction.ppt
1 Introduction.ppt
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
L1 nlp intro
L1 nlp introL1 nlp intro
L1 nlp intro
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
A Simple Explanation of XLNet
A Simple Explanation of XLNetA Simple Explanation of XLNet
A Simple Explanation of XLNet
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 

More from Roelof Pieters

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureRoelof Pieters
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity Roelof Pieters
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Roelof Pieters
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleRoelof Pieters
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineRoelof Pieters
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiRoelof Pieters
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadRoelof Pieters
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsRoelof Pieters
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationRoelof Pieters
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorRoelof Pieters
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningRoelof Pieters
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye viewRoelof Pieters
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryRoelof Pieters
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferRoelof Pieters
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRoelof Pieters
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP HackRoelof Pieters
 

More from Roelof Pieters (20)

Speculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain futureSpeculations in anthropology and tech for an uncertain future
Speculations in anthropology and tech for an uncertain future
 
AI assisted creativity
AI assisted creativity AI assisted creativity
AI assisted creativity
 
Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"Creativity and AI: 
Deep Neural Nets "Going Wild"
Creativity and AI: 
Deep Neural Nets "Going Wild"
 
Deep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with styleDeep Neural Networks 
that talk (Back)… with style
Deep Neural Networks 
that talk (Back)… with style
 
Building a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) MachineBuilding a Deep Learning (Dream) Machine
Building a Deep Learning (Dream) Machine
 
Multi-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative aiMulti-modal embeddings: from discriminative to generative models and creative ai
Multi-modal embeddings: from discriminative to generative models and creative ai
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Explore Data: Data Science + Visualization
Explore Data: Data Science + VisualizationExplore Data: Data Science + Visualization
Explore Data: Data Science + Visualization
 
Deep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog DetectorDeep Learning as a Cat/Dog Detector
Deep Learning as a Cat/Dog Detector
 
Graph, Data-science, and Deep Learning
Graph, Data-science, and Deep LearningGraph, Data-science, and Deep Learning
Graph, Data-science, and Deep Learning
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Learning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionaryLearning to understand phrases by embedding the dictionary
Learning to understand phrases by embedding the dictionary
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transfer
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Recommender Systems, Matrices and Graphs
Recommender Systems, Matrices and GraphsRecommender Systems, Matrices and Graphs
Recommender Systems, Matrices and Graphs
 
Hackathon 2014 NLP Hack
Hackathon 2014 NLP HackHackathon 2014 NLP Hack
Hackathon 2014 NLP Hack
 

Recently uploaded

Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
Motivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdfMotivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdfakankshagupta7348026
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptxBasil Achie
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
Motivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdfMotivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdf
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
 

Deep Learning for Natural Language Processing: Word Embeddings

  • 1. 1 @graphific Roelof Pieters Deep  Learning  for  Natural   Language  Processing:  Word   Embeddings 3  December  2015  
 KTH www.csc.kth.se/~roelof/ roelof@kth.se
  • 3. Can we understand Language ? 1. Language is ambiguous:
 Every sentence has many possible interpretations. 2. Language is productive:
 We will always encounter new words or new constructions 3. Language is culturally specific Some of the challenges in Language Understanding: 3
  • 4. Can we understand Language ? 1. Language is ambiguous:
 Every sentence has many possible interpretations. 2. Language is productive:
 We will always encounter new words or new constructions • plays well with others VB ADV P NN NN NN P DT • fruit flies like a banana NN NN VB DT NN NN VB P DT NN NN NN P DT NN NN VB VB DT NN • the students went to class DT NN VB P NN 4 Some of the challenges in Language Understanding:
  • 5. Can we understand Language ? 1. Language is ambiguous:
 Every sentence has many possible interpretations. 2. Language is productive:
 We will always encounter new words or new constructions 5 Some of the challenges in Language Understanding:
  • 6. [Karlgren 2014, NLP Sthlm Meetup]6
  • 7. Can we understand Language ? 1. Language is ambiguous:
 Every sentence has many possible interpretations. 2. Language is productive:
 We will always encounter new words or new constructions 3. Language is culturally specific Some of the challenges in Language Understanding: 7
  • 8. ML: Traditional Approach 1. Gather as much LABELED data as you can get 2. Throw some algorithms at it (mainly put in an SVM and keep it at that) 3. If you actually have tried more algos: Pick the best 4. Spend hours hand engineering some features / feature selection / dimensionality reduction (PCA, SVD, etc) 5. Repeat… For each new problem/question:: 8
  • 9. Machine Learning for NLP Data Classic Approach: Data is fed into a learning algorithm: Learning 
 Algorithm 9
  • 10. Machine Learning for NLP some of the (many) treebank datasets source: http://www-nlp.stanford.edu/links/statnlp.html#Treebanks ! 10
  • 11. Penn Treebank That’s a lot of “manual” work: 11
  • 12. • the students went to class DT NN VB P NN • plays well with others VB ADV P NN NN NN P DT • fruit flies like a banana NN NN VB DT NN NN VB P DT NN NN NN P DT NN NN VB VB DT NN With a lot of issues: Penn Treebank 12
  • 13. Machine Learning for NLP Learning 
 Algorithm Data “Features” Prediction Prediction/
 Classifier train set test set 13
  • 14. Machine Learning for NLP Learning 
 Algorithm “Features” Prediction Prediction/
 Classifier train set test set 14
  • 15. One Model rules them all ?
 
 DL approaches have been successfully applied to: Deep Learning: Why for NLP ? Automatic summarization Coreference resolution Discourse analysis Machine translation Morphological segmentation Named entity recognition (NER) Natural language generation Natural language understanding Optical character recognition (OCR) Part-of-speech tagging Parsing Question answering Relationship extraction sentence boundary disambiguation Sentiment analysis Speech recognition Speech segmentation Topic segmentation and recognition Word segmentation Word sense disambiguation Information retrieval (IR) Information extraction (IE) Speech processing 15
  • 16. Deep Learning: Why for NLP ? 16
  • 17.
  • 18. • What is the meaning of a word?
 (Lexical semantics) • What is the meaning of a sentence?
 ([Compositional] semantics) • What is the meaning of a longer piece of text? (Discourse semantics) Semantics: Meaning 18
  • 19. • NLP treats words mainly (rule-based/statistical approaches at least) as atomic symbols:
 • or in vector space:
 • also known as “one hot” representation. • Its problem ? Word Representation Love Candy Store [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] Candy [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] AND Store [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 …] = 0 ! 19
  • 21. • Structure corresponds to meaning: Structure and Meaning 21
  • 22. • Semantics • Syntax 22 NLP: what can we work with?
  • 23. • Language models define probability distributions over (natural language) strings or sentences • Joint and Conditional Probability Language Model 23
  • 24. • Language models define probability distributions over (natural language) strings or sentences Language Model 24
  • 25. • Language models define probability distributions over (natural language) strings or sentences Language Model 25
  • 26. Word senses What is the meaning of words? • Most words have many different senses:
 dog = animal or sausage? How are the meanings of different words related? • - Specific relations between senses:
 Animal is more general than dog. • - Semantic fields:
 money is related to bank 26
  • 27. Word senses Polysemy: • A lexeme is polysemous if it has different related senses • bank = financial institution or building Homonyms: • Two lexemes are homonyms if their senses are unrelated, but they happen to have the same spelling and pronunciation • bank = (financial) bank or (river) bank 27
  • 28. Word senses: relations Symmetric relations: • Synonyms: couch/sofa
 Two lemmas with the same sense • Antonyms: cold/hot, rise/fall, in/out
 Two lemmas with the opposite sense Hierarchical relations: • Hypernyms and hyponyms: pet/dog
 The hyponym (dog) is more specific than the hypernym (pet) • Holonyms and meronyms: car/wheel
 The meronym (wheel) is a part of the holonym (car) 28
  • 29. Distributional representations “You shall know a word by the company it keeps”
 (J. R. Firth 1957) One of the most successful ideas of modern statistical NLP! these words represent banking • Hard (class based) clustering models • Soft clustering models 29
  • 30. Distributional hypothesis He filled the wampimuk, passed it around and we all drunk some We found a little, hairy wampimuk sleeping behind the tree (McDonald & Ramscar 2001) 30
  • 31. Distributional semantics Landauer and Dumais (1997), Turney and Pantel (2010), … 31
  • 32. Distributional semantics Distributional meaning as co-occurrence vector: 32
  • 33. Distributional representations • Taking it further: • Continuous word embeddings • Combine vector space semantics with the prediction of probabilistic models • Words are represented as a dense vector: Candy = 33
  • 34. Word Embeddings: SocherVector Space Model adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: 34
  • 35. Word Embeddings: SocherVector Space Model adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born 35
  • 36. • Can theoretically (given enough units) approximate “any” function • and fit to “any” kind of data • Efficient for NLP: hidden layers can be used as word lookup tables • Dense distributed word vectors + efficient NN training algorithms: • Can scale to billions of words ! Why Neural Networks for NLP? 36
  • 37. • Representation of words as continuous vectors has a long history (Hinton et al. 1986; Rumelhart et al. 1986; Elman 1990) • First neural network language model: NNLM (Bengio et al. 2001; Bengio et al. 2003) based on earlier ideas of distributed representations for symbols (Hinton 1986) How? 37
  • 38. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born ? … 38
  • 39. Compositionality Principle of compositionality: the “meaning (vector) of a complex expression (sentence) is determined by: — Gottlob Frege 
 (1848 - 1925) - the meanings of its constituent expressions (words) and - the rules (grammar) used to combine them” 39
  • 40. • How do we handle the compositionality of language in our models? 40 Compositionality
  • 41. • How do we handle the compositionality of language in our models? • Recursion :
 the same operator (same parameters) is applied repeatedly on different components 41 Compositionality
  • 42. • How do we handle the compositionality of language in our models? • Option 1: Recurrent Neural Networks (RNN) 42 RNN 1: Recurrent Neural Networks
  • 43. • How do we handle the compositionality of language in our models? • Option 2: Recursive Neural Networks (also sometimes called RNN) 43 RNN 2: Recursive Neural Networks
  • 44. • achieved SOTA in 2011 on Language Modeling (WSJ AR task) (Mikolov et al., INTERSPEECH 2011): • and again at ASRU 2011: 44 Recurrent Neural Networks “Comparison to other LMs shows that RNN LMs are state of the art by a large margin. Improvements inrease with more training data.” “[ RNN LM trained on a] single core on 400M words in a few days, with 1% absolute improvement in WER on state of the art setup” Mikolov, T., Karafiat, M., Burget, L., Cernock, J.H., Khudanpur, S. (2011)
 Recurrent neural network based language model
  • 45. 45 Recurrent Neural Networks (simple recurrent 
 neural network for LM) input hidden layer(s) output layer + sigmoid activation function + softmax function: Mikolov, T., Karafiat, M., Burget, L., Cernock, J.H., Khudanpur, S. (2011)
 Recurrent neural network based language model
  • 47. 47 Recurrent Neural Networks backpropagation through time class based recurrent NN [code (Mikolov’s RNNLM Toolkit) and more info: http://rnnlm.org/ ]
  • 48. • Recursive Neural Network for LM (Socher et al. 2011; Socher 2014) • achieved SOTA on new Stanford Sentiment Treebank dataset (but comparing it to many other models): Recursive Neural Network 48 Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)
 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank info & code: http://nlp.stanford.edu/sentiment/
  • 49. Recursive Neural Tensor Network 49 code & info: http://www.socher.org/index.php/Main/ ParsingNaturalScenesAndNaturalLanguageWithRecursiveNeuralNetworks Socher, R., Liu, C.C., NG, A.Y., Manning, C.D. (2011) 
 Parsing Natural Scenes and Natural Language with Recursive Neural Networks
  • 51. • RNN (Socher et al. 2011a) Recursive Neural Network 51 Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)
 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank info & code: http://nlp.stanford.edu/sentiment/
  • 52. • RNN (Socher et al. 2011a) • Matrix-Vector RNN (MV-RNN) (Socher et al., 2012) Recursive Neural Network 52 Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)
 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank info & code: http://nlp.stanford.edu/sentiment/
  • 53. • RNN (Socher et al. 2011a) • Matrix-Vector RNN (MV-RNN) (Socher et al., 2012) • Recursive Neural Tensor Network (RNTN) (Socher et al. 2013) Recursive Neural Network 53
  • 54. • negation detection: Recursive Neural Network 54 Socher, R., Perelygin,, A., Wu, J.Y., Chuang, J., Manning, C.D., Ng, A.Y., Potts, C. (2013)
 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank info & code: http://nlp.stanford.edu/sentiment/
  • 55. NP PP/IN NP DT NN PRP$ NN Parse Tree Recurrent NN for Vector Space 55
  • 56. NP PP/IN NP DT NN PRP$ NN Parse Tree INDT NN PRP NN Compositionality 56 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  • 57. NP IN NP PRP NN Parse Tree DT NN Compositionality 57 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  • 58. NP IN NP DT NN PRP NN PP NP (S / ROOT) “rules” “meanings” Compositionality 58 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  • 59. Vector Space + Word Embeddings: Socher 59 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  • 60. Vector Space + Word Embeddings: Socher 60 Recurrent NN for Vector Space
  • 61. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/61
  • 62. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/ 62
  • 63. Word Embeddings: Collobert & Weston (2011) Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011) . Natural Language Processing (almost) from Scratch 63
  • 64. Multi-embeddings: Stanford (2012) Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng (2012)
 Improving Word Representations via Global Context and Multiple Word Prototypes 64
  • 65. Linguistic Regularities: Mikolov (2013) code & info: https://code.google.com/p/word2vec/ Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations 65
  • 66. Word Embeddings for MT: Mikolov (2013) Mikolov, T., Le, V. L., Sutskever, I. (2013) . 
 Exploiting Similarities among Languages for Machine Translation 66
  • 67. Word Embeddings for MT: Kiros (2014) 67
  • 68. Recursive Deep Models & Sentiment: Socher (2013) Socher, R., Perelygin, A., Wu, J., Chuang, J.,Manning, C., Ng, A., Potts, C. (2013) 
 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. code & demo: http://nlp.stanford.edu/sentiment/index.html 68
  • 69. Paragraph Vectors: Le & Mikolov (2014) Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents 69 • add context (sentence, paragraph, document) to word vectors during training ! Results on Stanford Sentiment 
 Treebank dataset:
  • 70. Paragraph Vectors: Dai et al. (2014) 70
  • 71. Paragraph Vectors: Dai et al. (2014) 71
  • 72. Paragraph Vectors: Dai et al. (2014) 72
  • 73. Global Vectors, GloVe: Stanford (2014) Pennington, P., Socher, R., Manning,. D.M. (2014). 
 GloVe: Global Vectors for Word Representation code & demo: http://nlp.stanford.edu/projects/glove/ vs results on the word analogy task “similar accuracy” 73
  • 74. Dependency-based Embeddings: Levy & Goldberg (2014) Levy, O., Goldberg, Y. (2014). Dependency-Based Word Embeddings code & demo: https://levyomer.wordpress.com/2014/04/25/dependency-based-word- embeddings/ - Syntactic Dependency Context Australian scientist discovers star with telescope - Bag of Words (BoW) Context 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$ 0$ 0.1$ 0.2$ 0.3$ 0.4$ 0.5$ 0.6$ 0.7$ 0.8$ 0.9$ 1$ Precision$ Recall$ “Dependency-based embeddings have more functional similarities” 74
  • 75. • LSTMS • Attention Wanna Play ? Recent breakthroughs 75
  • 76. • LSTMS • Attention Wanna Play ? Recent breakthroughs 76
  • 78. • LSTMS • Attention Wanna Play ? Recent breakthroughs 78
  • 79. Attention Gregor et al (2015) DRAW: A Recurrent Neural Network For Image Generation (arxiv) (code)
  • 80. • Question-Answering Systems (&Memory) • Summarization • Text Generation • Dialogue Systems • Image Captioning & other multimodal tasks Wanna Play ? Recent breakthroughs 80
  • 81. • Question-Answering Systems (&Memory) • Summarization • Text Generation • Dialogue Systems • Image Captioning & other multimodal tasks Wanna Play ? Recent breakthroughs 81
  • 82. Wanna Play ? QA & Memory 82 • Memory Networks (Weston et al 2015) • Dynamic Memory Network (Kumar et al 2015) • Neural Turing Machine (Graves et al 2014) Facebook Metamind DeepMind Weston et al (2015) Memory Networks (arxiv)
  • 83. QA & Memory 83 Yyer et al. (2014) A Neural Network for Factoid Question Answering over Paragraphs (paper)
  • 84. Wanna Play ? QA & Memory 84 • Memory Networks (Weston et al 2015) • Dynamic Memory Network (Kumar et al 2015) • Neural Turing Machine (Graves et al 2014) Facebook Metamind DeepMind Zaremba & Sutskever (2015) Learning to Execute (arxiv)
  • 85. Wanna Play ? QA & Memory 85 Babl Dataset
  • 86. • Question-Answering Systems (&Memory) • Summarization • Text Generation • Dialogue Systems • Image Captioning & other multimodal tasks Wanna Play ? Recent breakthroughs 86
  • 87. Wanna Play ? Text generation 87 Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
  • 88.
  • 89.
  • 90. Karpathy (2015), The Unreasonable Effectiveness of Recurrent Neural Networks (blog)
  • 91. • Question-Answering Systems (&Memory) • Summarization • Text Generation • Dialogue Systems • Image Captioning & other multimodal tasks Wanna Play ? Recent breakthroughs 91
  • 92. Image-Text Embeddings 92 Socher et al (2013) Zero Shot Learning Through Cross-Modal Transfer (info)
  • 93. Image-Captioning • Andrej Karpathy Li Fei-Fei , 2015. 
 Deep Visual-Semantic Alignments for Generating Image Descriptions (pdf) (info) (code) • Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan , 2015. Show and Tell: A Neural Image Caption Generator (arxiv) • Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio, Show, Attend and Tell: Neural Image Caption Generation with Visual Attention (arxiv) (info) (code)
  • 94. “A person riding a motorcycle on a dirt road.”??? Image-Captioning
  • 95. “Two hockey players are fighting over the puck.”??? Image-Captioning
  • 96. “A stop sign is flying in blue skies.” “A herd of elephants flying in the blue skies.” Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov, 2015. Generating Images from Captions with Attention (arxiv) (examples) Image-Captioning
  • 97. • TensorFlow: Recently released library by Google. 
 http://tensorflow.org • Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http://deeplearning.net/software/ theano/ • Caffe - Computer Vision oriented Deep Learning framework: caffe.berkeleyvision.org • Torch - Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ • more info: http://deeplearning.net/software links/ Wanna Play ? General Deep Learning 97
  • 98. • RNNLM (Mikolov)
 http://rnnlm.org • NB-SVM
 https://github.com/mesnilgr/nbsvm • Word2Vec (skipgrams/cbow)
 https://code.google.com/p/word2vec/ (original)
 http://radimrehurek.com/gensim/models/word2vec.html (python) • GloVe
 http://nlp.stanford.edu/projects/glove/ (original)
 https://github.com/maciejkula/glove-python (python) • Socher et al / Stanford RNN Sentiment code:
 http://nlp.stanford.edu/sentiment/code.html • Deep Learning without Magic Tutorial:
 http://nlp.stanford.edu/courses/NAACL2013/ Wanna Play ? NLP 98