SlideShare a Scribd company logo
1 of 6
Download to read offline
[EMNLP] What is GloVe? Part III
An introduction to unsupervised learning of word embeddings from
co-occurrence matrices.
Brendan Whitaker
May 27, 2018 · 5 min read
The nal GloVe model. We haven’t de ned a lot of the variables seen here, but worry not, we’ll get there.
If you’re just joining us, please feel free to read Parts I and II first, as we’re picking up
right where they left off:
[EMNLP] What is GloVe? Part I
An introduction to unsupervised learning of word
embeddings from co-occurrence matrices.
towardsdatascience.com
[EMNLP] What is GloVe? Part II
An introduction to unsupervised learning of word
embeddings from co-occurrence matrices.
towardsdatascience.com
In this article, we’ll discuss one of the newer methods of creating vector space models
of word semantics, more commonly known as word embeddings. The original paper by
J. Pennington, R. Socher, and C. Manning is available here:
http://www.aclweb.org/anthology/D14-1162.This method combines elements from
the two main word embedding models which existed when GloVe, short for “Global
Vectors [for word representation]” was proposed: global matrix factorization and local
context window methods. In Part I, we compared these two different approaches. In
Part II, we began walking through the authors’ development of the GloVe model. Now
we’ll summarize the rest of the derivation.
. . .
Recall that we’re attempting to design a function which maps word vectors to ratios of
co-occurrence probabilities. We have two word vectors which we’d like to discriminate
between, and a context word vector which is used to this effect. Our naive model
simply maps (using magic or whatever) the vectors right to these probabilities.
Unfortunately there are a plethora of different functions that will satisfy these
constraints and thus we must find one which best reflects the relationship we’re trying
to model, namely similarity of meaning.
So then the authors decided to use the vector difference of the two words i and j we’re
comparing as an input instead of both of these words individually, since our output is a
ratio between their co-occurrence probabilities with the context word. So now we have
two arguments, the context word vector, and the vector difference of the two words
we’re comparing. Since the authors wish to take scalar values to scalar values (note the
ratio of probabilities is a scalar), the dot product of these two arguments is taken, and
so the next iteration of our model looks like this:
The next issue we will resolve is that of the labeling of certain words as “context
words”. The problem with this is that the distinction between ordinary word vectors
and context word vectors is in reality arbitrary: there is no distinction. We should be
able to interchange them without causing problems. The way we work around this is by
requiring that F be a homomorphism from the additive group of real numbers to the
multiplicative group of positive real numbers.
Recall from elementary group theory that a homomorphism is a well-defined mapping
which preserves the group operation. So we need the following condition to be
satisfied:
Note addition within the domain of the function and the multiplication in the target
space. Now recall that we said our function’s domain is now scalar, specifically all real
numbers. That means that any input must be the dot product of two word vectors, as
opposed to being a single word vector. This is because if we only take a a single word
vector as an input, it would not be scalar. So we can think of a and b in the above
condition as dot products of two arbitrary word vectors w_a and v_a and w_b and v_b.
Letting V be the vector space where all our word vectors live, we can then rewrite the
condition:
But now remember that we want to define everything in terms of vector differences. So
instead of adding in the domain, we’ll add the additive inverse, i.e. subtract. And since
we want this to be a homomorphism, this will correspond to multiplying by the
multiplicative inverse in the target space (remember the target space is the group of
positive real numbers under multiplication). And this is just division. So we have
With a bit of relabeling to reflect the context word vectors v_a and v_b being equal,
and making use of distributivity in Euclidean space, we arrive at the condition the
authors give:
Now, setting this equation equal to the scalar input model we derived above, we have
and we make the following natural definition for the quantities we’re dividing on the
left:
Recall we defined in Part II X_{ik} to be the number of times word k appears in the
context of word i, and X_i to be the number of times any word appears in the context of
word i.
Now, what remains is to find a function F which behaves like the arbitrary one we’ve
described above. A nice place to start would be something that gives us a natural
homomorphism between the additive and multiplicative real numbers, i.e. a function
that turns addition into multiplication, or vice versa, as long as we have an inverse
where we need it. So what might work?
We’ll answer that question in Part IV 😊 Thanks so much for reading!
[EMNLP] What is GloVe? Part IV
An introduction to unsupervised learning of word
embeddings from co-occurrence matrices.
towardsdatascience.com
Please check out the source paper!
Page 1 of 12
GloVe: Global Vectors for Word Representation
Jeffrey Pennington, Richard Socher, Christopher D. Manning
Computer Science Department, Stanford University, Stanford, CA 94305
jpennin@stanford.edu, richard@socher.org, manning@stanford.edu
Machine Learning Technology Language Communication Arti cial Intelligence
About Help Legal
Abstract
Recent methods for learning vector space
representations of words have succeeded
in capturing fine-grained semantic and
syntactic regularities using vector arith-
metic, but the origin of these regularities
has remained opaque. We analyze and
make explicit the model properties
neededfor such regularities to emerge in word
vectors. The result is a new global log-
bilinear regression model that combines
the advantages of the two major model
families in the literature: global matrix
factorization and local context window
methods. Our model efficiently leverages
statistical information by training only on
the nonzero elements in a word-word co-
occurrence matrix, rather than on the en-
tire sparse matrix or on individual context
windows in a large corpus. The model
pro- duces a vector space with meaningful
sub- structure, as evidenced by its
performance
of 75% on a recent word analogy task. It
also outperforms related models on simi-
larity tasks and named entity recognition.
1 Introduction
Semantic vector space models of language repre-
sent each word with a real-valued vector. These
vectors can be used as features in a variety of
ap- plications, such as information retrieval
(Manning
et al., 2008), document classification (Sebastiani,
2002), question answering (Tellex et al., 2003),
named entity recognition (Turian et al., 2010), and
parsing (Socher et al.,
2013).Most word vector methods rely on the distance
or angle between pairs of word vectors as the pri-
mary method for evaluating the intrinsic quality
of such a set of word representations. Recently,
Mikolov et al. (2013c) introduced a new evalua-
tion scheme based on word analogies that
probes
the finer structure of the word vector space by ex-
amining not the scalar distance between word
vec- tors, but rather their various dimensions of
dif- ference. For example, the analogy “king is to
queen as man is to woman” should be encoded
in the vector space by the vector equation king −
queen = man − woman. This evaluation scheme
favors models that produce dimensions of mean-
ing, thereby capturing the multi-clustering idea of
distributed representations (Bengio, 2009).
The two main model families for learning word
vectors are: 1) global matrix factorization meth-
ods, such as latent semantic analysis (LSA)
(Deer- wester et al., 1990) and 2) local context
window
methods, such as the skip-gram model of Mikolov
et al. (2013c). Currently, both families suffer sig-
nificant drawbacks. While methods like LSA ef-
ficiently leverage statistical information, they do
relatively poorly on the word analogy task, indi-
cating a sub-optimal vector space structure.
Meth- ods like skip-gram may do better on the
analogy
task, but they poorly utilize the statistics of the
cor- pus since they train on separate local
context win- dows instead of on global co-
occurrence counts.
In this work, we analyze the model properties
necessary to produce linear directions of meaning
and argue that global log-bilinear regression mod-
els are appropriate for doing so. We propose a
spe- cific weighted least squares model that trains
on
global word-word co-occurrence counts and thus
makes efficient use of statistics. The model pro-
duces a word vector space with meaningful sub-
structure, as evidenced by its state-of-the-art per-
formance of 75% accuracy on the word analogy
dataset. We also demonstrate that our methods
outperform other current methods on several
wordsimilarity tasks, and also on a common named
en- tity recognition (NER) benchmark.
We provide the source code for the model as
well as trained word vectors at http://nlp.
stanford.edu/projects/glove/.
Page 1 / 12

More Related Content

What's hot

Natural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking ToolsNatural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking ToolsLukas Renggli
 
Entity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessEntity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessFaegheh Hasibi
 
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...Deren Lei
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_basedcyan1d3
 
An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAndre Freitas
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERTAbdurrahimDerric
 
Word Embedding to Document distances
Word Embedding to Document distancesWord Embedding to Document distances
Word Embedding to Document distancesGanesh Borle
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONijnlc
 
Proposed Method for String Transformation using Probablistic Approach
Proposed Method for String Transformation using Probablistic ApproachProposed Method for String Transformation using Probablistic Approach
Proposed Method for String Transformation using Probablistic ApproachEditor IJMTER
 
The Fuzzy Logical Databases
The Fuzzy Logical DatabasesThe Fuzzy Logical Databases
The Fuzzy Logical DatabasesAlaaZ
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for LexicographyLeiden University
 
Do Neural Models Learn Transitivity of Veridical Inference?
Do Neural Models Learn Transitivity of Veridical Inference?Do Neural Models Learn Transitivity of Veridical Inference?
Do Neural Models Learn Transitivity of Veridical Inference?Hitomi Yanaka
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 

What's hot (19)

Natural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking ToolsNatural Language Checking with Program Checking Tools
Natural Language Checking with Program Checking Tools
 
Entity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. EffectivenessEntity Linking in Queries: Efficiency vs. Effectiveness
Entity Linking in Queries: Efficiency vs. Effectiveness
 
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
Deep Reinforcement Learning with Distributional Semantic Rewards for Abstract...
 
Text smilarity02 corpus_based
Text smilarity02 corpus_basedText smilarity02 corpus_based
Text smilarity02 corpus_based
 
An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semantics
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERT
 
Word Embedding to Document distances
Word Embedding to Document distancesWord Embedding to Document distances
Word Embedding to Document distances
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
Proposed Method for String Transformation using Probablistic Approach
Proposed Method for String Transformation using Probablistic ApproachProposed Method for String Transformation using Probablistic Approach
Proposed Method for String Transformation using Probablistic Approach
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
The Fuzzy Logical Databases
The Fuzzy Logical DatabasesThe Fuzzy Logical Databases
The Fuzzy Logical Databases
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
 
Do Neural Models Learn Transitivity of Veridical Inference?
Do Neural Models Learn Transitivity of Veridical Inference?Do Neural Models Learn Transitivity of Veridical Inference?
Do Neural Models Learn Transitivity of Veridical Inference?
 
E r model
E r modelE r model
E r model
 
dbms
dbmsdbms
dbms
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Final ppt
Final pptFinal ppt
Final ppt
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 

Similar to [Emnlp] what is glo ve part iii - towards data science

Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsVincenzo Lomonaco
 
Paper dissected glove_ global vectors for word representation_ explained _ ...
Paper dissected   glove_ global vectors for word representation_ explained _ ...Paper dissected   glove_ global vectors for word representation_ explained _ ...
Paper dissected glove_ global vectors for word representation_ explained _ ...Nikhil Jaiswal
 
Doc format.
Doc format.Doc format.
Doc format.butest
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptxNameetDaga1
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnRwanEnan
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
A neural probabilistic language model
A neural probabilistic language modelA neural probabilistic language model
A neural probabilistic language modelc sharada
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEijnlc
 
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCEDETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCEAbdurrahimDerric
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationRichard Littauer
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEkevig
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)Learnbay Datascience
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)Abhimanyu Dwivedi
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Saeedeh Shekarpour
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptxGowrySailaja
 
bồn tắm jacuzzi.docx
bồn tắm jacuzzi.docxbồn tắm jacuzzi.docx
bồn tắm jacuzzi.docxDarosVitNam
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answeringAli Kabbadj
 

Similar to [Emnlp] what is glo ve part iii - towards data science (20)

Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experiments
 
Paper dissected glove_ global vectors for word representation_ explained _ ...
Paper dissected   glove_ global vectors for word representation_ explained _ ...Paper dissected   glove_ global vectors for word representation_ explained _ ...
Paper dissected glove_ global vectors for word representation_ explained _ ...
 
Doc format.
Doc format.Doc format.
Doc format.
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
A neural probabilistic language model
A neural probabilistic language modelA neural probabilistic language model
A neural probabilistic language model
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
 
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCEDETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentation
 
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEA FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Machine learning session8(svm nlp)
Machine learning   session8(svm nlp)Machine learning   session8(svm nlp)
Machine learning session8(svm nlp)
 
Analyse de sentiment et classification par approche neuronale en Python et Weka
Analyse de sentiment et classification par approche neuronale en Python et WekaAnalyse de sentiment et classification par approche neuronale en Python et Weka
Analyse de sentiment et classification par approche neuronale en Python et Weka
 
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts Metrics for Evaluating Quality of Embeddings for Ontological Concepts
Metrics for Evaluating Quality of Embeddings for Ontological Concepts
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
Aman chaudhary
 Aman chaudhary Aman chaudhary
Aman chaudhary
 
New word analogy corpus
New word analogy corpusNew word analogy corpus
New word analogy corpus
 
bồn tắm jacuzzi.docx
bồn tắm jacuzzi.docxbồn tắm jacuzzi.docx
bồn tắm jacuzzi.docx
 
French machine reading for question answering
French machine reading for question answeringFrench machine reading for question answering
French machine reading for question answering
 

Recently uploaded

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 

Recently uploaded (20)

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 

[Emnlp] what is glo ve part iii - towards data science

  • 1. [EMNLP] What is GloVe? Part III An introduction to unsupervised learning of word embeddings from co-occurrence matrices. Brendan Whitaker May 27, 2018 · 5 min read The nal GloVe model. We haven’t de ned a lot of the variables seen here, but worry not, we’ll get there. If you’re just joining us, please feel free to read Parts I and II first, as we’re picking up right where they left off: [EMNLP] What is GloVe? Part I An introduction to unsupervised learning of word embeddings from co-occurrence matrices. towardsdatascience.com
  • 2. [EMNLP] What is GloVe? Part II An introduction to unsupervised learning of word embeddings from co-occurrence matrices. towardsdatascience.com In this article, we’ll discuss one of the newer methods of creating vector space models of word semantics, more commonly known as word embeddings. The original paper by J. Pennington, R. Socher, and C. Manning is available here: http://www.aclweb.org/anthology/D14-1162.This method combines elements from the two main word embedding models which existed when GloVe, short for “Global Vectors [for word representation]” was proposed: global matrix factorization and local context window methods. In Part I, we compared these two different approaches. In Part II, we began walking through the authors’ development of the GloVe model. Now we’ll summarize the rest of the derivation. . . . Recall that we’re attempting to design a function which maps word vectors to ratios of co-occurrence probabilities. We have two word vectors which we’d like to discriminate between, and a context word vector which is used to this effect. Our naive model simply maps (using magic or whatever) the vectors right to these probabilities. Unfortunately there are a plethora of different functions that will satisfy these constraints and thus we must find one which best reflects the relationship we’re trying to model, namely similarity of meaning. So then the authors decided to use the vector difference of the two words i and j we’re comparing as an input instead of both of these words individually, since our output is a ratio between their co-occurrence probabilities with the context word. So now we have two arguments, the context word vector, and the vector difference of the two words we’re comparing. Since the authors wish to take scalar values to scalar values (note the ratio of probabilities is a scalar), the dot product of these two arguments is taken, and so the next iteration of our model looks like this:
  • 3. The next issue we will resolve is that of the labeling of certain words as “context words”. The problem with this is that the distinction between ordinary word vectors and context word vectors is in reality arbitrary: there is no distinction. We should be able to interchange them without causing problems. The way we work around this is by requiring that F be a homomorphism from the additive group of real numbers to the multiplicative group of positive real numbers. Recall from elementary group theory that a homomorphism is a well-defined mapping which preserves the group operation. So we need the following condition to be satisfied: Note addition within the domain of the function and the multiplication in the target space. Now recall that we said our function’s domain is now scalar, specifically all real numbers. That means that any input must be the dot product of two word vectors, as opposed to being a single word vector. This is because if we only take a a single word vector as an input, it would not be scalar. So we can think of a and b in the above condition as dot products of two arbitrary word vectors w_a and v_a and w_b and v_b. Letting V be the vector space where all our word vectors live, we can then rewrite the condition:
  • 4. But now remember that we want to define everything in terms of vector differences. So instead of adding in the domain, we’ll add the additive inverse, i.e. subtract. And since we want this to be a homomorphism, this will correspond to multiplying by the multiplicative inverse in the target space (remember the target space is the group of positive real numbers under multiplication). And this is just division. So we have With a bit of relabeling to reflect the context word vectors v_a and v_b being equal, and making use of distributivity in Euclidean space, we arrive at the condition the authors give: Now, setting this equation equal to the scalar input model we derived above, we have
  • 5. and we make the following natural definition for the quantities we’re dividing on the left: Recall we defined in Part II X_{ik} to be the number of times word k appears in the context of word i, and X_i to be the number of times any word appears in the context of word i. Now, what remains is to find a function F which behaves like the arbitrary one we’ve described above. A nice place to start would be something that gives us a natural homomorphism between the additive and multiplicative real numbers, i.e. a function that turns addition into multiplication, or vice versa, as long as we have an inverse where we need it. So what might work? We’ll answer that question in Part IV 😊 Thanks so much for reading! [EMNLP] What is GloVe? Part IV An introduction to unsupervised learning of word embeddings from co-occurrence matrices. towardsdatascience.com Please check out the source paper! Page 1 of 12 GloVe: Global Vectors for Word Representation Jeffrey Pennington, Richard Socher, Christopher D. Manning Computer Science Department, Stanford University, Stanford, CA 94305 jpennin@stanford.edu, richard@socher.org, manning@stanford.edu
  • 6. Machine Learning Technology Language Communication Arti cial Intelligence About Help Legal Abstract Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arith- metic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties neededfor such regularities to emerge in word vectors. The result is a new global log- bilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word co- occurrence matrix, rather than on the en- tire sparse matrix or on individual context windows in a large corpus. The model pro- duces a vector space with meaningful sub- structure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on simi- larity tasks and named entity recognition. 1 Introduction Semantic vector space models of language repre- sent each word with a real-valued vector. These vectors can be used as features in a variety of ap- plications, such as information retrieval (Manning et al., 2008), document classification (Sebastiani, 2002), question answering (Tellex et al., 2003), named entity recognition (Turian et al., 2010), and parsing (Socher et al., 2013).Most word vector methods rely on the distance or angle between pairs of word vectors as the pri- mary method for evaluating the intrinsic quality of such a set of word representations. Recently, Mikolov et al. (2013c) introduced a new evalua- tion scheme based on word analogies that probes the finer structure of the word vector space by ex- amining not the scalar distance between word vec- tors, but rather their various dimensions of dif- ference. For example, the analogy “king is to queen as man is to woman” should be encoded in the vector space by the vector equation king − queen = man − woman. This evaluation scheme favors models that produce dimensions of mean- ing, thereby capturing the multi-clustering idea of distributed representations (Bengio, 2009). The two main model families for learning word vectors are: 1) global matrix factorization meth- ods, such as latent semantic analysis (LSA) (Deer- wester et al., 1990) and 2) local context window methods, such as the skip-gram model of Mikolov et al. (2013c). Currently, both families suffer sig- nificant drawbacks. While methods like LSA ef- ficiently leverage statistical information, they do relatively poorly on the word analogy task, indi- cating a sub-optimal vector space structure. Meth- ods like skip-gram may do better on the analogy task, but they poorly utilize the statistics of the cor- pus since they train on separate local context win- dows instead of on global co- occurrence counts. In this work, we analyze the model properties necessary to produce linear directions of meaning and argue that global log-bilinear regression mod- els are appropriate for doing so. We propose a spe- cific weighted least squares model that trains on global word-word co-occurrence counts and thus makes efficient use of statistics. The model pro- duces a word vector space with meaningful sub- structure, as evidenced by its state-of-the-art per- formance of 75% accuracy on the word analogy dataset. We also demonstrate that our methods outperform other current methods on several wordsimilarity tasks, and also on a common named en- tity recognition (NER) benchmark. We provide the source code for the model as well as trained word vectors at http://nlp. stanford.edu/projects/glove/. Page 1 / 12