SlideShare a Scribd company logo
1 of 29
Understanding
GloVe
(Global Vectors for Word Representation)
-
Slides written by Park JeeHyun
27 DEC 2017
Contents
1. Introduction
2. GloVe model
3. GloVe cost function
4. Experiments & Results
5. References
1. Introduction
• The statistics of word occurrences in a corpus is the primary
source of information available to all unsupervised methods
for learning word representations.
• Although many such methods now exist, the question still
remains as to how meaning is generated from these
statistics, and how the resulting word vectors might
represent that meaning.
1. Introduction
• Recent methods for learning vector space representations
of words have succeeded in capturing fine-grained
semantic and syntactic regularities using vector arithmetic,
• But the origin of these regularities has remained opaque.
1. Introduction : pros & cons
2. GloVe model
• Combines the advantages of the two major model families
in the literature:
• global matrix factorization and,
• local context window methods
• Our model efficiently leverages statistical information by
training only on the nonzero elements in a word-word co-
occurrence matrix, rather than on the entire sparse matrix or
on individual context windows in a large corpus.
3. GloVe cost function
• Deriving the Cost Function
• The above argument suggests that the appropriate starting point for word vector
learning should be with ratios of co-occurrence probabilities
rather than the probabilities themselves.
3. GloVe cost function
• Establish some notations
• let the matrix of word-word co-occurrence counts be denoted by 𝑋,
• whose entries 𝑋𝑖𝑗 tabulate the number of times word 𝑗
occurs in the context of word 𝑖
• let 𝑋𝑖 = 𝑘 𝑋𝑖𝑘 be the number of times any word
appears in the context of word 𝑖
• let 𝑃𝑖𝑗 = 𝑃 𝑖 𝑗 = 𝑋𝑖𝑗 𝑋𝑖 be the probability that word j
appear in the context of word 𝑖
3. GloVe cost function
• Deriving the Cost Function
• set a function F that represents ratios of co-occurrence probabilities rather than t
he probabilities themselves
• we would like F to encode the information present the ratio 𝑃𝑖𝑘 𝑃𝑗𝑘 in the
word vector space. Since vector spaces are inherently linear structures,
the most natural way to do this is with vector differences.
3. GloVe cost function
 Note that 𝑤𝑖 and 𝑤 𝑘 are v
ectors from different
vector-spaces
• Deriving the Cost Function
• We note that the arguments of F in Eqn. (2) are vectors while the right-hand sid
e is a scalar.
• While F could be taken to be a complicated function parameterized by,
e.g., a neural network, doing so would obfuscate the linear structure we are tryin
g to capture.
• To avoid this issue, we can first take the dot product of the arguments,
which prevents F from mixing the vector dimensions in undesirable ways.
3. GloVe cost function
• Deriving the Cost Function
• for word-word co-occurrence matrices,
the distinction between a word and a context word is arbitrary
and that we are free to exchange the two roles.
• the symmetry can be restored in two steps.
• First, we require that 𝐹 be a homomorphism between the groups
(ℝ,+) and (ℝ >0, ×), i.e.,
• which, by Eqn. (3), is solved by,
3. GloVe cost function
3. GloVe cost function : homomorphism
Homo (same)
+
Morph (shape)
3. GloVe cost function : homomorphism
• Deriving the Cost Function
• The solution to Eqn. (4) is 𝐹 = 𝑒𝑥𝑝 or,
• Next, we note that Eqn. (6) would exhibit the exchange symmetry
if not for the 𝐥𝐨𝐠(𝑿𝒊) on the right-hand side.
• However, this term is independent of 𝑘 so it can be absorbed into
a bias 𝑏𝑖 for 𝑤𝑖.
• Finally, adding an additional bias 𝑏𝑖 for 𝑤𝑖 restores the symmetry.
3. GloVe cost function
• Deriving the Cost Function
3. GloVe cost function : weighting function
• Deriving the Cost Function
3. GloVe cost function : weighting function
4. Experiments & Results
• Experiments
• Word analogy
• Word similarity
• Named entity recognition (NER)
• Result
4. Experiments : analogy task
• The word analogy task consists of questions like,
“a is to b as c is to ___?”
• The semantic questions, like
“Athens is to Greece as Berlin is to ___?”.
• The syntactic questions like,
“dance is to dancing as fly is to ___?”
4. Experiments : analogy task
4. Experiments : analogy task
4. Experiments : similarity task
• A similarity score is obtained from the word vectors by first
normalizing each feature across the vocabulary and then
calculating the cosine similarity.
• We compute Spearman’s rank correlation coefficient
between this score and the human judgments.
4. Experiments : similarity task
4. Experiments : similarity task
4. Experiments : NER task
• The CoNLL-2003 English benchmark dataset for NER is a
collection of documents from Reuters newswire articles,
annotated with four entity types:
• Person
• Location
• Organization
• Miscellaneous
• We train models on CoNLL-03 training data on test on three
datasets:
1) ConLL-03 testing data
2) ACE Phase 2 (2001-02) and ACE-2003 data
3) MUC7 Formal Run test set.
4. Experiments : NER task
4. Result
• GloVe, is a new global log-bilinear regression model for the
unsupervised learning of word representations that
outperforms other models on word analogy, word similarity,
and named entity recognition tasks.
5. References
• GloVe
• https://nlp.stanford.edu/projects/glove/
• Stanford NLP lecture
• http://web.stanford.edu/class/cs224n/
• https://www.youtube.com/watch?v=ASn7ExxLZws&t=43s
• Socratica’s video about homomorphism
• https://www.youtube.com/watch?v=cYzp5IWqCsg
Understanding GloVe

More Related Content

What's hot

Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSangwoo Mo
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishHemantha Kulathilake
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationSangwoo Mo
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTSuman Debnath
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text ClassificationSai Srinivas Kotni
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronMostafa G. M. Mostafa
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)Shuntaro Yada
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentationMarijn van Zelst
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 

What's hot (20)

Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
NLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for EnglishNLP_KASHK:Context-Free Grammar for English
NLP_KASHK:Context-Free Grammar for English
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
Text classification presentation
Text classification presentationText classification presentation
Text classification presentation
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 

Similar to Understanding GloVe

Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptxNameetDaga1
 
[Emnlp] what is glo ve part iii - towards data science
[Emnlp] what is glo ve  part iii - towards data science[Emnlp] what is glo ve  part iii - towards data science
[Emnlp] what is glo ve part iii - towards data scienceNikhil Jaiswal
 
[Emnlp] what is glo ve part i - towards data science
[Emnlp] what is glo ve  part i - towards data science[Emnlp] what is glo ve  part i - towards data science
[Emnlp] what is glo ve part i - towards data scienceNikhil Jaiswal
 
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
 A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP TasksMasahiro Kaneko
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelHemantha Kulathilake
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data scienceNikhil Jaiswal
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptxGowrySailaja
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERTAbdurrahimDerric
 
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCEDETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCEAbdurrahimDerric
 
Interface for Finding Close Matches from Translation Memory
Interface for Finding Close Matches from Translation MemoryInterface for Finding Close Matches from Translation Memory
Interface for Finding Close Matches from Translation MemoryPriyatham Bollimpalli
 
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...Yuki Tomo
 
Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsVincenzo Lomonaco
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnRwanEnan
 
Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...
Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...
Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...Jinho Choi
 
Intrinsic and Extrinsic Evaluations of Word Embeddings
Intrinsic and Extrinsic Evaluations of Word EmbeddingsIntrinsic and Extrinsic Evaluations of Word Embeddings
Intrinsic and Extrinsic Evaluations of Word EmbeddingsJinho Choi
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problemJaeHo Jang
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentationSurya Sg
 

Similar to Understanding GloVe (20)

Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
[Emnlp] what is glo ve part iii - towards data science
[Emnlp] what is glo ve  part iii - towards data science[Emnlp] what is glo ve  part iii - towards data science
[Emnlp] what is glo ve part iii - towards data science
 
[Emnlp] what is glo ve part i - towards data science
[Emnlp] what is glo ve  part i - towards data science[Emnlp] what is glo ve  part i - towards data science
[Emnlp] what is glo ve part i - towards data science
 
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
 A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
 
NLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language ModelNLP_KASHK:Evaluating Language Model
NLP_KASHK:Evaluating Language Model
 
[Emnlp] what is glo ve part ii - towards data science
[Emnlp] what is glo ve  part ii - towards data science[Emnlp] what is glo ve  part ii - towards data science
[Emnlp] what is glo ve part ii - towards data science
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERT
 
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCEDETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
DETERMINING CUSTOMER SATISFACTION IN-ECOMMERCE
 
Interface for Finding Close Matches from Translation Memory
Interface for Finding Close Matches from Translation MemoryInterface for Finding Close Matches from Translation Memory
Interface for Finding Close Matches from Translation Memory
 
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
 
Word embeddings
Word embeddingsWord embeddings
Word embeddings
 
Word2vec on the italian language: first experiments
Word2vec on the italian language: first experimentsWord2vec on the italian language: first experiments
Word2vec on the italian language: first experiments
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...
Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...
Incremental Sense Weight Training for In-depth Interpretation of Contextualiz...
 
Intrinsic and Extrinsic Evaluations of Word Embeddings
Intrinsic and Extrinsic Evaluations of Word EmbeddingsIntrinsic and Extrinsic Evaluations of Word Embeddings
Intrinsic and Extrinsic Evaluations of Word Embeddings
 
Marvin_Capstone
Marvin_CapstoneMarvin_Capstone
Marvin_Capstone
 
wordembedding.pptx
wordembedding.pptxwordembedding.pptx
wordembedding.pptx
 
Open vocabulary problem
Open vocabulary problemOpen vocabulary problem
Open vocabulary problem
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
 

More from JEE HYUN PARK

keti companion classifier
keti companion classifierketi companion classifier
keti companion classifierJEE HYUN PARK
 
Kcc201728apr2017 170828235330
Kcc201728apr2017 170828235330Kcc201728apr2017 170828235330
Kcc201728apr2017 170828235330JEE HYUN PARK
 
neural based_context_representation_learning_for_dialog_act_classification
neural based_context_representation_learning_for_dialog_act_classificationneural based_context_representation_learning_for_dialog_act_classification
neural based_context_representation_learning_for_dialog_act_classificationJEE HYUN PARK
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationJEE HYUN PARK
 
Historical Finance Data
Historical Finance DataHistorical Finance Data
Historical Finance DataJEE HYUN PARK
 
Understanding lstm and its diagrams
Understanding lstm and its diagramsUnderstanding lstm and its diagrams
Understanding lstm and its diagramsJEE HYUN PARK
 
Short-Term Load Forecasting of Australian National Electricity Market by Hier...
Short-Term Load Forecasting of Australian National Electricity Market by Hier...Short-Term Load Forecasting of Australian National Electricity Market by Hier...
Short-Term Load Forecasting of Australian National Electricity Market by Hier...JEE HYUN PARK
 

More from JEE HYUN PARK (9)

keti companion classifier
keti companion classifierketi companion classifier
keti companion classifier
 
[Paper review] BERT
[Paper review] BERT[Paper review] BERT
[Paper review] BERT
 
Kcc201728apr2017 170828235330
Kcc201728apr2017 170828235330Kcc201728apr2017 170828235330
Kcc201728apr2017 170828235330
 
neural based_context_representation_learning_for_dialog_act_classification
neural based_context_representation_learning_for_dialog_act_classificationneural based_context_representation_learning_for_dialog_act_classification
neural based_context_representation_learning_for_dialog_act_classification
 
a deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarizationa deep reinforced model for abstractive summarization
a deep reinforced model for abstractive summarization
 
Historical Finance Data
Historical Finance DataHistorical Finance Data
Historical Finance Data
 
Understanding lstm and its diagrams
Understanding lstm and its diagramsUnderstanding lstm and its diagrams
Understanding lstm and its diagrams
 
KCC2017 28APR2017
KCC2017 28APR2017KCC2017 28APR2017
KCC2017 28APR2017
 
Short-Term Load Forecasting of Australian National Electricity Market by Hier...
Short-Term Load Forecasting of Australian National Electricity Market by Hier...Short-Term Load Forecasting of Australian National Electricity Market by Hier...
Short-Term Load Forecasting of Australian National Electricity Market by Hier...
 

Recently uploaded

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...Call girls in Ahmedabad High profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 

Recently uploaded (20)

Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 

Understanding GloVe

  • 1. Understanding GloVe (Global Vectors for Word Representation) - Slides written by Park JeeHyun 27 DEC 2017
  • 2. Contents 1. Introduction 2. GloVe model 3. GloVe cost function 4. Experiments & Results 5. References
  • 3. 1. Introduction • The statistics of word occurrences in a corpus is the primary source of information available to all unsupervised methods for learning word representations. • Although many such methods now exist, the question still remains as to how meaning is generated from these statistics, and how the resulting word vectors might represent that meaning.
  • 4. 1. Introduction • Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, • But the origin of these regularities has remained opaque.
  • 5. 1. Introduction : pros & cons
  • 6. 2. GloVe model • Combines the advantages of the two major model families in the literature: • global matrix factorization and, • local context window methods • Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word co- occurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus.
  • 7. 3. GloVe cost function
  • 8. • Deriving the Cost Function • The above argument suggests that the appropriate starting point for word vector learning should be with ratios of co-occurrence probabilities rather than the probabilities themselves. 3. GloVe cost function
  • 9. • Establish some notations • let the matrix of word-word co-occurrence counts be denoted by 𝑋, • whose entries 𝑋𝑖𝑗 tabulate the number of times word 𝑗 occurs in the context of word 𝑖 • let 𝑋𝑖 = 𝑘 𝑋𝑖𝑘 be the number of times any word appears in the context of word 𝑖 • let 𝑃𝑖𝑗 = 𝑃 𝑖 𝑗 = 𝑋𝑖𝑗 𝑋𝑖 be the probability that word j appear in the context of word 𝑖 3. GloVe cost function
  • 10. • Deriving the Cost Function • set a function F that represents ratios of co-occurrence probabilities rather than t he probabilities themselves • we would like F to encode the information present the ratio 𝑃𝑖𝑘 𝑃𝑗𝑘 in the word vector space. Since vector spaces are inherently linear structures, the most natural way to do this is with vector differences. 3. GloVe cost function  Note that 𝑤𝑖 and 𝑤 𝑘 are v ectors from different vector-spaces
  • 11. • Deriving the Cost Function • We note that the arguments of F in Eqn. (2) are vectors while the right-hand sid e is a scalar. • While F could be taken to be a complicated function parameterized by, e.g., a neural network, doing so would obfuscate the linear structure we are tryin g to capture. • To avoid this issue, we can first take the dot product of the arguments, which prevents F from mixing the vector dimensions in undesirable ways. 3. GloVe cost function
  • 12. • Deriving the Cost Function • for word-word co-occurrence matrices, the distinction between a word and a context word is arbitrary and that we are free to exchange the two roles. • the symmetry can be restored in two steps. • First, we require that 𝐹 be a homomorphism between the groups (ℝ,+) and (ℝ >0, ×), i.e., • which, by Eqn. (3), is solved by, 3. GloVe cost function
  • 13. 3. GloVe cost function : homomorphism
  • 14. Homo (same) + Morph (shape) 3. GloVe cost function : homomorphism
  • 15. • Deriving the Cost Function • The solution to Eqn. (4) is 𝐹 = 𝑒𝑥𝑝 or, • Next, we note that Eqn. (6) would exhibit the exchange symmetry if not for the 𝐥𝐨𝐠(𝑿𝒊) on the right-hand side. • However, this term is independent of 𝑘 so it can be absorbed into a bias 𝑏𝑖 for 𝑤𝑖. • Finally, adding an additional bias 𝑏𝑖 for 𝑤𝑖 restores the symmetry. 3. GloVe cost function
  • 16. • Deriving the Cost Function 3. GloVe cost function : weighting function
  • 17. • Deriving the Cost Function 3. GloVe cost function : weighting function
  • 18. 4. Experiments & Results • Experiments • Word analogy • Word similarity • Named entity recognition (NER) • Result
  • 19. 4. Experiments : analogy task • The word analogy task consists of questions like, “a is to b as c is to ___?” • The semantic questions, like “Athens is to Greece as Berlin is to ___?”. • The syntactic questions like, “dance is to dancing as fly is to ___?”
  • 20. 4. Experiments : analogy task
  • 21. 4. Experiments : analogy task
  • 22. 4. Experiments : similarity task • A similarity score is obtained from the word vectors by first normalizing each feature across the vocabulary and then calculating the cosine similarity. • We compute Spearman’s rank correlation coefficient between this score and the human judgments.
  • 23. 4. Experiments : similarity task
  • 24. 4. Experiments : similarity task
  • 25. 4. Experiments : NER task • The CoNLL-2003 English benchmark dataset for NER is a collection of documents from Reuters newswire articles, annotated with four entity types: • Person • Location • Organization • Miscellaneous • We train models on CoNLL-03 training data on test on three datasets: 1) ConLL-03 testing data 2) ACE Phase 2 (2001-02) and ACE-2003 data 3) MUC7 Formal Run test set.
  • 26. 4. Experiments : NER task
  • 27. 4. Result • GloVe, is a new global log-bilinear regression model for the unsupervised learning of word representations that outperforms other models on word analogy, word similarity, and named entity recognition tasks.
  • 28. 5. References • GloVe • https://nlp.stanford.edu/projects/glove/ • Stanford NLP lecture • http://web.stanford.edu/class/cs224n/ • https://www.youtube.com/watch?v=ASn7ExxLZws&t=43s • Socratica’s video about homomorphism • https://www.youtube.com/watch?v=cYzp5IWqCsg