Word2vector

Ashis Kumar Chanda
PhD Candidate
Understanding Word2Vector
Authors: Tomas Mikolov et al. 2013

Contents
• Problem description
• Motivation
• Proposed Method
• Experiments
• Conclusion
• Criticism
2

Problem description
• Every word has a meaning
• But, how can we learn a new word?
• We can check dictionary for its meaning
– It takes time and we are not always ready with dictionary
• Otherwise, we can guess the meaning of a new word from its
context
3
Her limpid prose made even the most difficult subjects accessible to all.
This part helps to guess the meaning of limpid
It would be “pleasant” or “clear”

Problem description
• How machine can understand a word meaning?
• It can translate from a dictionary or word library
– difficult to create and maintain such a library
• However, a word can have different meaning
– neighboring / context words can help to suggest
• Machine should learn word representation itself
4

Word embeddings
• There are many methods to find word embeddings
– Frequency based embeddings
– Count vectors
– TF-IDF
– Co-occurrence matrix
– Skip gram model
– CBOW
• We are going to discuss the last two methods
https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
5

Motivation
• Finding semantic meaning of words
• Learning word from its context words
• Representing word in a low dimensional vector
• Easy to compare two words in vector space
6

Proposed method
• Representing a word as a vector
• How should we learn these vector values
• There are two methods
– 1. Continuous Bag of Word (CBOW)
– 2. Skip gram model (SG)
7
0.2
0
0
0.7
0
0
0
0
…
0
cat
0.1
0.3
0.9
0
0
0
0
0
…
0
dog

Proposed method
• CBOW: use a set of words in fixed length (window) to predict
the middle word
• SG: use a word to predict the surrounding words in a fixed
distance (window)
8

Proposed method
• Scanning words in a window from an article
• Word order is not important in window
• Eg: Many days ago, there was a king who had ……
Here, “king” is our target word = Wt
9
Wt Wt+1 Wt+2Wt-1Wt-2 Window = 5
Next window

Proposed method
• Used a two layer neural network
• First layer is fully connected
• Final layer used softmax function to know probability of one
word with respect of others
• Stochastic gradient descent is used to learn parameters in
back propagation
10

Proposed method
• Representing a word as a vector
• Translate the query tree into a SQL statement
11
0.2
0
0
0.7
0
0
0
0
…
0
cat
0.1
0.3
0.9
0
0
0
0
0
…
0
cat
Fig: collected from Coursera course of Andrew Ng
word
Feature

Conclusions
• Introducing a new state of art in natural language processing
• Big Data is needed to find a good embedding
• Training process takes a long time
• Learned W2V model on wikipedia documents is publicly
available
• Used in many applications successfully
12

Project Links
• https://code.google.com/archive/p/word2vec/
• https://radimrehurek.com/gensim/models/word2vec.html
13

Application on Medical Data
• Medical data contains notes and codes
• Note is a description of patient’s condition and treatments
• Codes are unique values that used to represent diagnosis and
medicine
• There are many standard coding methods, like ICD-9, CPT …
• W2V can be used in medical dataset to know the medical
code embeddings
T. Bai, A. K. Chanda, S. Vucetic, B. L. Egleston. "Joint learning of
representations of medical concepts and words from EHR data". In the
BIBM conference, 2017
14

References
• T. Mikolov, K. Chen, G. Corrado, J. Dean, Ecient estimation of word representations in vector space, CoRR
• abs/1301.3781. arXiv:1301.3781. URL http://arxiv.org/abs/1301.3781
• X. Rong, word2vec parameter learning explained, CoRR abs/1411.2738. arXiv:1411.2738. URL
http://arxiv.org/abs/1411.2738
• T. Bai, A. K. Chanda, B. L. Egleston, S. Vucetic, Joint learning of representations of medical concepts and words from
• EHR data, in: 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017, Kansas City, MO, USA,
November 13-16, 2017, 2017, pp. 764{769. doi:10.1109/BIBM.2017.8217752. URL
https://doi.org/10.1109/BIBM.2017.8217752
15

Word2vector

More Related Content

Similar to Word2vector

More from Ashis Chanda

Recently uploaded

Word2vector

Editor's Notes