Word 2 vector

Ashis Kumar Chanda
PhD Candidate
Understanding Word2Vector
Authors: Tomas Mikolov et al. 2013

Contents
• Problem description
• Motivation
• Proposed Method
• Experiments
• Conclusion
• Criticism

Problem description
• Every word has a meaning
• But, how can we learn a new word?
• We can check dictionary for its meaning
– It takes time and we are not always ready with dictionary
• Otherwise, we can guess the meaning of a new word from its context
Her limpid prose made even the most difficult subjects accessible to all.
This part helps to guess the meaning of limpid
It would be “pleasant” or “clear”

Problem description
• How machine can understand a word meaning?
• It can translate from a dictionary or word library
– difficult to create and maintain such a library
• However, a word can have different meaning
– neighboring / context words can help to suggest
• Machine should learn word representation itself

Word embeddings
• There are many methods to find word embeddings
– Frequency based embeddings
– Count vectors
– TF-IDF
– Co-occurrence matrix
– Skip gram model
– CBOW
• We are going to discuss the last two methods
https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/

Motivation
• Finding semantic meaning of words
• Learning word from its context words
• Representing word in a low dimensional vector
• Easy to compare two words in vector space

Proposed method
• Representing a word as a vector
• How should we learn these vector values
• There are two methods
– 1. Continuous Bag of Word (CBOW)
– 2. Skip gram model (SG)
0.2
0
0
0.7
0
0
0
0
…
0
cat
0.1
0.3
0.9
0
0
0
0
0
…
0
dog

Proposed method
• CBOW: use a set of words in fixed length (window) to predict
the middle word
• SG: use a word to predict the surrounding words in a fixed
distance (window)

Proposed method
• Scanning words in a window from an article
• Word order is not important in window
• Eg: Many days ago, there was a king who had ……
Here, “king” is our target word = Wt
Wt Wt+1 Wt+2Wt-1Wt-2 Window = 5
Next window

Proposed method
• Used a two layer neural network
• First layer is fully connected
• Final layer used softmax function to know probability of one
word with respect of others
• Stochastic gradient descent is used to learn parameters in
back propagation

Proposed method
• Representing a word as a vector
• Translate the query tree into a SQL statement
0.2
0
0
0.7
0
0
0
0
…
0
cat
0.1
0.3
0.9
0
0
0
0
0
…
0
cat
Fig: collected from Coursera course of Andrew Ng
word
Feature

Conclusions
• Introducing a new state of art in natural language processing
• Big Data is needed to find a good embedding
• Training process takes a long time
• Learned W2V model on wikipedia documents is publicly
available
• Used in many applications successfully

Project Links
• https://code.google.com/archive/p/word2vec/
• https://radimrehurek.com/gensim/models/word2vec.html

Application on Medical Data
• Medical data contains notes and codes
• Note is a description of patient’s condition and treatments
• Codes are unique values that used to represent diagnosis and medicine
• There are many standard coding methods, like ICD-9, CPT …
• W2V can be used in medical dataset to know the medical code
embeddings
T. Bai, A. K. Chanda, S. Vucetic, B. L. Egleston. "Joint learning of representations of
medical concepts and words from EHR data". In the BIBM conference, 2017

References
• T. Mikolov, K. Chen, G. Corrado, J. Dean, Ecient estimation of word representations in vector space, CoRR
• abs/1301.3781. arXiv:1301.3781. URL http://arxiv.org/abs/1301.3781
• X. Rong, word2vec parameter learning explained, CoRR abs/1411.2738. arXiv:1411.2738. URL http://
arxiv.org/abs/1411.2738
• T. Bai, A. K. Chanda, B. L. Egleston, S. Vucetic, Joint learning of representations of medical concepts and words from
• EHR data, in: 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017, Kansas City, MO, USA,
November 13-16, 2017, 2017, pp. 764{769. doi:10.1109/BIBM.2017.8217752. URL
https://doi.org/10.1109/BIBM.2017.8217752

Word 2 vector

Recommended

Recommended

More Related Content

Similar to Word 2 vector

Similar to Word 2 vector (20)

More from Ashis Kumar Chanda

More from Ashis Kumar Chanda (20)

Recently uploaded

Recently uploaded (20)

Word 2 vector

Editor's Notes