2. Outline • Goal
• History
• Word Embedding
• Introduction toWord2Vec
• CBOW
• Skip-Gram
• Parameters
• Implementations
• Other usecases
2
3. When? Who?
• Word2vec was created by a team of
researchers led by Tomas Mikolov at Google.
• Embedding vectors created using the
Word2vec algorithm have many advantages
compared to earlier algorithmssuch as :
latent semantic analysis.
2013
3
12. What is
word2vec?
• Word2vec is a combination of two
techniques
– CBOW(Continuous bag of words)
– Skip-gram model.
• Both of these map word(s) to
word(s).
• learn weights which act as word
vector representations.
Skip-
gram
CBOW
12
13. How it
works?
1. Both input word wi and the output word wj are one-hot
encoded into binary vectors x and y of size V.
2. First, the multiplication of the binary vector xx and the
word embedding matrix W of size V×N gives us the
embedding vector of the input word wi: the i-th row of
the matrix W.
3. The multiplication of the hidden layer and the word
context matrix W′ of size N×W produces the output
one-hot encoded vector y.
13
19. Parametrization • Sub-sampling
– High frequency words often provide little information.
• Dimensionality
– Quality of word embedding increases with higher
dimensionality.
– But after reaching some point, marginal gain will
diminish.
– Typically, the dimensionality of the vectors is set to be
between 100 and 1,000.
• Context window
– The recommended value is 10 for skip-gram and 5 for
CBOW.
19
21. Variants models class
• documents to vector spaceDoc2vec
• There are a lot of noisy text and informal
language structure.tweet2vec
• dealing with item and user similarity is at heart
of lot of recommendation algorithmsitem2vec
• this embedding technique tries to marry best of
both worlds, word2vec and LDALda2vec
21