Word2Vec Network Structure Explained
Presented by: Subhashis Hazarika (The Ohio State University)
(Visualization Seminar Study)
Related Work
• Efficient Estimation of Word Representations in Vector Space – Mikolov et al.
2013
• Distributed Representations of Words and Phrases and their Compositionality –
Mikolov et al. 2013
• Linguistic Regularities in Continuous Space Word Representations – Mikolov et
al. 2013
• Implementation : https://code.google.com/archive/p/word2vec/ - Mikolov et al.
• word2vec Parameter Learning Explained – Rong 2014
• word2vec Explained: Deriving Mikolov et al’s Negative Sampling Word-
Embedding Method – Goldberg and Levy 2014
Word Embedding
• Atomic Word Representation:
• 1-of-N / one-hot vector encoding
King :
Queen:
Man:
Women:
Child:
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
Word Embedding
• Distributed Word Representation (word2vec):
• 1 word  high-dimensional vector
King :
Queen:
Man:
Women:
Child:
0.99 0.99 0.05 0.7
0.99 0.05 0.93 0.6
0.05 0.99 0.05 0.4
0.02 0.01 0.99 0.5
0.4 0.43 0.45 0.2
Word Embedding
• Reasoning with word vectors:
• Xapple – Xapples ≈ Xcar – Xcars , Xfamily – Xfamilies ≈ Xcar – Xcars
• Addressing analogy:
Word Embedding
• Vector composition to answer questions like “ King – Man + Woman = ? ”
(Queen!)
Word Embedding
Word2Vec overview
Context-based Representation
• Word is represented by context in use.
I eat an apple every day.  eat | apple
Sometimes I like to eat orange as well.  eat | orange
I like to drive my own car to work.  drive | car
Context-based Representation Models
• Continuous Bag-of-words model (CBOW):
I eat an apple every day.  eat, an, every, day | apple
• Skip-gram (SG):
I eat an apple every day.  apple | eat, an, every, day
Word2Vec Neural Net Structure
Basic Neuron Structure
Basic Neuron Structure : Training Process
Multilayer Neural Network
Multilayer Neural Network : Training
Word2Vec Network
Word2Vec Network
Online demo
Intuition
Context-based Representation Models
• Continuous Bag-of-words model (CBOW):
I eat an apple every day.  eat, an, every, day | apple
• Skip-gram (SG):
I eat an apple every day.  apple | eat, an, every, day
Training Scalability
Hierarchical Softmax
Hierarchical Softmax
Hierarchical Softmax
Negative Sample
Interpreting Word Embedding Models
Neural Word Embedding as Implicit Matrix Factorization: Levy & Goldberg, NIPS 2014
Limitations
• Word Ambiguity
• Debuggability
• Sequence
Paragraph Vector / Doc2Vec
• Similar network structure
Distributed Representation of Sentences and Documents: Le & Mikolov 2014
Thank You

Word2Vec Network Structure Explained