Skip gram and cbow

Skip-gram & CBOW
Hyunyoung2
Natural Language Processing Labs
Skip-gram & CBOW Natural Language Processing Labs

Natural Language Processing Labs
001. F = Wx
002. Skip-gram
003. CBOW
CONTENTS
Skip-gram & CBOW

Skip-gram&CBOW F=Wx Skip-gram CBOW
· F = Wx
- x : one-hot vector of Vocabularies.
- W : vector of each word that we want.
1 2 3 4 5
x : 1 by 5
W : 5 by 5
1
2
3
4
5
x : 1 by 5
W : 5 by 7
1 2 3 4 5 6 7
Dimension of word2vec
1
2
3
4
5
6
7
Always the same
Always the same
Hidden layer
in Neural Network

Skip-gram&CBOW
· Let me explain the architecture of skip-gram.
F=Wx Skip-gram CBOW
1
2
3
4
5
6
7
Sotfmax Cross-entropy
(cost function)
Input vector :
One-hot coding
Hidden Layer
Output Layer
Different!
W’ : Word2Vec we want from skip-gram
Backpropagation to Minimize cost function(Cross-entropy in here)
Center word Window word
Input vector * W Hidden layer * W’

· Let’s say, our vocabulary is {I, like, the, natural, language, processing} from a sentence, “I like the natural
language processing”. and the size of windows is 1.
- a pair consists of {center word, window word skipped}
I like the natural language processing
{I, like}
{like, I}, {like, the}
{the, like}, {the, natural}
{natural, the}, {natural, language}
{language, natural}, {language, processing}
{processing, language}
A sample for an
example of skip-gram

I like the natural language processing {like, I}, {like, the}
A sample for an example
of skip-gram
One-hot vector of “I” 1 0 0 0 0 0
One-hot vector of “like” 0 1 0 0 0 0
One-hot vector of “the” 0 0 1 0 0 0
1
2
3
4
5
6
7
(cost function)
Input vector
Hidden Layer
Output Layer
W, W’ is different!
“like” word “I” word that neural net expects
the real
“I” word
Compare “I” word vector that
neural net expects to the real “I”
word vector
1

I like the natural language processing {like, I}, {like, the}
A sample for an example
of skip-gram
One-hot vector of “I” 1 0 0 0 0 0
One-hot vector of “like” 0 1 0 0 0 0
One-hot vector of “the” 0 0 1 0 0 0
1
2
3
4
5
6
7
(cost function)
Input vector
Hidden Layer
Output Layer
W, W’ is different!
“like” word “the” word that neural net expects
the real
“the” word
Compare “the” word vector that
neural net expects to the real
“the” word vector
2

· Let me explain the architecture of Continuous Bag-of-Word.
1
2
3
4
5
6
7
(cost function)
Hidden Layer
Output Layer
Different!
W’ : Word2Vec we want from CBOW
Center word
Input Layer
Window word
*It is normal to use
Negative Sampling as
cost function

· Let’s say, our vocabulary is {I, like, the, NLP, programming} from a sentence, “I like the NLP programming”.
and the size of windows is 1.
- a pair consists of {[window word], center word}
I like the NLP programming
{ [like], I }
{ [I, the], like }
{ [like, NLP], the }
{ [the, programming], natural }
{ [NLP], language }
A sample for an
example of CBOW

1
2
3
4
5
6
7
Sotfmax
Cross-entropy
(cost function)
Hidden Layer
Output Layer
Different!
W’ : Word2Vec we want from CBOW
Input Layer
“I” word & “the”
word
“like” word that neural net expects
I like the NLP programming { [I, the], like }
A sample for an
example of CBOW
the real
“like” word
Compare expectation of neural
net to the real value

Skip gram and cbow

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Skip gram and cbow

Similar to Skip gram and cbow (20)

More from hyunyoung Lee

More from hyunyoung Lee (20)

Recently uploaded

Recently uploaded (20)

Skip gram and cbow