Deep Learning in NLP
微光國際資訊有限公司 2018/06/09
About this Section
• Preprocessing data
• Work with NLTK + word2vec
• Introduce Recurrent Neural Networks(RNN)
• Introduce Long Short-Term Memory
Networks(LSTM)
Preprocessing Data(Classic)
3
Preprocessing Data(NN)
4
Difference between Classic and NN
5
Classic NN
這叫做 Feature Engineering
這叫做 Feature Engineering
Bag-of-word
N-gram
例句:Bob went to the market to buy some flowers.
[1, 1, 2, 1, 1, 1, 1, 1, 0, 0, 0]
["Bo", "ob", "b ", " w", "we", "en", ..., "me", "e "," f", "fl", "lo", "ow", "we", "er", "rs"]
這叫做 Feature Engineering
Bag-of-word
N-gram
例句:Bob went to the Market to buy some flowers.
[1, 1, 2, 1, 1, 1, 1, 1, 0, 0, 0]
["Bo", "ob", "b ", " w", "we", "en", ..., "me", "e "," f", "fl", "lo", "ow", "we", "er", "rs"]
缺點在於
1. 太過消耗資源
2.需要專家且經驗不易複製
Load Data
9
Import NLTK & use tokenize
10
use word2vec and store it
11
word2vec - distance
12
word2vec - similar
13
word2vec - calculate distance
14
target = father + woman - man
Img that we want u to remember
15
開始NN之前先回想些東西
回想三個東西
• Activation Function
• Loss Function
• Optimizer
Activation Function
18
Loss Function
19
Optimizer
20
About this Section
• Preprocessing data
• Work with NLTK + word2vec
• Introduce Recurrent Neural Networks(RNN)
• Introduce Long Short-Term Memory
Networks(LSTM)
The problem with forward NN
22
已知這些關係
文字預測 James has cat and it like to drink ___ .
圖像表現
The problem with forward NN
23
改變這些關係
文字預測 James has cat and it like to drink ___ .
圖像表現
Begin with RNN
24
已知這些關係
可推導出 Y2
可推導出 Y4
圖像表現
Application in RNN
25
Application in RNN (Img)
26
Application in RNN (Table)
27
Have problem in backpropagation
28
想找到 Error 與 weight 的關係
展開之後
Recall previous video on youtube
29
Have problem in backpropagation with RNN
30
一樣想找到關係
展開之後卻發現...
h 產生 recursive
Limit recursive times.
31
Backpropagation Through Time (BPTT)
有效但會使梯度
消失
爆炸
Another problem in Long-Term Dependence
32
John lives in France. 以下省略五千字…..
這邊省略一點點字..... John speak ___.
距離太遙遠導致記憶消失
LSTM 就這樣出世
About this Section
• Preprocessing data
• Work with NLTK + word2vec
• Introduce Recurrent Neural Networks(RNN)
• Introduce Long Short-Term Memory
Networks(LSTM)
How LSTM differ from RNN ?
35
標準型 RNN LSTM 最小單位
How LSTM differ from RNN ?
36
LSTM
可以決定哪些記憶要忘記,哪些要留下
Application in LSTM
37
What is LSTM ?
38
LSTM 有五個值,四個動作,三個門
• Cell state: This is the internal cell state (that is, memory) of an LSTM cell
• Hidden state: This is the external hidden state used to calculate predictions
• Input gate: This determines how much of the current input is read into the cell state
• Forget gate: This determines how much of the previous cell state is sent into the current cell state
• Output gate: This determines how much of the cell state is output into the hidden state
表過去經驗
表預測結果
• Discard old information(that, forget something…)
• Store which new information in the cell state
• Updated the old cell state Ct-1, into the new cell state Ct
• Output information
How does LSTM works ?
39
• Input gate: This determines how much of the current input is read into the cell state (0~1)
• Forget gate: This determines how much of the previous cell state is sent into the current cell state (0~1
• Output gate: This determines how much of the cell state is output into the hidden state (0~1)
Classical article in LSTM
40
Material in “Understanding LSTM Networks”
41
Discard old information(that, forget something…)
42
• Forget gate: This determines how much of the previous cell state is sent into the current cell state (0~
Store which new information in the cell state
43
• Input gate: This determines how much of the current input is read into the cell state (0~1)
Updated the old cell state Ct-1, into the new cell state Ct
44
Output information
45
• Output gate: This determines how much of the cell state is output into the hidden state
Practice it by yourself
Use these functions to build LSTM cell
47
Input Gate
Output Gate
Forget Gate

Deep learning nlp