SlideShare a Scribd company logo
1 of 67
Neural Turing Machine
Mark Chang
大綱
• 神經元 -> 類神經網路
• 短期記憶 -> 類神經網路到深度學習
• 神經圖靈機(Neural Turing Machine)
神經元與動作電位
http://humanphisiology.wikispaces.com/file/view/neuron.png/216460
814/neuron.png
http://upload.wikimedia.org/wikipedia/commons/thumb/4
/4a/Action_potential.svg/1037px-Action_potential.svg.png
神經突觸
http://www.quia.com/files/quia/users/lmcgee/Systems/endocrine-nervous/synapse.gif
模擬神經元
nW1
W2
x1
x2
b
Wb
y
nin
nout
(0,0)
x2
x1
模擬神經元
1
0
二元分類:AND Gate
x1 x2 y
0 0 0
0 1 0
1 0 0
1 1 1
(0,0)
(0,1) (1,1)
(1,0)
0
1
n20
20
b
-30
yx1
x2
二元分類:OR Gate
x1 x2 y
0 0 0
0 1 1
1 0 1
1 1 1
(0,0)
(0,1) (1,1)
(1,0)
0
1
n20
20
b
-10
yx1
x2
XOR Gate ?
(0,0)
(0,1) (1,1)
(1,0)
0
0
1
x1 x2 y
0 0 0
0 1 1
1 0 1
1 1 0
二元分類:XOR Gate
n
-20
20
b
-10
y
(0,0)
(0,1) (1,1)
(1,0)
0
1
(0,0)
(0,1) (1,1)
(1,0)
1
0
(0,0)
(0,1) (1,1)
(1,0)
0
0
1
n1
20
20
b
-30
x1
x2
n2
20
20
b
-10
x1
x2
x1 x2 n1 n2 y
0 0 0 0 0
0 1 0 1 1
1 0 0 1 1
1 1 1 1 0
類神經網路
x
y
n11
n12
n21
n22W12,y
W12,x
b
W11,y
W11,bW12,b
b
W11,x W21,11
W22,12
W21,12
W22,11
W21,bW22,b
z1
z2
Input
Layer
Hidden
Layer
Output
Layer
視覺認知
http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg
(監督式)機器學習的過程
訓練資料 機器學習模型 輸出值
正確答案
對答案
如果答錯了,
要修正模型
機器學習模型測試資料
訓練
完成
輸出值
長期記憶
http://www.pnas.org/content/102/49/17846/F7.large.jpg
訓練類神經網路
• 用隨機值初始化模型參數w
• Forward Propagation
– 用目前的模型參數計算出答案
• 計算錯誤量(用Error Function)
• Backward Propagation
– 用錯誤量來修正模型
訓練類神經網路
訓練資料 機器學習模型 輸出值
正確答案
對答案
如果答錯了,
要修正模型
初始化 Forward
Propagation
Error
Function
Backward
Propagation
初始化
• 將所有的W隨機設成-N~N之間的數
• 每層之間W的值都不能相同
x
y
n11
n12
n21
n22W12,y
W12,x
b
W11,y
W11,bW12,b
b
W11,x W21,11
W22,12
W21,12
W22,11
W21,bW22,b
z1
z2
Forward Propagation
Forward Propagation
Error Function
n21
n22
z1
z2
w1
w0
Gradient Descent
Backward Propagation
Backward Propagation
Backward Propagation
Backward Propagation
Backward Propagation
Backward Propagation
Backward Propagation
Backward Propagation
http://cpmarkchang.logdown.com/posts/277349-neural-network-backward-propagation
實作
• Neural Network
短期記憶
白
白日依山盡,黃河入海流
白日
白日依
…..
白日依山
短期記憶
白 n(白)
日 n(日)
nW1
W2
x1
x2
b
Wb
y
nW1
W2
x1
x2
b
Wb
y
Recurrent Neural Network
白
日 n(n(白),日)
n(白)
依 n(n(n(白),日),依)
類神經網路到深度學習
Feedforward Neural Network Recurrent Neural Network
Long Short Term MemoryNeural Turing Machine
Recurrent Neural Network
把上一個時間點的nout,接回這個時間點的nin
Recurrent Neural Network
….
x0
y0 y1
x1 x2
y2 yt
xt
Recurrent Neural Network
x0 x1 xt-1 xt
y0 y1 yt-1 yt
Backward Propagation Through Time
t = 0 t = 1
Backward Propagation Through Time
http://cpmarkchang.logdown.com/posts/278457-neural-network-recurrent-neural-network
實作
• Recurrent Neural Network
Vanishing Gradient Problem
Long-Short Term Memory
xt m yt
Cin
c cc
k n
b
nout
Memory Cell
kout
CreadCforgetCwrite
mout,t
mout,t-1
Coutmin,t
Long-Short Term Memory
輸入值 Cin
讀取開關 Cread遺忘開關 Cforget寫入開關 Cwrite
輸出值Cout
Long-Short Term Memory
• 寫入開關Cwrite:控制是否可寫入記憶體
Long-Short Term Memory
• 遺忘開關Cforget:控制是否保留之前的值
Long-Short Term Memory
• 讀取開關Cread :控制是否可讀取記憶體
Training: Backward Propagation
http://www.felixgers.de/papers/phd.pdf
Long-Short Term Memory
https://class.coursera.org/neuralnets-2012-001/lecture/95
Neural Turing Machine
Input
Output
Read/Write
Head
controller
Memory
Memory
Memory Address
Memory Block
Block
Length
0 1 … i … n
0
j
m
……
Read Operation
11 2
21 3
42 1
Read Operation:
0 000.9 0.1
0 1 … i … n
Read Vector:
Head Location:
Memory :
1.1
1.0
2.2
Erase Operation
Erase Operation:
0
1
1
11 2
21 3
42 1
0 000.9 0.1
0 1 … i … n
0
j
m
……
11 2
3
1
0.1 1.8
0.2 3.6
Head Location:
Erase Vector:
Memory :
Add Operation
Add Operation:
1
1
0
0 000.9 0.1
0 1 … i … n
11 2
3
1
0.1 1.8
0.2 3.6
2
3
10.2 3.6
1.9
1.9
1.1
1.0
Add Vector:
Memory :
Head Location:
0
j
m
……
Controller
controller
Input
Read Vector:
Head Location:
Output
Add Vector:
Erase Vector:
Addressing
Mechanisms
Content Addressing Parameter:
Interpolation Parameter:
Convolutional Shift Parameter:
Sharpening Parameter:
Memory Key:
0 0000 1
.45 .05 .500 0 0
.45 .05 .50 0 0 0
0 0 0 1 0 0
Head Location:
11 2 04 0
21 3 01 1
42 1 15 00 000.9 0.1
Head Location:
Memory:Previous State
2
3
1
Memory
Key:
00 1
Controller
Outputs
Content
Addressing
Interpolation
Convolutional
Shift
Sharpening
Content Addressing
11 2 04 0
21 3 01 1
42 1 15 0
2
3
1
.16 .16 .16 .16 .16 .160 0000 1 .15 .10 .47 .08 .13 .17
Memory Key:Memory :
Head Location:
找出記憶體 中與 內容相近的位置。
參數 :調整集中度
Interpolation
0 000.9 0.1
0 0000 1
0 0000 1 0 000.9 0.1.45 .05 .50 0 0 0
將讀寫頭位置 與上一個時段位置 結合。
參數 :調整目前的與上個時段的比率
Convolutional Shift
.45 .05 .50 0 0 0 .45 .05 .50 0 0 0
.45.05 .50 0 0 0 .45 .05 .500 0 0
.45 .05 .50 0 0 0
.025 .475 .025 .25 0 .225
01 0 00 1 .5 0 .5
-1 0 1-1 0 1 -1 0 1
將 內的數值做平移。
參數 :調整平移方向
Sharpening
0 0 0 1 0 0 0 .37 0 .62 0 0
0 .45 .05 .50 0 0
.16 .16 .16 .16 .16 .16
使 中的值更集中(或分散)。
參數 :調整集中度
Neural Turing Machine
Implementation
http://awawfumin.blogspot.tw/2015/03/neural-turing-machines-implementation.html
Experiment: Repeat Copy
https://github.com/fumin/ntm
Evolution of Recurrent Neural Network
Recurrent Neural Network
Long Short Term Memory
Neural Turing Machine
短期記憶
可控制記憶體的讀寫
可更靈活地控制記憶體讀寫頭
的位置
實作
• Neural Turing Machine
延伸閱讀
• 機器學習相關
– Logistic Regression
• http://cpmarkchang.logdown.com/posts/189069-logisti-regression-model
– Overfitting and Regularization
• http://cpmarkchang.logdown.com/posts/193261-machine-learning-overfitting-and-regularization
– Model Selection
• http://cpmarkchang.logdown.com/posts/193914-machine-learning-model-selection
• 類神經網路相關
– Neural Network Backward Propagation
• http://cpmarkchang.logdown.com/posts/277349-neural-network-backward-propagation
– Recurrent Neural Network
• http://cpmarkchang.logdown.com/posts/278457-neural-network-recurrent-neural-network
– Long Short Term Memory
• http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf
• http://www.felixgers.de/papers/phd.pdf
– Neural Turing Machine
• http://arxiv.org/pdf/1410.5401.pdf
• http://awawfumin.blogspot.tw/2015/03/neural-turing-machines-implementation.html
線上課程
• 機器學習相關
– https://www.coursera.org/course/ntumlone
– https://www.coursera.org/course/ntumltwo
• 類神經網路相關
– https://www.youtube.com/playlist?list=PL6Xpj9I5
qXYEcOhn7TqghAJ6NAPrNmUBH
– https://www.coursera.org/course/neuralnets
神經圖靈機 原始碼
• https://github.com/fumin/ntm
講者聯絡方式:
• Mark Chang
– facebook:
https://www.facebook.com/ckmarkoh.chang
– Github:http://github.com/ckmarkoh
– Blog:http://cpmarkchang.logdown.com
– email:ckmarkoh at gmail.com
• Fumin
– Github:https://github.com/fumin
– Email:awawfumin at gmail.com

More Related Content

What's hot

程式人雜誌 -- 2014 年4月號
程式人雜誌 -- 2014 年4月號程式人雜誌 -- 2014 年4月號
程式人雜誌 -- 2014 年4月號鍾誠 陳鍾誠
 
Unity遊戲程式設計 - 2D運動與碰撞處理I
Unity遊戲程式設計 - 2D運動與碰撞處理IUnity遊戲程式設計 - 2D運動與碰撞處理I
Unity遊戲程式設計 - 2D運動與碰撞處理I吳錫修 (ShyiShiou Wu)
 
Unity遊戲程式設計(09) 3D物件與光源設定
Unity遊戲程式設計(09) 3D物件與光源設定Unity遊戲程式設計(09) 3D物件與光源設定
Unity遊戲程式設計(09) 3D物件與光源設定吳錫修 (ShyiShiou Wu)
 
Unity遊戲程式設計 - 3D物件與光源設定
Unity遊戲程式設計 - 3D物件與光源設定 Unity遊戲程式設計 - 3D物件與光源設定
Unity遊戲程式設計 - 3D物件與光源設定 吳錫修 (ShyiShiou Wu)
 
強化學習 Reinforcement Learning
強化學習 Reinforcement Learning強化學習 Reinforcement Learning
強化學習 Reinforcement LearningYen-lung Tsai
 
Unity遊戲程式設計(04) 2D運動與碰撞處理I
Unity遊戲程式設計(04) 2D運動與碰撞處理IUnity遊戲程式設計(04) 2D運動與碰撞處理I
Unity遊戲程式設計(04) 2D運動與碰撞處理I吳錫修 (ShyiShiou Wu)
 
Django book20 security
Django book20 securityDjango book20 security
Django book20 securityShih-yi Wei
 
Unity遊戲程式設計(03) 2D動畫製作及應用
Unity遊戲程式設計(03) 2D動畫製作及應用Unity遊戲程式設計(03) 2D動畫製作及應用
Unity遊戲程式設計(03) 2D動畫製作及應用吳錫修 (ShyiShiou Wu)
 
卷積神經網路(Python+TensorFlow+Keras)
卷積神經網路(Python+TensorFlow+Keras)卷積神經網路(Python+TensorFlow+Keras)
卷積神經網路(Python+TensorFlow+Keras)Fuzhou University
 
Opencv Stitching_detailed algorithm introduction
Opencv Stitching_detailed algorithm introductionOpencv Stitching_detailed algorithm introduction
Opencv Stitching_detailed algorithm introductionwilliam zhang
 
component based html5 game engine
component based html5 game enginecomponent based html5 game engine
component based html5 game enginehbbalfred
 
Unity遊戲程式設計(05) 2D移動與碰撞處理II
Unity遊戲程式設計(05) 2D移動與碰撞處理IIUnity遊戲程式設計(05) 2D移動與碰撞處理II
Unity遊戲程式設計(05) 2D移動與碰撞處理II吳錫修 (ShyiShiou Wu)
 
2017 9-12 Deep Learning / Tensorflow
2017 9-12 Deep Learning / Tensorflow2017 9-12 Deep Learning / Tensorflow
2017 9-12 Deep Learning / Tensorflow煒勛 賴
 
人工智慧09_神經網路(TensorFlow+Keras)
人工智慧09_神經網路(TensorFlow+Keras)人工智慧09_神經網路(TensorFlow+Keras)
人工智慧09_神經網路(TensorFlow+Keras)Fuzhou University
 
C1 discrete time signals and systems in the time-domain
C1 discrete time signals and systems in the time-domainC1 discrete time signals and systems in the time-domain
C1 discrete time signals and systems in the time-domainPei-Che Chang
 

What's hot (20)

程式人雜誌 -- 2014 年4月號
程式人雜誌 -- 2014 年4月號程式人雜誌 -- 2014 年4月號
程式人雜誌 -- 2014 年4月號
 
Unity遊戲程式設計 - 2D運動與碰撞處理I
Unity遊戲程式設計 - 2D運動與碰撞處理IUnity遊戲程式設計 - 2D運動與碰撞處理I
Unity遊戲程式設計 - 2D運動與碰撞處理I
 
Unity遊戲程式設計(09) 3D物件與光源設定
Unity遊戲程式設計(09) 3D物件與光源設定Unity遊戲程式設計(09) 3D物件與光源設定
Unity遊戲程式設計(09) 3D物件與光源設定
 
Unity遊戲程式設計 - 3D物件與光源設定
Unity遊戲程式設計 - 3D物件與光源設定 Unity遊戲程式設計 - 3D物件與光源設定
Unity遊戲程式設計 - 3D物件與光源設定
 
強化學習 Reinforcement Learning
強化學習 Reinforcement Learning強化學習 Reinforcement Learning
強化學習 Reinforcement Learning
 
Roll a ball遊戲專案
Roll a ball遊戲專案Roll a ball遊戲專案
Roll a ball遊戲專案
 
Unity遊戲程式設計(04) 2D運動與碰撞處理I
Unity遊戲程式設計(04) 2D運動與碰撞處理IUnity遊戲程式設計(04) 2D運動與碰撞處理I
Unity遊戲程式設計(04) 2D運動與碰撞處理I
 
Django book20 security
Django book20 securityDjango book20 security
Django book20 security
 
Unity遊戲程式設計(03) 2D動畫製作及應用
Unity遊戲程式設計(03) 2D動畫製作及應用Unity遊戲程式設計(03) 2D動畫製作及應用
Unity遊戲程式設計(03) 2D動畫製作及應用
 
卷積神經網路(Python+TensorFlow+Keras)
卷積神經網路(Python+TensorFlow+Keras)卷積神經網路(Python+TensorFlow+Keras)
卷積神經網路(Python+TensorFlow+Keras)
 
Unity遊戲程式設計(01) Unity簡介
Unity遊戲程式設計(01) Unity簡介Unity遊戲程式設計(01) Unity簡介
Unity遊戲程式設計(01) Unity簡介
 
Ch12
Ch12Ch12
Ch12
 
Opencv Stitching_detailed algorithm introduction
Opencv Stitching_detailed algorithm introductionOpencv Stitching_detailed algorithm introduction
Opencv Stitching_detailed algorithm introduction
 
component based html5 game engine
component based html5 game enginecomponent based html5 game engine
component based html5 game engine
 
Unity遊戲程式設計(05) 2D移動與碰撞處理II
Unity遊戲程式設計(05) 2D移動與碰撞處理IIUnity遊戲程式設計(05) 2D移動與碰撞處理II
Unity遊戲程式設計(05) 2D移動與碰撞處理II
 
2017 9-12 Deep Learning / Tensorflow
2017 9-12 Deep Learning / Tensorflow2017 9-12 Deep Learning / Tensorflow
2017 9-12 Deep Learning / Tensorflow
 
R intro 20140716-basic
R intro 20140716-basicR intro 20140716-basic
R intro 20140716-basic
 
人工智慧09_神經網路(TensorFlow+Keras)
人工智慧09_神經網路(TensorFlow+Keras)人工智慧09_神經網路(TensorFlow+Keras)
人工智慧09_神經網路(TensorFlow+Keras)
 
深度學習方法與實作
深度學習方法與實作深度學習方法與實作
深度學習方法與實作
 
C1 discrete time signals and systems in the time-domain
C1 discrete time signals and systems in the time-domainC1 discrete time signals and systems in the time-domain
C1 discrete time signals and systems in the time-domain
 

Viewers also liked

Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing MachinesKato Yuzuru
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringAkram El-Korashy
 
Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing MachinesIlya Kuzovkin
 
Neural turing machine
Neural turing machineNeural turing machine
Neural turing machinetm_2648
 
Differentiable neural conputers
Differentiable neural conputersDifferentiable neural conputers
Differentiable neural conputersnaoto moriyama
 
Neural Network と Universality について
Neural Network と Universality について  Neural Network と Universality について
Neural Network と Universality について Kato Yuzuru
 
ニューラルチューリングマシン入門
ニューラルチューリングマシン入門ニューラルチューリングマシン入門
ニューラルチューリングマシン入門naoto moriyama
 

Viewers also liked (7)

Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing Machines
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question Answering
 
Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing Machines
 
Neural turing machine
Neural turing machineNeural turing machine
Neural turing machine
 
Differentiable neural conputers
Differentiable neural conputersDifferentiable neural conputers
Differentiable neural conputers
 
Neural Network と Universality について
Neural Network と Universality について  Neural Network と Universality について
Neural Network と Universality について
 
ニューラルチューリングマシン入門
ニューラルチューリングマシン入門ニューラルチューリングマシン入門
ニューラルチューリングマシン入門
 

Similar to Neural Turing Machine Tutorial

人工智慧11_遞歸神經網路
人工智慧11_遞歸神經網路人工智慧11_遞歸神經網路
人工智慧11_遞歸神經網路Fuzhou University
 
遞歸神經網路(Python+TensorFlow+Keras)
遞歸神經網路(Python+TensorFlow+Keras)遞歸神經網路(Python+TensorFlow+Keras)
遞歸神經網路(Python+TensorFlow+Keras)Fuzhou University
 
管理员必备的20个 Linux系统监控工具
管理员必备的20个 Linux系统监控工具管理员必备的20个 Linux系统监控工具
管理员必备的20个 Linux系统监控工具wensheng wei
 
2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)
2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)
2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)Simen Li
 
Mahout資料分析基礎入門
Mahout資料分析基礎入門Mahout資料分析基礎入門
Mahout資料分析基礎入門Jhang Raymond
 
Sci2中文指南
Sci2中文指南Sci2中文指南
Sci2中文指南cueb
 
Tutorial of cnn 赵子健9.16
Tutorial of cnn 赵子健9.16Tutorial of cnn 赵子健9.16
Tutorial of cnn 赵子健9.16Zijian Zhao
 
我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptxjames tong
 
7 第七章 学习与进化模型ann
7 第七章 学习与进化模型ann7 第七章 学习与进化模型ann
7 第七章 学习与进化模型annzhang shuren
 
Linux性能监控cpu内存io网络
Linux性能监控cpu内存io网络Linux性能监控cpu内存io网络
Linux性能监控cpu内存io网络lovingprince58
 
人工智慧10_卷積神經網路
人工智慧10_卷積神經網路人工智慧10_卷積神經網路
人工智慧10_卷積神經網路Fuzhou University
 
Tiny6410um hw-20101026
Tiny6410um hw-20101026Tiny6410um hw-20101026
Tiny6410um hw-20101026syed_sifa
 
Chapter 2 Basic Neural Network Architecture_Claire.pdf
Chapter 2 Basic Neural Network Architecture_Claire.pdfChapter 2 Basic Neural Network Architecture_Claire.pdf
Chapter 2 Basic Neural Network Architecture_Claire.pdflearningfqz
 
TENSORFLOW深度學習講座講義(很硬的課程) 4/14
TENSORFLOW深度學習講座講義(很硬的課程) 4/14TENSORFLOW深度學習講座講義(很硬的課程) 4/14
TENSORFLOW深度學習講座講義(很硬的課程) 4/14NTC.im(Notch Training Center)
 
Ch1 系统启动
Ch1 系统启动Ch1 系统启动
Ch1 系统启动guest4d1b8c
 
单片机原理与应用
单片机原理与应用单片机原理与应用
单片机原理与应用greentask
 
Au698 x mp user's manual chinese
Au698 x mp user's manual chineseAu698 x mp user's manual chinese
Au698 x mp user's manual chineseod1ner
 
Deployment instruction trus guard utm 1000
Deployment instruction trus guard utm 1000Deployment instruction trus guard utm 1000
Deployment instruction trus guard utm 1000ahnlabchina
 
阿里巴巴 叶正盛 数据库性能量化
阿里巴巴 叶正盛 数据库性能量化阿里巴巴 叶正盛 数据库性能量化
阿里巴巴 叶正盛 数据库性能量化colderboy17
 

Similar to Neural Turing Machine Tutorial (20)

人工智慧11_遞歸神經網路
人工智慧11_遞歸神經網路人工智慧11_遞歸神經網路
人工智慧11_遞歸神經網路
 
遞歸神經網路(Python+TensorFlow+Keras)
遞歸神經網路(Python+TensorFlow+Keras)遞歸神經網路(Python+TensorFlow+Keras)
遞歸神經網路(Python+TensorFlow+Keras)
 
管理员必备的20个 Linux系统监控工具
管理员必备的20个 Linux系统监控工具管理员必备的20个 Linux系统监控工具
管理员必备的20个 Linux系统监控工具
 
最终版
最终版最终版
最终版
 
2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)
2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)
2018 VLSI/CAD Symposium Tutorial (Aug. 7, 20:00-21:00 Room 3F-VII)
 
Mahout資料分析基礎入門
Mahout資料分析基礎入門Mahout資料分析基礎入門
Mahout資料分析基礎入門
 
Sci2中文指南
Sci2中文指南Sci2中文指南
Sci2中文指南
 
Tutorial of cnn 赵子健9.16
Tutorial of cnn 赵子健9.16Tutorial of cnn 赵子健9.16
Tutorial of cnn 赵子健9.16
 
我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx
 
7 第七章 学习与进化模型ann
7 第七章 学习与进化模型ann7 第七章 学习与进化模型ann
7 第七章 学习与进化模型ann
 
Linux性能监控cpu内存io网络
Linux性能监控cpu内存io网络Linux性能监控cpu内存io网络
Linux性能监控cpu内存io网络
 
人工智慧10_卷積神經網路
人工智慧10_卷積神經網路人工智慧10_卷積神經網路
人工智慧10_卷積神經網路
 
Tiny6410um hw-20101026
Tiny6410um hw-20101026Tiny6410um hw-20101026
Tiny6410um hw-20101026
 
Chapter 2 Basic Neural Network Architecture_Claire.pdf
Chapter 2 Basic Neural Network Architecture_Claire.pdfChapter 2 Basic Neural Network Architecture_Claire.pdf
Chapter 2 Basic Neural Network Architecture_Claire.pdf
 
TENSORFLOW深度學習講座講義(很硬的課程) 4/14
TENSORFLOW深度學習講座講義(很硬的課程) 4/14TENSORFLOW深度學習講座講義(很硬的課程) 4/14
TENSORFLOW深度學習講座講義(很硬的課程) 4/14
 
Ch1 系统启动
Ch1 系统启动Ch1 系统启动
Ch1 系统启动
 
单片机原理与应用
单片机原理与应用单片机原理与应用
单片机原理与应用
 
Au698 x mp user's manual chinese
Au698 x mp user's manual chineseAu698 x mp user's manual chinese
Au698 x mp user's manual chinese
 
Deployment instruction trus guard utm 1000
Deployment instruction trus guard utm 1000Deployment instruction trus guard utm 1000
Deployment instruction trus guard utm 1000
 
阿里巴巴 叶正盛 数据库性能量化
阿里巴巴 叶正盛 数据库性能量化阿里巴巴 叶正盛 数据库性能量化
阿里巴巴 叶正盛 数据库性能量化
 

More from Mark Chang

Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationMark Chang
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationMark Chang
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the WeightsMark Chang
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the WeightsMark Chang
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningMark Chang
 
PAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningPAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningMark Chang
 
Domain Adaptation
Domain AdaptationDomain Adaptation
Domain AdaptationMark Chang
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOWMark Chang
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksMark Chang
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly ProblemMark Chang
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterMark Chang
 
淺談深度學習
淺談深度學習淺談深度學習
淺談深度學習Mark Chang
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用Mark Chang
 
TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用Mark Chang
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習Mark Chang
 
Computational Linguistics week 10
 Computational Linguistics week 10 Computational Linguistics week 10
Computational Linguistics week 10Mark Chang
 

More from Mark Chang (20)

Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the Weights
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the Weights
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep Learning
 
PAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningPAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep Learning
 
Domain Adaptation
Domain AdaptationDomain Adaptation
Domain Adaptation
 
NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOW
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural Networks
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive Writer
 
淺談深度學習
淺談深度學習淺談深度學習
淺談深度學習
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用
 
TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習
 
Computational Linguistics week 10
 Computational Linguistics week 10 Computational Linguistics week 10
Computational Linguistics week 10
 
Neural Doodle
Neural DoodleNeural Doodle
Neural Doodle
 

Neural Turing Machine Tutorial

Editor's Notes

  1. y = \frac{1}{ 1+e^{- ( w_{1} x_{1} + w_{2}x_{2}+w_{b} ) }} & n_{in} = w_{1} x_{1} + w_{2}x_{2}+w_{b} \\ & n_{out} = \frac{1}{1+e^{-n_{in}}}
  2. w_{1}x_{1}+w_{2}x_{2}+w_{b} = 0 w_{1}x_{1}+w_{2}x_{2}+w_{b} < 0 w_{1}x_{1}+w_{2}x_{2}+w_{b} >0
  3. y = \frac{1}{1+e^{-(20x_{1}+20x_{2}-30)}} 20x_{1}+20x_{2}-30 = 0
  4. y = \frac{1}{1+e^{-(20x_{1}+20x_{2}-10)}} 20x_{1}+20x_{2}-30 = 0
  5. & J = -( z_{1} log(n_{21(out)}) + (1-z_{1}) log (1 -n_{21(out)} )) \\ &\mspace{30mu} -( z_{2} log(n_{22(out)}) + (1-z_{2}) log (1 -n_{22(out)} )) \\ & n_{out} \approx 0 \text{ and } z = 0 \Rightarrow J \approx 0 \\ & n_{out} \approx 1 \text{ and } z = 1 \Rightarrow J \approx 0 \\ & n_{out} \approx 0 \text{ and } z = 1 \Rightarrow J \approx \infty \\ & n_{out} \approx 1 \text{ and } z = 0 \Rightarrow J \approx \infty \\
  6. & w_{21,11} \leftarrow w_{21,11} - \eta \dfrac{\partial J}{\partial w_{21,11}} \\ & w_{21,12} \leftarrow w_{21,12} - \eta \dfrac{\partial J}{\partial w_{21,12}} \\ & w_{21,b} \leftarrow w_{21,b} - \eta \dfrac{\partial J}{\partial w_{21,b}} \\ & w_{22,11} \leftarrow w_{21,11} - \eta \dfrac{\partial J}{\partial w_{22,11}} \\ & w_{22,12} \leftarrow w_{21,12} - \eta \dfrac{\partial J}{\partial w_{22,12}} \\ & w_{22,b} \leftarrow w_{21,b} - \eta \dfrac{\partial J}{\partial w_{22,b}} \\ &w_{11,x} \leftarrow w_{11,x} - \eta \dfrac{\partial J}{\partial w_{11,x}} \\ &w_{11,y} \leftarrow w_{11,y} - \eta \dfrac{\partial J}{\partial w_{11,y}} \\ &w_{11,b} \leftarrow w_{11,b} - \eta \dfrac{\partial J}{\partial w_{11,b}} \\ &w_{12,x} \leftarrow w_{12,x} - \eta \dfrac{\partial J}{\partial w_{12,x}} \\ &w_{12,y} \leftarrow w_{12,y} - \eta \dfrac{\partial J}{\partial w_{12,y}} \\ &w_{12,b} \leftarrow w_{12,b} - \eta \dfrac{\partial J}{\partial w_{12,b}} \\ ( – \dfrac{ \partial J}{\partial w_{0}} , – \dfrac{ \partial J}{\partial w_{1}} )
  7. \dfrac{\partial J}{\partial w_{21,11}} = \dfrac{\partial J}{\partial n_{21(out)}} \dfrac{\partial n_{21(out)}}{\partial n_{21(in)}} \dfrac{\partial n_{21(in)}}{\partial w_{21,11}} = (n_{21(out)}-z_{1}) n_{11(out)} \\ \delta_{21(out)} \delta_{21(in)} n_{11(out)} w_{21,11} \leftarrow w_{21,11} - \eta
  8. \dfrac{\partial J}{\partial w_{11,x}} = \dfrac{\partial J}{\partial n_{21(out)}} \dfrac{\partial n_{21(out)}}{\partial n_{21(in)}} \dfrac{\partial n_{21(in)}}{\partial w_{21,11}} w_{11,x} \leftarrow w_{11,x} - \eta \delta_{11(in)} x
  9. & {\color[rgb]{0.597455,0.000000,0.759310}\delta_{11(in)}} =\dfrac{\partial J}{\partial n_{11(in)}} ={\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{21(out)}} } \dfrac{\partial n_{21(out)}}{\partial n_{11(in)}} + {\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{22(out)}}} \dfrac{\partial n_{22(out)}}{\partial n_{11(in)}} \\ & {\color[rgb]{0.597455,0.000000,0.759310}\delta_{11(in)}} =\dfrac{\partial J}{\partial n_{11(in)}} ={\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{21(out)}} } \dfrac{\partial n_{21(out)}}{\partial n_{11(in)}} + {\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{22(out)}}} \dfrac{\partial n_{22(out)}}{\partial n_{11(in)}} \\ &= {\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{21(out)}}} {\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{21(out)}}{\partial n_{21(in)}} } {\color[rgb]{0.795165,0.000000,0.447221}\dfrac{\partial n_{21(in)}}{\partial n_{11(out)}} } {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{11(out)}}{\partial n_{11(in)}} } + {\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J_{2}}{\partial n_{22(out)}} } {\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{22(out)}}{\partial n_{22(in)}} } {\color[rgb]{0.795165,0.000000,0.447221}\dfrac{\partial n_{22(in)}}{\partial n_{11(out)}} } {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{11(out)}}{\partial n_{11(in)}}} \\ &= ({\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{21(out)}}} {\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{21(out)}}{\partial n_{21(in)}} } {\color[rgb]{0.795165,0.000000,0.447221}\dfrac{\partial n_{21(in)}}{\partial n_{11(out)}} } + {\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J_{2}}{\partial n_{22(out)}} } {\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{22(out)}}{\partial n_{22(in)}} } {\color[rgb]{0.795165,0.000000,0.447221}\dfrac{\partial n_{22(in)}}{\partial n_{11(out)}} }) {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{11(out)}}{\partial n_{11(in)}}} \\ &= ( {\color[rgb]{1.000000,0.000000,0.000000}\delta_{21(in)} } {\color[rgb]{0.795165,0.000000,0.447221}w_{21,11} } + {\color[rgb]{1.000000,0.000000,0.000000}\delta_{22(in)} } {\color[rgb]{0.795165,0.000000,0.447221}w_{22,11} }) {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{11(out)}}{\partial n_{11(in)}}} \\
  10. & n_{in,t} = w_{c}x_{t}+ w_{p}n_{out,t-1} + w_{b} \\ & n_{out,t} = \frac{1}{1+e^{-n_{in,t}}} \\
  11. & n_{in,t} = w_{c}x_{t}+ w_{p}n_{out,t-1} + w_{b} \\ & n_{out,t} = \frac{1}{1+e^{-n_{in,t}}} \\
  12. & {\color[rgb]{1.000000,0.000000,0.000000}\delta_{in,0} } = {\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{out,0}} }{\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{out,0}}{\partial n_{in,0}}} \\ & = {\color[rgb]{1.000000,0.500000,0.000000}\delta_{out,0}} {\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{out,0}}{\partial n_{in,0}} } & {\color[rgb]{0.597455,0.000000,0.759310}\delta_{in,0} } {\color[rgb]{0.000000,0.000000,0.000000}=} {\color[rgb]{1.000000,0.500000,0.000000}\dfrac{\partial J}{\partial n_{out,1}} }{\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{out,1}}{\partial n_{in,1}}} {\color[rgb]{0.795165,0.000000,0.447221}\dfrac{\partial n_{in,1}}{\partial n_{out,0} }} {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{out,0}}{\partial n_{in,0} }} \\ & {\color[rgb]{0.000000,0.000000,0.000000}=} {\color[rgb]{1.000000,0.500000,0.000000}\delta_{out,1}} {\color[rgb]{1.000000,0.000000,0.000000}\dfrac{\partial n_{out,1}}{\partial n_{in,1}} } {\color[rgb]{0.795165,0.000000,0.447221}\dfrac{\partial n_{in,1}}{\partial n_{out,0} }} {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{out,0}}{\partial n_{in,0} }} \\ & {\color[rgb]{0.000000,0.000000,0.000000}=} {\color[rgb]{1.000000,0.000000,0.000000}\delta_{in,1} } {\color[rgb]{0.795165,0.000000,0.447221}\dfrac{\partial n_{in,1}}{\partial n_{out,0} }} {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{out,0}}{\partial n_{in,0} }} {\color[rgb]{0.000000,0.000000,0.000000}=} {\color[rgb]{0.795165,0.000000,0.447221}\delta_{out,0} } {\color[rgb]{0.597455,0.000000,0.759310}\dfrac{\partial n_{out,0}}{\partial n_{in,0} }}\\
  13. \delta_{in,s}= \begin{cases} \dfrac{\partial J}{ \partial n_{out,s} } \dfrac{ \partial n_{out,s}}{\partial n_{in,s} } & \text{if } s = t \\ \delta_{in,s+1} \dfrac{ \partial n_{in,s+1}}{\partial n_{out,s} } \dfrac{ \partial n_{out,s}}{\partial n_{in,s} } & \text{otherwise} \end{cases}
  14. \delta_{in,0} = \dfrac{\partial J}{\partial n_{in,0}} = \dfrac{\partial J}{\partial n_{out,t}} \dfrac{\partial n_{out,t} }{\partial n_{in,t}} \dfrac{\partial n_{in,t} }{\partial n_{out,t-1}} ... \dfrac{\partial n_{in,1} }{\partial n_{out,0}} \dfrac{\partial n_{out,0} }{\partial n_{in,0}} \delta_{in,0} = \delta_{out,t} \dfrac{\partial n_{out,t} }{\partial n_{in,t}} \dfrac{\partial n_{in,t} }{\partial n_{out,t-1}} ... \dfrac{\partial n_{in,1} }{\partial n_{out,0}} \dfrac{\partial n_{out,0} }{\partial n_{in,0}}
  15. k_{out} = sigmoid(w_{k,x}x_{t}+w_{k,b}) C_{write} = sigmoid(w_{cw,x}x_{t}+w_{cw,y}y_{t-1}+w_{cw,b}) m_{in,t} = k_{out} C_{write}
  16. C_{forget}= sigmoid(w_{cf,x}x_{t} + w_{cf,y}y_{t} + w_{cf,b}) m_{out,t} = m_{in,t} + C_{forget} m_{out,t-1}
  17. n_{out}=sigmoid(m_{out,t}) C_{read}= sigmoid(w_{cr,x} x_{t} + w_{cr,y} y_{t-1} + w_{cr,b}) C_{out} = n_{out} C_{read}
  18. {\color[rgb]{0.036634,0.303698,0.550063}\dfrac{\partial m_{out,t}}{\partial w_{k,x}} }= {\color[rgb]{0.036634,0.303698,0.550063}\dfrac{\partial m_{in,t}}{\partial w_{k,x}}} + {\color[rgb]{0.615686,0.188235,0.215686}C_{forget} }{\color[rgb]{0.813054,0.443433,0.792399}\dfrac{\partial m_{out,t-1}}{\partial w_{k,x}}}
  19. \begin{bmatrix} r_{0} \\[0.3em] r_{1} \\[0.3em] r_{2} \\[0.3em] \end{bmatrix} =\begin{bmatrix} 1*0.9+2*0.1 \\[0.3em] 1*0.9+1*0.1 \\[0.3em] 2*0.9+4*0.1 \\[0.3em] \end{bmatrix} = \begin{bmatrix} 1.1 \\[0.3em] 1.0 \\[0.3em] 2.2 \\[0.3em] \end{bmatrix} \textbf{r} \leftarrow \sum_{i}w(i)\textbf{M}(i) &\sum_{i}w(i) = 1 \\ & 0 \leq w(i) \leq 1, \forall i \\
  20. \textbf{M}(i) \leftarrow (1-w(i) \textbf{e} ) \textbf{M}(i) 0 \leq e(j) \leq 1, \forall j M= \begin{bmatrix} 1(1-0.9) & 2(1-0.1) & 3 & ... \\[0.3em] 1 & 1 & 2 & ... \\[0.3em] 2(1-0.9) & 4(1-0.1) & 1 & ... \\[0.3em] \end{bmatrix} =\begin{bmatrix} 0.1 & 1.8 & 3 & ... \\[0.3em] 1 & 1 & 2 & ... \\[0.3em] 0.2 & 3.6 & 1 & ... \\[0.3em] \end{bmatrix}
  21. \textbf{M}(i) \leftarrow \textbf{M}(i) + w(i) \textbf{a} M= \begin{bmatrix} 0.1+0.9 & 1.8+0.1 & 3 & ... \\[0.3em] 1.0+0.9 & 1.0+0.1 & 2 & ... \\[0.3em] 0.2 & 3.6 & 1 & ... \\[0.3em] \end{bmatrix} =\begin{bmatrix} 1.0 & 1.9 & 3 & ... \\[0.3em] 1.9 & 1.1 & 2 & ... \\[0.3em] 0.2 & 3.6 & 1 & ... \\[0.3em] \end{bmatrix}
  22. \textbf{k}
  23. w(i) \leftarrow \frac{e^{\beta K[\textbf{k},\textbf{M}(i)] } }{ \sum_{j} e^{ \beta K[\textbf{k},\textbf{M}(j)] } } K[\textbf{u},\textbf{v} ] = \frac{ \textbf{u} \cdot \textbf{v} }{ |\textbf{u}| \cdot |\textbf{v}| }
  24. \textbf{w}_{t} \leftarrow g \textbf{w}_{t} + (1-g) \textbf{w}_{t-1}
  25. w(i) \leftarrow w(i-1) s(1) + w(i)s(i) + w(i+1)s(-1)
  26. w(i) \leftarrow \frac{w(i)^{\gamma}}{\sum{j}w(j)^{\gamma}}