SlideShare a Scribd company logo
1 of 46
Download to read offline
DIFFERENTIABLE
NEURAL COMPUTER
Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external
memory." Nature 538.7626 (2016): 471-476.
Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:
1410.5401 (2014).
문지형 

(jhmoon@dm.snu.ac.kr)
LAB SEMINAR
SNU DataMining Center
PROBLEM OF RNN
➤ Exploding & vanishing gradient
2
! ! !!!!
if the largest eigenvalue is > 1, 

gradient will explode
if the largest eigenvalue is < 1, 

gradient will vanish
Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016. p.404-405
LSTM & GRU & ATTENTION
➤ Gate를 통해 정보를 효과적으로 관리
➤ Attention을 통해 필요한 시점의 정보를 반영
3
Hochreiter, Sepp, and Jürgen Schmidhuber.
"Long short-term memory." Neural
computation 9.8 (1997): 1735-1780.
Chung, Junyoung, et al. "Gated feedback
recurrent neural networks." International
Conference on Machine Learning. 2015.
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural
machine translation by jointly learning to align and translate." arXiv
preprint arXiv:1409.0473 (2014).
STILL…
➤ LSTM cannot even solve simple tasks like copying
4
Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014).
WHY THIS HAPPENS
➤ Artificial neural networks are remarkably adept at sensory processing, sequence
learning and reinforcement learning
➤ But are limited in their ability to represent variables and data structures and to store
data over long timescales
➤ LSTM simultaneously keeps information and processes computing
5
BASIC IDEA
➤ Modern computers separate computation and memory
➤ Modeling?
6
http://people.idsia.ch/~rupesh/rnnsymposium2016/slides/graves.pdf
TURING MACHINE
➤ 1936년에 Turing이 고안한 추상적 계산 기계. 튜링 머신은 순서에 따라 계산이나 논리 조작을
행하는 장치로, 적절한 기억 장소와 알고리즘만 주어진다면 어떠한 계산이라도 가능함을 보여
주어 현대 컴퓨터의 원형을 제시하였다. ‒ 튜링 머신 [Turing machine] (실험심리학용어사전,
2008., 시그마프레스㈜)
7
VON NEUMANN ARCHITECTURE
➤ Turing Machine은 기계를 사용해 수학적인 문제를 사람처럼 풀기 위한 이론적인 개념
➤ Von Neumann Architecture는 Turing Machine의 개념을 바탕으로 실제 컴퓨터를 

구현하기 위한 구조
8
https://en.wikipedia.org/wiki/Von_Neumann_architecture
NEURAL 

TURING MACHINE
NEURAL TURING MACHINE
➤ Tries to mimic Von Neumann Architecture with neural networks
➤ Neural Network를 사용하므로 gradient descent를 통해 학습이 가능
➤ Controller (LSTM or Feed-forward Network)
A. interacts with the external world via input and output vectors
B. also interacts with a memory matrix using selective read and write operations (heads)
➤ Memory (Matrix)
10
DEALING WITH MEMORY
➤ How to access the specific location of memory
➤ How to read and write to memory
11
DEALING WITH MEMORY
➤ How to access the specific location of memory
➤ How to read and write to memory
12
ADDRESSING
➤ Addressing mechanism
1. Content Addressing
2. Interpolation
3. Convolutional Shift
4. Sharpening
13
DEALING WITH MEMORY
➤ How to access the specific location of memory
➤ How to read and write to memory
14
CONTROLLER
➤ Access memory to read and write
➤ Read
➤ 메모리에서 읽을 정보가 있는 ‘위치’가 주어지면 (1)
➤ 그 위치에 저장된 메모리를 읽어들임 (2)
15
CONTROLLER
➤ Access memory to read and write
➤ Write
➤ Erase: 메모리에 쓰기 전, 필요없는 정보를 지움
16
CONTROLLER
➤ Access memory to read and write
➤ Write
➤ Add: 새로운 정보를 입력
17
DIFFERENTIABLE
NEURAL COMPUTER
DIFFERENTIABLE NEURAL COMPUTER (DNC)
➤ DNC extends NTM addressing the following limitations:
1. Ensuring that blocks of allocated memory do not overlap and interfere
2. Freeing memory that have already been written to
3. Handling of non-contiguous memory through temporal links
19
DIFFERENTIABLE NEURAL COMPUTER (DNC)
➤ More like computer
➤ 컴퓨터가 a+b라는 연산을 할 때, +는 CPU에서, a와 b는 메모리에서 관리
➤ a와 b는 메모리 주소값으로 어떤 값을 주든, +를 할 수 있다면 답을 낼 수 있음
➤ https://youtu.be/B9U8sI7TcMY
20
1 9 2 3 4 7 9 9 8
source edge destination
CONTROLLER NETWORK
➤ LSTM or Feed-forward Network (NTM과 같음)
➤ LSTM:
➤ Feed-forward: function of X_t
21
CONTROLLER NETWORK
➤ LSTM or Feed-forward Network (NTM과 같음)
➤ LSTM:
➤ Feed-forward: function of X_t
22
학습변수
DEALING WITH MEMORY
➤ How to access the specific location of memory
➤ Better than (= more complicate than) NTM
1. 메모리가 서로 겹치지 않도록 처리
2. 오래된 메모리는 제거
3. 메모리의 입력 순서를 기억
➤ How to read and write to memory
➤ Same as NTM
23
DEALING WITH MEMORY
➤ How to access the specific location of memory
➤ Better than (= more complicate than) NTM
1. 메모리가 서로 겹치지 않도록 처리
2. 오래된 메모리는 제거
3. 메모리의 입력 순서를 기억
➤ How to read and write to memory
➤ Same as NTM
24
NTM’S ADDRESSING REVIEW
➤ Addressing mechanism
1. Content Addressing
2. Interpolation
3. Convolutional Shift
4. Sharpening
25
INTERFACE PARAMETERS
➤ Contains informations that can parameterize the memory interactions
26
INTERFACE PARAMETERS
➤ Contains informations that can parameterize the memory interactions
27
INTERFACE PARAMETERS
➤ Contains informations that can parameterize the memory interactions
28
1. 메모리가 서로 겹치지 않도록 처리
2. 오래된 메모리는 제거
3. 메모리의 입력 순서를 기억
MEMORY ADDRESSING
➤ 3 forms of attentions
1. content-based addressing
2. dynamic memory allocation
3. temporal memory linkage
29
CONTENT-BASED ADDRESSING
30
beta는 퍼짐정도를 조절 

(클수록 sharp, 작을수록 smooth)
DYNAMIC MEMORY ALLOCATION
➤ 목적: 메모리의 빈 부분에 새로운 정보를 write
➤ 빈 부분?
➤ memory usage vector
➤ update memory usage vector given previous usage vector(t-1 번째에 쓰여진 메모리의 위치정보는
반영이 되지 않은 상태), locations have just been written to, and memories that have been retained
by the free gates (최근에 읽어들인 정보 = 최근에 소모한 정보, )
31
N
is called memory
retention vector
represents by 

“how much each
locations will not be freed
by the free gates”
DYNAMIC MEMORY ALLOCATION
➤ 목적: 메모리의 빈 부분에 새로운 정보를 write
➤ 새로운 정보를 write?
➤ update된 usage vector의 usage가 적은 순서대로 위치정보를 저장한 새로운 벡터를 생성
➤ 은 가장 usage가 적은 location의 정보를 담고 있음
➤ usage가 적은 location에 새로운 정보가 적힐 확률이 커지도록 allocation vector를 생성
32
2
4
5
7
8
10
11
13
6
12
9
3
1
TEMPORAL MEMORY LINKAGE
➤ 목적: to keep track of consecutively modified memory locations
➤ Linkage matrix
➤ : location j 다음에 location i 에 정보가 저장되었는지를 나타냄
➤ 현재(t) 정보가 쓰일 위치와 바로 직전(t-1)에 정보가 쓰인 위치가 필요
➤ t 시점에 쓰인 위치를 나타내는 precedence weighting vector
➤ 이 정보를 바탕으로 Linkage matrix가 update됨
33
오래된 linkage는 점차 소멸 현재 i에 입력되고 바로 직전 j에 입력되었으면 1
0 1 0 0 0
0 0 0 0 0
0 0 0 1 0
1 0 0 0 0
0 0 0 0 0
2 -> 1 -> 4 -> 3
<Linkage matrix 예>
READ AND WRITE WEIGHTING (ADDRESSING)
➤ Write head allocation
➤ content-based addressing + dynamic memory allocation
➤ 키 값과 유사한 메모리의 위치에 쓰기 + 빈 위치에 쓰기
➤ Read head allocation
➤ content-based addressing + temporal memory linkage
➤ 키 값과 유사한 메모리의 위치에서 불러오기 + 연관된 정보들을 순차적으로 불러오기
34
READ AND WRITE WEIGHTING (ADDRESSING)
➤ Write head allocation
➤ content-based addressing + dynamic memory allocation (빈 위치에 할당시키기)
➤ Read head allocation
➤ content-based addressing + temporal memory linkage (연관된 정보들을 순차적으로 읽어들이기)
35
0 1 0 0 0
0 0 0 0 0
0 0 0 1 0
1 0 0 0 0
0 0 0 0 0
2 -> 1 -> 4 -> 3
0
0
0
1
0
0
0
1
0
0
=
DEALING WITH MEMORY
➤ How to access the specific location of memory
➤ Better than (= more complicate than) NTM
1. 메모리가 서로 겹치지 않도록 처리
2. 오래된 메모리는 제거
3. 메모리의 입력 순서를 기억
➤ How to read and write to memory
➤ Same as NTM
36
READING AND WRITING TO MEMORY
➤ Reading
➤ Writing
37
missing weight implicitly assigned to a null operation
that does not access any of the locations
Reading Writing
REVIEW
38
EXPERIMENT (GRAPH TASK)
➤ Traversal
➤ Training
39
1 9 2 3 4 7 9 9 8
source edge destination
Graph Description Phase Query Phase Answer Phase
1 9 2 3 4 7 0 0 0
0 0 0 3 4 7 0 0 0
1 9 2 3 4 7 9 9 8
9 9 8 3 4 7 9 9 7
EXPERIMENT (GRAPH TASK)
➤ Traversal
➤ London Underground data
40
RESULT (GRAPH TASK)
➤ London Underground
41
EXPERIMENT (GRAPH TASK)
➤ Inference
➤ Training
42
1 9 2 0 0 7 2 9 7
start single edge

relation

예: 아들1
end
Graph Description Phase Query Phase Answer Phase
1 9 2 3 4 7 0 0 0 1 9 2 3 4 7 9 9 8
친손자 (347) 라는 relation이

아들(007)의 아들(007)이라
는 것을 추론
multiple edge

relation

예: 친손자
EXPERIMENT (GRAPH TASK)
➤ Inference
43
SUMMARY
➤ RNN의 gate, attention은 정보가 손실되는 것을 막고, 필요한 정보만을 사용하기 위해 등장
➤ content memory is fragile
➤ can’t increase the amount of memory easily (computation also increases)
➤ computer처럼 memory와 computation을 분리시킨 NTM, DNC를 제안
➤ End-to-End model이기 때문에, gradient descent를 통해 학습 가능
➤ 필요한 부분을 집중적으로 보기 위해 Memory access시 attention할 수 있도록 modeling
➤ Read / Erase / Write (Add)
➤ Memory addressing
44
DNC is also another form of RNN!
https://github.com/deepmind/dnc
REFERENCE
1. Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external
memory." Nature 538.7626 (2016): 471-476.
2. Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:
1410.5401 (2014).
3. https://www.slideshare.net/carpedm20/differentiable-neural-computer
4. https://norman3.github.io/papers/docs/neural_turing_machine.html
5. https://www.youtube.com/watch?v=r5XKzjTFCZQ
6. https://deepmind.com/blog/differentiable-neural-computers/
45
“Thank you

More Related Content

What's hot (20)

Cache
CacheCache
Cache
 
Computer organization memory hierarchy
Computer organization memory hierarchyComputer organization memory hierarchy
Computer organization memory hierarchy
 
Organisation of cache memory
Organisation  of cache memoryOrganisation  of cache memory
Organisation of cache memory
 
Computer architecture cache memory
Computer architecture cache memoryComputer architecture cache memory
Computer architecture cache memory
 
Memory
MemoryMemory
Memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache memory by Foysal
Cache memory by FoysalCache memory by Foysal
Cache memory by Foysal
 
Cache Memory
Cache MemoryCache Memory
Cache Memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cachememory
CachememoryCachememory
Cachememory
 
Unit 5-lecture-1
Unit 5-lecture-1Unit 5-lecture-1
Unit 5-lecture-1
 
Ppt cache vs virtual memory without animation
Ppt cache vs virtual memory without animationPpt cache vs virtual memory without animation
Ppt cache vs virtual memory without animation
 
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
 
04 cache memory
04 cache memory04 cache memory
04 cache memory
 
Cache memory
Cache  memoryCache  memory
Cache memory
 
cache memory
 cache memory cache memory
cache memory
 
Cache Memory
Cache MemoryCache Memory
Cache Memory
 
cache memory
cache memorycache memory
cache memory
 
Cache performance considerations
Cache performance considerationsCache performance considerations
Cache performance considerations
 

Similar to Differentiable Neural Computer

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSharath TS
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Joe Xing
 
Triangulating general intelligence and cortical microcircuits
Triangulating general intelligence and cortical microcircuitsTriangulating general intelligence and cortical microcircuits
Triangulating general intelligence and cortical microcircuitsDileep George
 
モデル高速化百選
モデル高速化百選モデル高速化百選
モデル高速化百選Yusuke Uchida
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemVarad Meru
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets DeconstructedPaul Sterk
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)SungminYou
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 
Introduction to Tree-LSTMs
Introduction to Tree-LSTMsIntroduction to Tree-LSTMs
Introduction to Tree-LSTMsDaniel Perez
 
Introduction to Tensor Flow for Optical Character Recognition (OCR)
Introduction to Tensor Flow for Optical Character Recognition (OCR)Introduction to Tensor Flow for Optical Character Recognition (OCR)
Introduction to Tensor Flow for Optical Character Recognition (OCR)Vincenzo Santopietro
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun YooJaeJun Yoo
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxSagarTekwani4
 
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowNicholas McClure
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databaseslovingprince58
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
Apache cassandra an introduction
Apache cassandra  an introductionApache cassandra  an introduction
Apache cassandra an introductionShehaaz Saif
 

Similar to Differentiable Neural Computer (20)

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
Rnn presentation 2
Rnn presentation 2Rnn presentation 2
Rnn presentation 2
 
Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0Tsinghua invited talk_zhou_xing_v2r0
Tsinghua invited talk_zhou_xing_v2r0
 
Triangulating general intelligence and cortical microcircuits
Triangulating general intelligence and cortical microcircuitsTriangulating general intelligence and cortical microcircuits
Triangulating general intelligence and cortical microcircuits
 
モデル高速化百選
モデル高速化百選モデル高速化百選
モデル高速化百選
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 
Introduction to Tree-LSTMs
Introduction to Tree-LSTMsIntroduction to Tree-LSTMs
Introduction to Tree-LSTMs
 
Introduction to Tensor Flow for Optical Character Recognition (OCR)
Introduction to Tensor Flow for Optical Character Recognition (OCR)Introduction to Tensor Flow for Optical Character Recognition (OCR)
Introduction to Tensor Flow for Optical Character Recognition (OCR)
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo[PR12] Inception and Xception - Jaejun Yoo
[PR12] Inception and Xception - Jaejun Yoo
 
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep LearningAnimesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
Animesh Prasad and Muthu Kumar Chandrasekaran - WESST - Basics of Deep Learning
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptx
 
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in Tensorflow
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
Apache cassandra an introduction
Apache cassandra  an introductionApache cassandra  an introduction
Apache cassandra an introduction
 

Recently uploaded

CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrsaastr
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptssuser319dad
 

Recently uploaded (20)

CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.ppt
 

Differentiable Neural Computer

  • 1. DIFFERENTIABLE NEURAL COMPUTER Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature 538.7626 (2016): 471-476. Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv: 1410.5401 (2014). 문지형 (jhmoon@dm.snu.ac.kr) LAB SEMINAR SNU DataMining Center
  • 2. PROBLEM OF RNN ➤ Exploding & vanishing gradient 2 ! ! !!!! if the largest eigenvalue is > 1, 
 gradient will explode if the largest eigenvalue is < 1, 
 gradient will vanish Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT Press, 2016. p.404-405
  • 3. LSTM & GRU & ATTENTION ➤ Gate를 통해 정보를 효과적으로 관리 ➤ Attention을 통해 필요한 시점의 정보를 반영 3 Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780. Chung, Junyoung, et al. "Gated feedback recurrent neural networks." International Conference on Machine Learning. 2015. Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).
  • 4. STILL… ➤ LSTM cannot even solve simple tasks like copying 4 Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014).
  • 5. WHY THIS HAPPENS ➤ Artificial neural networks are remarkably adept at sensory processing, sequence learning and reinforcement learning ➤ But are limited in their ability to represent variables and data structures and to store data over long timescales ➤ LSTM simultaneously keeps information and processes computing 5
  • 6. BASIC IDEA ➤ Modern computers separate computation and memory ➤ Modeling? 6 http://people.idsia.ch/~rupesh/rnnsymposium2016/slides/graves.pdf
  • 7. TURING MACHINE ➤ 1936년에 Turing이 고안한 추상적 계산 기계. 튜링 머신은 순서에 따라 계산이나 논리 조작을 행하는 장치로, 적절한 기억 장소와 알고리즘만 주어진다면 어떠한 계산이라도 가능함을 보여 주어 현대 컴퓨터의 원형을 제시하였다. ‒ 튜링 머신 [Turing machine] (실험심리학용어사전, 2008., 시그마프레스㈜) 7
  • 8. VON NEUMANN ARCHITECTURE ➤ Turing Machine은 기계를 사용해 수학적인 문제를 사람처럼 풀기 위한 이론적인 개념 ➤ Von Neumann Architecture는 Turing Machine의 개념을 바탕으로 실제 컴퓨터를 
 구현하기 위한 구조 8 https://en.wikipedia.org/wiki/Von_Neumann_architecture
  • 10. NEURAL TURING MACHINE ➤ Tries to mimic Von Neumann Architecture with neural networks ➤ Neural Network를 사용하므로 gradient descent를 통해 학습이 가능 ➤ Controller (LSTM or Feed-forward Network) A. interacts with the external world via input and output vectors B. also interacts with a memory matrix using selective read and write operations (heads) ➤ Memory (Matrix) 10
  • 11. DEALING WITH MEMORY ➤ How to access the specific location of memory ➤ How to read and write to memory 11
  • 12. DEALING WITH MEMORY ➤ How to access the specific location of memory ➤ How to read and write to memory 12
  • 13. ADDRESSING ➤ Addressing mechanism 1. Content Addressing 2. Interpolation 3. Convolutional Shift 4. Sharpening 13
  • 14. DEALING WITH MEMORY ➤ How to access the specific location of memory ➤ How to read and write to memory 14
  • 15. CONTROLLER ➤ Access memory to read and write ➤ Read ➤ 메모리에서 읽을 정보가 있는 ‘위치’가 주어지면 (1) ➤ 그 위치에 저장된 메모리를 읽어들임 (2) 15
  • 16. CONTROLLER ➤ Access memory to read and write ➤ Write ➤ Erase: 메모리에 쓰기 전, 필요없는 정보를 지움 16
  • 17. CONTROLLER ➤ Access memory to read and write ➤ Write ➤ Add: 새로운 정보를 입력 17
  • 19. DIFFERENTIABLE NEURAL COMPUTER (DNC) ➤ DNC extends NTM addressing the following limitations: 1. Ensuring that blocks of allocated memory do not overlap and interfere 2. Freeing memory that have already been written to 3. Handling of non-contiguous memory through temporal links 19
  • 20. DIFFERENTIABLE NEURAL COMPUTER (DNC) ➤ More like computer ➤ 컴퓨터가 a+b라는 연산을 할 때, +는 CPU에서, a와 b는 메모리에서 관리 ➤ a와 b는 메모리 주소값으로 어떤 값을 주든, +를 할 수 있다면 답을 낼 수 있음 ➤ https://youtu.be/B9U8sI7TcMY 20 1 9 2 3 4 7 9 9 8 source edge destination
  • 21. CONTROLLER NETWORK ➤ LSTM or Feed-forward Network (NTM과 같음) ➤ LSTM: ➤ Feed-forward: function of X_t 21
  • 22. CONTROLLER NETWORK ➤ LSTM or Feed-forward Network (NTM과 같음) ➤ LSTM: ➤ Feed-forward: function of X_t 22 학습변수
  • 23. DEALING WITH MEMORY ➤ How to access the specific location of memory ➤ Better than (= more complicate than) NTM 1. 메모리가 서로 겹치지 않도록 처리 2. 오래된 메모리는 제거 3. 메모리의 입력 순서를 기억 ➤ How to read and write to memory ➤ Same as NTM 23
  • 24. DEALING WITH MEMORY ➤ How to access the specific location of memory ➤ Better than (= more complicate than) NTM 1. 메모리가 서로 겹치지 않도록 처리 2. 오래된 메모리는 제거 3. 메모리의 입력 순서를 기억 ➤ How to read and write to memory ➤ Same as NTM 24
  • 25. NTM’S ADDRESSING REVIEW ➤ Addressing mechanism 1. Content Addressing 2. Interpolation 3. Convolutional Shift 4. Sharpening 25
  • 26. INTERFACE PARAMETERS ➤ Contains informations that can parameterize the memory interactions 26
  • 27. INTERFACE PARAMETERS ➤ Contains informations that can parameterize the memory interactions 27
  • 28. INTERFACE PARAMETERS ➤ Contains informations that can parameterize the memory interactions 28 1. 메모리가 서로 겹치지 않도록 처리 2. 오래된 메모리는 제거 3. 메모리의 입력 순서를 기억
  • 29. MEMORY ADDRESSING ➤ 3 forms of attentions 1. content-based addressing 2. dynamic memory allocation 3. temporal memory linkage 29
  • 30. CONTENT-BASED ADDRESSING 30 beta는 퍼짐정도를 조절 
 (클수록 sharp, 작을수록 smooth)
  • 31. DYNAMIC MEMORY ALLOCATION ➤ 목적: 메모리의 빈 부분에 새로운 정보를 write ➤ 빈 부분? ➤ memory usage vector ➤ update memory usage vector given previous usage vector(t-1 번째에 쓰여진 메모리의 위치정보는 반영이 되지 않은 상태), locations have just been written to, and memories that have been retained by the free gates (최근에 읽어들인 정보 = 최근에 소모한 정보, ) 31 N is called memory retention vector represents by 
 “how much each locations will not be freed by the free gates”
  • 32. DYNAMIC MEMORY ALLOCATION ➤ 목적: 메모리의 빈 부분에 새로운 정보를 write ➤ 새로운 정보를 write? ➤ update된 usage vector의 usage가 적은 순서대로 위치정보를 저장한 새로운 벡터를 생성 ➤ 은 가장 usage가 적은 location의 정보를 담고 있음 ➤ usage가 적은 location에 새로운 정보가 적힐 확률이 커지도록 allocation vector를 생성 32 2 4 5 7 8 10 11 13 6 12 9 3 1
  • 33. TEMPORAL MEMORY LINKAGE ➤ 목적: to keep track of consecutively modified memory locations ➤ Linkage matrix ➤ : location j 다음에 location i 에 정보가 저장되었는지를 나타냄 ➤ 현재(t) 정보가 쓰일 위치와 바로 직전(t-1)에 정보가 쓰인 위치가 필요 ➤ t 시점에 쓰인 위치를 나타내는 precedence weighting vector ➤ 이 정보를 바탕으로 Linkage matrix가 update됨 33 오래된 linkage는 점차 소멸 현재 i에 입력되고 바로 직전 j에 입력되었으면 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 2 -> 1 -> 4 -> 3 <Linkage matrix 예>
  • 34. READ AND WRITE WEIGHTING (ADDRESSING) ➤ Write head allocation ➤ content-based addressing + dynamic memory allocation ➤ 키 값과 유사한 메모리의 위치에 쓰기 + 빈 위치에 쓰기 ➤ Read head allocation ➤ content-based addressing + temporal memory linkage ➤ 키 값과 유사한 메모리의 위치에서 불러오기 + 연관된 정보들을 순차적으로 불러오기 34
  • 35. READ AND WRITE WEIGHTING (ADDRESSING) ➤ Write head allocation ➤ content-based addressing + dynamic memory allocation (빈 위치에 할당시키기) ➤ Read head allocation ➤ content-based addressing + temporal memory linkage (연관된 정보들을 순차적으로 읽어들이기) 35 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 2 -> 1 -> 4 -> 3 0 0 0 1 0 0 0 1 0 0 =
  • 36. DEALING WITH MEMORY ➤ How to access the specific location of memory ➤ Better than (= more complicate than) NTM 1. 메모리가 서로 겹치지 않도록 처리 2. 오래된 메모리는 제거 3. 메모리의 입력 순서를 기억 ➤ How to read and write to memory ➤ Same as NTM 36
  • 37. READING AND WRITING TO MEMORY ➤ Reading ➤ Writing 37 missing weight implicitly assigned to a null operation that does not access any of the locations Reading Writing
  • 39. EXPERIMENT (GRAPH TASK) ➤ Traversal ➤ Training 39 1 9 2 3 4 7 9 9 8 source edge destination Graph Description Phase Query Phase Answer Phase 1 9 2 3 4 7 0 0 0 0 0 0 3 4 7 0 0 0 1 9 2 3 4 7 9 9 8 9 9 8 3 4 7 9 9 7
  • 40. EXPERIMENT (GRAPH TASK) ➤ Traversal ➤ London Underground data 40
  • 41. RESULT (GRAPH TASK) ➤ London Underground 41
  • 42. EXPERIMENT (GRAPH TASK) ➤ Inference ➤ Training 42 1 9 2 0 0 7 2 9 7 start single edge
 relation
 예: 아들1 end Graph Description Phase Query Phase Answer Phase 1 9 2 3 4 7 0 0 0 1 9 2 3 4 7 9 9 8 친손자 (347) 라는 relation이
 아들(007)의 아들(007)이라 는 것을 추론 multiple edge
 relation
 예: 친손자
  • 44. SUMMARY ➤ RNN의 gate, attention은 정보가 손실되는 것을 막고, 필요한 정보만을 사용하기 위해 등장 ➤ content memory is fragile ➤ can’t increase the amount of memory easily (computation also increases) ➤ computer처럼 memory와 computation을 분리시킨 NTM, DNC를 제안 ➤ End-to-End model이기 때문에, gradient descent를 통해 학습 가능 ➤ 필요한 부분을 집중적으로 보기 위해 Memory access시 attention할 수 있도록 modeling ➤ Read / Erase / Write (Add) ➤ Memory addressing 44 DNC is also another form of RNN! https://github.com/deepmind/dnc
  • 45. REFERENCE 1. Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature 538.7626 (2016): 471-476. 2. Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv: 1410.5401 (2014). 3. https://www.slideshare.net/carpedm20/differentiable-neural-computer 4. https://norman3.github.io/papers/docs/neural_turing_machine.html 5. https://www.youtube.com/watch?v=r5XKzjTFCZQ 6. https://deepmind.com/blog/differentiable-neural-computers/ 45