Submit Search
Upload
Rabbit challenge 5_dnn3
•
Download as PPTX, PDF
•
0 likes
•
10 views
T
TOMMYLINK1
Follow
RNN,LSTM,GRU,Seq2Seq
Read less
Read more
Engineering
Report
Share
Report
Share
1 of 31
Download now
Recommended
Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2
TOMMYLINK1
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Altoros
ラビットチャレンジ 深層学習Day1 day2レポート
ラビットチャレンジ 深層学習Day1 day2レポート
KazuyukiMasada
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
Larry Guo
The world of loss function
The world of loss function
홍배 김
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
Sang Jun Lee
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Universitat Politècnica de Catalunya
Recommended
Rabbit challenge 3 DNN Day2
Rabbit challenge 3 DNN Day2
TOMMYLINK1
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
홍배 김
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
Altoros
ラビットチャレンジ 深層学習Day1 day2レポート
ラビットチャレンジ 深層学習Day1 day2レポート
KazuyukiMasada
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
Larry Guo
The world of loss function
The world of loss function
홍배 김
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
Sang Jun Lee
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Universitat Politècnica de Catalunya
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
Universitat Politècnica de Catalunya
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Universitat Politècnica de Catalunya
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Universitat Politècnica de Catalunya
Machine Learning for Trading
Machine Learning for Trading
Larry Guo
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Universitat Politècnica de Catalunya
Iclr2016 vaeまとめ
Iclr2016 vaeまとめ
Deep Learning JP
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Universitat Politècnica de Catalunya
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Universitat Politècnica de Catalunya
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Universitat Politècnica de Catalunya
Murpy's Machine Learning:14. Kernel
Murpy's Machine Learning:14. Kernel
Jungkyu Lee
Lesson 38
Lesson 38
Avijit Kumar
Lesson 39
Lesson 39
Avijit Kumar
Convolution Neural Networks
Convolution Neural Networks
AhmedMahany
Deep Learning for AI (2)
Deep Learning for AI (2)
Dongheon Lee
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Universitat Politècnica de Catalunya
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Universitat Politècnica de Catalunya
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Universitat Politècnica de Catalunya
머피의 머신러닝: 17장 Markov Chain and HMM
머피의 머신러닝: 17장 Markov Chain and HMM
Jungkyu Lee
Introduction to deep learning
Introduction to deep learning
Junaid Bhat
Recurrent Neural Networks
Recurrent Neural Networks
Sharath TS
More Related Content
What's hot
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
Universitat Politècnica de Catalunya
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Universitat Politècnica de Catalunya
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Universitat Politècnica de Catalunya
Machine Learning for Trading
Machine Learning for Trading
Larry Guo
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Universitat Politècnica de Catalunya
Iclr2016 vaeまとめ
Iclr2016 vaeまとめ
Deep Learning JP
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Universitat Politècnica de Catalunya
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Universitat Politècnica de Catalunya
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Universitat Politècnica de Catalunya
Murpy's Machine Learning:14. Kernel
Murpy's Machine Learning:14. Kernel
Jungkyu Lee
Lesson 38
Lesson 38
Avijit Kumar
Lesson 39
Lesson 39
Avijit Kumar
Convolution Neural Networks
Convolution Neural Networks
AhmedMahany
Deep Learning for AI (2)
Deep Learning for AI (2)
Dongheon Lee
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Universitat Politècnica de Catalunya
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Universitat Politècnica de Catalunya
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Universitat Politècnica de Catalunya
머피의 머신러닝: 17장 Markov Chain and HMM
머피의 머신러닝: 17장 Markov Chain and HMM
Jungkyu Lee
What's hot
(20)
The Perceptron (D1L2 Deep Learning for Speech and Language)
The Perceptron (D1L2 Deep Learning for Speech and Language)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...
Machine Learning for Trading
Machine Learning for Trading
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Deep Generative Models I (DLAI D9L2 2017 UPC Deep Learning for Artificial Int...
Iclr2016 vaeまとめ
Iclr2016 vaeまとめ
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Multilayer Perceptron - Elisa Sayrol - UPC Barcelona 2018
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Murpy's Machine Learning:14. Kernel
Murpy's Machine Learning:14. Kernel
Lesson 38
Lesson 38
Lesson 39
Lesson 39
Convolution Neural Networks
Convolution Neural Networks
Deep Learning for AI (2)
Deep Learning for AI (2)
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
머피의 머신러닝: 17장 Markov Chain and HMM
머피의 머신러닝: 17장 Markov Chain and HMM
Similar to Rabbit challenge 5_dnn3
Introduction to deep learning
Introduction to deep learning
Junaid Bhat
Recurrent Neural Networks
Recurrent Neural Networks
Sharath TS
Deep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
iTrain
Machine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural Networks
Andrew Ferlitsch
Recurrent Neuron Network-from point of dynamic system & state machine
Recurrent Neuron Network-from point of dynamic system & state machine
GAYO3
XLnet RoBERTa Reformer
XLnet RoBERTa Reformer
San Kim
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Fordham University
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Spark Summit
Recurrent Neural Networks
Recurrent Neural Networks
Jun Young Park
Recurrent neural network
Recurrent neural network
Syed Annus Ali SHah
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt
ManiMaran230751
240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd
ssuser2624f71
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
Dongang (Sean) Wang
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
AbhijitVenkatesh1
Supervised sequence labelling with recurrent neural networks ch1 6
Supervised sequence labelling with recurrent neural networks ch1 6
SungminYou
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
CastLabKAIST
Aaa ped-22-Artificial Neural Network: Introduction to ANN
Aaa ped-22-Artificial Neural Network: Introduction to ANN
AminaRepo
DNN and RBM
DNN and RBM
Masayuki Tanaka
【DL輪読会】Incorporating group update for speech enhancement based on convolutio...
【DL輪読会】Incorporating group update for speech enhancement based on convolutio...
Deep Learning JP
20171110 qrnn quasi-recurrent neural networks
20171110 qrnn quasi-recurrent neural networks
h m
Similar to Rabbit challenge 5_dnn3
(20)
Introduction to deep learning
Introduction to deep learning
Recurrent Neural Networks
Recurrent Neural Networks
Deep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
Machine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural Networks
Recurrent Neuron Network-from point of dynamic system & state machine
Recurrent Neuron Network-from point of dynamic system & state machine
XLnet RoBERTa Reformer
XLnet RoBERTa Reformer
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Recurrent Neural Networks
Recurrent Neural Networks
Recurrent neural network
Recurrent neural network
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt
240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
Supervised sequence labelling with recurrent neural networks ch1 6
Supervised sequence labelling with recurrent neural networks ch1 6
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
Aaa ped-22-Artificial Neural Network: Introduction to ANN
Aaa ped-22-Artificial Neural Network: Introduction to ANN
DNN and RBM
DNN and RBM
【DL輪読会】Incorporating group update for speech enhancement based on convolutio...
【DL輪読会】Incorporating group update for speech enhancement based on convolutio...
20171110 qrnn quasi-recurrent neural networks
20171110 qrnn quasi-recurrent neural networks
Recently uploaded
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
Tsuyoshi Horigome
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
PoojaBan
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
rehmti665
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
null - The Open Security Community
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
Mark Billinghurst
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
k795866
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
9953056974 Low Rate Call Girls In Saket, Delhi NCR
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
Asst.prof M.Gokilavani
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
KartikeyaDwivedi3
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
eptoze12
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
959SahilShah
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
wendy cai
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
João Esperancinha
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
purnimasatapathy1234
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
hassan khalil
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
ssuser7cb4ff
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
vipinkmenon1
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ
Recently uploaded
(20)
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
Rabbit challenge 5_dnn3
1.
Rabbit Challenge Stage
4-1 深層学習 DAY3 RNN,LSTM,GRU,Seq2Seq他
2.
目的 Rabbit Challengeの課題レポートとして、 Stage4 (Day3)
の学習内容をまとめる
3.
1.RNN 2.LSTM 3.GRU 4.双方向RNN 5.Seq2Seq 6.実装課題
4.
1.1 RNNとは RNN( Recurrent
Neural Network): 時系列データをNNで学習、推測 時系列データ: 時間的順序を追って、一定時間ごとに観察されさらに統計的依存関係が認められるようなデータ系列
5.
1.2 RNNモデル 基本構造 𝒛 𝒙 𝒚 𝒛𝟏 𝒙𝟏 𝒚𝟏 𝒛𝟎
𝒛𝟐 𝒙𝟐 𝒚𝟐 𝒛𝒏 𝒙𝒏 𝒚𝒏 𝑤1 out … 𝑤2 out 𝑤n out 𝑤n in 𝑤2 in 𝑤1 in 𝒘𝟎 𝒘𝟏 𝒘𝟐 𝒘𝒏−𝟏 前の中間層からの入力を次の中間層にどの程度取り込むか 特徴:前の時刻の中間層の出力を次の時刻の学習に使う
6.
1.3 RNNモデル 演算 𝒛𝒕 𝑥𝑡 𝑦𝑡 𝑤𝑜𝑢𝑡 𝑤𝑖𝑛 𝑤 𝑧𝑡−1 𝑔(𝑣𝒕) + 𝑓(𝑢𝒕) 𝑏 𝑢𝒕 + 𝑐 𝑣𝒕 … … 𝑢𝒕
= 𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑧𝑡−1 + 𝑏 𝑧𝑡 = 𝑓 𝑢𝒕 𝑣𝒕 = 𝑤𝑜𝑢𝑡𝑧𝑡 + c 𝑦𝒕 = 𝑔 𝑣𝒕 実装例 u[:,t] = np.dot(x, w_in)+np.dot(z[:,t-1].np.shape(1,-1),w) + b z[:,t] = sigmoid(u[:,t]) v[:,t] = np.dot(z[:,t].np.shape(1,-1),w_out) + c 時間方向 データ数 z[:,t-1]:t-1列目のスライス y[:,t] = sigmoid(v[:,t]) ※なおそれぞれの重み(𝑤 𝑤𝑖𝑛 𝑤𝑜𝑢𝑡 )は 時刻によらない。よって順伝播中は定数と なる。
7.
1.4 BPTT (1) 𝑑𝑡 E
誤差 𝜕𝐸 𝜕𝑤𝑜𝑢𝑡 = 𝑡=1 𝑇 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑤𝑜𝑢𝑡 = 𝑡=1 𝑇 δ𝑜𝑢𝑡,𝑡 𝜕 𝜕𝑤𝑜𝑢𝑡 (𝑤𝑜𝑢𝑡 𝑧𝑡 + c)= 𝑡=1 𝑇 δ𝑜𝑢𝑡,𝑡 𝑧𝑡 BPTT(Back Propagation Through Time) 出力した時刻の誤差をもとに、時間成分をさかのぼる方向に逆伝播する 𝒛𝒕 𝑥𝑡 𝑦𝑡 𝑧𝑡−1 𝑔(𝑣𝒕) + 𝑓(𝑢𝒕) 𝑏 𝑢𝒕 + 𝑐 𝑣𝒕 … 𝑤𝑜𝑢𝑡 𝑤𝑖𝑛 𝑤 ※重みは全時刻に対して一定なので誤差微分値は、すべての時刻成分に対して加算する δ𝑜𝑢𝑡,𝑡 ≡ 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝐸 𝜕𝑤 = 𝑡=1 𝑇 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑧𝑡 𝜕𝑧𝑡 𝜕𝑢𝑡 𝜕𝑢𝑡 𝜕𝑤 = 𝑡=1 𝑇 δ𝑡 𝜕 𝜕𝑤 (𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑧𝑡−1 + 𝑏)= 𝑡=1 𝑇 δ𝑡 𝑧𝑡−1 δ𝑡 ≡ 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑧𝑡 𝜕𝑧𝑡 𝜕𝑢𝑡 𝜕𝐸 𝜕𝑤𝑖𝑛 = 𝑡=1 𝑇 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑧𝑡 𝜕𝑧𝑡 𝜕𝑢𝑡 𝜕𝑢𝑡 𝜕𝑤𝑖𝑛 = 𝑡=1 𝑇 δ𝑡 𝜕 𝜕𝑤𝑖𝑛 (𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑧𝑡−1 + 𝑏)= 𝑡=1 𝑇 δ𝑡 𝑥𝑡 𝜕𝐸 𝜕𝑐 = 𝑡=1 𝑇 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑐 = 𝑡=1 𝑇 δ𝑜𝑢𝑡,𝑡 𝜕 𝜕𝑤𝑜𝑢𝑡 (𝑤𝑜𝑢𝑡 𝑧𝑡 + c)= 𝑡=1 𝑇 δ𝑜𝑢𝑡,𝑡 𝜕𝐸 𝜕𝑏 = 𝑡=1 𝑇 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑧𝑡 𝜕𝑧𝑡 𝜕𝑢𝑡 𝜕𝑢𝑡 𝜕𝑏 = 𝑡=1 𝑇 δ𝑡 𝜕 𝜕𝑤𝑖𝑛 (𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑧𝑡−1 + 𝑏)= 𝑡=1 𝑇 δ𝑡
8.
1.5 BPTT (2)
δの展開 𝑑𝑡 E 誤差 𝒛𝒕 𝑥𝑡 𝑦𝑡 𝑧𝑡−1 𝑔(𝑣𝒕) + 𝑓(𝑢𝒕) 𝑏 𝑢𝒕 + 𝑐 𝑣𝒕 … 𝑤𝑜𝑢𝑡 𝑤𝑖𝑛 𝑤 δ𝑡−1 = 𝜕𝐸 𝜕𝑢𝑡−1 = 𝜕𝐸 𝜕𝑢𝑡 𝜕𝑢𝑡 𝜕𝑢𝑡−1 = δ𝑡 𝜕𝑢𝑡 𝜕𝑧𝑡−1 𝜕𝑧𝑡−1 𝜕𝑢𝑡−1 = δ𝑡 𝜕 𝜕𝑧𝑡−1 (𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑧𝑡−1 + 𝑏) 𝜕 𝜕𝑢𝑡−1 (𝑓 𝑢𝒕−𝟏 )= δ𝑡 𝑤 𝑓′ 𝑢𝒕−𝟏 δ𝑡 ≡ 𝜕𝐸 𝜕𝑦𝑡 𝜕𝑦𝑡 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑧𝑡 𝜕𝑧𝑡 𝜕𝑢𝑡 = 𝜕𝐸 𝜕𝑢𝑡 を展開して整理する δ𝑡 = 𝜕𝐸 𝜕𝑢𝑡 = 𝜕𝐸 𝜕𝑣𝑡 𝜕𝑣𝑡 𝜕𝑢𝑡 = δ𝑜𝑢𝑡,𝑡 𝜕 𝜕𝑢𝑡 (𝑤𝑜𝑢𝑡𝑧𝑡 + c)= δ𝑜𝑢𝑡,𝑡 𝜕 𝜕𝑢𝑡 (𝑤𝑜𝑢𝑡𝑓 𝑢𝒕 + c)= δ𝑜𝑢𝑡,𝑡𝑤𝑜𝑢𝑡 𝑓′ 𝑢𝒕 出力層由来の展開 先の時刻の中間層由来の展開 δ𝑡 = 出力由来の展開 + 先の時刻の中間層由来の展開 δ𝑡 = ( δ𝑜𝑢𝑡,𝑡 𝑤𝑜𝑢𝑡 + δ𝑡+1 𝑤 )𝑓′ 𝑢𝒕 ※当該時刻の誤差は、その時刻由来の誤差と、 先の時刻から伝播された誤差の加算として定義されることに注意
9.
1.6 BPTT (3)
重み更新 𝑑𝑡 E 誤差 𝒛𝒕 𝑥𝑡 𝑦𝑡 𝑧𝑡−1 𝑔(𝑣𝒕) + 𝑓(𝑢𝒕) 𝑏 𝑢𝒕 + 𝑐 𝑣𝒕 … 𝑤𝑜𝑢𝑡 𝑤𝑖𝑛 𝑤 𝑤𝑜𝑢𝑡𝑡+1 = 𝑤𝑜𝑢𝑡𝑡 − 𝜖 𝜕𝐸 𝜕𝑤𝑜𝑢𝑡 = 𝑤𝑜𝑢𝑡𝑡+1 − δ𝑜𝑢𝑡,𝑡 𝑧𝑡 𝑐𝑡+1 = 𝑐𝑡 − 𝜖 𝜕𝐸 𝜕𝑐 = 𝑐𝑡 − δ𝑜𝑢𝑡,𝑡 𝑤𝑡+1 = 𝑤𝑡 − 𝜖 𝜕𝐸 𝜕𝑤 = 𝑤𝑡 − 𝑇=1 𝑇𝑡 δ𝑇 𝑧𝑇−1 𝑤𝑖𝑛𝑡+1 = 𝑤𝑖𝑛𝑡 − 𝜖 𝜕𝐸 𝜕𝑤𝑖𝑛 = 𝑤𝑖𝑛𝑡 − 𝑇=1 𝑇𝑡 δ𝑇 𝑥𝑇 𝑏𝑡+1 = 𝑏𝑡 − 𝜖 𝜕𝐸 𝜕𝑏 = 𝑏𝑡 − 𝑇=1 𝑇𝑡 δ𝑇 出力に関する重みなので、 当該時間のみで重み更新する 入力から中間層に関する重み なので、ここまでの時間成分 で重み更新する
10.
1.7 BPTTで時間方向に再帰処理される理由 𝐸𝑡 =
𝑙𝑜𝑠𝑠(𝑦𝑡, 𝑑𝑡) = 𝑙𝑜𝑠𝑠(𝑔(𝑤𝑜𝑢𝑡 𝑧𝑡 + c), 𝑑𝑡) = 𝑙𝑜𝑠𝑠(𝑔(𝑤𝑜𝑢𝑡 𝑓(𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑧𝑡−1 + 𝑏) + c), 𝑑𝑡) 時刻𝑡における誤差𝐸を考える = 𝑙𝑜𝑠𝑠(𝑔(𝑤𝑜𝑢𝑡 𝑓(𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑓(𝑤𝑖𝑛 𝑥𝑡−1 + 𝑤𝑧𝑡−2 + 𝑏) + 𝑏) + c), 𝑑𝑡) 𝑧𝑡の項を元に𝑡がどんどんさかのぼっていく→再帰的 𝑢𝒕 = 𝑤𝑖𝑛 𝑥𝑡 + 𝑤𝑧𝑡−1 + 𝑏 𝑧𝑡 = 𝑓 𝑢𝒕
11.
1.RNN 2.LSTM 3.GRU 4.双方向RNN 5.Seq2Seq 6.実装課題
12.
2.1 RNNの課題 各層が時系列に並んでいるため、時刻が積み重なるほど層が深くなる 勾配消失問題、勾配爆発問題(※)が生じる ※勾配爆発 層を伝播するごとに勾配が大きい値になってしまう(活性化関数が恒等関数の場合などに生じやすい) 対策として「勾配のクリッピング」がある。以下サンプルコード。 def grad_clip(grad,
thresh): norm = np.linalg.norm(grad) rate = thresh / norm if rate < 1: return grad * rate return grad 勾配からノルム(ベクトルの大きさ)を取る thresholdとの割合を求める 1より小さければ normが大きすぎる 勾配をtresh/norm倍して(小さくして)return これらが生じにくいモデルを考えたのが次に紹介するLSTMである
13.
CEC 2.2 LSTM LSTM:Long Short-Term
Memory。 中間層を記憶ユニットに置換し、記憶ユニットへの入出力ゲートを学習させるモデル。 𝒇 ×,+ 現時刻の入力𝒙(𝒕) 𝒉(𝒕 − 𝟏) 前時刻の出力 𝑾𝒊 × σ 𝒂(𝒕) 𝒊(𝒕) σ シグモイド関数 𝒇 活性化関数(tanhなど) ×,+ 入力ゲート 参考:わかるLSTM ~ 最近の動向と共 に 𝑾𝒇 σ ×,+ 忘却ゲート × 𝒂(𝒕) 𝒊(𝒕) 𝒇(𝒕) 𝒈 𝒈(𝒄(𝒕)) 𝒄(𝒕) 𝒈 𝒄(𝒕) 𝒄 𝒕 = 𝒂 𝒕 𝒊 𝒕 + 𝒇 𝒕 𝒄(𝒕 − 𝟏) 𝑾𝒂 𝒇 𝒕 𝒄(𝒕 − 𝟏) 𝑾𝒐 σ ×,+ 出力ゲート 𝒐(𝒕) × 𝒐(𝒕)𝒈(𝒄(𝒕)) 𝒉(𝒕) 現時刻の出力 𝒉(𝒕 − 𝟏) 前時刻の出力 𝑼𝒐 𝑼𝒇 𝑼𝒊 𝑼𝒂
14.
2.3 CEC CEC:Constant Error
Carousel(定誤差カルーセル) 前の層からの(過去の)入力を常に重み=1の線形和として出力する。 このことで勾配爆発や勾配消失を防ぐ。 ただし、これでは学習もできなくなるため、代わりに入出力ゲートを設け、これを学習する。 CEC 𝒇 ×,+ 𝒙(𝒕) 𝒉(𝒕 − 𝟏) 𝑾𝒊 × σ 𝒂(𝒕) 𝒊(𝒕) ×,+ 𝑾𝒇 σ ×,+ × 𝒂(𝒕) 𝒊(𝒕) 𝒇(𝒕) 𝒈 𝒈(𝒄(𝒕)) 𝒄(𝒕) 𝒄(𝒕) 𝑾𝒂 𝒇 𝒕 𝒄(𝒕 − 𝟏) 𝑾𝒐 σ ×,+ 𝒐(𝒕) × 𝒐(𝒕)𝒈(𝒄(𝒕)) 𝒉(𝒕) 𝒉(𝒕 − 𝟏) 𝑼𝒐 𝑼𝒇 𝑼𝒊 𝑼𝒂 参考:RNNとLSTMを理解する
15.
2.4 入力ゲート、出力ゲート 「現在のユニットの入力」と「前のユニット(1つ前の時間のユニット)の出力」をどの程度受け取るか」をそれぞ れどの程度の割合で取り込むかを調整するためのゲート。 CECの入力側と出力側に置かれ、シグモイド関数を使うことで[0,1]の値を掛けられるためゲートとしての役割を持 つ CEC 𝒇 ×,+ 𝒙(𝒕) 𝒉(𝒕 −
𝟏) 𝑾𝒊 × σ 𝒂(𝒕) 𝒊(𝒕) ×,+ 𝑾𝒇 σ ×,+ × 𝒂(𝒕) 𝒊(𝒕) 𝒇(𝒕) 𝒈 𝒈(𝒄(𝒕)) 𝒄(𝒕) 𝒄(𝒕) 𝑾𝒂 𝒇 𝒕 𝒄(𝒕 − 𝟏) 𝑾𝒐 σ ×,+ 𝒐(𝒕) × 𝒐(𝒕)𝒈(𝒄(𝒕)) 𝒉(𝒕) 𝒉(𝒕 − 𝟏) 𝑼𝒐 𝑼𝒇 𝑼𝒊 𝑼𝒂 参考:RNNとLSTMを理解する
16.
2.5 忘却ゲート CECで過去の状態を保持し続けると、新しい情報による学習が進みにくくなる。 これを防ぐために、CECの情報を忘却する仕組みを持つ。 CEC 𝒇 ×,+ 𝒙(𝒕) 𝒉(𝒕 −
𝟏) 𝑾𝒊 × σ 𝒂(𝒕) 𝒊(𝒕) ×,+ 𝑾𝒇 σ ×,+ × 𝒂(𝒕) 𝒊(𝒕) 𝒇(𝒕) 𝒈 𝒈(𝒄(𝒕)) 𝒄(𝒕) 𝒄(𝒕) 𝑾𝒂 𝒇 𝒕 𝒄(𝒕 − 𝟏) 𝑾𝒐 σ ×,+ 𝒐(𝒕) × 𝒐(𝒕)𝒈(𝒄(𝒕)) 𝒉(𝒕) 𝒉(𝒕 − 𝟏) 𝑼𝒐 𝑼𝒇 𝑼𝒊 𝑼𝒂
17.
CEC 2.6 のぞき穴結合 従来のLSTM構造は、「現時刻の入力」と「前時刻の出力」が制御に使われるが、 CECの中身が直接制御に使われることが無かった。 上記解決のため、CECのデータにも重みを加味したうえで、直接各ゲートの制御に使う。 (ただし、一般的なLSTMの実装では省略されることがある。) 𝒇 ×,+ 現時刻の入力𝒙(𝒕) 𝒉(𝒕 −
𝟏) 前時刻の出力 𝑾𝒊 × σ 𝒂(𝒕) 𝒊(𝒕) ×,+ 入力ゲート 参考:わかるLSTM ~ 最近の動向と共 に 𝑾𝒇 σ ×,+ 忘却ゲート × 𝒂(𝒕) 𝒊(𝒕) 𝒇(𝒕) 𝒈 𝒈(𝒄(𝒕)) 𝒄(𝒕) 𝒄(𝒕) 𝒄 𝒕 = 𝒂 𝒕 𝒊 𝒕 + 𝒇 𝒕 𝒄(𝒕 − 𝟏) 𝑾𝒂 𝒇 𝒕 𝒄(𝒕 − 𝟏) 𝑾𝒐 σ ×,+ 出力ゲート 𝒐(𝒕) × 𝒐(𝒕)𝒈(𝒄(𝒕)) 𝒉(𝒕) 現時刻の出力 𝒉(𝒕 − 𝟏) 前時刻の出力 𝑼𝒐 𝑼𝒇 𝑼𝒊 𝑼𝒂 𝑷𝒇 𝑷𝒐 𝑷𝒊
18.
1.RNN 2.LSTM 3.GRU 4.双方向RNN 5.Seq2Seq 6.実装課題
19.
3.1 LSTMの課題 LSTMはパラメータが多いため計算負荷が高い Lパラメータを減らして軽量化したのがGRUである
20.
3.2 GRU GRU:Gated Recurrent
Unit 記憶ユニットを廃止し、リセットゲートと更新ゲートで制御する 𝒇 現時刻の入力 𝒙(𝒕) 𝑾𝒓 + σ 𝒓(𝒕) σ シグモイド関数 𝒇 活性化関数(tanhなど) ×,+ リセットゲート 𝒉(𝒕) = 𝒇(𝑾𝒉𝒙 𝒕 + 𝒓(𝒕)𝑼𝒉・𝒉(𝒕 − 𝟏) + 𝒃𝒉(𝒕)) 𝑾𝒉 𝑾𝒛 σ ×,+ 更新ゲート 𝒛(𝒕) × 𝒉(𝒕) 現時刻の出力 𝒉(𝒕 − 𝟏) 前時刻の出力 𝑼𝒛 𝑼𝒓 参考:〈機械学習基礎〉数式なし! LSTM・GRU超入門 × 𝒓 𝒕 = 𝑾𝒓𝒙 𝒕 + 𝑼𝒓・𝒉 𝒕 − 𝟏 + 𝒃𝒓(𝒕) 𝒓(𝒕)𝑼𝒉・𝒉(𝒕 − 𝟏) 𝑼𝒉 𝒛 𝒕 = 𝑾𝒛𝒙 𝒕 + 𝑼𝒛・𝒉 𝒕 − 𝟏 + 𝒃𝒛(𝒕) × 𝒛(𝒕) 𝟏 − 𝒛(𝒕) + 𝒉(𝒕) 𝒉(𝒕) = (1 − 𝒛 𝒕 )・𝒉(𝒕) + 𝒛(𝒕)・𝒉(𝒕 − 𝟏)
21.
1.RNN 2.LSTM 3.GRU 4.双方向RNN 5.Seq2Seq 6.実装課題
22.
4.1 双方向RNN 過去の情報だけでなく未来の情報も加味して学習する 文脈を加味する必要がある機械翻訳などによく使用される 𝒛𝟏 𝒙𝟏 𝒚𝟏 𝒛𝟎 𝒛𝟐 𝒙𝟐 𝒚𝟐 𝒛𝒏 𝒙𝒏 𝒚𝒏 𝑤1 out … 𝑤2 out 𝑤n out 𝑤n in 𝑤2 in 𝑤1 in 𝒘𝟎
𝒘𝟏 𝒘𝟐 𝒘𝒏−𝟏 𝒛𝟏 ′ 𝒛𝟐 ′ 𝒛𝒏 ′ … 𝒘𝟏 ′ 𝒘𝟐 ′ 𝒘𝒏−𝟏 ′
23.
1.RNN 2.LSTM 3.GRU 4.双方向RNN 5.Seq2Seq 6.実装課題
24.
5.1 Seq2Seqとは Encoder 特徴ベクトル
Decoder Encoder-Decoderモデルの一種 入力データをEncodeして特徴ベクトルに変換し、 その特徴ベクトルをDecodeして新たなデータを生成するモデル 機械翻訳や機械対話に利用される
25.
5.2 Encodeの手順 𝒉𝟏 𝒙𝟏 𝒉𝟐 𝒙𝟐 … 𝒉𝑻 𝒙𝑻 私 は
です。 ①Tokenize:文章を単語に分解しIDを付与 ②Embedding:IDを分散表現を用いてベクトル表現に変換(word2vecなどの手法がある) ③Encoder-RNN:ベクトルを順番にRNN等に入力し特徴ベクトルを生成 最後のベクトルを入力した際のhidden stateは、 入力した分全体を表す文脈ベクトルとなっている →thought vectorという 参考:DeepLearning における会話モデル: Seq2Seq から VHRED まで
26.
5.3 Decodeの手順 𝒉𝟏 𝒚𝟏 𝒉𝟐 𝒚𝟐 … 𝒉𝑻 𝒚𝑻 I am
student. ①Decoder-RNN:Encoderで生成したthought vectorから各tokenの生成確率を出力 ②Sampling:生成確率を元にランダムにTokenを選択 ③Embedding:TokenをEmbeddingして次のDecoder-RNNへ入力する ④これを繰り返して意味のある文字列に変換していく 参考:DeepLearning における会話モデル: Seq2Seq から VHRED まで
27.
5.4 word2vec 参考:挑戦! word2vecで自然言語処理(Keras+TensorFlow使用) Embeddingの手法の一つ 単語をベクトル表現にする際、単語同士の意味を保ちながら使用するデータ量を削減できる 例えば以下のような演算が成立するようなベクトル表現となる king
- man + woman = queen (それぞれの単語のベクトル表現で演算) word2vec CBOW (Continuous Bag-of-words) Model skip-gram Model さらに以下の2種類の手法に分けられる ターゲットの単語を前後の単語から推測する ターゲットが与えられ、その前後の単語を推測する
28.
5.5 Attention Mechanism 参考:Seq2Seq+Attentionのその先へ Seq2Seqの課題 Decoderではすべての単語の情報が圧縮される 結果として、長い文章では途中の単語の情報が十分伝えられない Attentionメカニズム Decoder時に入力系列の情報を直接参照できるようにする仕組み Decoderの隠れ層のベクトルと、Encoderの各時刻の隠れ層のベクトルから、 土の入力単語に中止するかのスコアを決定 そのスコアを元に隠れ層のベクトルの加重平均を求める
29.
1.RNN 2.LSTM 3.GRU 4.双方向RNN 5.Seq2Seq 6.実装課題
30.
6.1 実装演習 RNN 8bitビット配列の加算をRNNで予測 Seq2Seq 特定フォルダにある文章をベースにSeq2Seqで単語を予測 https://github.com/33quitykubby/Rabbit_DNN_3/blob/main/Rabbit_RNN_1_simple.ipynb https://github.com/33quitykubby/Rabbit_DNN_3/blob/main/Rabbit_RNN_2_Seq2Seq_RNN_wordPredict.ipynb LSTM 8bitビット配列の加算をLSTMで予測(kerasフレームワーク利用) https://github.com/33quitykubby/Rabbit_DNN_3/blob/main/Rabbit_RNN_3_simple_LSTM.ipynb
31.
6.2 実装演習 GRU 8bitビット配列の加算をGRUで予測(kerasフレームワーク利用) https://github.com/33quitykubby/Rabbit_DNN_3/blob/main/Rabbit_RNN_4_simple_GRU.ipynb 双方向RNN 8bitビット配列の加算を双方向RNNで予測(kerasフレームワーク利用) https://github.com/33quitykubby/Rabbit_DNN_3/blob/main/Rabbit_RNN_5_simple_BidirectionalRNN.ipynb 双方向RNN+Attention 8bitビット配列の加算を双方向RNN+Attentionで予測(kerasフレームワーク利用) https://github.com/33quitykubby/Rabbit_DNN_3/blob/main/Rabbit_RNN_6_simple_BidirectionalRNN_Attention.ipynb
Download now