SlideShare a Scribd company logo
James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard Socher
Salesforce Research Palo Alto, California
arXiv:1611.01576v2 [cs.NE] 21 Nov 2016
ICLR 2017 accepted
文献紹介
QRNN: QUASI-RECURRENT NEURAL
NETWORKS
Abstract
- QRNN = RNN processing like CNN
- can process sequential data in parallel
- up to 16 times faster than LSTM in train/test
- can make visual analysis of weights easy
Outline
- Introduction
- review of RNN/LSTM
- Model (QRNN)
- Variants
- Results
- sentiment classification
- language modeling
- character-level machine translation
- Conclusion
- Reference
Introduction (review of RNN)
- the standard model architecture for deep learning approaches to sequence
modeling tasks
- sentence classification | word- and character-level language modeling | machine translation |
question answering | image caption | time series forecasting
Introduction (review of RNN)
- the network which has loop
arhictectures
- RNN is very deep (causing
gradient vanising) word2vec(“私”)
昨日の株価
(“の”:0.2, “は”:0.3, ...)
今日の株価の予測値
Introduction (review of RNN)
- problem: not good at
learning very long
sequences
- document classification |
character-level
- why?: can’t deal with
sequential data in parallel
Introduction (review of LSTM)
- LSTM solves gradient vanising, using memory cell
- LSTM has 3 gates to control information flow
Introduction (review of LSTM)
- forget gate to control long-term information (in memory cell c)
Introduction (review of LSTM)
- input gate to control current+short-time information (in x and h(t-1))
Introduction (review of LSTM)
- update memory cell, mixing the current with the previos memory cell
- output gate to control current hidden-state information to the next layer
Introduction (review of LSTM)
- using a forget gate instead of an input gate
Introduction (variants of LSTM)
Model
Model (convolution component)
“ズン”, “ドコ”, “きよし”
( 1, 0, 0, )
=“ズン”
この例はone-hotだが
word2vecというもっといい変換
を使う
ズン, ズン, ズン, ドコ, きよし
時刻tの値を予測す
るのに未来の時刻
t+1のデータを用い
てはいけないので、
masked
convolution
Model
bottle-neckになっていた前の層のhidden
state h[t-1] を用いるのではなく、前の時刻
の入力x[t-1/2/...]を用いて並列処理を可能
にした。
Model (pooling component)
LSTM
さらに、hidden state h に重みをかけずに渡していくので、各要素
の情報がごっちゃにならないので可視化しやすい。
ここは従来のLSTMと同じく逐次
計算するが、そんなに大して時
間かからない。
Model (pooling component)
other type poolings (この論文では使われていない?)
f-pooling
ifo-pooling
Variants
- Zoneout: Dropout for LSTM
- skip-connection like DenseNet
- Attention for Encoder-Decoder
Experiments
- Sentiment Classification (document binary-classification)
- IMDb movie review
- 25,000 positive/negative reviews
- Language Modeling (word-level prediction)
- PTB: Penn Treebank
- Character-level Machine Translatoin
- IWST English-German spoken language translation task
Results (sentiment classification)
- 小 batch_size, 長 seq_len
に向いている(最大16倍早
かった。)
- training 時間は3倍早い
Results (sentiment classification)
final layer’s
hidden state
Results (language modeling)
Results (character-level machine translation)
BLEU: upper is better
http://unicorn.ike.tottori-u.ac.jp/2010/s072046/paper/graduation-thesis/node32.html
考察
- LSTMに精度で少し負けてしまった理由は、隠れ層の状態 h[t-1] ではなく、直前の
入力 x[t-1|t-2|,...]を使って近似したからと考えられる。
- 入力で、隠れ層の状態を近似する場合、使う、前の時刻の filter size k を無限大ま
で長くすれば一致する。(sentiment classificationのtaskではkを大きくしたら精度
上がった)
- なので、filter-sizeを大きくすればいいが、そうすると、計算速度はどれほど落ちる
のかが問題。
Conclusion
- QRNN = RNN processing like CNN
- can process sequential data in parallel
- up to 16x faster than LSTM in train/test
- can make visual analysis of weights easy
Reference
- LSTM
- LSTMネットワークの概要 https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca
- わかるLSTM ~ 最近の動向と共に https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca
- ニューラルネットワーク勉強会
http://isw3.naist.jp/~neubig/student/2015/seitaro-s/161025neuralnet_study_LSTM.pdf
- conv の 3D図作成
- thinkercad https://www.tinkercad.com/
- QRNN
- LSTMを超える期待の新星、QRNN https://qiita.com/icoxfog417/items/d77912e10a7c60ae680e
- slideshare
https://www.slideshare.net/DeepLearningJP2016/dlquasirecurrent-neural-networks?qid=a4ead77d-d8dd-458b-965c-5e53723d7757
&v=&b=&from_search=1
- pytorchでの公式実装 https://github.com/salesforce/pytorch-qrnn/blob/master/torchqrnn/qrnn.py

More Related Content

What's hot

[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ
Deep Learning JP
 
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
Deep Learning JP
 
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
Tenki Lee
 
動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)
cvpaper. challenge
 
【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models
Deep Learning JP
 
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
Deep Learning JP
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
Arithmer Inc.
 
畳み込みLstm
畳み込みLstm畳み込みLstm
畳み込みLstm
tak9029
 
【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識
Hirokatsu Kataoka
 
[DL輪読会]World Models
[DL輪読会]World Models[DL輪読会]World Models
[DL輪読会]World Models
Deep Learning JP
 
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII
 
ResNetの仕組み
ResNetの仕組みResNetの仕組み
ResNetの仕組み
Kota Nagasato
 
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Yusuke Uchida
 
【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models
【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models
【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models
Deep Learning JP
 
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks? 【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
Deep Learning JP
 
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
Deep Learning JP
 
【チュートリアル】コンピュータビジョンによる動画認識 v2
【チュートリアル】コンピュータビジョンによる動画認識 v2【チュートリアル】コンピュータビジョンによる動画認識 v2
【チュートリアル】コンピュータビジョンによる動画認識 v2
Hirokatsu Kataoka
 
Transformerを雰囲気で理解する
Transformerを雰囲気で理解するTransformerを雰囲気で理解する
Transformerを雰囲気で理解する
AtsukiYamaguchi1
 
言語と画像の表現学習
言語と画像の表現学習言語と画像の表現学習
言語と画像の表現学習
Yuki Noguchi
 
DockerコンテナでGitを使う
DockerコンテナでGitを使うDockerコンテナでGitを使う
DockerコンテナでGitを使う
Kazuhiro Suga
 

What's hot (20)

[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ
 
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
 
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
CVPR2018 pix2pixHD論文紹介 (CV勉強会@関東)
 
動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)動画認識における代表的なモデル・データセット(メタサーベイ)
動画認識における代表的なモデル・データセット(メタサーベイ)
 
【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models
 
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
[DL輪読会]Transframer: Arbitrary Frame Prediction with Generative Models
 
全力解説!Transformer
全力解説!Transformer全力解説!Transformer
全力解説!Transformer
 
畳み込みLstm
畳み込みLstm畳み込みLstm
畳み込みLstm
 
【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識【チュートリアル】コンピュータビジョンによる動画認識
【チュートリアル】コンピュータビジョンによる動画認識
 
[DL輪読会]World Models
[DL輪読会]World Models[DL輪読会]World Models
[DL輪読会]World Models
 
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
SSII2019OS: 深層学習にかかる時間を短くしてみませんか? ~分散学習の勧め~
 
ResNetの仕組み
ResNetの仕組みResNetの仕組み
ResNetの仕組み
 
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...
 
【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models
【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models
【DL輪読会】Hyena Hierarchy: Towards Larger Convolutional Language Models
 
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks? 【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
 
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
 
【チュートリアル】コンピュータビジョンによる動画認識 v2
【チュートリアル】コンピュータビジョンによる動画認識 v2【チュートリアル】コンピュータビジョンによる動画認識 v2
【チュートリアル】コンピュータビジョンによる動画認識 v2
 
Transformerを雰囲気で理解する
Transformerを雰囲気で理解するTransformerを雰囲気で理解する
Transformerを雰囲気で理解する
 
言語と画像の表現学習
言語と画像の表現学習言語と画像の表現学習
言語と画像の表現学習
 
DockerコンテナでGitを使う
DockerコンテナでGitを使うDockerコンテナでGitを使う
DockerコンテナでGitを使う
 

Similar to 20171110 qrnn quasi-recurrent neural networks

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
Sharath TS
 
lepibwp74jd2rz.pdf
lepibwp74jd2rz.pdflepibwp74jd2rz.pdf
lepibwp74jd2rz.pdf
SajalTyagi6
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
Sang Jun Lee
 
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Universitat Politècnica de Catalunya
 
Applying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language ServicesApplying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language Services
Yannis Flet-Berliac
 
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt
ManiMaran230751
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Junaid Bhat
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
Abner Huang
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
Dongang (Sean) Wang
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
odsc
 
Role of Tensors in Machine Learning
Role of Tensors in Machine LearningRole of Tensors in Machine Learning
Role of Tensors in Machine Learning
Anima Anandkumar
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Single shot multiboxdetectors
Single shot multiboxdetectorsSingle shot multiboxdetectors
Single shot multiboxdetectors
지현 백
 
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Fordham University
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3
TOMMYLINK1
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
Buhwan Jeong
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
Sasha Lazarevic
 
Deep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.pptDeep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.ppt
ShankerRajendiran2
 
Deep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.pptDeep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.ppt
SatyaNarayana594629
 
RNN.ppt
RNN.pptRNN.ppt

Similar to 20171110 qrnn quasi-recurrent neural networks (20)

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
lepibwp74jd2rz.pdf
lepibwp74jd2rz.pdflepibwp74jd2rz.pdf
lepibwp74jd2rz.pdf
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
 
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
 
Applying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language ServicesApplying Deep Learning Machine Translation to Language Services
Applying Deep Learning Machine Translation to Language Services
 
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練2009 CSBB LAB 新生訓練
2009 CSBB LAB 新生訓練
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
Role of Tensors in Machine Learning
Role of Tensors in Machine LearningRole of Tensors in Machine Learning
Role of Tensors in Machine Learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Single shot multiboxdetectors
Single shot multiboxdetectorsSingle shot multiboxdetectors
Single shot multiboxdetectors
 
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
Foundation of Generative AI: Study Materials Connecting the Dots by Delving i...
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
Deep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.pptDeep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.ppt
 
Deep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.pptDeep-Learning-2017-Lecture6RNN.ppt
Deep-Learning-2017-Lecture6RNN.ppt
 
RNN.ppt
RNN.pptRNN.ppt
RNN.ppt
 

Recently uploaded

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 

Recently uploaded (20)

06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 

20171110 qrnn quasi-recurrent neural networks

  • 1. James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard Socher Salesforce Research Palo Alto, California arXiv:1611.01576v2 [cs.NE] 21 Nov 2016 ICLR 2017 accepted 文献紹介 QRNN: QUASI-RECURRENT NEURAL NETWORKS
  • 2. Abstract - QRNN = RNN processing like CNN - can process sequential data in parallel - up to 16 times faster than LSTM in train/test - can make visual analysis of weights easy
  • 3. Outline - Introduction - review of RNN/LSTM - Model (QRNN) - Variants - Results - sentiment classification - language modeling - character-level machine translation - Conclusion - Reference
  • 4. Introduction (review of RNN) - the standard model architecture for deep learning approaches to sequence modeling tasks - sentence classification | word- and character-level language modeling | machine translation | question answering | image caption | time series forecasting
  • 5. Introduction (review of RNN) - the network which has loop arhictectures - RNN is very deep (causing gradient vanising) word2vec(“私”) 昨日の株価 (“の”:0.2, “は”:0.3, ...) 今日の株価の予測値
  • 6. Introduction (review of RNN) - problem: not good at learning very long sequences - document classification | character-level - why?: can’t deal with sequential data in parallel
  • 7. Introduction (review of LSTM) - LSTM solves gradient vanising, using memory cell - LSTM has 3 gates to control information flow
  • 8. Introduction (review of LSTM) - forget gate to control long-term information (in memory cell c)
  • 9. Introduction (review of LSTM) - input gate to control current+short-time information (in x and h(t-1))
  • 10. Introduction (review of LSTM) - update memory cell, mixing the current with the previos memory cell
  • 11. - output gate to control current hidden-state information to the next layer Introduction (review of LSTM)
  • 12. - using a forget gate instead of an input gate Introduction (variants of LSTM)
  • 13. Model
  • 14. Model (convolution component) “ズン”, “ドコ”, “きよし” ( 1, 0, 0, ) =“ズン” この例はone-hotだが word2vecというもっといい変換 を使う ズン, ズン, ズン, ドコ, きよし 時刻tの値を予測す るのに未来の時刻 t+1のデータを用い てはいけないので、 masked convolution
  • 15. Model
  • 16. bottle-neckになっていた前の層のhidden state h[t-1] を用いるのではなく、前の時刻 の入力x[t-1/2/...]を用いて並列処理を可能 にした。 Model (pooling component) LSTM さらに、hidden state h に重みをかけずに渡していくので、各要素 の情報がごっちゃにならないので可視化しやすい。 ここは従来のLSTMと同じく逐次 計算するが、そんなに大して時 間かからない。
  • 17. Model (pooling component) other type poolings (この論文では使われていない?) f-pooling ifo-pooling
  • 18. Variants - Zoneout: Dropout for LSTM - skip-connection like DenseNet - Attention for Encoder-Decoder
  • 19. Experiments - Sentiment Classification (document binary-classification) - IMDb movie review - 25,000 positive/negative reviews - Language Modeling (word-level prediction) - PTB: Penn Treebank - Character-level Machine Translatoin - IWST English-German spoken language translation task
  • 20. Results (sentiment classification) - 小 batch_size, 長 seq_len に向いている(最大16倍早 かった。) - training 時間は3倍早い
  • 23. Results (character-level machine translation) BLEU: upper is better http://unicorn.ike.tottori-u.ac.jp/2010/s072046/paper/graduation-thesis/node32.html
  • 24. 考察 - LSTMに精度で少し負けてしまった理由は、隠れ層の状態 h[t-1] ではなく、直前の 入力 x[t-1|t-2|,...]を使って近似したからと考えられる。 - 入力で、隠れ層の状態を近似する場合、使う、前の時刻の filter size k を無限大ま で長くすれば一致する。(sentiment classificationのtaskではkを大きくしたら精度 上がった) - なので、filter-sizeを大きくすればいいが、そうすると、計算速度はどれほど落ちる のかが問題。
  • 25. Conclusion - QRNN = RNN processing like CNN - can process sequential data in parallel - up to 16x faster than LSTM in train/test - can make visual analysis of weights easy
  • 26. Reference - LSTM - LSTMネットワークの概要 https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca - わかるLSTM ~ 最近の動向と共に https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca - ニューラルネットワーク勉強会 http://isw3.naist.jp/~neubig/student/2015/seitaro-s/161025neuralnet_study_LSTM.pdf - conv の 3D図作成 - thinkercad https://www.tinkercad.com/ - QRNN - LSTMを超える期待の新星、QRNN https://qiita.com/icoxfog417/items/d77912e10a7c60ae680e - slideshare https://www.slideshare.net/DeepLearningJP2016/dlquasirecurrent-neural-networks?qid=a4ead77d-d8dd-458b-965c-5e53723d7757 &v=&b=&from_search=1 - pytorchでの公式実装 https://github.com/salesforce/pytorch-qrnn/blob/master/torchqrnn/qrnn.py