SlideShare a Scribd company logo
1 of 26
Download to read offline
James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard Socher
Salesforce Research Palo Alto, California
arXiv:1611.01576v2 [cs.NE] 21 Nov 2016
ICLR 2017 accepted
QRNN: QUASI-RECURRENT
NEURAL NETWORKS
Abstract
- QRNN = RNN processing like CNN
- can process sequential data in parallel
- up to 16 times faster than LSTM in train/test
- can make visual analysis of weights easy
Outline
- Introduction
- review of RNN/LSTM
- Model (QRNN)
- Variants
- Results
- sentiment classification
- language modeling
- character-level machine translation
- Conclusion
- Reference
Introduction (review of RNN)
- the standard model architecture for deep learning approaches to sequence
modeling tasks
- sentence classification | word- and character-level language modeling | machine translation |
question answering | image caption | time series forecasting
Introduction (review of RNN)
- the network which has loop
arhictectures
- RNN is very deep (causing
gradient vanising) word2vec(“私”)
昨日の株価
(“の”:0.2, “は”:0.3, ...)
今日の株価の予測値
Introduction (review of RNN)
- problem: not good at
learning very long
sequences
- document classification |
character-level
- why?: can’t deal with
sequential data in parallel
Introduction (review of LSTM)
- LSTM solves gradient vanising, using memory cell
- LSTM has 3 gates to control information flow
Introduction (review of LSTM)
- forget gate to control long-term information (in memory cell c)
Introduction (review of LSTM)
- input gate to control current+short-time information (in x and h(t-1))
Introduction (review of LSTM)
- update memory cell, mixing the current with the previos memory cell
- output gate to control current hidden-state information to the next layer
Introduction (review of LSTM)
- using a forget gate instead of an input gate
Introduction (variants of LSTM)
Model
Model (convolution component)
“ズン”, “ドコ”, “きよし”
( 1, 0, 0, )
=“ズン”
この例はone-hotだが
word2vecというもっといい変換
を使う
ズン, ズン, ズン, ドコ, きよし
時刻tの値を予測す
るのに未来の時刻
t+1のデータを用い
てはいけないので、
masked
convolution
Model
bottle-neckになっていた前の層のhidden
state h[t-1] を用いるのではなく、前の時刻
の入力x[t-1/2/...]を用いて並列処理を可能
にした。
Model (pooling component)
LSTM
さらに、hidden state h に重みをかけずに渡していくので、各要素
の情報がごっちゃにならないので可視化しやすい。
ここは従来のLSTMと同じく逐次
計算するが、そんなに大して時
間かからない。
Model (pooling component)
other type poolings (この論文では使われていない?)
f-pooling
ifo-pooling
Variants
- Zoneout: Dropout for LSTM
- skip-connection like DenseNet
- Attention for Encoder-Decoder
Experiments
- Sentiment Classification (document binary-classification)
- IMDb movie review
- 25,000 positive/negative reviews
- Language Modeling (word-level prediction)
- PTB: Penn Treebank
- Character-level Machine Translatoin
- IWST English-German spoken language translation task
Results (sentiment classification)
- 小 batch_size, 長 seq_len
に向いている(最大16倍早
かった。)
- training 時間は3倍早い
Results (sentiment classification)
final layer’s
hidden state
Results (language modeling)
Results (character-level machine translation)
BLEU: upper is better
http://unicorn.ike.tottori-u.ac.jp/2010/s072046/paper/graduation-thesis/node32.html
考察
- LSTMに精度で少し負けてしまった理由は、隠れ層の状態 h[t-1] ではなく、直前の
入力 x[t-1|t-2|,...]を使って近似したからと考えられる。
- 入力で、隠れ層の状態を近似する場合、使う、前の時刻の filter size k を無限大ま
で長くすれば一致する。(sentiment classificationのtaskではkを大きくしたら精度
上がった)
- なので、filter-sizeを大きくすればいいが、そうすると、計算速度はどれほど落ちる
のかが問題。
Conclusion
- QRNN = RNN processing like CNN
- can process sequential data in parallel
- up to 16x faster than LSTM in train/test
- can make visual analysis of weights easy
Reference
- LSTM
- LSTMネットワークの概要 https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca
- わかるLSTM ~ 最近の動向と共に https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca
- ニューラルネットワーク勉強会
http://isw3.naist.jp/~neubig/student/2015/seitaro-s/161025neuralnet_study_LSTM.pdf
- conv の 3D図作成
- thinkercad https://www.tinkercad.com/
- QRNN
- LSTMを超える期待の新星、QRNN https://qiita.com/icoxfog417/items/d77912e10a7c60ae680e
- slideshare
https://www.slideshare.net/DeepLearningJP2016/dlquasirecurrent-neural-networks?qid=a4ead77d-d8dd-458b-965c-5e53723d7757
&v=&b=&from_search=1
- pytorchでの公式実装 https://github.com/salesforce/pytorch-qrnn/blob/master/torchqrnn/qrnn.py

More Related Content

Recently uploaded

Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 

Recently uploaded (20)

9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 

Featured

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 

Featured (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

研究室文献発表 11/10 QRNN

  • 1. James Bradbury∗, Stephen Merity∗ , Caiming Xiong & Richard Socher Salesforce Research Palo Alto, California arXiv:1611.01576v2 [cs.NE] 21 Nov 2016 ICLR 2017 accepted QRNN: QUASI-RECURRENT NEURAL NETWORKS
  • 2. Abstract - QRNN = RNN processing like CNN - can process sequential data in parallel - up to 16 times faster than LSTM in train/test - can make visual analysis of weights easy
  • 3. Outline - Introduction - review of RNN/LSTM - Model (QRNN) - Variants - Results - sentiment classification - language modeling - character-level machine translation - Conclusion - Reference
  • 4. Introduction (review of RNN) - the standard model architecture for deep learning approaches to sequence modeling tasks - sentence classification | word- and character-level language modeling | machine translation | question answering | image caption | time series forecasting
  • 5. Introduction (review of RNN) - the network which has loop arhictectures - RNN is very deep (causing gradient vanising) word2vec(“私”) 昨日の株価 (“の”:0.2, “は”:0.3, ...) 今日の株価の予測値
  • 6. Introduction (review of RNN) - problem: not good at learning very long sequences - document classification | character-level - why?: can’t deal with sequential data in parallel
  • 7. Introduction (review of LSTM) - LSTM solves gradient vanising, using memory cell - LSTM has 3 gates to control information flow
  • 8. Introduction (review of LSTM) - forget gate to control long-term information (in memory cell c)
  • 9. Introduction (review of LSTM) - input gate to control current+short-time information (in x and h(t-1))
  • 10. Introduction (review of LSTM) - update memory cell, mixing the current with the previos memory cell
  • 11. - output gate to control current hidden-state information to the next layer Introduction (review of LSTM)
  • 12. - using a forget gate instead of an input gate Introduction (variants of LSTM)
  • 13. Model
  • 14. Model (convolution component) “ズン”, “ドコ”, “きよし” ( 1, 0, 0, ) =“ズン” この例はone-hotだが word2vecというもっといい変換 を使う ズン, ズン, ズン, ドコ, きよし 時刻tの値を予測す るのに未来の時刻 t+1のデータを用い てはいけないので、 masked convolution
  • 15. Model
  • 16. bottle-neckになっていた前の層のhidden state h[t-1] を用いるのではなく、前の時刻 の入力x[t-1/2/...]を用いて並列処理を可能 にした。 Model (pooling component) LSTM さらに、hidden state h に重みをかけずに渡していくので、各要素 の情報がごっちゃにならないので可視化しやすい。 ここは従来のLSTMと同じく逐次 計算するが、そんなに大して時 間かからない。
  • 17. Model (pooling component) other type poolings (この論文では使われていない?) f-pooling ifo-pooling
  • 18. Variants - Zoneout: Dropout for LSTM - skip-connection like DenseNet - Attention for Encoder-Decoder
  • 19. Experiments - Sentiment Classification (document binary-classification) - IMDb movie review - 25,000 positive/negative reviews - Language Modeling (word-level prediction) - PTB: Penn Treebank - Character-level Machine Translatoin - IWST English-German spoken language translation task
  • 20. Results (sentiment classification) - 小 batch_size, 長 seq_len に向いている(最大16倍早 かった。) - training 時間は3倍早い
  • 23. Results (character-level machine translation) BLEU: upper is better http://unicorn.ike.tottori-u.ac.jp/2010/s072046/paper/graduation-thesis/node32.html
  • 24. 考察 - LSTMに精度で少し負けてしまった理由は、隠れ層の状態 h[t-1] ではなく、直前の 入力 x[t-1|t-2|,...]を使って近似したからと考えられる。 - 入力で、隠れ層の状態を近似する場合、使う、前の時刻の filter size k を無限大ま で長くすれば一致する。(sentiment classificationのtaskではkを大きくしたら精度 上がった) - なので、filter-sizeを大きくすればいいが、そうすると、計算速度はどれほど落ちる のかが問題。
  • 25. Conclusion - QRNN = RNN processing like CNN - can process sequential data in parallel - up to 16x faster than LSTM in train/test - can make visual analysis of weights easy
  • 26. Reference - LSTM - LSTMネットワークの概要 https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca - わかるLSTM ~ 最近の動向と共に https://qiita.com/KojiOhki/items/89cd7b69a8a6239d67ca - ニューラルネットワーク勉強会 http://isw3.naist.jp/~neubig/student/2015/seitaro-s/161025neuralnet_study_LSTM.pdf - conv の 3D図作成 - thinkercad https://www.tinkercad.com/ - QRNN - LSTMを超える期待の新星、QRNN https://qiita.com/icoxfog417/items/d77912e10a7c60ae680e - slideshare https://www.slideshare.net/DeepLearningJP2016/dlquasirecurrent-neural-networks?qid=a4ead77d-d8dd-458b-965c-5e53723d7757 &v=&b=&from_search=1 - pytorchでの公式実装 https://github.com/salesforce/pytorch-qrnn/blob/master/torchqrnn/qrnn.py