핵심 딥러닝 입문 4장 RNN

•Download as PPTX, PDF•

0 likes•54 views

This document discusses recurrent neural networks (RNNs) and their training methods. It covers the basic architecture of RNNs, including their ability to process sequential data over time. It then discusses feedforward propagation and backpropagation for training RNNs, including challenges like exploding and vanishing gradients. It introduces techniques like truncated backpropagation through time (BPTT) and mini-batches to help address these issues during training. The document provides code examples to help understand RNN concepts in practice.

Interaction Lab. Seoul National University of Science and Technology
핵심 딥러닝 입문
chapter 4. RNN
Jeong Jae-Yeop

Interaction Lab., Seoul National University of Science and Technology
■Intro
■Training method
■Code practice
■Conclusion
Agenda
2

Interaction Lab., Seoul National University of Science and Technology
■What is RNN?
 Reccurent Neural Network
• Sequence data
• 𝑡 : Time
Intro
4
Input Output
Hidden

Interaction Lab., Seoul National University of Science and Technology
■Reccurent architecture
Intro
5

Interaction Lab., Seoul National University of Science and Technology
■Activation function
 Hyperbolic tangent
• 𝑥𝑡 : Input
• 𝑊
𝑥 : Input weight
• 𝑏 : Bias
• ℎ𝑡−1 : Previous output
• 𝑊ℎ : Previous output weight
Intro
6

Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation
 Calculate and store variables sequentially from the input layer to the output layer of the NN
■Backpropagation
 How to calculate gradients for parameters of a NN
Training method

Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation of RNN
 Deep Neural Network
• 𝑈 = 𝑋𝑊 + 𝐵
• 𝑌 = 𝑓(𝑈)
 RNN
• 𝑈(𝑡)
= 𝑋(𝑡)
𝑊 + 𝑌(𝑡−1)
𝑉 + 𝐵
• 𝑌(𝑡)
= 𝑓(𝑈(𝑡)
)
Training method
9

Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation of RNN
Training method
10
Input(t) 행렬 곱
행렬 곱
+
Activation
function Next layer
Next point
Weight
Weight
Bias
Output

Interaction Lab., Seoul National University of Science and Technology
■Feed forward propagation of RNN
Training method
11
𝑈(𝑡)
= 𝑥𝑡𝑊𝑥ℎ + ℎ𝑡−1𝑊ℎℎ + 𝑏ℎ

Interaction Lab., Seoul National University of Science and Technology
■Backpropagation of RNN
Training method
12

Interaction Lab., Seoul National University of Science and Technology
■Backpropagation of RNN
 We have to update parameters 𝑊𝑥ℎ, 𝑊ℎℎ, 𝑏
Training method
13
𝑑ℎ𝑡−1

Interaction Lab., Seoul National University of Science and Technology
■BPTT (Backpropagation Through Time)
 As the time scale of time series data increases, the computing resources consumed by
BPTT also increase
 As the time scale increases, the gradient of backpropagation becomes unstable
Training method
14

Interaction Lab., Seoul National University of Science and Technology
■Truncated BPTT
 Data must be entered in order
 Cut the backpropagation connection to an appropriate length
Training method
15

Interaction Lab., Seoul National University of Science and Technology
■Truncated BPTT using mini-batch
 Mini-batch : 2
 1,000 data : 500 / 500
Training method
16

Interaction Lab., Seoul National University of Science and Technology
■Binary addition
 5 = 1 × 22 + 0 × 21 + 1 × 20 ∶ 101
 36 = 1 × 25 + 0 × 24 + 0 × 23 + 0 × 22 +0 × 21 +0 × 20 ∶ 100100
 Input : two randomly selected binary numbers
 Label : sum of two numbers
 Link
Code practice
17

Interaction Lab., Seoul National University of Science and Technology
■Disadvantage of RNN
 Gradient vanishing and Gradient exploding
• LSTM and GRU
• Gradient clipping
Conclusion

World's toughest and most interesting analysis tasks lie at the intersection of graph data (inter-dependencies in data) and deep learning (inter-dependencies in the model). Classical graph embedding techniques have for years occupied research groups seeking how complex graphs can be encoded into a low-dimensional latent space. Recently, deep learning has dominated the space of embeddings generation due to its ability to automatically generate embeddings given any static graph. Grapharis is a project that revitalizes the concept of graph embeddings, yet it does so in a real setting were graphs are not static but keep changing over time (think of user interactions in social networks). More specifically, we explored how a system like Flink can be used to simplify both the process of training a graph embedding model incrementally but also make complex inferences and predictions in real time using graph structured data streams. To our knowledge, Grapharis is the first complete data pipeline using Flink and Tensorflow for real-time deep graph learning. This talk will cover how we can train, store and generate embeddings continuously and accurately as data evolves over time without the need to re-train the underlying model.

DeepLearningProjV3Ana Sanchez

Superefficient Monte Carlo SimulationsCheng-An Yang

matlab trainingShaminder Sandhu

https://github.com/telecombcn-dl/dlmm-2017-dcu Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.

Gaze estimation using transformer

Jaey Jeong

Mlp mixer an all-mlp architecture for vision

Jaey Jeong

Neural networks for semantic gaze analysis in xr settings

Jaey Jeong

Similar to 핵심 딥러닝 입문 4장 RNN

Improving accuracy of binary neural networks using unbalanced activation dist...

Jaey Jeong

Data Wrangling Week 7

Ferdin Joe John Joseph PhD

deep learning from scratch chapter 3 neural network

Jaey Jeong

deep learning from scratch chapter 7.cnn

Jaey Jeong

Week 9: Programming for Data Analysis

Ferdin Joe John Joseph PhD

Introduction to Neural networks (under graduate course) Lecture 7 of 9

Randa Elanwar

Transfer Learning: Breve introducción a modelos pre-entrenados.

Fernando Constantino

Nimrita deep learning

Nimrita Koul

hands on machine learning Chapter 4 model training

Jaey Jeong

Unsupervised representation learning for gaze estimation

Jaey Jeong

Recent progress on distributing deep learning

Viet-Trung TRAN

IDS for IoT.pptx

RashilaShrestha

Bag of tricks for image classification with convolutional neural networks r...

Dongmin Choi

Artificial neural network model & hidden layers in multilayer artificial neur...

Muhammad Ishaq

Neural network techniquesVipul Bhargava

Forecasting of Sales using Neural network techniquesHitesh Dua

240219_RNN, LSTM code.pptxdddddddddddddddd

ssuser2624f71

[20240422_LabSeminar_Huy]Taming_Effect.pptx

thanhdowork

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...

Universitat Politècnica de Catalunya

Gaze estimation using transformer

Jaey Jeong

Similar to 핵심 딥러닝 입문 4장 RNN (20)

Improving accuracy of binary neural networks using unbalanced activation dist...

Data Wrangling Week 7

deep learning from scratch chapter 3 neural network

deep learning from scratch chapter 7.cnn

Week 9: Programming for Data Analysis

Introduction to Neural networks (under graduate course) Lecture 7 of 9

Transfer Learning: Breve introducción a modelos pre-entrenados.

Nimrita deep learning

hands on machine learning Chapter 4 model training

Unsupervised representation learning for gaze estimation

Recent progress on distributing deep learning

IDS for IoT.pptx

Bag of tricks for image classification with convolutional neural networks r...

Artificial neural network model & hidden layers in multilayer artificial neur...

Neural network techniques

Forecasting of Sales using Neural network techniques

240219_RNN, LSTM code.pptxdddddddddddddddd

[20240422_LabSeminar_Huy]Taming_Effect.pptx

Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...

Gaze estimation using transformer

Recently uploaded

Orion Context Broker introduction 20240604

Fermin Galan

Providing Globus Services to Users of JASMIN for Environmental Data Analysis

Globus

JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.

Essentials of Automations: The Art of Triggers and Actions in FME

Safe Software

In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation. We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios. Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!

Mobile App Development Company In Noida | Drona Infotech

Drona Infotech

Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...

Globus

The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.

AI Genie Review: World’s First Open AI WordPress Website Creator

Google

AI Genie Review: World’s First Open AI WordPress Website Creator 👉👉 Click Here To Get More Info 👇👇 https://sumonreview.com/ai-genie-review AI Genie Review: Key Features ✅Creates Limitless Real-Time Unique Content, auto-publishing Posts, Pages & Images directly from Chat GPT & Open AI on WordPress in any Niche ✅First & Only Google Bard Approved Software That Publishes 100% Original, SEO Friendly Content using Open AI ✅Publish Automated Posts and Pages using AI Genie directly on Your website ✅50 DFY Websites Included Without Adding Any Images, Content Or Doing Anything Yourself ✅Integrated Chat GPT Bot gives Instant Answers on Your Website to Visitors ✅Just Enter the title, and your Content for Pages and Posts will be ready on your website ✅Automatically insert visually appealing images into posts based on keywords and titles. ✅Choose the temperature of the content and control its randomness. ✅Control the length of the content to be generated. ✅Never Worry About Paying Huge Money Monthly To Top Content Creation Platforms ✅100% Easy-to-Use, Newbie-Friendly Technology ✅30-Days Money-Back Guarantee See My Other Reviews Article: (1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review (2) SocioWave Review: https://sumonreview.com/sociowave-review (3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review (4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review #AIGenieApp #AIGenieBonus #AIGenieBonuses #AIGenieDemo #AIGenieDownload #AIGenieLegit #AIGenieLiveDemo #AIGenieOTO #AIGeniePreview #AIGenieReview #AIGenieReviewandBonus #AIGenieScamorLegit #AIGenieSoftware #AIGenieUpgrades #AIGenieUpsells #HowDoesAlGenie #HowtoBuyAIGenie #HowtoMakeMoneywithAIGenie #MakeMoneyOnline #MakeMoneywithAIGenie

AI Pilot Review: The World’s First Virtual Assistant Marketing Suite

Google

AI Pilot Review: The World’s First Virtual Assistant Marketing Suite 👉👉 Click Here To Get More Info 👇👇 https://sumonreview.com/ai-pilot-review/ AI Pilot Review: Key Features ✅Deploy AI expert bots in Any Niche With Just A Click ✅With one keyword, generate complete funnels, websites, landing pages, and more. ✅More than 85 AI features are included in the AI pilot. ✅No setup or configuration; use your voice (like Siri) to do whatever you want. ✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It… ✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again. ✅ZERO Limits On Features Or Usages ✅Use Our AI-powered Traffic To Get Hundreds Of Customers ✅No Complicated Setup: Get Up And Running In 2 Minutes ✅99.99% Up-Time Guaranteed ✅30 Days Money-Back Guarantee ✅ZERO Upfront Cost See My Other Reviews Article: (1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review (2) SocioWave Review: https://sumonreview.com/sociowave-review (3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review (4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review

Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...

Globus

Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.

Introduction to Pygame (Lecture 7 Python Game Development)

abdulrafaychaudhry

Globus Compute wth IRI Workflows - GlobusWorld 2024

Globus

As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.

Game Development with Unity3D (Game Development lecture 3)

abdulrafaychaudhry

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...

Mind IT Systems

Need for Speed: Removing speed bumps from your Symfony projects ⚡️

Łukasz Chruściel

No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception. In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed. We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.

AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App

Google

AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App 👉👉 Click Here To Get More Info 👇👇 https://sumonreview.com/ai-fusion-buddy-review AI Fusion Buddy Review: Key Features ✅Create Stunning AI App Suite Fully Powered By Google's Latest AI technology, Gemini ✅Use Gemini to Build high-converting Converting Sales Video Scripts, ad copies, Trending Articles, blogs, etc.100% unique! ✅Create Ultra-HD graphics with a single keyword or phrase that commands 10x eyeballs! ✅Fully automated AI articles bulk generation! ✅Auto-post or schedule stunning AI content across all your accounts at once—WordPress, Facebook, LinkedIn, Blogger, and more. ✅With one keyword or URL, generate complete websites, landing pages, and more… ✅Automatically create & sell AI content, graphics, websites, landing pages, & all that gets you paid non-stop 24*7. ✅Pre-built High-Converting 100+ website Templates and 2000+ graphic templates logos, banners, and thumbnail images in Trending Niches. ✅Say goodbye to wasting time logging into multiple Chat GPT & AI Apps once & for all! ✅Save over $5000 per year and kick out dependency on third parties completely! ✅Brand New App: Not available anywhere else! ✅ Beginner-friendly! ✅ZERO upfront cost or any extra expenses ✅Risk-Free: 30-Day Money-Back Guarantee! ✅Commercial License included! See My Other Reviews Article: (1) AI Genie Review: https://sumonreview.com/ai-genie-review (2) SocioWave Review: https://sumonreview.com/sociowave-review (3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review (4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review #AIFusionBuddyReview, #AIFusionBuddyFeatures, #AIFusionBuddyPricing, #AIFusionBuddyProsandCons, #AIFusionBuddyTutorial, #AIFusionBuddyUserExperience #AIFusionBuddyforBeginners, #AIFusionBuddyBenefits, #AIFusionBuddyComparison, #AIFusionBuddyInstallation, #AIFusionBuddyRefundPolicy, #AIFusionBuddyDemo, #AIFusionBuddyMaintenanceFees, #AIFusionBuddyNewbieFriendly, #WhatIsAIFusionBuddy?, #HowDoesAIFusionBuddyWorks

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx

rickgrimesss22

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...

Juraj Vysvader

LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM

lorraineandreiamcidl

Cracking the code review at SpringIO 2024

Paco van Beckhoven

Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production. Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process? In this session we will cover: - The Art of Effective Code Reviews - Streamlining the Review Process - Elevating Reviews with Automated Tools By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces

Enterprise Resource Planning System in Telangana

NYGGS Automation Suite

Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics. To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/

A Study of Variable-Role-based Feature Enrichment in Neural Models of Code

Aftab Hussain

Understanding variable roles in code has been found to be helpful by students in learning programming -- could variable roles help deep neural models in performing coding tasks? We do an exploratory study. - These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia

Recently uploaded (20)

Orion Context Broker introduction 20240604

Providing Globus Services to Users of JASMIN for Environmental Data Analysis

Essentials of Automations: The Art of Triggers and Actions in FME

Mobile App Development Company In Noida | Drona Infotech

Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...

AI Genie Review: World’s First Open AI WordPress Website Creator

AI Pilot Review: The World’s First Virtual Assistant Marketing Suite

Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...

Introduction to Pygame (Lecture 7 Python Game Development)

Globus Compute wth IRI Workflows - GlobusWorld 2024

Game Development with Unity3D (Game Development lecture 3)

Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...

Need for Speed: Removing speed bumps from your Symfony projects ⚡️

AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...

LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM

Cracking the code review at SpringIO 2024

Enterprise Resource Planning System in Telangana

A Study of Variable-Role-based Feature Enrichment in Neural Models of Code

핵심 딥러닝 입문 4장 RNN

1. Interaction Lab. Seoul National University of Science and Technology 핵심 딥러닝 입문 chapter 4. RNN Jeong Jae-Yeop

2. Interaction Lab., Seoul National University of Science and Technology ■Intro ■Training method ■Code practice ■Conclusion Agenda 2

3. Intro Training method 3

4. Interaction Lab., Seoul National University of Science and Technology ■What is RNN?  Reccurent Neural Network • Sequence data • 𝑡 : Time Intro 4 Input Output Hidden

5. Interaction Lab., Seoul National University of Science and Technology ■Reccurent architecture Intro 5

6. Interaction Lab., Seoul National University of Science and Technology ■Activation function  Hyperbolic tangent • 𝑥𝑡 : Input • 𝑊 𝑥 : Input weight • 𝑏 : Bias • ℎ𝑡−1 : Previous output • 𝑊ℎ : Previous output weight Intro 6

7. Training method Code practice 7

8. Interaction Lab., Seoul National University of Science and Technology ■Feed forward propagation  Calculate and store variables sequentially from the input layer to the output layer of the NN ■Backpropagation  How to calculate gradients for parameters of a NN Training method

9. Interaction Lab., Seoul National University of Science and Technology ■Feed forward propagation of RNN  Deep Neural Network • 𝑈 = 𝑋𝑊 + 𝐵 • 𝑌 = 𝑓(𝑈)  RNN • 𝑈(𝑡) = 𝑋(𝑡) 𝑊 + 𝑌(𝑡−1) 𝑉 + 𝐵 • 𝑌(𝑡) = 𝑓(𝑈(𝑡) ) Training method 9

10. Interaction Lab., Seoul National University of Science and Technology ■Feed forward propagation of RNN Training method 10 Input(t) 행렬 곱 행렬 곱 + Activation function Next layer Next point Weight Weight Bias Output

11. Interaction Lab., Seoul National University of Science and Technology ■Feed forward propagation of RNN Training method 11 𝑈(𝑡) = 𝑥𝑡𝑊𝑥ℎ + ℎ𝑡−1𝑊ℎℎ + 𝑏ℎ

12. Interaction Lab., Seoul National University of Science and Technology ■Backpropagation of RNN Training method 12

13. Interaction Lab., Seoul National University of Science and Technology ■Backpropagation of RNN  We have to update parameters 𝑊𝑥ℎ, 𝑊ℎℎ, 𝑏 Training method 13 𝑑ℎ𝑡−1

14. Interaction Lab., Seoul National University of Science and Technology ■BPTT (Backpropagation Through Time)  As the time scale of time series data increases, the computing resources consumed by BPTT also increase  As the time scale increases, the gradient of backpropagation becomes unstable Training method 14

15. Interaction Lab., Seoul National University of Science and Technology ■Truncated BPTT  Data must be entered in order  Cut the backpropagation connection to an appropriate length Training method 15

16. Interaction Lab., Seoul National University of Science and Technology ■Truncated BPTT using mini-batch  Mini-batch : 2  1,000 data : 500 / 500 Training method 16

17. Interaction Lab., Seoul National University of Science and Technology ■Binary addition  5 = 1 × 22 + 0 × 21 + 1 × 20 ∶ 101  36 = 1 × 25 + 0 × 24 + 0 × 23 + 0 × 22 +0 × 21 +0 × 20 ∶ 100100  Input : two randomly selected binary numbers  Label : sum of two numbers  Link Code practice 17

18. Interaction Lab., Seoul National University of Science and Technology ■Disadvantage of RNN  Gradient vanishing and Gradient exploding • LSTM and GRU • Gradient clipping Conclusion

19. Q&A 19

핵심 딥러닝 입문 4장 RNN

Recommended

Recommended

More Related Content

Similar to 핵심 딥러닝 입문 4장 RNN

Similar to 핵심 딥러닝 입문 4장 RNN (20)

More from Jaey Jeong

More from Jaey Jeong (10)

Recently uploaded

Recently uploaded (20)

핵심 딥러닝 입문 4장 RNN