Submit Search
Upload
Playing atari with deep reinforcement learning
•
Download as PPTX, PDF
•
0 likes
•
132 views
K
KyeongUkJang
Follow
Playing atari with deep reinforcement learning
Read less
Read more
Technology
Report
Share
Report
Share
1 of 30
Download now
Recommended
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
Woong won Lee
Linear algebra.pptx
Linear algebra.pptx
GeonWooYoo1
Vae
Vae
Lee Gyeong Hoon
해커에게 전해들은 머신러닝 #3
해커에게 전해들은 머신러닝 #3
Haesun Park
03. linear regression
03. linear regression
Jeonghun Yoon
Random walk, brownian motion, black scholes equation
Random walk, brownian motion, black scholes equation
창호 손
Photo wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photo
KyeongUkJang
YOLO
YOLO
KyeongUkJang
Recommended
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
Woong won Lee
Linear algebra.pptx
Linear algebra.pptx
GeonWooYoo1
Vae
Vae
Lee Gyeong Hoon
해커에게 전해들은 머신러닝 #3
해커에게 전해들은 머신러닝 #3
Haesun Park
03. linear regression
03. linear regression
Jeonghun Yoon
Random walk, brownian motion, black scholes equation
Random walk, brownian motion, black scholes equation
창호 손
Photo wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photo
KyeongUkJang
YOLO
YOLO
KyeongUkJang
AlphagoZero
AlphagoZero
KyeongUkJang
GoogLenet
GoogLenet
KyeongUkJang
GAN - Generative Adversarial Nets
GAN - Generative Adversarial Nets
KyeongUkJang
Distilling the knowledge in a neural network
Distilling the knowledge in a neural network
KyeongUkJang
Latent Dirichlet Allocation
Latent Dirichlet Allocation
KyeongUkJang
Gaussian Mixture Model
Gaussian Mixture Model
KyeongUkJang
CNN for sentence classification
CNN for sentence classification
KyeongUkJang
Visualizing data using t-SNE
Visualizing data using t-SNE
KyeongUkJang
Chapter 20 - GAN
Chapter 20 - GAN
KyeongUkJang
Chapter 20 - VAE
Chapter 20 - VAE
KyeongUkJang
Chapter 20 Deep generative models
Chapter 20 Deep generative models
KyeongUkJang
Chapter 19 Variational Inference
Chapter 19 Variational Inference
KyeongUkJang
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2
KyeongUkJang
Natural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - Basic
KyeongUkJang
Chapter 17 monte carlo methods
Chapter 17 monte carlo methods
KyeongUkJang
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2
KyeongUkJang
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1
KyeongUkJang
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2
KyeongUkJang
Chapter 15 Representation learning - 1
Chapter 15 Representation learning - 1
KyeongUkJang
Chapter 6 Deep feedforward networks - 2
Chapter 6 Deep feedforward networks - 2
KyeongUkJang
More Related Content
More from KyeongUkJang
AlphagoZero
AlphagoZero
KyeongUkJang
GoogLenet
GoogLenet
KyeongUkJang
GAN - Generative Adversarial Nets
GAN - Generative Adversarial Nets
KyeongUkJang
Distilling the knowledge in a neural network
Distilling the knowledge in a neural network
KyeongUkJang
Latent Dirichlet Allocation
Latent Dirichlet Allocation
KyeongUkJang
Gaussian Mixture Model
Gaussian Mixture Model
KyeongUkJang
CNN for sentence classification
CNN for sentence classification
KyeongUkJang
Visualizing data using t-SNE
Visualizing data using t-SNE
KyeongUkJang
Chapter 20 - GAN
Chapter 20 - GAN
KyeongUkJang
Chapter 20 - VAE
Chapter 20 - VAE
KyeongUkJang
Chapter 20 Deep generative models
Chapter 20 Deep generative models
KyeongUkJang
Chapter 19 Variational Inference
Chapter 19 Variational Inference
KyeongUkJang
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2
KyeongUkJang
Natural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - Basic
KyeongUkJang
Chapter 17 monte carlo methods
Chapter 17 monte carlo methods
KyeongUkJang
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2
KyeongUkJang
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1
KyeongUkJang
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2
KyeongUkJang
Chapter 15 Representation learning - 1
Chapter 15 Representation learning - 1
KyeongUkJang
Chapter 6 Deep feedforward networks - 2
Chapter 6 Deep feedforward networks - 2
KyeongUkJang
More from KyeongUkJang
(20)
AlphagoZero
AlphagoZero
GoogLenet
GoogLenet
GAN - Generative Adversarial Nets
GAN - Generative Adversarial Nets
Distilling the knowledge in a neural network
Distilling the knowledge in a neural network
Latent Dirichlet Allocation
Latent Dirichlet Allocation
Gaussian Mixture Model
Gaussian Mixture Model
CNN for sentence classification
CNN for sentence classification
Visualizing data using t-SNE
Visualizing data using t-SNE
Chapter 20 - GAN
Chapter 20 - GAN
Chapter 20 - VAE
Chapter 20 - VAE
Chapter 20 Deep generative models
Chapter 20 Deep generative models
Chapter 19 Variational Inference
Chapter 19 Variational Inference
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - Basic
Chapter 17 monte carlo methods
Chapter 17 monte carlo methods
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 1
Chapter 15 Representation learning - 1
Chapter 6 Deep feedforward networks - 2
Chapter 6 Deep feedforward networks - 2
Playing atari with deep reinforcement learning
1.
Playing Atari with
Deep Reinforcement Learning
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
𝑉𝜋 = 𝐸
𝜋 𝑅1, +𝑣𝑅2 + ⋯ |𝑠 = 𝐸 𝑇= 𝑡=1 𝑇 𝛾 𝑡−1 𝑅t 𝑠 𝑉𝜋 𝑖+1 s = 1 ⅈ + 1 𝑔𝑖+1 − 𝑉𝜋 𝑖 (𝑠) 𝑉𝜋 1 s = 1 1 𝑔1 + 𝑉𝜋 0 (𝑠) 𝑉𝜋 1 s = 𝑔1 𝑉𝜋 2 s = 1 2 𝑔2 + 𝑉𝜋 1 (𝑠) 𝑉𝜋 2 s = 1 2 𝑔1 + 𝑔2 𝑉𝜋 3 s = 1 3 𝑔3 + 𝑉𝜋 2 (𝑠) 𝑉𝜋 3 s = 1 3 𝑔1 + 𝑔2 + 𝑔3
19.
𝑉𝜋 𝑖+1 s = 1 ⅈ +
1 𝑔𝑖+1 − 𝑉𝜋 𝑖 (𝑠) 𝑉𝜋 𝑖+1 s = α 𝑔𝑖+1 − 𝑉𝜋 𝑖 (𝑠) 𝑉𝜋 𝑖+1 s = (1 − α)𝑉𝜋 𝑖 (𝑠) + α𝑔𝑖+1 𝑉𝜋 𝑖+1 s = α 𝑔𝑖+1 − 𝑉𝜋 𝑖 (𝑠) 𝑄 𝑆𝑡, 𝐴 𝑡 < − 𝑄 𝑆𝑡, 𝐴 𝑡 + 𝛼 𝑅 + 𝛾𝑄 𝑆𝑡+1, 𝐴 𝑡+1 − 𝑄 𝑆𝑡, 𝐴 𝑡 𝑄 𝑆𝑡, 𝐴 𝑡 < − 𝑄 𝑆𝑡, 𝐴 𝑡 + 𝛼 𝑅𝑡+1 + 𝛾 max 𝑎 𝑄 𝑆𝑡+1, 𝑎 − 𝑄 𝑆𝑡, 𝐴 𝑡
20.
21.
22.
23.
24.
25.
26.
- 미니 배치
크리 32 - 리플레이 메모리 크리 400000 - ε : 1부터 0.1까지 100000스텝 동안 감소 - 감가율 0.99 - 학습속도 0.00025
27.
28.
29.
30.
References https://www.youtube.com/watch?v=lvoHnicueoEStanford University School
of Engineering https://www.youtube.com/watch?v=V7_cNTfm2i8&list=P L0oFI08O71gKjGhaWctTPvvM7_cVzsAtK&index=5Sung Kim 파이썬과 케라스로 배우는 강화학습 좌충우돌 강화학습의 이론과 구현[출처] 좌충우돌 강화학습의 이론과 구현(원고)|작성자 숨은원리 출판사
Download now