Playing Atari with Deep Reinforcement Learning

•

0 likes•132 views

Yoonho Shin

고려대학교 지능 미디어 연구실 스터디에 사용한 자료

Engineering

DeepMind Technologies
Playing Atari with
Deep
Reinforcement
Learning

Contents
 Abstract
 Introduction
 Background
 Deep Reinforcement Learning
 Experiments
 Conclusion

Abstract
 First deep learning model using reinforcement
learning
• Successfully learn control policies directly
• From high-dimensional sensory input (pixels)
 CNN model trained on a variant of Q-learning
• Input: raw pixel
• Output: a value function estimating future reward
 Applied seven Atari 2600 games (no adjustment)
• Outperforms ALL previous approaches on six games
• Surpasses a human expert on three games

Introduction
 Learning directly from high-dimensional sensory input
is Long-standing challenges of RL
 Most successful RL relies on hand-crafted features
 Recent advances in deep learning extract high-level
features from raw sensory data
• Breakthroughs in computer vision/Speech recognition
 Neural architectures on supervised/unsupervised
learning
• Convolutional networks
• Multilayer perceptrons
• Restricted Boltzmann machines
• Recurrent Neural nets
Problems to solve : Motivation

Introduction
 Most DL requires hand labeled training data
• RL must learn from a scalar reward signal
• Reward signal is often sparse, noisy, and delayed
• Delay between actions and resulting rewards can be
thousand time steps
CNN with a variant Q-learning
• Most DL assumes data samples are independent
• RL encounters sequences of highly correlated states
Experience replay
Remaining challenges

Background
Major Components of an RL Agent

Background
On-policy learning vs Off-policy learning

Background
On-policy learning vs Off-policy learning
http://www.modulabs.co.kr/RL_library/2621

Deep Reinforcement Learning
Deep Q-Networks : Experience Replay

Deep Reinforcement Learning
Model Architecture

What's hot

Reinforcement Learning Q-Learning Melaku Eneayehu

Reinforcement Learning : A Beginners TutorialOmar Enayet

Reinforcement LearningSalem-Kabbani

control system Lab 01-introduction to transfer functionsnalan karunanayake

An introduction to reinforcement learningSubrat Panda, PhD

A brief overview of Reinforcement Learning applied to gamesThomas da Silva Paula

Reinforcement learning, Q-LearningKuppusamy P

DDPG algortihm for angry birdsWangyu Han

Fourier series and applications of fourier transformKrishna Jangid

Actor critic algorithmJie-Han Chen

Deep Reinforcement Learning: Q-LearningKai-Wen Zhao

An introduction to reinforcement learning (rl)pauldix

Reinforcement LearningMuhammad Iqbal Tawakal

Reinforcement learningDing Li

photovoltaics cell pv cell solar cell Gautam Singh

An introduction to reinforcement learningJie-Han Chen

Lect 02 second portionMoe Moe Myint

Survey of Super Resolution Task (SISR Only)MYEONGGYU LEE

Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdfssuserbe3944

Faster R-CNN: Towards real-time object detection with region proposal network...Universitat Politècnica de Catalunya

What's hot (20)

Reinforcement Learning Q-Learning

Reinforcement Learning : A Beginners Tutorial

Reinforcement Learning

control system Lab 01-introduction to transfer functions

An introduction to reinforcement learning

A brief overview of Reinforcement Learning applied to games

Reinforcement learning, Q-Learning

DDPG algortihm for angry birds

Fourier series and applications of fourier transform

Actor critic algorithm

Deep Reinforcement Learning: Q-Learning

An introduction to reinforcement learning (rl)

Reinforcement Learning

Reinforcement learning

photovoltaics cell pv cell solar cell

An introduction to reinforcement learning

Lect 02 second portion

Survey of Super Resolution Task (SISR Only)

Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdf

Faster R-CNN: Towards real-time object detection with region proposal network...

Similar to Playing Atari with Deep Reinforcement Learning

Introduction to deep learningAbhishek Bhandwaldar

AI powered emotion recognition: From Inception to Production - Global AI Conf...Vandana Kannan

AI powered emotion recognition: From Inception to Production - Global AI Conf...Apache MXNet

MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong

Deep learning: the future of recommendationsBalázs Hidasi

Startup.Ml: Using neon for NLP and Localization Applications Intel Nervana

DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado

Machine Duping 101: Pwning Deep Learning SystemsClarence Chio

Development of Deep Learning ArchitecturePantech ProLabs India Pvt Ltd

Computer Design Concepts for Machine LearningFacultad de Informática UCM

Deep Learning Sample Class (Jon Lederman)Jon Lederman

Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Impetus Technologies

Deep Learning Made Easy with Deep FeaturesTuri, Inc.

An Introduction to Deep LearningPoo Kuan Hoong

DSRLab seminar Introduction to deep learningPoo Kuan Hoong

Introduction to deep learningAmr Rashed

Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks

Introduction to deep learning @ Startup.ML by Andres RodriguezIntel Nervana

Apache MXNet ODSC West 2018Apache MXNet

Big Data Malaysia - A Primer on Deep LearningPoo Kuan Hoong

Similar to Playing Atari with Deep Reinforcement Learning (20)

Introduction to deep learning

AI powered emotion recognition: From Inception to Production - Global AI Conf...

MDEC Data Matters Series: machine learning and Deep Learning, A Primer

Deep learning: the future of recommendations

Startup.Ml: Using neon for NLP and Localization Applications

DEF CON 24 - Clarence Chio - machine duping 101

Machine Duping 101: Pwning Deep Learning Systems

Development of Deep Learning Architecture

Computer Design Concepts for Machine Learning

Deep Learning Sample Class (Jon Lederman)

Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...

Deep Learning Made Easy with Deep Features

An Introduction to Deep Learning

DSRLab seminar Introduction to deep learning

Introduction to deep learning

Separating Hype from Reality in Deep Learning with Sameer Farooqui

Introduction to deep learning @ Startup.ML by Andres Rodriguez

Apache MXNet ODSC West 2018

Big Data Malaysia - A Primer on Deep Learning

Recently uploaded

UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1

Porous Ceramics seminar and technical writingrakeshbaidya232001

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal

Introduction to Multiple Access Protocol.pptxupamatechverse

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat

DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEslot gacor bisa pakai pulsa

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Introduction to IEEE STANDARDS and its different types.pptxupamatechverse

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile

Recently uploaded (20)

UNIT-V FMM.HYDRAULIC TURBINE - Construction and working

MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik

the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...

Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

Porous Ceramics seminar and technical writing

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...

Introduction to Multiple Access Protocol.pptx

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Introduction to IEEE STANDARDS and its different types.pptx

VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts

Playing Atari with Deep Reinforcement Learning

1. DeepMind Technologies Playing Atari with Deep Reinforcement Learning

2. Contents  Abstract  Introduction  Background  Deep Reinforcement Learning  Experiments  Conclusion

3. Abstract  First deep learning model using reinforcement learning • Successfully learn control policies directly • From high-dimensional sensory input (pixels)  CNN model trained on a variant of Q-learning • Input: raw pixel • Output: a value function estimating future reward  Applied seven Atari 2600 games (no adjustment) • Outperforms ALL previous approaches on six games • Surpasses a human expert on three games

4. Introduction  Learning directly from high-dimensional sensory input is Long-standing challenges of RL  Most successful RL relies on hand-crafted features  Recent advances in deep learning extract high-level features from raw sensory data • Breakthroughs in computer vision/Speech recognition  Neural architectures on supervised/unsupervised learning • Convolutional networks • Multilayer perceptrons • Restricted Boltzmann machines • Recurrent Neural nets Problems to solve : Motivation

5. Introduction  Most DL requires hand labeled training data • RL must learn from a scalar reward signal • Reward signal is often sparse, noisy, and delayed • Delay between actions and resulting rewards can be thousand time steps CNN with a variant Q-learning • Most DL assumes data samples are independent • RL encounters sequences of highly correlated states Experience replay Remaining challenges

6. Introduction

7. Background Agent and Environment

8. Background State

9. Background Major Components of an RL Agent

10. Background Policy

11. Background On-policy learning vs Off-policy learning

12. Background On-policy learning vs Off-policy learning http://www.modulabs.co.kr/RL_library/2621

13. Background Value Function

14. Background Optimal Value Functions

15. Background Q-Networks

16. Background Q-Learning

17. Background Model