6 28 18_hack_hunterdon_meetup_deep_rl

•

0 likes•34 views

Sean Devlin

Hack Hunterdon Meetup :: Overview of Reinforcement Learning

Technology

Deep Reinforcement Learning Barton Hall 78

AI could become ‘an immortal dictator from which we would never
escape’
- Elon Musk

Action at
Environment
Reward rt
Agent
Left, Right, Shoot
State st

Optimal Action Value Q*(s,a)
Q*(s,a) = maxπ
𝔼[R|sR|st
=s, at
=a, π]
Q(s,a;θ) ≈ Q*(s,a)θ) ≈ Q*(s,a)) ≈ Q*(s,a)
Minimize loss function Li
(θ) ≈ Q*(s,a))
Li
(θ) ≈ Q*(s,a)) = 𝔼s,a~p(∙)
[R|s(yi
-Q(s,a;θ) ≈ Q*(s,a)θ) ≈ Q*(s,a)))2
],
yi
= 𝔼s’~Ɛ
[R|sr + ϒmaxmaxa’
Q(s’,a’;θ) ≈ Q*(s,a)θ) ≈ Q*(s,a)i-1
)|s,a]
p(s, a) is probability over sequences s
and actions a

Book
Reinforcement Learning: An Introduction (Sutton & Barto)
http://incompleteideas.net/book/the-book-2nd.html
Courses
David Silver’s UCL Course on RL
http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching.html
Berkely CS 294: Deep Reinforcement Learning
rll.berkeley.edu/deeprlcourse/
Implementations
Denny Britz
https://github.com/dennybritz/reinforcement-learning
Article
Reinforcement Learning Doesn’t Work Yet
https://www.alexirpan.com/2018/02/14/rl-hard.html
Code
PyTorch Deep Q Learning
http://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html
Papers
Playing Atari with Deep Reinforcement Learning
https://deepmind.com/research/publications/playing-atari-deep-reinforcement-learning/
Human Level Control with RL
https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

How to convert PDF to text with Nanonetsnaman860154

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

A Call to Action for Generative AI in 2024Results

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Developing An App To Navigate The Roads of BrazilV3cube

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Histor y of HAM Radio presentation slidevu2urc

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Partners Life - Insurer Innovation Award 2024The Digital Insurer

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Recently uploaded (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

How to convert PDF to text with Nanonets

Presentation on how to chat with PDF using ChatGPT code interpreter

GenCyber Cyber Security Day Presentation

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Exploring the Future Potential of AI-Enabled Smartphone Processors

Driving Behavioral Change for Information Management through Data-Driven Gree...

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Salesforce Community Group Quito, Salesforce 101

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

A Call to Action for Generative AI in 2024

A Domino Admins Adventures (Engage 2024)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Data Cloud, More than a CDP by Matt Robison

Developing An App To Navigate The Roads of Brazil

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Histor y of HAM Radio presentation slide

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

Partners Life - Insurer Innovation Award 2024

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Featured

Skeleton Culture CodeSkeleton Technologies

PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley

Content Methodology: A Best Practices Report (Webinar)contently

How to Prepare For a Successful Job Search for 2024Albert Qian

Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)

Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal

5 Public speaking tips from TED - Visualized summarySpeakerHub

ChatGPT and the Future of Work - Clark Boyd Clark Boyd

Getting into the tech field. what next Tessa Mero

Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray

How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC

Introduction to Data ScienceChristy Abraham Joy

Time Management & Productivity - Best PracticesVit Horky

The six step guide to practical project managementMindGenius

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools

12 Ways to Increase Your Influence at WorkGetSmarter

ChatGPT webinar slidesAlireza Esmikhani

More than Just Lines on a Map: Best Practices for U.S Bike RoutesProject for Public Spaces & National Center for Biking and Walking

Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference

Featured (20)

Skeleton Culture Code

PEPSICO Presentation to CAGNY Conference Feb 2024

Content Methodology: A Best Practices Report (Webinar)

How to Prepare For a Successful Job Search for 2024

Social Media Marketing Trends 2024 // The Global Indie Insights

Trends In Paid Search: Navigating The Digital Landscape In 2024

5 Public speaking tips from TED - Visualized summary

ChatGPT and the Future of Work - Clark Boyd

Getting into the tech field. what next

Google's Just Not That Into You: Understanding Core Updates & Search Intent

How to have difficult conversations

Introduction to Data Science

Time Management & Productivity - Best Practices

The six step guide to practical project management

Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...

Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...

12 Ways to Increase Your Influence at Work

ChatGPT webinar slides

More than Just Lines on a Map: Best Practices for U.S Bike Routes

Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...

6 28 18_hack_hunterdon_meetup_deep_rl

1. Deep Reinforcement Learning Barton Hall 78

2. AI could become ‘an immortal dictator from which we would never escape’ - Elon Musk

4. Not Here | Here

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.

21. Actions: Left, Right, Shoot Reward

22. Actions: Left, Right, Shoot Reward

23. Actions: Left, Right, Shoot Reward: 50

24. Action at Environment Reward rt Agent Left, Right, Shoot State st

25.

26.

27. Optimal Action Value Q*(s,a) Q*(s,a) = maxπ 𝔼[R|sR|st =s, at =a, π] Q(s,a;θ) ≈ Q*(s,a)θ) ≈ Q*(s,a)) ≈ Q*(s,a) Minimize loss function Li (θ) ≈ Q*(s,a)) Li (θ) ≈ Q*(s,a)) = 𝔼s,a~p(∙) [R|s(yi -Q(s,a;θ) ≈ Q*(s,a)θ) ≈ Q*(s,a)))2 ], yi = 𝔼s’~Ɛ [R|sr + ϒmaxmaxa’ Q(s’,a’;θ) ≈ Q*(s,a)θ) ≈ Q*(s,a)i-1 )|s,a] p(s, a) is probability over sequences s and actions a

28.

29.

30.

31. So Hype OR No-Hype?

32. Book Reinforcement Learning: An Introduction (Sutton & Barto) http://incompleteideas.net/book/the-book-2nd.html Courses David Silver’s UCL Course on RL http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Teaching.html Berkely CS 294: Deep Reinforcement Learning rll.berkeley.edu/deeprlcourse/ Implementations Denny Britz https://github.com/dennybritz/reinforcement-learning Article Reinforcement Learning Doesn’t Work Yet https://www.alexirpan.com/2018/02/14/rl-hard.html Code PyTorch Deep Q Learning http://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html Papers Playing Atari with Deep Reinforcement Learning https://deepmind.com/research/publications/playing-atari-deep-reinforcement-learning/ Human Level Control with RL https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf

33. seanpdevlin@gmail.com

6 28 18_hack_hunterdon_meetup_deep_rl

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (20)

Featured

Featured (20)

6 28 18_hack_hunterdon_meetup_deep_rl