SlideShare a Scribd company logo
A Deep Reinforcement Learning
Approach to Traffic Management
By Osvaldo Castellanos
Motivation
Ref: Machine Learning for Everyone
Ref: https://xkcd.com/1838/
RL Model
Ref: https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-
learning.html#what-is-reinforcement-learning
Markov Decision Processes
Ref: https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-
learning.html#what-is-reinforcement-learning
Important Concepts:
Ref: https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-
learning.html#what-is-reinforcement-learning
Ref: https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-
learning.html#what-is-reinforcement-learning
Backup Diagram
Ref: https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html#what-is-
reinforcement-learning
Ref: https://medium.freecodecamp.org/diving-deeper-into-reinforcement-learning-with-q-learning-c18d0db58efe
A Taxonomy of RL Algorithms
Ref: Spinning up RL
Approaches
• Dynamic Programming
• Policy Evaluation
• Policy Improvement
• Policy Iteration
• Monte-Carlo Methods
• Temporal-Difference Learning
• SARSA: On-Policy TD
• Q-Learning: Off-Policy TD
• Deep Q-Network
Deep Q-Network
Ref: URL: https://2.bp.blogspot.com/-
bZERYUNyjao/Wa98yt7GjhI/AAAAAAAACt8/SYQjUNrbe1YDtKTMKR6LPt68C0pPqkoowCLcBGAs/s1600/DRL.JPG
OpenAI Gym
Main Functions Needed in a Custom Environment to Interface
with Gym:
• Reset
• Step
• Render
Step returns:
• next state
• reward
• done
• info
https://github.com/oscastellanos/gym-traffic/blob/master/gym_traffic/envs/TrEnv.py
pygame (the library) is a Free and
Open Source python programming
language library for making multimedia
applications like games built on top of
the excellent SDL library. Like SDL,
pygame is highly portable and runs on
nearly every platform and operating
system.
• Does not require OpenGL
• Multi core CPUs can be
used easily
• Uses optimized C, and
Assembly code for core
functions.
Ref: https://www.pygame.org/wiki/about
traffic_simulator.py
https://github.com/oscastellanos/gym-traffic/blob/master/gym_traffic/envs/traffic_simulator.py
"Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks," Liang et al.,
(2018), arxiv.org/abs/1803.11115
Faulty Reward Example
• https://youtu.be/tlOIHko8ySg
• From https://openai.com/blog/faulty-reward-functions/
• Intersections consist of different statuses.
• Complex behavior such as "Left turn on green," etc.
require their own status
• The time duration at one status is called a phase. The
number of phases is decided by the number of legal
statuses.
• In the Liang et al. paper, a cycle consists of phases with
fixed sequences, but the duration of every phase is
adaptive.
"Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks," Liang et al., (2018),
arxiv.org/abs/1803.11115
Example of my gym-traffic
• https://www.youtube.com/watch?v=sVswDx8WfPU
Ref: https://github.com/sarcturus00/Tidy-Reinforcement-learning/blob/master/Pseudo_code/DQN.png
"Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks," Liang et al., (2018),
arxiv.org/abs/1803.11115
A To-Do list of upcoming changes to simulator/environment:
• Refactor traffic-simulator.py
• Add docstrings to methods
• Include more statuses at an intersection
• Extend to multiple lanes
• Implement render in environment, add compatibility to monitor class
of gym
• Add tensorboard summaries for variables
For the Poster:
• Finish implementing DQN
• Adaptive phase duration
• Implement DDQN
• Add more graphs/results comparing random, fixed-timer, DQN, and
DDQN
Final report:
• Implement multi-agent reinforcement learning for multiple
intersections
• Add randomness to the environment by closing lanes for a period of
time.
• References:
• "Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks,"
Liang et al., (2018), arxiv.org/abs/1803.11115
• Machine Learning for Everyone : https://vas3k.com/blog/machine_learning/
• A (Long) Peek into Reinforcement Learning by Lilian Weng :
https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-
learning.html#what-is-reinforcement-learning
• OpenAI Spinning Up :
https://spinningup.openai.com/en/latest/spinningup/rl_intro.html
• Understanding RL: The Bellman Equations by Josh Greaves :
https://joshgreaves.com/reinforcement-learning/understanding-rl-the-bellman-
equations/
• OpenAI Gym basics:
https://katefvision.github.io/10703_openai_gym_recitation.pdf
• Diving Deeper into Reinforcement Learning with Q-Learning :
https://medium.freecodecamp.org/diving-deeper-into-reinforcement-learning-
with-q-learning-c18d0db58efe
THANK YOU!

More Related Content

Similar to T13 (1).pptx

Report_Altair
Report_AltairReport_Altair
Report_Altair
Rahul Rochlani
 
Network (Automation) eAcademy
Network (Automation) eAcademyNetwork (Automation) eAcademy
和艦長一起玩轉 GitLab & GitLab Workflow
和艦長一起玩轉 GitLab & GitLab Workflow和艦長一起玩轉 GitLab & GitLab Workflow
和艦長一起玩轉 GitLab & GitLab Workflow
Chen Cheng-Wei
 
Shift Dev Conf API
Shift Dev Conf APIShift Dev Conf API
Shift Dev Conf API
Cédrick Lunven
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
Grigory Sapunov
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL Advanced
LeanIX GmbH
 
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)
Christian Catalan
 
Git influencer - PPT
Git influencer - PPTGit influencer - PPT
Git influencer - PPT
Catherine Shen
 
Reproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflowReproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflow
Databricks
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World Graphene
Marcin Gębala
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Stijn Decubber
 
ChainerUI v0.2, v0.3
ChainerUI v0.2, v0.3ChainerUI v0.2, v0.3
ChainerUI v0.2, v0.3
Preferred Networks
 
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondHierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyond
Frank Kelly
 
Serving models using KFServing
Serving models using KFServingServing models using KFServing
Serving models using KFServing
Theofilos Papapanagiotou
 
Giachos_MSc_Thesis_v05
Giachos_MSc_Thesis_v05Giachos_MSc_Thesis_v05
Giachos_MSc_Thesis_v05
Fanis Giachos
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
SigOpt
 
Proposal with sdlc
Proposal with sdlcProposal with sdlc
Proposal with sdlc
Kamau Francis
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
dgarijo
 
Moving to Drupal
Moving to DrupalMoving to Drupal
Moving to Drupal
Mark Jarrell
 
Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...
Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...
Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...
Shift Conference
 

Similar to T13 (1).pptx (20)

Report_Altair
Report_AltairReport_Altair
Report_Altair
 
Network (Automation) eAcademy
Network (Automation) eAcademyNetwork (Automation) eAcademy
Network (Automation) eAcademy
 
和艦長一起玩轉 GitLab & GitLab Workflow
和艦長一起玩轉 GitLab & GitLab Workflow和艦長一起玩轉 GitLab & GitLab Workflow
和艦長一起玩轉 GitLab & GitLab Workflow
 
Shift Dev Conf API
Shift Dev Conf APIShift Dev Conf API
Shift Dev Conf API
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL Advanced
 
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)
 
Git influencer - PPT
Git influencer - PPTGit influencer - PPT
Git influencer - PPT
 
Reproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflowReproducible AI Using PyTorch and MLflow
Reproducible AI Using PyTorch and MLflow
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World Graphene
 
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.jsTensorFlow meetup: Keras - Pytorch - TensorFlow.js
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
 
ChainerUI v0.2, v0.3
ChainerUI v0.2, v0.3ChainerUI v0.2, v0.3
ChainerUI v0.2, v0.3
 
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondHierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyond
 
Serving models using KFServing
Serving models using KFServingServing models using KFServing
Serving models using KFServing
 
Giachos_MSc_Thesis_v05
Giachos_MSc_Thesis_v05Giachos_MSc_Thesis_v05
Giachos_MSc_Thesis_v05
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
 
Proposal with sdlc
Proposal with sdlcProposal with sdlc
Proposal with sdlc
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
 
Moving to Drupal
Moving to DrupalMoving to Drupal
Moving to Drupal
 
Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...
Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...
Designing API: REST | gRPC | GraphQL, which one should you pick? - Cedrick Lu...
 

Recently uploaded

Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
ecqow
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
SakkaravarthiShanmug
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
IJECEIAES
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
TaghreedAltamimi
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
abbyasa1014
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 

Recently uploaded (20)

Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
一比一原版(CalArts毕业证)加利福尼亚艺术学院毕业证如何办理
 
cnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classicationcnn.pptx Convolutional neural network used for image classication
cnn.pptx Convolutional neural network used for image classication
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
Software Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.pptSoftware Quality Assurance-se412-v11.ppt
Software Quality Assurance-se412-v11.ppt
 
Engineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdfEngineering Drawings Lecture Detail Drawings 2014.pdf
Engineering Drawings Lecture Detail Drawings 2014.pdf
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 

T13 (1).pptx