Smart Traffic Lights that Learn !
Multi-Agent Reinforcement Learning Integrated Network
of Adaptive Traffic Signal Control...
Outline
2
1. In a Nutshell
2. Theory in Brief
Reinforcement Learning and Game Theory
3. Applications
City of Toronto Tes...
In a Nutshell
3
 Grand objective
 Intersections "talk to each other",
 Each is affected by what is happening upstream
...
What is MARLIN?
4
 Artificial-intelligence-based control
software
 Enables traffic lights to self-learn
and self-collabo...
MARLIN-ATSC: Level 4
Evolution of “Adaptive” Signal Control
Level 0
• Fixed-Time
and
Actuated
Control
• TRANSYT
• 1969, UK...
Issues with Leading ATSC Technologies?
• Expensive
• Not scalable
• Not robust
Centralized
• Relying on an accurate traffi...
Why is MARLIN Different?
7
MARLIN
Self-Learning
Decentralized
Model-Free
CoordinatedScalable
Pattern
Sensitive
Generic
Hum...
Learning the Control Law:
Reinforcement Learning Architecture
8
Environment
RL Architecture
Agent
State RewardAction
Goal:...
RL-based ATSC Architecture
RL Software
Agent
State
(Queue Lengths)
Reward
(Delay Savings)
Action
(Extend
/Switch)
Traffic ...
10
 Each agent plays a game with each adjacent
intersection in its neighborhood
I5 I6I4
I2 I3I1
I8 I9I7
I5 I6I4
I2 I3I1
I...
MARLIN-ATSC: (a) Independent Mode, (b) Integrated Mode
MARLIN-ATSC Available Modes
Queue Length 1
Delay 1Extend 1
Queue L...
Large-Scale Application
Network-Wide MOE in the Normal Scenario
12
System
MOE
BC
%
Improvments
MARL-TI Vs.
BC
%
Improvment...
Large-Scale Application
% Improvement in Average Delay
13
MARLIN-IC vs BC
% Improvement
Area 1
Area 2
Area 3
Large-Scale Application
Average Route Travel Time for Selected Routes
14
0
1
2
3
4
5
6
7
8
1 2 3 4 5 6 7 8 9 10 11 12
Aver...
Large-Scale Application
Average Route Travel Time for Selected Routes
15
0
2
4
6
8
10
12
14
16
18
20
1 2 3 4 5 6 7 8 9 10 ...
Controller Interface Device(CID)
RS485 to USB
Traffic Signal Controller
RS485 -
SDLC protocol
USB -
SDLC protocol
Industri...
HILS Setup: Demo
17
Conclusion
18
 MARLIN state of the art gen4+
 Thoroughly developed and tested
 Patent Pending Status
 On going:
 HILS...
Samah El-Tantawy samah.el.tantawy@utoronto.ca
Baher Abdulhai baher.abdulhai@utoronto.ca
Hossam Abdelgawad h.abdel.gawad@ut...
Smart Traffic Lights that Learn !
Multi-Agent Reinforcement Learning Integrated Network
of Adaptive Traffic Signal Control...
Upcoming SlideShare
Loading in …5
×

Smart Traffic Lights that Learn ! Multi-Agent Reinforcement Learning Integrated Network of Adaptive Traffic Signal Controllers

1,107 views

Published on

The University of Toronto’s Dr. Samah el-Tantawy has developed MARLIN-ATSC (Multi-agent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers), an artificial intelligence traffic system that uses cameras and inter-system collaboration to determine how best to manage traffic flow throughout a region. MARLIN’s success is largely due to its adaptive nature, which allows it to learn and adjust as it gathers traffic pattern information.

Read more: http://www.utne.com/science-and-technology/marlin-transforms-traffic.aspx#ixzz2yxjuDpCP

visit http://techronto.com for more information like this.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,107
On SlideShare
0
From Embeds
0
Number of Embeds
141
Actions
Shares
0
Downloads
51
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Smart Traffic Lights that Learn ! Multi-Agent Reinforcement Learning Integrated Network of Adaptive Traffic Signal Controllers

  1. 1. Smart Traffic Lights that Learn ! Multi-Agent Reinforcement Learning Integrated Network of Adaptive Traffic Signal Controllers M A R L I N Samah El-Tantawy, Ph.D. Post Doctoral Fellow, Dept of Civil Engineering Baher Abdulhai, Ph.D., P.Eng. Director, ITS Centre and Testbed, Dept of Civil Engineering Hossam Abdelgawad, Ph.D., P.Eng. Manager of ITS Centre and Testbed ACGM 2013- Intelligent Transport for Smart Cities
  2. 2. Outline 2 1. In a Nutshell 2. Theory in Brief Reinforcement Learning and Game Theory 3. Applications City of Toronto Testbed 4. Hardware in the Loop Testing Approach Integration with PEEK ATC-1000 Next Steps Q&A
  3. 3. In a Nutshell 3  Grand objective  Intersections "talk to each other",  Each is affected by what is happening upstream  Each affects what is happening downstream –  Whole network control in one shot from a grand brain is the dream  Issue  Intractable theoretically,  Too complex practically,  Requires massive and very expensive communication  Solution  Decentralized,  Self learning: agents learn to control their local intersection, and  Game theory based: agents learn to collaborate
  4. 4. What is MARLIN? 4  Artificial-intelligence-based control software  Enables traffic lights to self-learn and self-collaborate with neighbouring traffic lights  Cuts down motorists’ delay, fuel consumption and the negative environmental effects of congestion  Easier to operate (self learning)  Less expensive communication if even necessary (less costly)
  5. 5. MARLIN-ATSC: Level 4 Evolution of “Adaptive” Signal Control Level 0 • Fixed-Time and Actuated Control • TRANSYT • 1969, UK Level 1 • Centralized Control, Off- line Optimization • SCATS • 1979, Australia • >50 installations worldwide Level 2 • Centralized Control, On-line Optimization • SCOOT • 1981, UK • >170 installations worldwide Level 3 • Distributed Control, Model- Based • OPAC, RHODES • 1992, USA • 5 installations in USA Level 4 • Distributed Self-Learning Control • MARLIN-ATSC • 2011, Canada 5
  6. 6. Issues with Leading ATSC Technologies? • Expensive • Not scalable • Not robust Centralized • Relying on an accurate traffic modelling framework • the accuracy of which is questionable Model-Based • Increasing the complexity of the system exponentially with the increase in the number of intersections/controllers Curse of Dimensionality • Requiring highly skilled labour to operate due to their complexity. Human Intervention Requirements 6
  7. 7. Why is MARLIN Different? 7 MARLIN Self-Learning Decentralized Model-Free CoordinatedScalable Pattern Sensitive Generic Human Intervention Requirements Centralized Inefficient Coordination Model-Based Curse of Dimensionality Prediction Requirement Specific Design
  8. 8. Learning the Control Law: Reinforcement Learning Architecture 8 Environment RL Architecture Agent State RewardAction Goal: Optimal Control law = mapping between states and actions )],(),(max[),(),( 111 kkkkk a kkkkkkk asQasQrasQasQ    ),(maxarg 11 asQa kk a k   Balancing exploration and exploitation Q a1 a2 s1 -10 -5 s2 -3 -15 Q Table
  9. 9. RL-based ATSC Architecture RL Software Agent State (Queue Lengths) Reward (Delay Savings) Action (Extend /Switch) Traffic Simulation Environment 9
  10. 10. 10  Each agent plays a game with each adjacent intersection in its neighborhood I5 I6I4 I2 I3I1 I8 I9I7 I5 I6I4 I2 I3I1 I8 I9I7 I5 I6I4 I2 I3I1 I8 I9I7 Example for Intermediate Intersection (4 Games ) Example for Edge Intersection ( 3 Games) Example for Corner Intersection ( 2 Games) MARLIN- ATSC: Coordination Principle
  11. 11. MARLIN-ATSC: (a) Independent Mode, (b) Integrated Mode MARLIN-ATSC Available Modes Queue Length 1 Delay 1Extend 1 Queue Length 2 Delay 2Extend 2 MARLIN-ATSC Queue Length 1 Delay 1Extend 1 Queue Length 2 Delay 2Extend 2 (a) (b) 11
  12. 12. Large-Scale Application Network-Wide MOE in the Normal Scenario 12 System MOE BC % Improvments MARL-TI Vs. BC % Improvments MARLIN-IC Vs. BC % Improvments MARLIN-IC Vs. MARL-TI Average Intersection Delay (sec/veh) 35.27 27% 38% 14% Throughput (veh) 23084 3% 6% 3% Avg Queue Length (veh) 8.66 24% 32% 11% Std. Avg. Queue Length (veh) 2.12 23% 31% 10% Avg. Link Delay (sec) 9.45 10% 47% 41%
  13. 13. Large-Scale Application % Improvement in Average Delay 13 MARLIN-IC vs BC % Improvement Area 1 Area 2 Area 3
  14. 14. Large-Scale Application Average Route Travel Time for Selected Routes 14 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 11 12 AverageTravelTime(min) Time Interval (5 min) Gardiner EB BC MARL-TI MARLIN-IC Freeway
  15. 15. Large-Scale Application Average Route Travel Time for Selected Routes 15 0 2 4 6 8 10 12 14 16 18 20 1 2 3 4 5 6 7 8 9 10 11 12 AverageTravelTime(min) Time Interval (5 min) LakeShore EB to Spadina NB BC MARL-TI MARLIN-IC MajorArterial
  16. 16. Controller Interface Device(CID) RS485 to USB Traffic Signal Controller RS485 - SDLC protocol USB - SDLC protocol Industrial Computer Ethernet - NTCIP protocol Paramics Modeller MARLIN-HILS Architecture 16
  17. 17. HILS Setup: Demo 17
  18. 18. Conclusion 18  MARLIN state of the art gen4+  Thoroughly developed and tested  Patent Pending Status  On going:  HILS & PEEK ATC-1000 Integration  Potential Field Operation Test  Productization  From TSP to People Priority (PSP)
  19. 19. Samah El-Tantawy samah.el.tantawy@utoronto.ca Baher Abdulhai baher.abdulhai@utoronto.ca Hossam Abdelgawad h.abdel.gawad@utoronto.ca ACGM 2013- Intelligent Transport for Smart Cities
  20. 20. Smart Traffic Lights that Learn ! Multi-Agent Reinforcement Learning Integrated Network of Adaptive Traffic Signal Controllers M A R L I N Samah ElTantawy, Ph.D. Post Doctoral Fellow, Dept of Civil Engineering Baher Abdulhai, Ph.D., P.Eng. Director, ITS Centre and Testbed, Dept of Civil Engineering Hossam Abdelgawad, Ph.D., P.Eng. Manager of ITS Centre and Testbed ACGM 2013- Intelligent Transport for Smart Cities

×