Deep_Reinforcement_Learning_based_Dynamic_Timetable.pptx

Deep Reinforcement Learning based
Dynamic Optimization of Bus
Timetable
Ankit Sharma

Bus Timetable Optimization
 Bus timetable optimization is a key issue to reduce operational cost of bus
companies and improve the service quality.
 Heuristic algorithms work in offline and does not account for people flow change.
https://doi.org/10.48550/arXiv.2107.07066

Bus Timetable Optimization
 The optimization of the bus timetable aims to consider the interests of both
passengers and the bus company, and set the departure time of buses to meet
the demand of passenger flow.
 The main quantitative indicators are bus congestion and the waiting time of
passengers, while the interests of bus companies are mainly affected by the
number of departures (departure intervals) in the timetable.

Deep Reinforcement Learning Formulation
 Actions – No Departure(0), Departure(1)
 States – Time(hr-𝑿𝟏𝒕,),Time(min-𝑿𝟐𝒕,),Load Rate(𝑿𝟑𝒕), Waiting Time(𝑿𝟒𝒕),
Carrying Capability(𝑿𝟓𝒕), Stranded Passengers(𝑿𝟔𝒕).
 Reward(from reference paper):
1 − (𝑿𝟓𝒕) − 𝛼 ∗ 𝑿𝟒𝒕 − 𝛽 ∗ 𝑿𝟔𝒕, (𝑎𝑐𝑡𝑖𝑜𝑛 = 0)
(𝑿𝟓𝒕) − 𝛽 ∗ 𝑿𝟔𝒕, (𝑎𝑐𝑡𝑖𝑜𝑛 = 1)
Alternate reward functions were also tested.
, , ,

Bus Environment
 Bus timetable considered as episodic task (star-end schedule) with 6 states changing w.r.t
time.
𝑺𝒕 = [𝑿𝟏𝒕, 𝑿𝟐𝒕, 𝑿𝟑𝒕, 𝑿𝟒𝒕, 𝑿𝟓𝒕, 𝑿𝟔𝒕]
 𝑋1𝑡 − 𝑡ℎ/24, 𝑋2𝑡 −
𝑡𝑚
60
 𝑋3𝑡 −
𝑀𝑎𝑥 𝑃𝑎𝑠𝑠𝑒𝑛𝑔𝑒𝑟𝑠
𝑀𝑎𝑥 𝐵𝑢𝑠 𝐶𝑎𝑝𝑎𝑐𝑖𝑡𝑦
 𝑋4𝑡 − Normalized Waiting time of all passengers
 𝑋5𝑡 −
𝑁𝑒𝑒𝑑 𝑜𝑓 𝐶𝑎𝑟𝑟𝑦𝑖𝑛𝑔 𝐶𝑎𝑝𝑎𝑐𝑖𝑡𝑦
𝐶𝑎𝑟𝑟𝑦𝑖𝑛𝑔 𝑐𝑎𝑝𝑎𝑐𝑖𝑡𝑦 𝑜𝑓 𝑣𝑒ℎ𝑖𝑐𝑙𝑒
 𝑋6𝑡 − 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑟𝑎𝑛𝑑𝑒𝑑 𝑝𝑎𝑠𝑠𝑒𝑛𝑔𝑒𝑟𝑠
Assumptions*:
• 𝑋6𝑡 was assumed to be exponential distribution after every departure.
• Based on number of stranded passengers awaiting bus or remainder post departure
are used for calculation of 𝑋4𝑡, 𝑋5𝑡
• 𝑋3𝑡 is calculated based on max bus capacity and no. of stranded passengers at the
time of departure
• Episode considered from 06:00-10:00Am with decision point at every 5 mins
* Real data was not available
http://www.muxingyun.com/en/digital-factory

DRL - Algorithms
Deep SARSA
S.No. Parameter Value
1. DNN – Hidden
Layers
2
2. DNN-Hidden
Units
188
3. Activation
Function
ReLU
4. Epsilon 0.2
5. Gamma .99
6. Experience
memory size
10000
7. Batch Size 32
8. Learning rate .001
DQN
S.No. Parameter Value
1. DNN – Hidden
Layers
2
2. DNN-Hidden
Units
188
3. Activation
Function
ReLU
4. Epsilon 0.2
5. Gamma .99
6. Experience
memory size
10000
7. Batch Size 32
8. Learning rate .001

Bus Timetable-Fixed Interval(30 mins)

Testing with Deep SARSA & DQN
SARSA
Episodes 1500
Epsilon 0.2
DQN
Episodes 1500
Epsilon 0.2
Reward: Same as in paper.

Load Factor
Fixed Interval(30 mins): 8, SARSA Departures:23,DQN Departures:23
Normalized Waiting Time
Bus Timetable
DQN
Deep SARSA

Required Carrying Capacity
Stranded Passenger
Bus Timetable
DQN
Deep SARSA

SARSA
Episodes 1500
Epsilon 0.2
DQN
Episodes 1500
Epsilon 0.2
Reward: Modified to reduce number of departures and increase load rate to account
for bus agency. Condition added to have Load Rate > 0.7

Load Factor
Bus Timetable
DQN
Deep SARSA

Stranded Passenger
Bus Timetable
DQN
Deep SARSA

SARSA
Episodes 1500
Epsilon 0.2
DQN
Episodes 1500
Epsilon 0.2
Reward: Modified to reduce number of departures and increase load rate more to
account for bus agency. Condition added to have Load Rate > 0.8

Load Factor
Bus Timetable
DQN
Deep SARSA

Stranded Passenger
Bus Timetable
DQN
Deep SARSA

Conclusion
 Timetable created with fixed interval of 30 mins had waiting time and
stranded passenger going up and couldn’t be contained. Load Rate was also
100% which doesn’t go well with travelers due to heavy congestion.
 With right reward function DQN and Deep SARSA were able to understand
stranded passenger rate and reduce waiting time and number of stranded
passengers while keeping Load Rate less than 90% with just 3 more
departures in 4 hours (6:00-10:00AM)
 With Deep SARSA & DQN real time decision based on number of stranded
passengers can be taken without re-computing the whole problem.

Deep_Reinforcement_Learning_based_Dynamic_Timetable.pptx

Recommended

Recommended

More Related Content

Similar to Deep_Reinforcement_Learning_based_Dynamic_Timetable.pptx

Similar to Deep_Reinforcement_Learning_based_Dynamic_Timetable.pptx (20)

Recently uploaded

Recently uploaded (20)

Deep_Reinforcement_Learning_based_Dynamic_Timetable.pptx