The document proposes a traffic demand prediction model based on a dynamic transition convolutional neural network. It defines a transition network using density-peak based clustering to identify virtual stations. A dynamic graph convolutional gated recurrent unit is developed to model the station-to-station transitions over time. Meteorological data is also incorporated as external features to improve the prediction performance. The model is evaluated on bike and taxi demand data from New York City and shows improved results over various baseline models.
4. 4
Overview
Suggest novel methods to apply NLP approaches to music domain
Introduce MusicBERT, a large-scale pretrained model for symbolic music understanding
Evaluate the performance on four tasks
5. 5
Contributions
A traffic transition network is defined
• Virtual station by Density-peak based
clustering
Dynamic Graph Convolution Gated Recurrent
Unit
• A new dynamic transition convolution unit
Unifying learning framework(weather factors with
hidden states of demands)
6. 6
Model Overview
Framework
1) Transition Network Construction
1) Virtual Station Discovery
• DPC-based virtual station recognition algorithm
2) Transition Matrix Construction
• Network is constructed from the virtual stations.
2) Dynamic Transition Convolution Unit Design
1) Dynamic Graph Convolutional Gated Recurrent Unit’s devised.
8. 8
Model Overview
Preliminaries and Notations
1. Virtual Station
2. Station-to-Station Transition
3. Traffic Demand
4. Transition Flow
5. Transition Network
9. 9
Model Overview
Preliminaries and Notations
• Virtual Station
• Density Peak Clustering(DPC) Virtual Station Recognition
• Station-to-Station Transition
• Traffic trajectory is a tuple : ((p.t, p.s),(d.t, d.s))
• P, d – origin and destination points of trajectory
• T, s – timestamp and virtual station
• Traffic Demand
• In fixed time duration
• Each station’s pick-up and drop-off demand
• X = [pickup, dropoff], Y = [estimated traffic demand at (t+1)]
10. 10
Model Overview
Preliminaries and Notations
• Transition Flow
• W - # of trips from station i to station j at time step t
• k – starting time, l –arrival time, 1<= t <= T
• s – time slot s = {1,2,…, M}
11. 11
Model Overview
Preliminaries and Notations
• Transition Network
• Defined as a series of directed graphs
• 𝘝 – the set of nodes(virtual stations)
• ε – the set of edges
• 𝐰 – NxN matrix R, a weighted adjacency matrix
12. 12
Model Overview
Framework
1) Transition Network Construction
1) Virtual Station Discovery
• DPC-based virtual station recognition algorithm
2) Transition Matrix Construction
• Network is constructed from the virtual stations.
2) Dynamic Transition Convolution Unit Design
1) Dynamic Graph Convolutional Gated Recurrent Unit’s devised.
13. 13
Model Overview
Framework
1) Transition Network Construction
1) Virtual Station Discovery
• DPC(Density Peak
Clustering)-based virtual
station recognition
algorithm cores
subs
14. 14
Model Overview
Framework
1) Transition Network Construction
1) Virtual Station Discovery
• DPC-based virtual station recognition algorithm
2) Transition Matrix Construction
• Network is constructed from the virtual stations.
2) Dynamic Transition Convolution Unit Design
1) Dynamic Graph Convolutional Gated Recurrent Unit’s devised.
15. 15
Overview
Suggest novel methods to apply NLP approaches to music domain
Introduce MusicBERT, a large-scale pretrained model for symbolic music understanding
Evaluate the performance ons
18. 18
Experiments & Results
Data set
1. Bike-NYC
• Dates, times, station IDs of pick-up and drop-off points
2. Taxi-NYC
• Lat and lng of pick-up and drop-off, times
3. Weather-NYC
• Hourly, Kennedy International Station
20. 20
Experiments & Results
Setting
1. Setting Boundary
• 40;28;38.6363◦N to 40;55;3.2772◦N and 73;42;0.9792◦W to 74;15;32.7240◦W
• Divide the region into a 1000 × 1000 grid map
• The size of each lattice is about 47m × 49m
• Setting the minimum distance between station centers as 50 lattice(2400 meters)
• The local density threshold is set as 30
• 1192 virtual stations are identified.
21. 21
Experiments & Results
Setting
1. Meteorological semantics embedding
• Recorded by Kennedy International Station per hour
• 26 features except for timestamp
• Encode categorical features as an one-hot numeric array
Results
22. 22
Future Work
Increasing the depth of the model and further decreasing the
computational complexity(탐색 알고리즘 효율)
Modeling the spatial and temporal dependencies by transition networks
with dynamic nodes
Extending this model to multi-step forecasting
23. 23
My Research
• Motivation
• Bus와 Taxi 데이터를 기반으로 MOD 서비스 적용 시 일어날 수요에 대한 예측
• MOD 에 대한 데이터 부족이 빈번함
• 서비스가 시작 된지 얼마 되지않아 아직 적용 사례가 부족 데이터 부족
• MOD 회사는 데이터 제공에 회의적
• MOD 서비스를 시작하고 싶은 회사와 정부기관에게 해당 지역의 Taxi와 bus 수요 데이터를
기반으로 MOD 서비스 적용 시 일어날 수요를 차량 단위로 예측을 제공
Shareabiltiy(detour ratio 공유
경로)적용한 최적 경로로 시뮬레이션
데이터 생성
딥러닝 모델에 적용하여
특정 경로를 지나는 MOD vehicle 안의
탑승 인원 수 예측
<Bus & Taxi data-based simulation scenario>
24. 24
My Research
• Data Analysis
• 2016.01 New York Taxi data
• Randomly picked demands
<Randomly picked demands>
<Integrated demand>
25. 25
My Research
• Data Analysis
• 2016.01 New York Taxi data
• Randomly picked demands에 Lin-Kernighan traveling salesman heuristic 적용하여 최적경로 추출
• Cost = 거리
<Randomly picked demands> <Randomly picked demands>
26. 26
My Research
• Data Analysis
• 2016.01 New York Taxi data
• In total, 10906858𝑃5*15sec (for 2 demands on the optimal path) spent randomly pick demands calculate
optimal path
• Problems
• Too much computational load(detour ratio 적용시 더 늘어날것으로 예상)
• Hard to filter points where minor demands and major demands are
<Randomly picked demands> <Randomly picked demands>
27. 27
My Research
• Data Analysis
• 2016.01 New York Taxi data
• In total 10906858 demands randomly pick demands calculate optimal path
• Modes to solve the issues
• Narrow down areas where majority of demands existrelieving computational cost
• More weights to potential trajectories demand prediction accuracy increases (데일리
패턴으로 분석하는 실 수요지 후보군)
28. 28
My Research
• Data Analysis
• 2016.01 New York Taxi data
• In total 10906858 demands randomly pick demands calculate optimal path
• Modes to solve the issues
• 최적 경로를 구할 때 모든 수요 포인트를 랜덤으로 픽 하기 보다 메인 수요지를 우선
파악해서 각 가상 스테이션을 기점으로 최적경로를 구하면 computational load가
줄면서 모델 학습 정확도도 실 수요지를 중심으로 높일 수 있지 않을까?
• DTCNN에서 학습시킬때 그래프 정보를 그대로 가져와서 디코더 출력단에서 매번
인코더의 정보를 상기시켜 주는 Attention 학습으로 수요 예측을 하는데 도움을
준다면?