[20240429_LabSeminar_Huy]Spatio-Temporal Graph Neural Point Process for Traffic Congestion Event Prediction.pptx
1. Quang-Huy Tran
Network Science Lab
Dept. of Artificial Intelligence
The Catholic University of Korea
E-mail: huytran1126@gmail.com
2024-04-29
Spatio-Temporal Graph Neural Point
Process for Traffic Congestion Event
Prediction
Guangyin Jin et al.
AAAI’37: 2023 Conference on Artificial Intelligence
3. 3
MOTIVATION
• Traffic congestion is one of the most serious problems in urban management.
• Traffic congestion is a continuous process from generation to dissipation.
o Individual congestion event: occurrence time and duration.
o Meaningful for prediction to improve the traffic management and scheduling.
when the next congestion event occur.
how long it will last.
Traffic congestion overview
• Previous have disadvantages:
o The conventional methods only model dense variables like road, sparse like congestion not done.
o support the prediction in the given future time window (short time), not suitable for congestion
(long time).
4. 4
MOTIVATION
• An appropriate framework for sparse event prediction in continuous-time.
Neural Point Process
• Challenges:
o 1) How to effectively capture the spatio-temporal
dependencies inroad networks?.
o 2) How to effectively model the continuous and
instantaneous temporal dynamics simultaneously for
each road?
• Probabilistic models of variable-length
point sequences observed on the real half-
line—here interpreted as arrival times of
events.
5. 5
INTRODUCTION
• Propose a novel model named Spatio-Temporal Graph Neural Point Process (STGNPP)
for traffic congestion event prediction.
o Transformer and Graph Convolution Network (GCN) to jointly capture the spatio-temporal
dependencies from traffic states data.
o Extract the contextual link representations to incorporate with congestion event information for
modeling the history of the point process.
• To encode the hidden evolution patterns of each road
• present a novel continuous Gated Recurrent Unit (GRU) layer with neural flow
architecture.
• First work to propose spatio-temporal graph neural point process.
6. 6
METHODOLOGY
Task definition
• A road network with 𝑁 links 𝑉 𝑉 = 𝑁 as a graph G = V, E, A
• Traffic states 𝑋𝑛(eg., link speed) on each link 𝑉
𝑛 are dense features in the snapshots of
certain time granularity.
• Given a fixed-length historical time window T for each sample:
o predict the occurrence time and duration of the next congestion event.
• Sequential congestion events 𝑆𝑛 = {𝑠𝑛,𝑖} 𝑖 = 1, 2, . . . , 𝑆𝑛 :
o Link 𝑉
𝑛 has 𝑠𝑛,𝑖 = 𝑡𝑛,𝑖, 𝑑𝑛,𝑖 .
o 𝑡𝑛,𝑖: occurrence time.
o 𝑑𝑛,𝑖: duration.
7. 7
METHODOLOGY
Point Process Distribution
• Stochastic process to simulate the sequential events in a given observation time
interval [0, 𝑇]
• Time point is given:
• Intensity function of events at time point 𝑡 depended on the historical sequential
events 𝐻𝑡 up to 𝑡:
• Probability density function to observe an event sequence { }
𝑡𝑖 𝑖=1
𝑛
= 1, 𝜏 inter-event time:
9. 9
METHODOLOGY
Spatio-Temporal Graph Learning Module
• Link-wise Transformer layer.
• Graph convolution layer.
• Spatio-temporal inquirer.
• First, a fully connected layer to map the
historical traffic states into high-
dimensional representation.
10. 10
METHODOLOGY
Link-Wise Transformer Layer
• Self-attention network: Employ trigonometric functions-based position encoding
method.
where 𝑄, 𝐾, and 𝑉 are query, key, and value matrices obtained by three linear
transformations 𝑊𝑄, 𝑊𝐾 , 𝑊𝑉 ∈ ℝ𝐷×𝐷
, 𝐷 are dimension.
• Pass into two-layer position-wise feed-forward neural network.
𝑀𝐷: mask operation that sets the value of the upper triangle of attention matrix
to 0
11. 11
METHODOLOGY
Graph Convolution Layer - Spatio-temporal Inquirer
• Simple graph convolution operation with mix-hop aggregation.
where A is the normalized predefined adjacency matrix, 𝛼1, 𝛼2 ∈ ℝ𝑁 ×𝐷′
𝐷′ ≪ 𝑁 are
two learnable matrices, Θ𝑖 is learnable weight for each convolution layer.
• Select corresponding hidden representations based on indexes.
• Obtain those representation using sum aggregation and zero
padding.
12. 12
METHODOLOGY
Congestion Event Learning Module – Continuous GRU Layer
• Congestion event representation
where denotes the historical duration of each congestion event after zero padding.
• Insight: the traffic state for each link is a combination of continuous changes and
instantaneous change.
• Apply ODE-based(ordinary differential
equations):
where 𝜙 𝑡 is a continuous function satisfies 2
properties: i) 𝜙 0 = 0 and ii) 𝜙 𝑡 < 1, Γ 𝑡, 𝑥 is
an arbitrary contractive neural network.
14. 14
METHODOLOGY
Optimization and Prediction
• Optimizes the negative log-likelihood of the probability density function of the inter-
event time and the absolute error of the duration prediction:
where 𝑓𝑑 · denotes the fully connected layer for duration prediction of the next traffic
congestion, 𝛼 denotes the tradeoff ratio.
• Intensity Function Network: To approximate the distribution of inter-event time and
characterize the effect of periodic patterns of congestion, a periodic gated unit to
adjust the intensity function is defined:
15. 15
EXPERIMENT AND RESULT
EXPERIMENT
• Measurement:
o Mean Absolute Errors (MAE).
o Negative log-likelihood (NLL).
• Dataset: Amap application
o Beijing and Chengdu.
o interevent times, duration and periodic features.
• Task:
o Predict link condition in next 6 hours.
16. 16
• Baseline:
o Simple model: Historical Average (HA), Gradient Boosting Decision Tree(GBDT)[1], Gate Recurrent Unit
(GRU)[2].
o Spatio-temporal GNN: DCRNN[3], GraphWaveNet[4], STGODE [5].
o Neural point-process model: NHTPP [6], RMTPP [7], THPP [8], FNN-TPP [9].
EXPERIMENT AND RESULT
EXPERIMENT
[1] Ye, J., Chow, J. H., Chen, J., & Zheng, Z. (2009, November). Stochastic gradient boosted distributed decision trees. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 2061-2064)..
[2] Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
[3] Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2017). Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv preprint arXiv:1707.01926.
[4] Wu, Z., Pan, S., Long, G., Jiang, J., & Zhang, C. (2019). Graph wavenet for deep spatial-temporal graph modeling. arXiv preprint arXiv:1906.00121.
[5] Fang, Z., Long, Q., Song, G., & Xie, K. (2021, August). Spatial-temporal graph ode networks for traffic flow forecasting. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining (pp. 364-373).
[6] Mei, H., & Eisner, J. M. (2017). The neural hawkes process: A neurally self-modulating multivariate point process. Advances in neural information processing systems, 30.
[7] Du, N., Dai, H., Trivedi, R., Upadhyay, U., Gomez-Rodriguez, M., & Song, L. (2016, August). Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD international conference on knowled
ge discovery and data mining (pp. 1555-1564).
[8] Zuo, S., Jiang, H., Li, Z., Zhao, T., & Zha, H. (2020, November). Transformer hawkes process. In International conference on machine learning (pp. 11692-11702). PMLR.
[9] Omi, T., & Aihara, K. (2019). Fully neural network based model for general temporal point processes. Advances in neural information processing systems, 32.
19. 19
CONCLUSION
• Propose a novel spatio-temporal graph neural point process framework for traffic
congestion event prediction.
o utilize the spatiotemporal graph to incorporate with neural point process for traffic congestion
event modeling.
o consider periodic features, continuous and instantaneous dynamics to improve the inter-event
dependencies learning.
• Experiment shows that the proposed demonstrate the superiority compared with
other traditional methods.
Editor's Notes
Example of the traffic congestion features and linkspeed trends from the Beijing dataset we adopted in this paper. In sub-figure (a), we select the traffic congestion statistics of three neighbor links on 12 May 2021 to visualize theoccurrence time and duration in 24 hours. In sub-figure (b),we select the speed of link 1 from 7.am to 10.am on 12 May2021 to visualize the change trend.
Type equation here.
Type equation here.
Zero padding is a technique typically employed to make the size of the input sequence equal to a power of two
The green squares denote the moment when the congestion event occurred.
The green curves and arrows represent continuous and instantaneous changes in the hidden representation of link states, which are learned by GRU flow and discrete GRU.
grey strip denotes the input contextual information at each time step.
The green squares denote the moment when the congestion event occurred.
The green curves and arrows represent continuous and instantaneous changes in the hidden representation of link states, which are learned by GRU flow and discrete GRU.
grey strip denotes the input contextual information at each time step.
𝑓 𝑙 + (·) denotes the fully connected layer for computing the basic intensity function, 𝑓 𝑝 (·) denotes the fully connected layer for periodic gated unit, 𝑃 𝑖 𝑑 , 𝑃 𝑖 𝑤 respectively denote the time of day and the day of week of the i-th event
1a : results of the analysis by reporting the median load profile for each cluster; shaded areas correspond to quantiles with 10% increments.