Graph Neural Controlled Differential Equations for
Traffic Forecasting
Jeongwhan Choi Hwangyong Choi
Jeehyun Hwang Noseong Park
Yonsei University
Seoul, South Korea
AAAI 2022
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 1 / 34
Table of Contents
1 Introduction
2 Proposed Method
3 Experiments
4 Conclusion
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 2 / 34
Table of Contents
1 Introduction
2 Proposed Method
3 Experiments
4 Conclusion
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 3 / 34
Traffic Forecasting (1/2)
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 4 / 34
Traffic Forecasting (2/2)
Predicting traffic flows in a city is of great importance to Intelligence
Traffic Systems (ITS):
help manage and control traffic to ease congestion,
pre-allocate taxis and sharing bikes,
route planning.
Figure: Example of heavy traffic (credit: KTLA)
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 5 / 34
Spatio-temporal Forecasting (1/2)
Figure: Traffic congestion in front of the Yonsei AI workshop venue
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 6 / 34
Spatio-temporal Forecasting (2/2)
The traffic forecasting task is one of the most popular problems in the
area of spatio-temporal processing.
Spatio-temporal forecasting is to predict the traffic volume for each
location of a road network for the next time points given past
historical traffic pattern.
......
Figure: Example of spatio-temporal graph data
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 7 / 34
Motivation (1/3)
Neural Controlled Differential Equations
(NCDEs) are considered as a
continuous analogue to recurrent neural
networks (RNNs).
NCDEs show the state-of-the-art
accuracy in many time-series tasks.
Time
Path
NCDE
Data
Hidden
Neural CDE
......
......
......
Figure: The overall workflow
of the original NCDE for
processing time-series
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 8 / 34
Motivation (2/3)
Model Temporal Dependency Spatial Dependency
STGCN [11] GLU GCN
DCRNN [8] LSTM GCN (Pre-defined Graph)
GraphWaveNet [10] TCN GCN (Adaptive Graph)
ASTGCN [5] 1D CNN+Attention Attention+GCN(Pre-defined Graph)
STSGCN [9] GCN (Sliding Window) GCN (Multiple Graph)
AGCRN [1] GRU GCN (Adaptive Graph)
STFGNN [7] 1D CNN GCN (Multiple Graph)
STGODE [4] TCN ODE+GCN (Dynamic Graph)
STG-NDCE NCDE NCDE + GCN (Adaptive Graph)
Table: Classification of spatial-temporal modeling methods
Motivation
It has not been studied yet how to combine the NCDE technology and the
graph convolutional processing technology.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 9 / 34
Motivation (3/3)
Objective
We integrate them into a single framework to solve the spatio-temporal
forecasting problem.
Time
Path
NCDE
Data
Hidden
Neural CDE
......
......
......
Figure: The overall workflow of the
original NCDE for processing time-series
Time
Path
NCDE
Data
Hidden
Temporal Processing
Spatial Processing
NCDE
Hidden
......
......
......
......
......
Figure: The overall workflow in our
proposed STG-NCDE
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 10 / 34
Table of Contents
1 Introduction
2 Proposed Method
3 Experiments
4 Conclusion
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 11 / 34
Graph Neural Controlled Differential Equations
Our proposed spatio-temporal graph neural controlled differential
equation (STG-NCDE) consists of two NCDEs.
Time
Path
NCDE
Data
Hidden
Temporal Processing
Spatial Processing
NCDE
Hidden
......
......
......
......
......
Figure: The overall workflow in our proposed STG-NCDE.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 12 / 34
Table of Contents
1 Introduction
2 Proposed Method
3 Experiments
4 Conclusion
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 13 / 34
Experimental Results (1/2)
Model
PeMSD3 PeMSD4 PeMSD7 PeMSD8
MAE RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE
HA 31.58 52.39 33.78% 38.03 59.24 27.88% 45.12 65.64 24.51% 34.86 59.24 27.88%
ARIMA 35.41 47.59 33.78% 33.73 48.80 24.18% 38.17 59.27 19.46% 31.09 44.32 22.73%
VAR 23.65 38.26 24.51% 24.54 38.61 17.24% 50.22 75.63 32.22% 19.19 29.81 13.10%
FC-LSTM 21.33 35.11 23.33% 26.77 40.65 18.23% 29.98 45.94 13.20% 23.09 35.17 14.99%
TCN 19.32 33.55 19.93% 23.22 37.26 15.59% 32.72 42.23 14.26% 22.72 35.79 14.03%
TCN(w/o causal) 18.87 32.24 18.63% 22.81 36.87 14.31% 30.53 41.02 13.88% 21.42 34.03 13.09%
GRU-ED 19.12 32.85 19.31% 23.68 39.27 16.44% 27.66 43.49 12.20% 22.00 36.22 13.33%
DSANet 21.29 34.55 23.21% 22.79 35.77 16.03% 31.36 49.11 14.43% 17.14 26.96 11.32%
STGCN 17.55 30.42 17.34% 21.16 34.89 13.83% 25.33 39.34 11.21% 17.50 27.09 11.29%
DCRNN 17.99 30.31 18.34% 21.22 33.44 14.17% 25.22 38.61 11.82% 16.82 26.36 10.92%
GraphWaveNet 19.12 32.77 18.89% 24.89 39.66 17.29% 26.39 41.50 11.97% 18.28 30.05 12.15%
ASTGCN(r) 17.34 29.56 17.21% 22.93 35.22 16.56% 24.01 37.87 10.73% 18.25 28.06 11.64%
MSTGCN 19.54 31.93 23.86% 23.96 37.21 14.33% 29.00 43.73 14.30% 19.00 29.15 12.38%
STG2Seq 19.03 29.83 21.55% 25.20 38.48 18.77% 32.77 47.16 20.16% 20.17 30.71 17.32%
LSGCN 17.94 29.85 16.98% 21.53 33.86 13.18% 27.31 41.46 11.98% 17.73 26.76 11.20%
STSGCN 17.48 29.21 16.78% 21.19 33.65 13.90% 24.26 39.03 10.21% 17.13 26.80 10.96%
AGCRN 15.98 28.25 15.23% 19.83 32.26 12.97% 22.37 36.55 9.12% 15.95 25.22 10.09%
STFGNN 16.77 28.34 16.30% 20.48 32.51 16.77% 23.46 36.60 9.21% 16.94 26.25 10.60%
STGODE 16.50 27.84 16.69% 20.84 32.82 13.77% 22.59 37.54 10.14% 16.81 25.97 10.62%
Z-GCNETs 16.64 28.15 16.39% 19.50 31.61 12.78% 21.77 35.17 9.25% 15.76 25.11 10.01%
STG-NCDE 15.57 27.09 15.06% 19.21 31.09 12.76% 20.53 33.84 8.80% 15.45 24.81 9.92%
Table: Forecasting error on PeMSD3, PeMSD4, PeMSD7 and PeMSD8
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 14 / 34
Experimental Results (2/2)
06:00 12:00 18:00 24:00 06:00
Time
100
200
300
400
Traffic
Flow
Truth
STG-NCDE
Z-GCNETs
(a) Node 261 in PeMSD4
12:00 18:00 24:00 06:00 12:00
Time
100
200
300
400
500
600
700
800
Traffic
Flow
Truth
STG-NCDE
Z-GCNETs
(b) Node 9 in PeMSD8
Figure: Traffic forecasting visualization.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 15 / 34
Irregular traffic forecasting
Since NCDEs are able to consider irregular time-series by the design,
STG-NCDE is also able to do it without any changes on its model
design.
Table: Forecasting error on irregular PeMSD4
Model Missing rate MAE RMSE MAPE
STG-NCDE
10% 19.36 31.28 12.79%
30% 19.40 31.30 13.04%
50% 19.98 32.09 13.48%
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 16 / 34
Table of Contents
1 Introduction
2 Proposed Method
3 Experiments
4 Conclusion
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 17 / 34
Conclusion
Time
Path
NCDE
Data
Hidden
Temporal Processing
Spatial Processing
NCDE
Hidden
......
......
......
......
......
Figure: STG-NCDE
We presented a spatio-temporal NCDE
model to perform traffic forecasting.
In our experiments with 6 datasets and
20 baselines, our method clearly shows
the best overall accuracy.
In addition, our model can perform
irregular traffic forecasting.
We believe that the combination of
NCDEs and GCNs is a promising
research direction for spatio-temporal
processing.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 18 / 34
Reference (1/3)
[1] Lei Bai et al. “Adaptive Graph Convolutional Recurrent Network for
Traffic Forecasting”. In: NeurIPS. Vol. 33. 2020, pp. 17804–17815.
[2] Ricky T. Q. Chen et al. “Neural Ordinary Differential Equations”.
In: NeurIPS. 2018.
[3] Yuzhou Chen, Ignacio Segovia-Dominguez, and Yulia R Gel.
“Z-GCNETs: Time Zigzags at Graph Convolutional Networks for
Time Series Forecasting”. In: ICML. 2021.
[4] Zheng Fang et al. “Spatial-Temporal Graph ODE Networks for
Traffic Flow Forecasting”. In: KDD. 2021.
[5] Shengnan Guo et al. “Attention Based Spatial-Temporal Graph
Convolutional Networks for Traffic Flow Forecasting”. In: AAAI. July
2019. doi: 10.1609/aaai.v33i01.3301922.
[6] Patrick Kidger et al. “Neural Controlled Differential Equations for
Irregular Time Series”. In: NeurIPS (2020).
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 19 / 34
Reference (2/3)
[7] Mengzhang Li and Zhanxing Zhu. “Spatial-Temporal Fusion Graph
Neural Networks for Traffic Flow Forecasting”. In: AAAI. May 2021.
[8] Yaguang Li et al. “Diffusion Convolutional Recurrent Neural
Network: Data-Driven Traffic Forecasting”. In: ICLR. 2018.
[9] Chao Song et al. “Spatial-Temporal Synchronous Graph
Convolutional Networks: A New Framework for Spatial-Temporal
Network Data Forecasting”. In: AAAI. Apr. 2020. doi:
10.1609/aaai.v34i01.5438.
[10] Zonghan Wu et al. “Graph WaveNet for Deep Spatial-Temporal
Graph Modeling”. In: IJCAI. July 2019, pp. 1907–1913.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 20 / 34
Reference (3/3)
[11] Bing Yu, Haoteng Yin, and Zhanxing Zhu. “Spatio-Temporal Graph
Convolutional Networks: A Deep Learning Framework for Traffic
Forecasting”. In: IJCAI. July 2018. doi:
10.24963/ijcai.2018/505. url:
https://doi.org/10.24963/ijcai.2018/505.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 21 / 34
Thank You!
Visit our poster session for more details.
homepage: jeongwhanchoi.me
email: jeongwhan.choi@yonsei.ac.kr
github: jeongwhanchoi/STG-NCDE
twitter: jeongwhan_choi
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 22 / 34
More Details
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 23 / 34
Neural Ordinary Differential Equations (NODEs)
NODEs introduced how to
continuously model residual neural
networks (ResNets) with differential
equations.
NODE can be written as follows:
z(T) = z(0) +
Z T
0
f (z(t), t; θf )dt; (1)
where the neural network
parameterized by θf approximates
z(t)
dt .
Figure: ResNet and NODE figures
from [2]
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 24 / 34
Neural controlled differential equations (NCDEs)
NCDEs [6] generalize RNNs in a continuous
manner and can be written as follows:
z(T) = z(0) +
Z T
0
f (z(t); θf )dX(t) (2)
= z(0) +
Z T
0
f (z(t); θf )
dX(t)
dt
dt, (3)
Time
Path
NCDE
Data
Hidden
Neural CDE
......
......
......
Figure: The overall workflow
of the original NCDE for
processing time-series.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 25 / 34
Neural controlled differential equations (NCDEs)
NCDEs [6] generalize RNNs in a continuous
manner and can be written as follows:
z(T) = z(0) +
Z T
0
f (z(t); θf )dX(t) (2)
= z(0) +
Z T
0
f (z(t); θf )
dX(t)
dt
dt, (3)
The original CDE formulation in Eq. (2)
reduces Eq. (3) where z(t)
dt is approximated
by f (z(t); θf )dX(t)
dt .
Time
Path
NCDE
Data
Hidden
Neural CDE
......
......
......
Figure: The overall workflow
of the original NCDE for
processing time-series.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 25 / 34
Graph Neural Controlled Differential Equations (1/7)
Initial value generation
Our proposed spatio-temporal graph neural controlled differential
equation (STG-NCDE) consists of two NCDEs.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 26 / 34
Graph Neural Controlled Differential Equations (2/7)
Initial value generation
Time
Path
NCDE
Data
Hidden
Temporal Processing
Spatial Processing
NCDE
Hidden
......
......
......
......
......
Figure: The overall workflow in our proposed STG-NCDE.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 27 / 34
Graph Neural Controlled Differential Equations (3/7)
Initial value generation
Time
Path
NCDE
Data
Hidden
Temporal Processing
Spatial Processing
NCDE
Hidden
......
......
......
......
......
Figure: STG-NCDE
The first NCDE for the temporal
processing can be written as follows:
H(T) = H(0) +
Z T
0
f (H(t); θf )
dX(t)
dt
dt,
(4)
where H(v)(t) is a hidden trajectory
(over time t ∈ [0, T]) of the temporal
information of node v, and X(t) is a
matrix whose v-th row is X(v).
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 28 / 34
Graph Neural Controlled Differential Equations (4/7)
Initial value generation
Time
Path
NCDE
Data
Hidden
Temporal Processing
Spatial Processing
NCDE
Hidden
......
......
......
......
......
Figure: STG-NCDE
The second NCDE starts for its spatial
processing as follows:
Z(T) = Z(0) +
Z T
0
g(Z(t); θg )
dH(t)
dt
dt,
(5)
where the hidden trajectory Z(t) is
controlled.
After combining Eqs. (4) and (5), we
have the following single equation
which incorporates both the temporal
and the spatial processing:
Z(T) = Z(0) +
Z T
0
g(Z(t); θg )f (H(t); θf )
dX(t)
dt
dt
(6)
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 29 / 34
Graph Neural Controlled Differential Equations (5/7)
Initial value generation
The definition of f : R|V|×dim(h(v)) → R|V|×dim(h(v)) is as as follows:
f (H(t); θf ) = ψ(FC|V|×dim(h(v))→|V|×dim(h(v))(AK )),
.
.
.
A1 = σ(FC|V|×dim(h(v))→|V|×dim(h(v))(A0)),
A0 = σ(FC|V|×dim(h(v))→|V|×dim(h(v))(H(t))),
(7)
where σ is a rectified linear unit and ψ is a hyperbolic tangent.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 30 / 34
Graph Neural Controlled Differential Equations (6/7)
Initial value generation
The definition of g : R|V|×dim(z(v)) → R|V|×dim(z(v)) is as follows:
g(Z(t); θg ) = ψ(FC|V|×dim(z(v))→|V|×dim(z(v))(B1)), (8)
B1 = (I + ϕ(σ(E · E⊺
)))B0Wspatial , (9)
B0 = σ(FC|V|×dim(z(v))→|V|×dim(z(v))(Z(t))), (10)
where I is the |V| × |V| identity matrix, ϕ is a softmax activation,
E ∈ R|V|×C is a trainable node-embedding matrix and Wspatial is a
trainable weight transformation matrix.
ϕ(σ(E · E⊺)) corresponds to the normalized adjacency matrix
D−1
2 AD−1
2 , where A = σ(E · E⊺) and the softmax activation plays a
role of normalizing the adaptive adjacency matrix [10, 1].
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 31 / 34
Graph Neural Controlled Differential Equations (7/7)
Initial value generation
The initial value of the temporal processing as follows:
H(0) = FCD→dim(h(v))(Ft0 ).
We also use the following similar strategy to generate Z(0):
Z(0) = FCdim(h(v))→dim(z(v))(H(0)).
After generating these initial values for the two NCDEs, we can
calculate Z(T) after solving the Riemann–Stieltjes integral problem.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 32 / 34
Datasets
We use six real-world traffic datasets which were widely used in the
previous studies [11, 5, 4, 3, 9].
Dataset |V| Time Steps Time Range Type
PeMSD3 358 26,208 09/2018 - 11/2018 Volume
PeMSD4 307 16,992 01/2018 - 02/2018 Volume
PeMSD7 883 28,224 05/2017 - 08/2017 Volume
PeMSD8 170 17,856 07/2016 - 08/2016 Volume
PeMSD7(M) 228 12,672 05/2012 - 06/2012 Velocity
PeMSD7(L) 1,026 12,672 05/2012 - 06/2012 Velocity
Table: The summary of the datasets used in our work. We predict either traffic
volume (i.e., # of vehicles) or velocity.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 33 / 34
Experimental Settings
We use the forecasting settings of S = 12 and M = 1 after reading
past 12 graph snapshots.
We use the mean absolute error (MAE), the mean absolute
percentage error (MAPE), and the root mean squared error (RMSE)
to measure the performance.
We compare our proposed STG-NCDE with the 20 baseline models.
Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 34 / 34

[Yonsei AI Workshop 2022] Graph Neural Controlled Differential Equations for Traffic Forecasting

  • 1.
    Graph Neural ControlledDifferential Equations for Traffic Forecasting Jeongwhan Choi Hwangyong Choi Jeehyun Hwang Noseong Park Yonsei University Seoul, South Korea AAAI 2022 Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 1 / 34
  • 2.
    Table of Contents 1Introduction 2 Proposed Method 3 Experiments 4 Conclusion Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 2 / 34
  • 3.
    Table of Contents 1Introduction 2 Proposed Method 3 Experiments 4 Conclusion Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 3 / 34
  • 4.
    Traffic Forecasting (1/2) JeongwhanChoi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 4 / 34
  • 5.
    Traffic Forecasting (2/2) Predictingtraffic flows in a city is of great importance to Intelligence Traffic Systems (ITS): help manage and control traffic to ease congestion, pre-allocate taxis and sharing bikes, route planning. Figure: Example of heavy traffic (credit: KTLA) Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 5 / 34
  • 6.
    Spatio-temporal Forecasting (1/2) Figure:Traffic congestion in front of the Yonsei AI workshop venue Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 6 / 34
  • 7.
    Spatio-temporal Forecasting (2/2) Thetraffic forecasting task is one of the most popular problems in the area of spatio-temporal processing. Spatio-temporal forecasting is to predict the traffic volume for each location of a road network for the next time points given past historical traffic pattern. ...... Figure: Example of spatio-temporal graph data Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 7 / 34
  • 8.
    Motivation (1/3) Neural ControlledDifferential Equations (NCDEs) are considered as a continuous analogue to recurrent neural networks (RNNs). NCDEs show the state-of-the-art accuracy in many time-series tasks. Time Path NCDE Data Hidden Neural CDE ...... ...... ...... Figure: The overall workflow of the original NCDE for processing time-series Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 8 / 34
  • 9.
    Motivation (2/3) Model TemporalDependency Spatial Dependency STGCN [11] GLU GCN DCRNN [8] LSTM GCN (Pre-defined Graph) GraphWaveNet [10] TCN GCN (Adaptive Graph) ASTGCN [5] 1D CNN+Attention Attention+GCN(Pre-defined Graph) STSGCN [9] GCN (Sliding Window) GCN (Multiple Graph) AGCRN [1] GRU GCN (Adaptive Graph) STFGNN [7] 1D CNN GCN (Multiple Graph) STGODE [4] TCN ODE+GCN (Dynamic Graph) STG-NDCE NCDE NCDE + GCN (Adaptive Graph) Table: Classification of spatial-temporal modeling methods Motivation It has not been studied yet how to combine the NCDE technology and the graph convolutional processing technology. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 9 / 34
  • 10.
    Motivation (3/3) Objective We integratethem into a single framework to solve the spatio-temporal forecasting problem. Time Path NCDE Data Hidden Neural CDE ...... ...... ...... Figure: The overall workflow of the original NCDE for processing time-series Time Path NCDE Data Hidden Temporal Processing Spatial Processing NCDE Hidden ...... ...... ...... ...... ...... Figure: The overall workflow in our proposed STG-NCDE Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 10 / 34
  • 11.
    Table of Contents 1Introduction 2 Proposed Method 3 Experiments 4 Conclusion Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 11 / 34
  • 12.
    Graph Neural ControlledDifferential Equations Our proposed spatio-temporal graph neural controlled differential equation (STG-NCDE) consists of two NCDEs. Time Path NCDE Data Hidden Temporal Processing Spatial Processing NCDE Hidden ...... ...... ...... ...... ...... Figure: The overall workflow in our proposed STG-NCDE. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 12 / 34
  • 13.
    Table of Contents 1Introduction 2 Proposed Method 3 Experiments 4 Conclusion Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 13 / 34
  • 14.
    Experimental Results (1/2) Model PeMSD3PeMSD4 PeMSD7 PeMSD8 MAE RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE MAE RMSE MAPE HA 31.58 52.39 33.78% 38.03 59.24 27.88% 45.12 65.64 24.51% 34.86 59.24 27.88% ARIMA 35.41 47.59 33.78% 33.73 48.80 24.18% 38.17 59.27 19.46% 31.09 44.32 22.73% VAR 23.65 38.26 24.51% 24.54 38.61 17.24% 50.22 75.63 32.22% 19.19 29.81 13.10% FC-LSTM 21.33 35.11 23.33% 26.77 40.65 18.23% 29.98 45.94 13.20% 23.09 35.17 14.99% TCN 19.32 33.55 19.93% 23.22 37.26 15.59% 32.72 42.23 14.26% 22.72 35.79 14.03% TCN(w/o causal) 18.87 32.24 18.63% 22.81 36.87 14.31% 30.53 41.02 13.88% 21.42 34.03 13.09% GRU-ED 19.12 32.85 19.31% 23.68 39.27 16.44% 27.66 43.49 12.20% 22.00 36.22 13.33% DSANet 21.29 34.55 23.21% 22.79 35.77 16.03% 31.36 49.11 14.43% 17.14 26.96 11.32% STGCN 17.55 30.42 17.34% 21.16 34.89 13.83% 25.33 39.34 11.21% 17.50 27.09 11.29% DCRNN 17.99 30.31 18.34% 21.22 33.44 14.17% 25.22 38.61 11.82% 16.82 26.36 10.92% GraphWaveNet 19.12 32.77 18.89% 24.89 39.66 17.29% 26.39 41.50 11.97% 18.28 30.05 12.15% ASTGCN(r) 17.34 29.56 17.21% 22.93 35.22 16.56% 24.01 37.87 10.73% 18.25 28.06 11.64% MSTGCN 19.54 31.93 23.86% 23.96 37.21 14.33% 29.00 43.73 14.30% 19.00 29.15 12.38% STG2Seq 19.03 29.83 21.55% 25.20 38.48 18.77% 32.77 47.16 20.16% 20.17 30.71 17.32% LSGCN 17.94 29.85 16.98% 21.53 33.86 13.18% 27.31 41.46 11.98% 17.73 26.76 11.20% STSGCN 17.48 29.21 16.78% 21.19 33.65 13.90% 24.26 39.03 10.21% 17.13 26.80 10.96% AGCRN 15.98 28.25 15.23% 19.83 32.26 12.97% 22.37 36.55 9.12% 15.95 25.22 10.09% STFGNN 16.77 28.34 16.30% 20.48 32.51 16.77% 23.46 36.60 9.21% 16.94 26.25 10.60% STGODE 16.50 27.84 16.69% 20.84 32.82 13.77% 22.59 37.54 10.14% 16.81 25.97 10.62% Z-GCNETs 16.64 28.15 16.39% 19.50 31.61 12.78% 21.77 35.17 9.25% 15.76 25.11 10.01% STG-NCDE 15.57 27.09 15.06% 19.21 31.09 12.76% 20.53 33.84 8.80% 15.45 24.81 9.92% Table: Forecasting error on PeMSD3, PeMSD4, PeMSD7 and PeMSD8 Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 14 / 34
  • 15.
    Experimental Results (2/2) 06:0012:00 18:00 24:00 06:00 Time 100 200 300 400 Traffic Flow Truth STG-NCDE Z-GCNETs (a) Node 261 in PeMSD4 12:00 18:00 24:00 06:00 12:00 Time 100 200 300 400 500 600 700 800 Traffic Flow Truth STG-NCDE Z-GCNETs (b) Node 9 in PeMSD8 Figure: Traffic forecasting visualization. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 15 / 34
  • 16.
    Irregular traffic forecasting SinceNCDEs are able to consider irregular time-series by the design, STG-NCDE is also able to do it without any changes on its model design. Table: Forecasting error on irregular PeMSD4 Model Missing rate MAE RMSE MAPE STG-NCDE 10% 19.36 31.28 12.79% 30% 19.40 31.30 13.04% 50% 19.98 32.09 13.48% Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 16 / 34
  • 17.
    Table of Contents 1Introduction 2 Proposed Method 3 Experiments 4 Conclusion Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 17 / 34
  • 18.
    Conclusion Time Path NCDE Data Hidden Temporal Processing Spatial Processing NCDE Hidden ...... ...... ...... ...... ...... Figure:STG-NCDE We presented a spatio-temporal NCDE model to perform traffic forecasting. In our experiments with 6 datasets and 20 baselines, our method clearly shows the best overall accuracy. In addition, our model can perform irregular traffic forecasting. We believe that the combination of NCDEs and GCNs is a promising research direction for spatio-temporal processing. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 18 / 34
  • 19.
    Reference (1/3) [1] LeiBai et al. “Adaptive Graph Convolutional Recurrent Network for Traffic Forecasting”. In: NeurIPS. Vol. 33. 2020, pp. 17804–17815. [2] Ricky T. Q. Chen et al. “Neural Ordinary Differential Equations”. In: NeurIPS. 2018. [3] Yuzhou Chen, Ignacio Segovia-Dominguez, and Yulia R Gel. “Z-GCNETs: Time Zigzags at Graph Convolutional Networks for Time Series Forecasting”. In: ICML. 2021. [4] Zheng Fang et al. “Spatial-Temporal Graph ODE Networks for Traffic Flow Forecasting”. In: KDD. 2021. [5] Shengnan Guo et al. “Attention Based Spatial-Temporal Graph Convolutional Networks for Traffic Flow Forecasting”. In: AAAI. July 2019. doi: 10.1609/aaai.v33i01.3301922. [6] Patrick Kidger et al. “Neural Controlled Differential Equations for Irregular Time Series”. In: NeurIPS (2020). Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 19 / 34
  • 20.
    Reference (2/3) [7] MengzhangLi and Zhanxing Zhu. “Spatial-Temporal Fusion Graph Neural Networks for Traffic Flow Forecasting”. In: AAAI. May 2021. [8] Yaguang Li et al. “Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting”. In: ICLR. 2018. [9] Chao Song et al. “Spatial-Temporal Synchronous Graph Convolutional Networks: A New Framework for Spatial-Temporal Network Data Forecasting”. In: AAAI. Apr. 2020. doi: 10.1609/aaai.v34i01.5438. [10] Zonghan Wu et al. “Graph WaveNet for Deep Spatial-Temporal Graph Modeling”. In: IJCAI. July 2019, pp. 1907–1913. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 20 / 34
  • 21.
    Reference (3/3) [11] BingYu, Haoteng Yin, and Zhanxing Zhu. “Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting”. In: IJCAI. July 2018. doi: 10.24963/ijcai.2018/505. url: https://doi.org/10.24963/ijcai.2018/505. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 21 / 34
  • 22.
    Thank You! Visit ourposter session for more details. homepage: jeongwhanchoi.me email: jeongwhan.choi@yonsei.ac.kr github: jeongwhanchoi/STG-NCDE twitter: jeongwhan_choi Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 22 / 34
  • 23.
    More Details Jeongwhan Choi(Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 23 / 34
  • 24.
    Neural Ordinary DifferentialEquations (NODEs) NODEs introduced how to continuously model residual neural networks (ResNets) with differential equations. NODE can be written as follows: z(T) = z(0) + Z T 0 f (z(t), t; θf )dt; (1) where the neural network parameterized by θf approximates z(t) dt . Figure: ResNet and NODE figures from [2] Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 24 / 34
  • 25.
    Neural controlled differentialequations (NCDEs) NCDEs [6] generalize RNNs in a continuous manner and can be written as follows: z(T) = z(0) + Z T 0 f (z(t); θf )dX(t) (2) = z(0) + Z T 0 f (z(t); θf ) dX(t) dt dt, (3) Time Path NCDE Data Hidden Neural CDE ...... ...... ...... Figure: The overall workflow of the original NCDE for processing time-series. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 25 / 34
  • 26.
    Neural controlled differentialequations (NCDEs) NCDEs [6] generalize RNNs in a continuous manner and can be written as follows: z(T) = z(0) + Z T 0 f (z(t); θf )dX(t) (2) = z(0) + Z T 0 f (z(t); θf ) dX(t) dt dt, (3) The original CDE formulation in Eq. (2) reduces Eq. (3) where z(t) dt is approximated by f (z(t); θf )dX(t) dt . Time Path NCDE Data Hidden Neural CDE ...... ...... ...... Figure: The overall workflow of the original NCDE for processing time-series. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 25 / 34
  • 27.
    Graph Neural ControlledDifferential Equations (1/7) Initial value generation Our proposed spatio-temporal graph neural controlled differential equation (STG-NCDE) consists of two NCDEs. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 26 / 34
  • 28.
    Graph Neural ControlledDifferential Equations (2/7) Initial value generation Time Path NCDE Data Hidden Temporal Processing Spatial Processing NCDE Hidden ...... ...... ...... ...... ...... Figure: The overall workflow in our proposed STG-NCDE. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 27 / 34
  • 29.
    Graph Neural ControlledDifferential Equations (3/7) Initial value generation Time Path NCDE Data Hidden Temporal Processing Spatial Processing NCDE Hidden ...... ...... ...... ...... ...... Figure: STG-NCDE The first NCDE for the temporal processing can be written as follows: H(T) = H(0) + Z T 0 f (H(t); θf ) dX(t) dt dt, (4) where H(v)(t) is a hidden trajectory (over time t ∈ [0, T]) of the temporal information of node v, and X(t) is a matrix whose v-th row is X(v). Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 28 / 34
  • 30.
    Graph Neural ControlledDifferential Equations (4/7) Initial value generation Time Path NCDE Data Hidden Temporal Processing Spatial Processing NCDE Hidden ...... ...... ...... ...... ...... Figure: STG-NCDE The second NCDE starts for its spatial processing as follows: Z(T) = Z(0) + Z T 0 g(Z(t); θg ) dH(t) dt dt, (5) where the hidden trajectory Z(t) is controlled. After combining Eqs. (4) and (5), we have the following single equation which incorporates both the temporal and the spatial processing: Z(T) = Z(0) + Z T 0 g(Z(t); θg )f (H(t); θf ) dX(t) dt dt (6) Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 29 / 34
  • 31.
    Graph Neural ControlledDifferential Equations (5/7) Initial value generation The definition of f : R|V|×dim(h(v)) → R|V|×dim(h(v)) is as as follows: f (H(t); θf ) = ψ(FC|V|×dim(h(v))→|V|×dim(h(v))(AK )), . . . A1 = σ(FC|V|×dim(h(v))→|V|×dim(h(v))(A0)), A0 = σ(FC|V|×dim(h(v))→|V|×dim(h(v))(H(t))), (7) where σ is a rectified linear unit and ψ is a hyperbolic tangent. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 30 / 34
  • 32.
    Graph Neural ControlledDifferential Equations (6/7) Initial value generation The definition of g : R|V|×dim(z(v)) → R|V|×dim(z(v)) is as follows: g(Z(t); θg ) = ψ(FC|V|×dim(z(v))→|V|×dim(z(v))(B1)), (8) B1 = (I + ϕ(σ(E · E⊺ )))B0Wspatial , (9) B0 = σ(FC|V|×dim(z(v))→|V|×dim(z(v))(Z(t))), (10) where I is the |V| × |V| identity matrix, ϕ is a softmax activation, E ∈ R|V|×C is a trainable node-embedding matrix and Wspatial is a trainable weight transformation matrix. ϕ(σ(E · E⊺)) corresponds to the normalized adjacency matrix D−1 2 AD−1 2 , where A = σ(E · E⊺) and the softmax activation plays a role of normalizing the adaptive adjacency matrix [10, 1]. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 31 / 34
  • 33.
    Graph Neural ControlledDifferential Equations (7/7) Initial value generation The initial value of the temporal processing as follows: H(0) = FCD→dim(h(v))(Ft0 ). We also use the following similar strategy to generate Z(0): Z(0) = FCdim(h(v))→dim(z(v))(H(0)). After generating these initial values for the two NCDEs, we can calculate Z(T) after solving the Riemann–Stieltjes integral problem. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 32 / 34
  • 34.
    Datasets We use sixreal-world traffic datasets which were widely used in the previous studies [11, 5, 4, 3, 9]. Dataset |V| Time Steps Time Range Type PeMSD3 358 26,208 09/2018 - 11/2018 Volume PeMSD4 307 16,992 01/2018 - 02/2018 Volume PeMSD7 883 28,224 05/2017 - 08/2017 Volume PeMSD8 170 17,856 07/2016 - 08/2016 Volume PeMSD7(M) 228 12,672 05/2012 - 06/2012 Velocity PeMSD7(L) 1,026 12,672 05/2012 - 06/2012 Velocity Table: The summary of the datasets used in our work. We predict either traffic volume (i.e., # of vehicles) or velocity. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 33 / 34
  • 35.
    Experimental Settings We usethe forecasting settings of S = 12 and M = 1 after reading past 12 graph snapshots. We use the mean absolute error (MAE), the mean absolute percentage error (MAPE), and the root mean squared error (RMSE) to measure the performance. We compare our proposed STG-NCDE with the 20 baseline models. Jeongwhan Choi (Yonsei Univ.) STG-NCDE Yonsei AI Workshop 2022 34 / 34