Scaling Deep Learning Models for Large Spatial
Time-Series Forecasting
Zainab Abbas1, Jon Reginbald Ivarsson1, Ahmad Al-Shishtawy2 and Vladimir Vlassov1
1 KTH Royal Institute of Technology
2 RISE SICS
Stockholm, Sweden
IEEE BIGDATA 2019, LOS ANGELES, DEC 9-12, 2019
Challenge of Scale
● Deep Neural networks are used for different machine learning tasks, such as
spatial time-series forecasting.
● At scale, training deep NNs is computationally and memory intensive.
● Partitioning and distribution is a general approach to the challenge of scale in
NN-based modelling
○ dividing the problem into smaller tasks.
○ these tasks comprise of smaller models working on subsets of data
2
Traffic Data
● Sensor ID
● GPS coordinates
● Time
● Flow (no. of cars per minute)
● Average Speed (Km per hr)
Flow / Speed = Density (Cars per Km)
3Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
Large amount of sensor data
● Radar sensors deployed on
Stockholm highways
● More than 88 millions data
points collected by 2058
sensors
● Number of sensors
increasing every year
● Sensor values per minute
4Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
Research Questions
● How to partition spatial time-series while preserving dependencies among
them?
● Which and how many spatial time-series do we take into account for a fast and
accurate forecast?
5
Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
Graph Representation
● The traffic sensors are represented in the
form of a directed weighted graph
● Sensors that are at the same location but
on different lanes are represented as a
vertex
● The paths between sensors are
represented as edges
● An edge is weighted with the travel time
between the corresponding sensors
6
Graph Partitioning
7
2) Creation of Base Partitions
4) Additions of Partitions from Front and Behind.3) Creation of Base Partitions Graph
Do backward traversal from the
starting vertex in the graph till the
threshold is met.
1) Directed Weighted Graph
Data Representation
8
Spatio-temporal dependencies of sensor data Space-Time window representation of sensor data
Stacked LSTMs
● Stacked LSTMs are build of multiple layers of LSTM
placed over each other
● More powerful and deep neural network compared to
the conventional architecture
● Capable to learn non-linear dependencies in the data
9Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
Technology
● Apache Spark 2.4.0
● Python 2.7.15
● Tensorflow 1.11.0
10Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
Prediction Models
11Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
1-1 Single Sensor Model (SSM)
12
1-1 Single Sensor Model (SSM)
13
n-n Entire Sensor Infrastructure Model (ESIM)
n n
Partition-Based Models (B-t)
15
3 min partitions (B3) 5 min partitions (B5)
Partition-Based Models (B-t)
16
10 min partitions (B10) 20 min partitions (B20)
Results
Partition-based model B15 has ≈2x less RMSE compared to SSM and ESIM.
17
Results
Sequential training time (sec) Parallel training time (sec)
18
* SSMs have very less prediction time, hence omitted. 19
Results
Conclusion
● Scalability
● Accuracy
● Performance
20Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
Thank you :)
21Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
Zainab Abbas
zainabab@kth.se
Vladimir Vlassov
vladv@kth.se
Ahmad Al-Shishtawy
ahmad.al-shishtawy@ri.se
Jon R. Ivarsson
mail@reginbald.com
Scaling Deep Learning Models for Large Spatial
Time-Series Forecasting
Zainab Abbas1, Jon Reginbald Ivarsson1, Ahmad Al-Shishtawy2 and Vladimir Vlassov1
1 KTH Royal Institute of Technology
2 RISE SICS
Stockholm, Sweden
IEEE BIGDATA 2019, LOS ANGELES, DEC 9-12, 2019

Scaling Deep Learning Models for Large Spatial Time-Series Forecasting

  • 1.
    Scaling Deep LearningModels for Large Spatial Time-Series Forecasting Zainab Abbas1, Jon Reginbald Ivarsson1, Ahmad Al-Shishtawy2 and Vladimir Vlassov1 1 KTH Royal Institute of Technology 2 RISE SICS Stockholm, Sweden IEEE BIGDATA 2019, LOS ANGELES, DEC 9-12, 2019
  • 2.
    Challenge of Scale ●Deep Neural networks are used for different machine learning tasks, such as spatial time-series forecasting. ● At scale, training deep NNs is computationally and memory intensive. ● Partitioning and distribution is a general approach to the challenge of scale in NN-based modelling ○ dividing the problem into smaller tasks. ○ these tasks comprise of smaller models working on subsets of data 2
  • 3.
    Traffic Data ● SensorID ● GPS coordinates ● Time ● Flow (no. of cars per minute) ● Average Speed (Km per hr) Flow / Speed = Density (Cars per Km) 3Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
  • 4.
    Large amount ofsensor data ● Radar sensors deployed on Stockholm highways ● More than 88 millions data points collected by 2058 sensors ● Number of sensors increasing every year ● Sensor values per minute 4Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
  • 5.
    Research Questions ● Howto partition spatial time-series while preserving dependencies among them? ● Which and how many spatial time-series do we take into account for a fast and accurate forecast? 5 Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
  • 6.
    Graph Representation ● Thetraffic sensors are represented in the form of a directed weighted graph ● Sensors that are at the same location but on different lanes are represented as a vertex ● The paths between sensors are represented as edges ● An edge is weighted with the travel time between the corresponding sensors 6
  • 7.
    Graph Partitioning 7 2) Creationof Base Partitions 4) Additions of Partitions from Front and Behind.3) Creation of Base Partitions Graph Do backward traversal from the starting vertex in the graph till the threshold is met. 1) Directed Weighted Graph
  • 8.
    Data Representation 8 Spatio-temporal dependenciesof sensor data Space-Time window representation of sensor data
  • 9.
    Stacked LSTMs ● StackedLSTMs are build of multiple layers of LSTM placed over each other ● More powerful and deep neural network compared to the conventional architecture ● Capable to learn non-linear dependencies in the data 9Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
  • 10.
    Technology ● Apache Spark2.4.0 ● Python 2.7.15 ● Tensorflow 1.11.0 10Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
  • 11.
    Prediction Models 11Scaling DeepLearning Models for Large Spatial Time-Series Forecasting
  • 12.
    1-1 Single SensorModel (SSM) 12
  • 13.
    1-1 Single SensorModel (SSM) 13
  • 14.
    n-n Entire SensorInfrastructure Model (ESIM) n n
  • 15.
    Partition-Based Models (B-t) 15 3min partitions (B3) 5 min partitions (B5)
  • 16.
    Partition-Based Models (B-t) 16 10min partitions (B10) 20 min partitions (B20)
  • 17.
    Results Partition-based model B15has ≈2x less RMSE compared to SSM and ESIM. 17
  • 18.
    Results Sequential training time(sec) Parallel training time (sec) 18
  • 19.
    * SSMs havevery less prediction time, hence omitted. 19 Results
  • 20.
    Conclusion ● Scalability ● Accuracy ●Performance 20Scaling Deep Learning Models for Large Spatial Time-Series Forecasting
  • 21.
    Thank you :) 21ScalingDeep Learning Models for Large Spatial Time-Series Forecasting Zainab Abbas zainabab@kth.se Vladimir Vlassov vladv@kth.se Ahmad Al-Shishtawy ahmad.al-shishtawy@ri.se Jon R. Ivarsson mail@reginbald.com
  • 22.
    Scaling Deep LearningModels for Large Spatial Time-Series Forecasting Zainab Abbas1, Jon Reginbald Ivarsson1, Ahmad Al-Shishtawy2 and Vladimir Vlassov1 1 KTH Royal Institute of Technology 2 RISE SICS Stockholm, Sweden IEEE BIGDATA 2019, LOS ANGELES, DEC 9-12, 2019

Editor's Notes