In this paper, we provide a proof of concept on how to model and forecast average Vessel Service Time (VST) using Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs). The proposed model is learned from the Automatic Identification System (AIS) data by using machine learning. Geohash area (GeoArea) with a certain precision, convex hull area (ConvArea), and average vessel proximity (Delta) are mined for the port of Singapore every hour. These three metrics are used to calculate port spatial complexity (SpComplexity) and port spatial density (SpDensity) indicators. In addition, we propose an algorithm to mine the VST and associate that with the mined GeoArea, ConvArea, and the calculated indicators (i.e., SpDensity and SpComplexity). Then, an LSTM model is trained and subsequently tested to forecast future V ST, as Port Authorities are increasingly relying on data-driven insights for decision-making purposes. We trained and tested several LSTM models with four different time aggregation granularities (2; 4; 6; and 8 hours) and provided performance comparisons between them in terms of Mean Square Error (MSE). The experiments emphasized the feasibility of the proposed LSTM model to forecast VST.
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Bd 2018 ibrahim
1. DATA-DRIVEN VESSEL SERVICE TIME FORECASTING USING LONG
SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS
Ibrahim AbuAlhaol, Rafael Falcon, Rami Abielmona, and Emil Petriu
Project : Big Data Analytics for the Maritime Internet of Things
Funding: NSERC CRD 499024-16, Ontario Centres of Excellence (OCE) , Larus Technologies
2. Motivation
12/11/18 2
‣ Maritime Port congestion causes delay
in the shipping services which results
in financial and reputation losses.
‣ Disruption management mitigates the
impact of disruption events but
requires data-driven insights to
evaluate the disruption.
‣ Can we model and forecast average
Vessel Service Time by using
Automatic Identification System (AIS)?
3. Contributions
12/11/18 3
‣ Spatiotemporal mining
algorithms to calculate
Convex hull area,
Geohash area, and
vessels proximity.
‣ Analytical formulation of two Port Congestion Indicators (PCIs) to
capture port spatial complexity and spatial density.
‣ AIS-based mining algorithm to estimate average Vessel Service Time.
‣ LSTM-based model to forecast average Vessel Service Time.
4. 12/11/18 4
‣ Reliable shipping services is an important challenge for maritime port
authorities and liner shipping companies.
‣ Regular uncertainties (e.g., port productivity) and disruptions (e.g., weather
conditions) are main factors that affect the quality of maritime port services.
‣ Port authorities need to mitigate disruption by considering all possible causes
and appropriate countermeasures.
‣ We propose data-driven Port Congestion Indicators (PCIs) and Average Vessel
Service Time forecasting model to provide actionable insights to port authority
operation managers and other stakeholders (liner shipping companies)
‣ Seaborne trade is estimated to have 90% of
the volume of global trade and therefore
maritime port performance and resilience is
crucial in sustaining global economic growth.
Connecting the dots!
5. Data: Automatic Identification System (AIS)
12/11/18
* Draught of a ship’s hull is the vertical distance
between the waterline and the bottom of the hull.
‣ AIS is a vessel tracking system
that provides regular updates
on a vessel’s movement and
other relevant ship voyage
data to other parties.
‣ Static and Dynamic vessel
information can be
electronically exchanged
between AIS-receiving
stations (on board, ashore or
satellite). *
6. Big Data characteristic of AIS
12/11/18 6
‣ Volume: The AIS data is big in volume where
one year of Terrestrial data is around 1
Terabyte.
‣ Velocity: AIS dynamic messages are broadcast
at different time intervals depending on the
vessel’s speed and rate of turn (from 2 seconds
to 3 minutes).
‣ Veracity: Some of the fields in an AIS message could be either left outdated
or intentionally spoofed.
‣ Variety: AIS messages report different data types (e.g., Destination is text
whereas Maritime Mobile Service Identity (MMSI) is an integer).
7. Area of Interest [Port of Singapore]
12/11/18 7
Convex Hull Area
Geohash Area
8. 12/11/18 8
Geohash precision and cell height/width
‣ Geohash is a geocoding system encodes a geographic
location into a short string of letters and digits.
‣ We used precision-7 (PREC = 7 ) geohashes which divides the
Area of Interest into 153m x 153m cells.
Geohash Area
9. AIS Data mining Framework
12/11/18 9
The Framework is composed of
• Cassandra for the consumption of AIS data,
• Spark for mining and extracting, and
• TensorFlow for LSTM modeling and forecasting.
10. Spatiotemporal Data Mining
12/11/18 10
‣ Convex Hull Area (ConvArea): The port
convex hull area is defined as the area
that encloses all vessels in the smallest
perimeter fence.
‣ Geohash Area (GeoArea):
Geohash is a geocoding system which
encodes a geographic location into a
short string of letters and digits.
Geohash area with precsion-7 is the
sum of all blue squares shown in the
figure.
Convex hull and precision-7 geohashes for Port
of Hong Kong in July 2015 (ShipT ype : 70 - 79,
ShipSpeed <= 5 knots, ConvArea = 63.55 Km2
,
GeoArea = 4.28 Km2).
‣ Average Vessel Proximity (D ): The average distance between the
locations of all vessels that are reported as either “Anchored” or
“Moored” and have a speed less than a predefined threshold.
15. Port Congestion Indicators (PCI) [1/2]
12/11/18 15
‣ Spatial Complexity (SpComplexity): is calculated after mining the convex
hull area (ConvArea) and the average vessel proximity (D ) as presented in
Algorithm 1 and Algorithm 3.
• i is the hour index from all hours (I) in January and February of 2018.
• G(i) denotes the number of unique geohashes at the ith aggregation period.
16. Port Congestion Indicators (PCI) [2/2]
12/11/18 16
‣ Spatial Density (SpDensity) calculated after mining the convex hull area
(ConvArea) and the Geohash area (GeoArea) as presented in Algorithm 1
and Algorithm 2.
• i is the hour index from all hours (I) in January and February of 2018.
20. LSTM with dense layer architecture to forecast Average VST
12/11/18 20
‣ Lag Features: Current and past VST, spatiotemporal characteristics (i.e., ConvArea ,
GeotoArea , !), and congestion indicators (i.e., SpComplexity and SpDensity ).
k = 0; 1; 2; 3; 4 in the current (i.e., k=0) and the
past four aggregation periods (k = 4, is
selected after trying many possible lags).
K=0K=1K=2K=3
K=4
21. Architecture parameters and MSE Performance
12/11/18 21
We ran each experiment 50 times
and provided the 95% confidence
intervals.
23. Summary and Conclusion
12/11/18 23
‣ The mined ConvArea and GeoArea were negatively correlated with
average VST which aligns with the fact that a larger area reduces the
congestion and therefore decreases average VST values.
‣ SpComplexity, SpDensity , and ! were positively correlated with
average VST ; this corroborates the fact that smaller values of these
congestion indicators lead to low average VST values.
‣ The results provides an empirical evidence of the practicality of using
LSTM Recurrent Neural Networks to model and forecast average VST
using current/past spatiotemporal characteristics and congestion
indicators mined from AIS data.
24. Limitations and Future Work
12/11/18 24
‣ One necessary extension of this work is to train and validate the
models on more AIS data.
‣ Investigate the model on several time and geohash granularities.
‣ The LSTM architecture could be enhanced by adding more layers
or/and incorporating more lags.
‣ Investigate advanced LSTM architectures such as
§ Bi-directional LSTM
§ Stacked LSTM