Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Exploratory Analysis of Massive Movement Data (RGS-IBG GIScience Research Group Seminar 2021)

The potential of Big Data for understanding human mobility patterns and other complex phenomena in transportation and movement research is significant. Many contemporary Big Data sources have clear spatiotemporal dimensions. However, Big Spatiotemporal Data is usually messy and presents numerous challenges to researchers and analysts trying to extract information and knowledge. Exploratory data analysis tools for massive movement data are necessary to gain an understanding of our data, its biases and messiness and how they might affect our analyses. This talk presents methods for the exploration of movement patterns in massive quasi-continuous GPS tracking datasets, with examples focusing on international maritime vessel movements.

  • Login to see the comments

Exploratory Analysis of Massive Movement Data (RGS-IBG GIScience Research Group Seminar 2021)

  1. 1. EXPLORATORY ANALYSIS OF MASSIVE MOVEMENT DATA RGS-IBG GIScRG webinar series Anita Graser Supported by the Austrian Federal Ministry of Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK) within the programme “IKT der Zukunft” under Grant 861258 (project MARNG).
  2. 2.  Gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others  Dealing with data that incorporates spatial and often temporal elements  Turning Big Spatiotemporal Data into insight and understanding GEOGRAPHIC DATA SCIENCE Big Data & Data Science Geography Geographic / Spatial Data Science Quant. Geogr. & Spatial Statistics & GIScience & Geomatics CS & Math & Statistics AI / ML Humanities STEM Critical Data Studies ICT & Engineering Andrienko et al. (2017) Geographic Data Science Singleton & Arribas-Bel (2021) Geographic Data Science
  3. 3.  Opportunistic reuse of data  Black box / undocumented data collection  Usually biased & messy data MOVEMENT DATA SCIENCE Geotagged social media posts Cell phone network data Mobile app usage data Vehicle tracking systems Check-in data WiFi tracking “All metadata records are incomplete as it is impossible to foresee future uses” Janowicz et al. (2020) GeoAI Payment data
  4. 4.  Accept messiness in data  Need to understand  Causes of bias & messiness  Consequences of using such data in analyses  Data visualization & exploratory approaches CHALLENGES Brunsdon & Comber (2020) Big issues for big data Graser & Dragaschnig (2020) Open Geospatial Tools for Movement Data Exploration
  5. 5.  Complex spatiotemporal phenomena  Context & scale dependent  Spatial, temporal & attribute uncertainty EXPLORATORY DATA ANALYSIS (EDA) OF MOVEMENT DATA Demšar & Virrantaus (2010) Space-time density of trajectories Andrienko & Andrienko (2011) Spatial generalization and aggregation of massive movement data Andrienko et al. (2017) Visual exploration of movement and event data with interactive time masks + Lack of established tools & practices
  6. 6. MARITIME MOVEMENT DATA  Unconstrained movement  Strongly varying trip properties: e.g. regarding duration & spatial extent  Observation gaps: data sources usually limited to certain regions  Unreliable information: e.g. movement speed and direction, trip destination, vessel identity, time stamps Image source: https://anitagraser.com/2017/10/28/movement-data-in-gis-10-open-tools-for-ais-tracks-from-marinecadastre-gov/
  7. 7. EDA OF MASSIVE MARITIME MOVEMENT DATA Input: AIS by Danish Maritime Authority for 2017  4 billion records by 89,926 distinct vessels Graser et al. (2020) The M³ massive movement model
  8. 8. AGGREGATING UNCONSTRAINED MOVEMENT DATA
  9. 9. AGGREGATING UNCONSTRAINED MOVEMENT DATA
  10. 10. © Eric Fischer © Ravi Shekhar © Marine Traffic © Strava
  11. 11. Why density grids are not good enough: M³ MASSIVE MOVEMENT MODEL PROTOTYPE AGGREGATION Graser et al. (2020) The M³ massive movement model
  12. 12. Multiple (up to nmax) prototypes per cell Gaussian mixture models (GMMs) including  Location  Speed  Direction  Count M³ MASSIVE MOVEMENT MODEL PROTOTYPE AGGREGATION Graser et al. (2020) The M³ massive movement model
  13. 13. Graser et al. (2020) The M³ massive movement model  Scalable  Grid cells independent  Independent of input order  Fast  Spark implementation (41min for 4 billion records on 8 node cluster)  Supports streaming data M³ MASSIVE MOVEMENT MODEL PROTOTYPE AGGREGATION
  14. 14. QUICK TECHNICAL BACKGROUND
  15. 15. Link to “GeoAI” or Predictive Analytics  Anomaly detection  Trajectory prediction M³ MASSIVE MOVEMENT MODEL PROTOTYPE AGGREGATION Graser & Widhalm (2018) Modelling Massive AIS Streams with Quad Trees and Gaussian Mixtures Graser et al. (2019) Data-driven Trajectory Prediction and Spatial Variability of Prediction Performance in Maritime Location Based Services
  16. 16. AGGREGATING UNCONSTRAINED MOVEMENT DATA
  17. 17. Cleaning & splitting at stops & observation gaps TRAJECTORY AGGREGATION Graser et al. (2020) Exploratory Trajectory Analysis for Massive Historical AIS Datasets
  18. 18. Challenges  No straight-forward spatial or temporal binning  Records of individual ships need to be processed in chronological order  Individual trips can be weeks long  collected records exceed available memory Implementation  Needs to iteratively process the sorted records WITHOUT materializing them in memory all at once  1 hour runtime for 6 months of AIS (on 6 node cluster) Graser et al. (2020) Exploratory Trajectory Analysis for Massive Historical AIS Datasets TRAJECTORY AGGREGATION
  19. 19. Challenges  No straight-forward spatial or temporal binning  Records of individual ships need to be processed in chronological order  Individual trips can be weeks long  collected records exceed available memory Implementation  Needs to iteratively process the sorted records WITHOUT materializing them in memory all at once  1 hour runtime for 6 months of AIS (on 6 node cluster) Graser et al. (2020) Exploratory Trajectory Analysis for Massive Historical AIS Datasets TRAJECTORY AGGREGATION
  20. 20. Link to “GeoAI” or Predictive Analytics  Travel time prediction  Destination prediction TRAJECTORY AGGREGATION Travel times between Gothenburg and Gibraltar Graser et al. (2020) Exploratory Trajectory Analysis for Massive Historical AIS Datasets
  21. 21. AGGREGATING UNCONSTRAINED MOVEMENT DATA
  22. 22. Movement between prototypes  Prototype pair  Speed  Count  Scalable ✘ Not suitable for data streams M³ MASSIVE MOVEMENT MODEL FLOW AGGREGATION Graser et al. (2020) Extracting Patterns from Large Movement Datasets
  23. 23. Passenger patterns Tanker patterns Graser et al. (2020) Extracting Patterns from Large Movement Datasets M³ MASSIVE MOVEMENT MODEL FLOW AGGREGATION
  24. 24. Passenger vessel speed & variability Graser et al. (2020) Extracting Patterns from Large Movement Datasets M³ MASSIVE MOVEMENT MODEL FLOW AGGREGATION
  25. 25. Link to “GeoAI” or Predictive Analytics  “Routing graph”  Improved trajectory prediction M³ MASSIVE MOVEMENT MODEL FLOW AGGREGATION
  26. 26. AGGREGATING UNCONSTRAINED MOVEMENT DATA
  27. 27. Continued need for domain-specific movement EDA tools 1. Best practices / checklist tools  Systematic evaluation of data quality * 2. Openness & reproducibility  Document all operations undertaken on the data 3. Privacy  Built-in privacy protection 4. Uncertainty  Efficient evaluation & visualization concepts for uncertainty SOME FUTURE CHALLENGES * Graser (in press) An exploratory data analysis protocol for identifying problems in continuous movement data.
  28. 28. anita.graser@ait.ac.at @underdarkGIS ANITA GRASER

×